Proxy Generic Function

Tagged as lisp, metaobject protocol

Written on 2023-10-03 by Daniel 'jackdaniel' KochmaƄski

It is often hard to refactor software implementing an independent specification. There are already clients of the API so we can't remove operators, and newly added operators must play by the specified rules. There are a few possibilities: break the user contract and make pre-existing software obsolete, or abandon some improvements. There is also an option that software is written in Common Lisp, so you can eat your cake and have it too.

CLIM has two protocols that have a big overlap: sheets and output records. Both abstractions are organized in a similar way and have equivalent operators. In this example let's consider a part of the protocol for managing hierarchies:

;; Sheet hierarchy (sub-)protocol with an example implementation.
(defclass sheet () ()) ; protocol class
(defclass example-sheet (sheet)
  ((children :initform '() :accessor sheet-children)))

(defgeneric note-sheet-adopted (sheet)
  (:method (sheet) nil))

(defgeneric note-sheet-disowned (sheet)
  (:method (sheet) nil))

(defgeneric adopt-sheet (parent child)
  (:method ((parent example-sheet) child)
    (push child (sheet-children parent))
    (note-sheet-adopted child)))

(defgeneric disown-sheet (parent child &optional errorp)
  (:method ((parent example-sheet) child &optional (errorp t))
    (and errorp (assert (member child (sheet-children parent))))
    (setf (sheet-children parent)
          (remove child (sheet-children parent)))
    (note-sheet-disowned child)))

;; Output record hierarchy (sub-)protocol with an example implementation.
(defclass output-record () ()) ; protocol class
(defclass example-record (output-record)
  ((children :initform '() :accessor output-record-children)))

(defgeneric add-output-record (child parent)
  (:method (child (parent example-record))
    (push child (output-record-children parent))))

(defgeneric delete-output-record (child parent &optional errorp)
  (:method (child (parent example-record) &optional (errorp t))
    (and errorp (assert (member child (sheet-children parent))))
    (setf (output-record-children parent)
          (remove child (output-record-children parent)))))

Both protocols are very similar and do roughly the same thing. We are tempted to flesh out a single protocol to reduce the cognitive overhead when dealing with hierarchies.

;; The mixin is not strictly necessary - output records and sheets may have
;; wildly different internal structures - this is for the sake of simplicity;
;; most notably it is _not_ a protocol class. We don't do protocol classes.
(defclass node-mixin ()
  ((scions :initform '() :accessor node-scions)))

(defgeneric note-node-parent-changed (node parent adopted-p)
  (:method (node parent adopted-p)
    (declare (ignore node parent adopted-p))
    nil))

(defgeneric insert-node (elder scion)
  (:method :after (elder scion)
    (note-node-parent-changed scion elder t))
  (:method ((elder node-mixin) scion)
    (push scion (node-scions elder))))

(defgeneric delete-node (elder scion)
  (:method :after (elder scion)
    (note-node-parent-changed scion elder nil))
  (:method ((elder node-mixin) scion)
    (setf (node-scions elder) (remove scion (node-scions elder)))))

We define a mixin class for simplicity. In principle we care only about the new protocol and different classes may have different internal representations. Now that we have a brand new unified protocol, it is time to rewrite the old code:

;; Sheet hierarchy (sub-)protocol with an example implementation.
(defclass sheet () ()) ; protocol class
(defclass example-sheet (node-mixin sheet) ())

(defgeneric note-sheet-adopted (sheet)
  (:method (sheet)
    (declare (ignore sheet))
    nil))

(defgeneric note-sheet-disowned (sheet)
  (:method (sheet)
    (declare (ignore sheet))
    nil))

(defmethod note-node-parent-changed :after ((sheet sheet) parent adopted-p)
  (declare (ignore parent))
  (if adopted-p
      (note-sheet-adopted sheet)
      (note-sheet-disowned sheet)))

(defgeneric adopt-sheet (parent child)
  (:method (parent child)
    (insert-node parent child)))

(defgeneric disown-sheet (parent child &optional errorp)
  (:method (parent child &optional (errorp t))
    (and errorp (assert (member child (node-scions parent))))
    (delete-node parent child)))

;; Output record hierarchy (sub-)protocol with an example implementation.
(defclass output-record () ()) ; protocol class
(defclass example-record (node-mixin output-record) ())

(defgeneric add-output-record (child parent)
  (:method (child parent)
    (insert-node parent child)))

(defgeneric delete-output-record (child parent &optional errorp)
  (:method (child parent &optional (errorp t))
    (and errorp (assert (member child (node-scions parent))))
    (delete-node parent child)))

Peachy! Now we can call (delete-node parent child) and this will work equally well for both sheets and output records. It is time to ship the code and boost how clever we are (and advertise the new API). After a weekend we realize that there is a problem with our solution!

Since the old API is alive and kicking, the user may still call adopt-sheet, or if they want to switch to the new api they may call insert-node. This is fine and we have rewritten all our code so that the new element will always be added. But what about user methods?

There may be a legacy code that defines its additional constraints, for example:

(defvar *temporary-freeze* nil)
(defmethod add-output-record :before (child (record output-record))
  (declare (ignore child record))
  (when *temporary-freeze*
    (error "No-can-do's-ville, baby doll!")))

When the new code calls insert-node, then this method won't be called and the constraint will fail. There is an interesting idea, that perhaps instead of trampolining from the sheet protocol to the node protocol functions we could do it the other way around: specialized node protocol methods will call the sheet protocol functions. This is futile - the problem is symmetrical. In that case if some legacy code calls adopt-sheet, then our node methods won't be called.

That's quite a pickle we are in. The main problem is that we are not in control of all definitions and the cat is out of the bag. So what about the cake? The cake is a lie of course! … I'm kidding, of course there is the cake.

When Common Lisp programmers encounter a problem that seems impossible to solve, they usually think of one of three solutions: write a macro, write a dsl compiler or use the metaobject protocol. Usually the solution is a mix of these three things. We are dealing with generic functions - the MOP it is.

The problem could be summarized as follows:

  1. We have under our control a new function that implements the program logic
  2. We have under our control old functions that call the new function
  3. We have legacy methods outside of our control defined on old functions
  4. We will have new methods outside of our control defined on the new function
  5. Sometimes lambda lists between protocols are not compatible

We want the new function to call legacy methods when invoked, and we want to ensure that old functions always call the new function (i.e it is not possible for legacy (sheet-disown-child :around) methods to bypass delete-node).

In order to do that, we will define a new generic function class responsible for mangling arguments when the method is called with make-method-lambda, and proxying add-method to the target class. That's all. When a new legacy method is added to the generic function sheet-disown-child, then it will be hijacked and added to the generic function delete-node instead.

First some syntactic sugar. defgeneric is a good operator except that it does error when we pass options that are not specified. Moreover some compilers are tempted to macroexpand methods at compile time, so we'll expand the new macro in the dynamic environment of a definition:

(eval-when (:compile-toplevel :load-toplevel :execute)
  (defun mappend (fun &rest lists)
    (loop for results in (apply #'mapcar fun lists) append results)))

;;; syntactic sugar -- like defgeneric but accepts unknown options
(defmacro define-generic (name lambda-list &rest options)
  (let ((declarations '())
        (methods '()))
    (labels ((parse-option (option)
               (destructuring-bind (name . value) option
                 (case name
                   (cl:declare
                    (setf declarations (append declarations value))
                    nil)
                   (:method
                     (push value methods)
                     nil)
                   ((:documentation :generic-function-class :method-class)
                    `(,name (quote ,@value)))
                   ((:argument-precedence-order :method-combination)
                    `(,name (quote ,value)))
                   (otherwise
                    `(,name (quote ,value))))))
             (expand-generic (options)
               `(c2mop:ensure-generic-function
                 ',name
                 :name ',name :lambda-list ',lambda-list
                 :declarations ',declarations ,@options))
             (expand-method (method)
               `(c2mop:ensure-method (function ,name) '(lambda ,@method))))
      ;; We always expand to ENSURE-FOO because we want dynamic variables like
      ;; *INSIDE-DEFINE-PROXY-P* to be correctly bound during the creation..
      `(progn
         ,(expand-generic (mappend #'parse-option options))
         ,@(mapcar #'expand-method methods)))))

Now we will add a macro that defines a proxy generic function. We include a dynamic flag that will communicte to make-method-lambda and add-method function, that we are still in the initialization phase and methods should be added to the proxy generic function:

(defvar *inside-define-proxy-p* nil)

(defmacro define-proxy-gf (name lambda-list &rest options)
  `(let ((*inside-define-proxy-p* t))
     (define-generic ,name ,lambda-list
       (:generic-function-class proxy-generic-function)
       ,@options)))

The proxy generic function may have a different lambda list than the target. That's indeed the case with our protocol - we don't have the argument errorp in the function delete-node. We want to allow default methods in order to implement that missing behavior. We will mangle arguments according to the specified template in :mangle-args in the function mangle-args-expressoin.

(defclass proxy-generic-function (c2mop:standard-generic-function)
  ((target-gfun                       :reader target-gfun)
   (target-args :initarg :target-args :reader target-args)
   (mangle-args :initarg :mangle-args :reader mangle-args))
  (:metaclass c2mop:funcallable-standard-class)
  (:default-initargs :target-gfun (error "~s required" :target-gfun)
                     :target-args nil
                     :mangle-args nil))

(defmethod shared-initialize :after ((gf proxy-generic-function) slot-names
                                     &key (target-gfun nil target-gfun-p))
  (when target-gfun-p
    (assert (null (rest target-gfun)))
    (setf (slot-value gf 'target-gfun)
          (ensure-generic-function (first target-gfun)))))

To ensure that a proxied method can invoke call-next-method we must be able to mangle arguments both ways. The target generic functions lambda list is stated verbatim in :target-args argument, while the source generic function lambda list is read from c2mop:generic-function-lambda-list.

The function make-method-lambda is tricky to get it right, but it gives quite a bit of control over the method invocation. Default methods are added normally so we don't mangle arguments in the trampoline method, otherwise we convert the target call into the lambda list of a defined method:

;;; MAKE-METHOD-LAMBDA is expected to return a lambda expression compatible with
;;; CALL-METHOD invocations in the method combination. The first argument are
;;; the prototype generic function arguments (the function a method is initially
;;; defined for) and the reminder are all arguments passed to CALL-METHOD - in a
;;; default combination there is one such argument - next-methods. The second
;;; returned value are extra initialization arguments for the method instance.
;;; 
;;; Our goal is to construct a lambda expression that will construct a function
;;; which instead of the prototype argument list accepts the proxied function
;;; arguments and mangles them to call the defined method body. Something like:
;;;
#+ (or)
(lambda (proxy-gfun-call-args &rest call-method-args)
  (flet ((original-method (method-arg-1 method-arg-2 ...)))
    (apply #'original-method (mangle-args proxy-gfun-call-args))))

(defun mangle-args-expression (gf type args)
  (let ((lambda-list (ecase type
                       (:target (target-args gf))
                       (:source (c2mop:generic-function-lambda-list gf)))))
    `(destructuring-bind ,lambda-list ,args
       (list ,@(mangle-args gf)))))

(defun mangle-method (gf gf-args lambda-expression)
  (let ((mfun (gensym)))
    `(lambda ,(second lambda-expression)
       ;; XXX It is not conforming to shadow locally CALL-NEXT-METHOD. That said
       ;; we subclass C2MOP:STANDARD-GENERIC-FUNCTION and they do that too(!).
       (flet ((call-next-method (&rest args)
                (if (null args)
                    (call-next-method)
                    ;; CALL-NEXT-METHOD is called with arguments are meant for
                    ;; the proxy function lambda list. We first need to destruct
                    ;; them and then mangle again.
                    (apply #'call-next-method 
                           ,(mangle-args-expression gf :target
                             (mangle-args-expression gf :source 'args))))))
         (flet ((,mfun ,@(rest lambda-expression)))
           (apply (function ,mfun) ,(mangle-args-expression gf :target gf-args)))))))

(defmethod c2mop:make-method-lambda
    ((gf proxy-generic-function) method lambda-expression environment)
  (declare (ignorable method lambda-expression environment))
  (if (or *inside-define-proxy-p* (null (mangle-args gf)))
      (call-next-method)
      `(lambda (proxy-args &rest call-method-args)
         (apply ,(call-next-method gf method (mangle-method gf 'proxy-args lambda-expression) environment)
                proxy-args call-method-args))))

That leaves us with the last method add-method that decides where to add the method - to the proxy function or to the target function.

(defmethod add-method ((gf proxy-generic-function) method)
  (when *inside-define-proxy-p*
    (return-from add-method (call-next-method)))
  ;; The warning will go away in the production code because we don't want to
  ;; barf at a normal client code.
  (warn "~s is deprecated, please use ~s instead."
        (c2mop:generic-function-name gf)
        (c2mop:generic-function-name (target-gfun gf)))
  (if (or (typep method 'c2mop:standard-accessor-method) (null (mangle-args gf)))
      ;; XXX readers and writers always have congruent lambda lists so this should
      ;; be fine. Besides we don't know how to construct working accessors on some
      ;; (ekhm sbcl) implementations, because they have problems with invoking
      ;; user-constructed standard accessors (with passed :SLOT-DEFINITION SLOTD).
      (add-method (target-gfun gf) method)
      (let* ((method-class (class-of method))
             (old-lambda-list (c2mop:generic-function-lambda-list gf))
             (new-lambda-list (target-args gf))
             (new-specializers (loop with spec = (c2mop:method-specializers method)
                                     for arg in new-lambda-list
                                     until (member arg '(&rest &optional &key))
                                     collect (nth (position arg old-lambda-list) spec)))
             ;; It would be nice if we could reinitialize the method.. but we can't.
             (new-method (make-instance method-class
                                        :lambda-list new-lambda-list
                                        :specializers new-specializers
                                        :qualifiers (method-qualifiers method)
                                        :function (c2mop:method-function method))))
        (add-method (target-gfun gf) new-method))))

That's it. We've defined a new generic function class that allows specifying proxies. Now we can replace definitions of generic functions that are under our control. The new (the final) implementation looks like this:

;; Sheet hierarchy (sub-)protocol with an example implementation.
(defclass sheet () ()) ; protocol class
(defclass example-sheet (node-mixin sheet) ())

(defgeneric note-sheet-adopted (sheet)
  (:method (sheet)
    (declare (ignore sheet))
    nil))

(defgeneric note-sheet-disowned (sheet)
  (:method (sheet)
    (declare (ignore sheet))
    nil))

(defmethod note-node-parent-changed :after ((sheet sheet) parent adopted-p)
  (declare (ignore parent))
  (if adopted-p
      (note-sheet-adopted sheet)
      (note-sheet-disowned sheet)))

(define-proxy-gf adopt-sheet (parent child)
  (:target-gfun insert-node)
  (:target-args parent child)
  (:mangle-args parent child)
  (:method (parent child)
    (insert-node parent child)))

(define-proxy-gf disown-sheet (parent child &optional errorp)
  (:target-gfun delete-node)
  (:target-args parent child)
  (:mangle-args parent child nil)
  (:method (parent child &optional (errorp t))
    (and errorp (assert (member child (node-scions parent))))
    (delete-node parent child)))

;; Output record hierarchy (sub-)protocol with an example implementation.
(defclass output-record () ()) ; protocol class
(defclass example-record (node-mixin output-record) ())

(define-proxy-gf add-output-record (child parent)
  (:target-gfun insert-node)
  (:target-args parent child)
  (:mangle-args child parent)
  (:method (child parent)
    (insert-node parent child)))

(define-proxy-gf delete-output-record (child parent &optional errorp)
  (:target-gfun insert-node)
  (:target-args parent child)
  (:mangle-args child parent)
  (:method (child parent &optional (errorp t))
    (and errorp (assert (member child (node-scions parent))))
    (delete-node parent child)))

And this code is defined in a separate compilation unit:

;; Legacy code in a third-party library.
(defvar *temporary-freeze* nil)
(defmethod add-output-record :before (child (record output-record))
  (declare (ignore child))
  (when *temporary-freeze*
    (error "No-can-do's-ville, baby doll!")))

;; Bleeding edge code in an experimental third-party library.
(defvar *logging* nil)
(defmethod insert-node :after ((record output-record) child)
  (declare (ignore child))
  (when *logging*
    (warn "The record ~s has been extended!" record)))

Dare we try it? You bet we do!

(defparameter *parent* (make-instance 'example-record))
(defparameter *child1* (make-instance 'example-record))
(defparameter *child2* (make-instance 'example-record))
(defparameter *child3* (make-instance 'example-record))
(defparameter *child4* (make-instance 'example-record))
(defparameter *child5* (make-instance 'example-record))

(add-output-record *child1* *parent*)
(print (node-scions *parent*))        ;1 element

(insert-node *parent* *child2*)
(print (node-scions *parent*))        ;1 element

;; So far good!
(let ((*temporary-freeze* t))
  (handler-case (adopt-sheet *parent* *child3*)
    (error     (c) (print `("Good!" ,c)))
    (:no-error (c) (print `("Bad!!" ,c))))

  (handler-case (add-output-record *child3* *parent*)
    (error     (c) (print `("Good!" ,c)))
    (:no-error (c) (print `("Bad!!" ,c))))

  (handler-case (insert-node *parent* *child3*)
    (error     (c) (print `("Good!" ,c)))
    (:no-error (c) (print `("Bad!!" ,c)))))

;; Still perfect!
(let ((*logging* t))
  (handler-case (adopt-sheet *parent* *child3*)
    (error     (c) (print `("Bad!" ,c)))
    (warning   (c) (print `("Good!",c))))

  (handler-case (add-output-record *child4* *parent*)
    (error     (c) (print `("Bad!" ,c)))
    (warning   (c) (print `("Good!",c))))

  (handler-case (insert-node *parent* *child5*)
    (error     (c) (print `("Bad!" ,c)))
    (warning   (c) (print `("Good!",c)))))

(print `("We should have 5 children -- " ,(length (node-scions *parent*))))
(print (node-scions *parent*))

This solution has one possible drawback. We add methods from the proxy generic function to the target generic function without discriminating. That means that applicable methods defined on adopt-sheet are called when add-output-record is invoked (and vice versa). Moreover methods with the same set of specializers in the target function may replace each other. On the flip side this is what we arguably want – the unified protocol exhibits full behavior of all members. We could have mitigated this problem by signaling an error for conflicting methods from different proxies, but if you think about it, a conforming program must not define methods that are not specialized on a subclass of the standard class - otherwise they risk overwriting internal methods! In other words all is good.

Edit 1 Another caveat is that methods for the proxy generic function must be defined in a different compilation unit than the function. This is because of limitations of defmethod - the macro calls make-method-lambda when it is expanding the body (at compile time), while the function definition is processed at the execution time.

That means that make-method-lambda during the first compilation will be called with a standard-generic-function prototype and the proxy won't work.

Edit 2 To handle correctly call-next-method we need to shadow it. That is not conforming, but works when we subclass c2mop:standard-generic-function. As an alternative we could write a full make-method-lambda expansion that defines both call-next-method and next-method-p.

Cheers!
Daniel

P.S. if you like writing like this you may consider supporting me on Patreon.