Proxy Generic Function
Tagged as lisp, metaobject protocol
Written on 2023-10-03 by Daniel 'jackdaniel' KochmaĆski
It is often hard to refactor software implementing an independent specification. There are already clients of the API so we can't remove operators, and newly added operators must play by the specified rules. There are a few possibilities: break the user contract and make pre-existing software obsolete, or abandon some improvements. There is also an option that software is written in Common Lisp, so you can eat your cake and have it too.
CLIM has two protocols that have a big overlap: sheets and output records. Both abstractions are organized in a similar way and have equivalent operators. In this example let's consider a part of the protocol for managing hierarchies:
;; Sheet hierarchy (sub-)protocol with an example implementation.
(defclass sheet () ()) ; protocol class
(defclass example-sheet (sheet)
((children :initform '() :accessor sheet-children)))
(defgeneric note-sheet-adopted (sheet)
(:method (sheet) nil))
(defgeneric note-sheet-disowned (sheet)
(:method (sheet) nil))
(defgeneric adopt-sheet (parent child)
(:method ((parent example-sheet) child)
(push child (sheet-children parent))
(note-sheet-adopted child)))
(defgeneric disown-sheet (parent child &optional errorp)
(:method ((parent example-sheet) child &optional (errorp t))
(and errorp (assert (member child (sheet-children parent))))
(setf (sheet-children parent)
(remove child (sheet-children parent)))
(note-sheet-disowned child)))
;; Output record hierarchy (sub-)protocol with an example implementation.
(defclass output-record () ()) ; protocol class
(defclass example-record (output-record)
((children :initform '() :accessor output-record-children)))
(defgeneric add-output-record (child parent)
(:method (child (parent example-record))
(push child (output-record-children parent))))
(defgeneric delete-output-record (child parent &optional errorp)
(:method (child (parent example-record) &optional (errorp t))
(and errorp (assert (member child (sheet-children parent))))
(setf (output-record-children parent)
(remove child (output-record-children parent)))))
Both protocols are very similar and do roughly the same thing. We are tempted to flesh out a single protocol to reduce the cognitive overhead when dealing with hierarchies.
;; The mixin is not strictly necessary - output records and sheets may have
;; wildly different internal structures - this is for the sake of simplicity;
;; most notably it is _not_ a protocol class. We don't do protocol classes.
(defclass node-mixin ()
((scions :initform '() :accessor node-scions)))
(defgeneric note-node-parent-changed (node parent adopted-p)
(:method (node parent adopted-p)
(declare (ignore node parent adopted-p))
nil))
(defgeneric insert-node (elder scion)
(:method :after (elder scion)
(note-node-parent-changed scion elder t))
(:method ((elder node-mixin) scion)
(push scion (node-scions elder))))
(defgeneric delete-node (elder scion)
(:method :after (elder scion)
(note-node-parent-changed scion elder nil))
(:method ((elder node-mixin) scion)
(setf (node-scions elder) (remove scion (node-scions elder)))))
We define a mixin class for simplicity. In principle we care only about the new protocol and different classes may have different internal representations. Now that we have a brand new unified protocol, it is time to rewrite the old code:
;; Sheet hierarchy (sub-)protocol with an example implementation.
(defclass sheet () ()) ; protocol class
(defclass example-sheet (node-mixin sheet) ())
(defgeneric note-sheet-adopted (sheet)
(:method (sheet)
(declare (ignore sheet))
nil))
(defgeneric note-sheet-disowned (sheet)
(:method (sheet)
(declare (ignore sheet))
nil))
(defmethod note-node-parent-changed :after ((sheet sheet) parent adopted-p)
(declare (ignore parent))
(if adopted-p
(note-sheet-adopted sheet)
(note-sheet-disowned sheet)))
(defgeneric adopt-sheet (parent child)
(:method (parent child)
(insert-node parent child)))
(defgeneric disown-sheet (parent child &optional errorp)
(:method (parent child &optional (errorp t))
(and errorp (assert (member child (node-scions parent))))
(delete-node parent child)))
;; Output record hierarchy (sub-)protocol with an example implementation.
(defclass output-record () ()) ; protocol class
(defclass example-record (node-mixin output-record) ())
(defgeneric add-output-record (child parent)
(:method (child parent)
(insert-node parent child)))
(defgeneric delete-output-record (child parent &optional errorp)
(:method (child parent &optional (errorp t))
(and errorp (assert (member child (node-scions parent))))
(delete-node parent child)))
Peachy! Now we can call (delete-node parent child)
and this will work equally
well for both sheets and output records. It is time to ship the code and boost
how clever we are (and advertise the new API). After a weekend we realize that
there is a problem with our solution!
Since the old API is alive and kicking, the user may still call adopt-sheet
,
or if they want to switch to the new api they may call insert-node
. This is
fine and we have rewritten all our code so that the new element will always be
added. But what about user methods?
There may be a legacy code that defines its additional constraints, for example:
(defvar *temporary-freeze* nil)
(defmethod add-output-record :before (child (record output-record))
(declare (ignore child record))
(when *temporary-freeze*
(error "No-can-do's-ville, baby doll!")))
When the new code calls insert-node
, then this method won't be called and the
constraint will fail. There is an interesting idea, that perhaps instead of
trampolining from the sheet protocol to the node protocol functions we could do
it the other way around: specialized node protocol methods will call the sheet
protocol functions. This is futile - the problem is symmetrical. In that case if
some legacy code calls adopt-sheet
, then our node methods won't be called.
That's quite a pickle we are in. The main problem is that we are not in control of all definitions and the cat is out of the bag. So what about the cake? The cake is a lie of course! … I'm kidding, of course there is the cake.
When Common Lisp programmers encounter a problem that seems impossible to solve, they usually think of one of three solutions: write a macro, write a dsl compiler or use the metaobject protocol. Usually the solution is a mix of these three things. We are dealing with generic functions - the MOP it is.
The problem could be summarized as follows:
- We have under our control a new function that implements the program logic
- We have under our control old functions that call the new function
- We have legacy methods outside of our control defined on old functions
- We will have new methods outside of our control defined on the new function
- Sometimes lambda lists between protocols are not compatible
We want the new function to call legacy methods when invoked, and we want to
ensure that old functions always call the new function (i.e it is not possible
for legacy (sheet-disown-child :around)
methods to bypass delete-node
).
In order to do that, we will define a new generic function class responsible for
mangling arguments when the method is called with make-method-lambda
, and
proxying add-method
to the target class. That's all. When a new legacy method
is added to the generic function sheet-disown-child
, then it will be hijacked
and added to the generic function delete-node
instead.
First some syntactic sugar. defgeneric
is a good operator except that it does
error when we pass options that are not specified. Moreover some compilers are
tempted to macroexpand methods at compile time, so we'll expand the new macro in
the dynamic environment of a definition:
(eval-when (:compile-toplevel :load-toplevel :execute)
(defun mappend (fun &rest lists)
(loop for results in (apply #'mapcar fun lists) append results)))
;;; syntactic sugar -- like defgeneric but accepts unknown options
(defmacro define-generic (name lambda-list &rest options)
(let ((declarations '())
(methods '()))
(labels ((parse-option (option)
(destructuring-bind (name . value) option
(case name
(cl:declare
(setf declarations (append declarations value))
nil)
(:method
(push value methods)
nil)
((:documentation :generic-function-class :method-class)
`(,name (quote ,@value)))
((:argument-precedence-order :method-combination)
`(,name (quote ,value)))
(otherwise
`(,name (quote ,value))))))
(expand-generic (options)
`(c2mop:ensure-generic-function
',name
:name ',name :lambda-list ',lambda-list
:declarations ',declarations ,@options))
(expand-method (method)
`(c2mop:ensure-method (function ,name) '(lambda ,@method))))
;; We always expand to ENSURE-FOO because we want dynamic variables like
;; *INSIDE-DEFINE-PROXY-P* to be correctly bound during the creation..
`(progn
,(expand-generic (mappend #'parse-option options))
,@(mapcar #'expand-method methods)))))
Now we will add a macro that defines a proxy generic function. We include a
dynamic flag that will communicte to make-method-lambda
and add-method
function, that we are still in the initialization phase and methods should be
added to the proxy generic function:
(defvar *inside-define-proxy-p* nil)
(defmacro define-proxy-gf (name lambda-list &rest options)
`(let ((*inside-define-proxy-p* t))
(define-generic ,name ,lambda-list
(:generic-function-class proxy-generic-function)
,@options)))
The proxy generic function may have a different lambda list than the target.
That's indeed the case with our protocol - we don't have the argument errorp
in the function delete-node
. We want to allow default methods in order to
implement that missing behavior. We will mangle arguments according to the
specified template in :mangle-args
in the function mangle-args-expressoin
.
(defclass proxy-generic-function (c2mop:standard-generic-function)
((target-gfun :reader target-gfun)
(target-args :initarg :target-args :reader target-args)
(mangle-args :initarg :mangle-args :reader mangle-args))
(:metaclass c2mop:funcallable-standard-class)
(:default-initargs :target-gfun (error "~s required" :target-gfun)
:target-args nil
:mangle-args nil))
(defmethod shared-initialize :after ((gf proxy-generic-function) slot-names
&key (target-gfun nil target-gfun-p))
(when target-gfun-p
(assert (null (rest target-gfun)))
(setf (slot-value gf 'target-gfun)
(ensure-generic-function (first target-gfun)))))
To ensure that a proxied method can invoke call-next-method
we must be able to
mangle arguments both ways. The target generic functions lambda list is stated
verbatim in :target-args
argument, while the source generic function lambda
list is read from c2mop:generic-function-lambda-list
.
The function make-method-lambda
is tricky to get it right, but it gives quite
a bit of control over the method invocation. Default methods are added normally
so we don't mangle arguments in the trampoline method, otherwise we convert the
target call into the lambda list of a defined method:
;;; MAKE-METHOD-LAMBDA is expected to return a lambda expression compatible with
;;; CALL-METHOD invocations in the method combination. The first argument are
;;; the prototype generic function arguments (the function a method is initially
;;; defined for) and the reminder are all arguments passed to CALL-METHOD - in a
;;; default combination there is one such argument - next-methods. The second
;;; returned value are extra initialization arguments for the method instance.
;;;
;;; Our goal is to construct a lambda expression that will construct a function
;;; which instead of the prototype argument list accepts the proxied function
;;; arguments and mangles them to call the defined method body. Something like:
;;;
#+ (or)
(lambda (proxy-gfun-call-args &rest call-method-args)
(flet ((original-method (method-arg-1 method-arg-2 ...)))
(apply #'original-method (mangle-args proxy-gfun-call-args))))
(defun mangle-args-expression (gf type args)
(let ((lambda-list (ecase type
(:target (target-args gf))
(:source (c2mop:generic-function-lambda-list gf)))))
`(destructuring-bind ,lambda-list ,args
(list ,@(mangle-args gf)))))
(defun mangle-method (gf gf-args lambda-expression)
(let ((mfun (gensym)))
`(lambda ,(second lambda-expression)
;; XXX It is not conforming to shadow locally CALL-NEXT-METHOD. That said
;; we subclass C2MOP:STANDARD-GENERIC-FUNCTION and they do that too(!).
(flet ((call-next-method (&rest args)
(if (null args)
(call-next-method)
;; CALL-NEXT-METHOD is called with arguments are meant for
;; the proxy function lambda list. We first need to destruct
;; them and then mangle again.
(apply #'call-next-method
,(mangle-args-expression gf :target
(mangle-args-expression gf :source 'args))))))
(flet ((,mfun ,@(rest lambda-expression)))
(apply (function ,mfun) ,(mangle-args-expression gf :target gf-args)))))))
(defmethod c2mop:make-method-lambda
((gf proxy-generic-function) method lambda-expression environment)
(declare (ignorable method lambda-expression environment))
(if (or *inside-define-proxy-p* (null (mangle-args gf)))
(call-next-method)
`(lambda (proxy-args &rest call-method-args)
(apply ,(call-next-method gf method (mangle-method gf 'proxy-args lambda-expression) environment)
proxy-args call-method-args))))
That leaves us with the last method add-method
that decides where to add the
method - to the proxy function or to the target function.
(defmethod add-method ((gf proxy-generic-function) method)
(when *inside-define-proxy-p*
(return-from add-method (call-next-method)))
;; The warning will go away in the production code because we don't want to
;; barf at a normal client code.
(warn "~s is deprecated, please use ~s instead."
(c2mop:generic-function-name gf)
(c2mop:generic-function-name (target-gfun gf)))
(if (or (typep method 'c2mop:standard-accessor-method) (null (mangle-args gf)))
;; XXX readers and writers always have congruent lambda lists so this should
;; be fine. Besides we don't know how to construct working accessors on some
;; (ekhm sbcl) implementations, because they have problems with invoking
;; user-constructed standard accessors (with passed :SLOT-DEFINITION SLOTD).
(add-method (target-gfun gf) method)
(let* ((method-class (class-of method))
(old-lambda-list (c2mop:generic-function-lambda-list gf))
(new-lambda-list (target-args gf))
(new-specializers (loop with spec = (c2mop:method-specializers method)
for arg in new-lambda-list
until (member arg '(&rest &optional &key))
collect (nth (position arg old-lambda-list) spec)))
;; It would be nice if we could reinitialize the method.. but we can't.
(new-method (make-instance method-class
:lambda-list new-lambda-list
:specializers new-specializers
:qualifiers (method-qualifiers method)
:function (c2mop:method-function method))))
(add-method (target-gfun gf) new-method))))
That's it. We've defined a new generic function class that allows specifying proxies. Now we can replace definitions of generic functions that are under our control. The new (the final) implementation looks like this:
;; Sheet hierarchy (sub-)protocol with an example implementation.
(defclass sheet () ()) ; protocol class
(defclass example-sheet (node-mixin sheet) ())
(defgeneric note-sheet-adopted (sheet)
(:method (sheet)
(declare (ignore sheet))
nil))
(defgeneric note-sheet-disowned (sheet)
(:method (sheet)
(declare (ignore sheet))
nil))
(defmethod note-node-parent-changed :after ((sheet sheet) parent adopted-p)
(declare (ignore parent))
(if adopted-p
(note-sheet-adopted sheet)
(note-sheet-disowned sheet)))
(define-proxy-gf adopt-sheet (parent child)
(:target-gfun insert-node)
(:target-args parent child)
(:mangle-args parent child)
(:method (parent child)
(insert-node parent child)))
(define-proxy-gf disown-sheet (parent child &optional errorp)
(:target-gfun delete-node)
(:target-args parent child)
(:mangle-args parent child nil)
(:method (parent child &optional (errorp t))
(and errorp (assert (member child (node-scions parent))))
(delete-node parent child)))
;; Output record hierarchy (sub-)protocol with an example implementation.
(defclass output-record () ()) ; protocol class
(defclass example-record (node-mixin output-record) ())
(define-proxy-gf add-output-record (child parent)
(:target-gfun insert-node)
(:target-args parent child)
(:mangle-args child parent)
(:method (child parent)
(insert-node parent child)))
(define-proxy-gf delete-output-record (child parent &optional errorp)
(:target-gfun insert-node)
(:target-args parent child)
(:mangle-args child parent)
(:method (child parent &optional (errorp t))
(and errorp (assert (member child (node-scions parent))))
(delete-node parent child)))
And this code is defined in a separate compilation unit:
;; Legacy code in a third-party library.
(defvar *temporary-freeze* nil)
(defmethod add-output-record :before (child (record output-record))
(declare (ignore child))
(when *temporary-freeze*
(error "No-can-do's-ville, baby doll!")))
;; Bleeding edge code in an experimental third-party library.
(defvar *logging* nil)
(defmethod insert-node :after ((record output-record) child)
(declare (ignore child))
(when *logging*
(warn "The record ~s has been extended!" record)))
Dare we try it? You bet we do!
(defparameter *parent* (make-instance 'example-record))
(defparameter *child1* (make-instance 'example-record))
(defparameter *child2* (make-instance 'example-record))
(defparameter *child3* (make-instance 'example-record))
(defparameter *child4* (make-instance 'example-record))
(defparameter *child5* (make-instance 'example-record))
(add-output-record *child1* *parent*)
(print (node-scions *parent*)) ;1 element
(insert-node *parent* *child2*)
(print (node-scions *parent*)) ;1 element
;; So far good!
(let ((*temporary-freeze* t))
(handler-case (adopt-sheet *parent* *child3*)
(error (c) (print `("Good!" ,c)))
(:no-error (c) (print `("Bad!!" ,c))))
(handler-case (add-output-record *child3* *parent*)
(error (c) (print `("Good!" ,c)))
(:no-error (c) (print `("Bad!!" ,c))))
(handler-case (insert-node *parent* *child3*)
(error (c) (print `("Good!" ,c)))
(:no-error (c) (print `("Bad!!" ,c)))))
;; Still perfect!
(let ((*logging* t))
(handler-case (adopt-sheet *parent* *child3*)
(error (c) (print `("Bad!" ,c)))
(warning (c) (print `("Good!",c))))
(handler-case (add-output-record *child4* *parent*)
(error (c) (print `("Bad!" ,c)))
(warning (c) (print `("Good!",c))))
(handler-case (insert-node *parent* *child5*)
(error (c) (print `("Bad!" ,c)))
(warning (c) (print `("Good!",c)))))
(print `("We should have 5 children -- " ,(length (node-scions *parent*))))
(print (node-scions *parent*))
This solution has one possible drawback. We add methods from the proxy generic
function to the target generic function without discriminating. That means that
applicable methods defined on adopt-sheet
are called when add-output-record
is invoked (and vice versa). Moreover methods with the same set of specializers
in the target function may replace each other. On the flip side this is what we
arguably want – the unified protocol exhibits full behavior of all members. We
could have mitigated this problem by signaling an error for conflicting methods
from different proxies, but if you think about it, a conforming program must not
define methods that are not specialized on a subclass of the standard class -
otherwise they risk overwriting internal methods! In other words all is good.
Edit 1 Another caveat is that methods for the proxy generic function must be
defined in a different compilation unit than the function. This is because of
limitations of defmethod
- the macro calls make-method-lambda
when it is
expanding the body (at compile time), while the function definition is processed
at the execution time.
That means that make-method-lambda
during the first compilation will be called
with a standard-generic-function
prototype and the proxy won't work.
Edit 2 To handle correctly call-next-method
we need to shadow it. That is
not conforming, but works when we subclass c2mop:standard-generic-function
. As
an alternative we could write a full make-method-lambda
expansion that defines
both call-next-method
and next-method-p
.
Cheers!
Daniel
P.S. if you like writing like this you may consider supporting me on Patreon.