Selective waste collection

Tagged as lisp, hacks

Written on 2021-10-21 by Daniel 'jackdaniel' KochmaƄski

When an object in Common Lisp is not reachable it is garbage collected. Some implementations provide the functionality to set finalizers for these objects. A finalizer is a function that is run when the object is not reachable.

Whether the finalizer is run before the object is deallocated or after is a nuance differing between implementations.

On ABCL, CMU CL, LispWorks, Mezzano, SBCL and Scieener CL the finalizer does not accept any arguments and it can't capture the finalized object (because otherwise it will be always reachable); effectively it may be already deallocated. As the least common denominator it is the approach taken in the portability library trivial-garbage.

(let* ((file (open "my-file"))
       (object (make-instance 'pseudo-stream :file file)))
  (flet ((finalize () (close file)))
    (trivial-garbage:set-finalizer object (lambda () (close file))))

On contrary ACL, CCL, clasp, clisp, corman and ECL the finalizer accepts one argument - the finalized object. This relieves the programmer from the concern of what should be captured but puts the burden on the programmer to ensure that there are no circular dependencies between finalized objects.

(let ((object (make-instance 'pseudo-stream :file (open "my-file"))))
  (flet ((finalize (stream) (close (slot-value stream 'file))))
    (another-garbage:set-finalizer object #'finalize)))

The first approach may for instance store weak pointers to objects with registered finalizers and when a weak pointer is broken then the finalizer is called.

The second approach requires more synchronization with GC and for some strategies makes it possible to absolve objects from being collected - i.e by stipulating that finalizers are executed in a topological order one per the garbage collection cycle.

In this post I want to discuss a certain problem related to finalizers I've encountered in an existing codebase. Consider the following code:

(defclass pseudo-stream ()
  ((resource :initarg :resource :accessor resource)))

(defun open-pseudo-stream (uri)
  (make-instance 'pseudo-stream :resource (make-resource uri)))

(defun close-pseudo-stream (object)
  (destroy-resource (resource object))))

(defvar *pseudo-streams* (make-hash-table))

(defun reload-pseudo-streams ()
  (loop for uri in *uris*
        do (setf (gethash uri *pseudo-streams*)
                 (open-pseudo-stream uri))))

The function reopen-pseudo-streams may be executed i.e to invalidate caches. Its main problem is that it leaks resources by not closing the pseudo stream before opening a new one. If the resource consumes a file descriptor then we'll eventually run out of them.

A naive solution is to close a stream after assigning a new one:

(defun reload-pseudo-streams/incorrect ()
  (loop for uri in *uris*
        for old = (gethash uri *pseudo-streams*)
        do (setf (gethash uri *pseudo-streams*)
                 (open-pseudo-stream uri))
           (close-pseudo-stream old)))

This solution is not good enough because it is prone to race conditions. In the example below we witness that the old stream (that is closed) may still be referenced after a new one is put in the hash table.

(defun nom-the-stream (uri)
  (loop
    (let ((stream (gethash uri *pseudo-streams*)))
      (some-long-computation-1 stream)
      ;; reload-pseudo-streams/incorrect called, the stream is closed
      (some-long-computation-2 stream) ;; <-- aaaa
      )))

This is a moment when you should consider abandoning the function reload-pseudo-streams/incorrect and using a finalizer. The new version of the function open-pseudo-stream destroys the resource only when the stream is no longer reachable, so the function nom-the-stream can safely nom.

When the finalizer accepts the object as an argument then it is enough to register the function close-pseudo-stream. Otherwise, since we can't close over the stream, we close over the resource and open-code destroying it.

(defun open-pseudo-stream (uri)
  (let* ((resource (make-resource uri))
         (stream (make-instance 'pseudo-stream :resource resource)))

    #+trivial-garbage ;; closes over the resource (not the stream)
    (flet ((finalizer () (destroy-resource resource)))
      (set-finalizer stream #'finalizer))

    #+another-garbage ;; doesn't close over anything
    (set-finalizer stream #'close-pseudo-stream)

    stream))

Story closed, the problem is fixed. It is late friday afternoon, so we eagerly push the commit to the production system and leave home with a warm feeling of fulfilled duty. Two hours later all hell breaks loose and the system fails. The problem is the following function:

(defun run-client (stream)
  (assert (pseudo-stream-open-p stream))
  (loop for message = (read-message stream)
        do (process-message message)
        until (eql message :server-closed-connection)
        finally (close-pseudo-stream stream)))

The resource is released twice! The first time when the function run-client closes the stream and the second time when the stream is finalized. A fix for this issue depends on the finalization strategy:

#+trivial-garbage ;; just remove the reference
(defun close-pseudo-stream (stream)
  (setf (resource stream) nil))

#+another-garbage ;; remove the reference and destroy the resource
(defun close-pseudo-stream (stream)
  (when-let ((resource (resource steram)))
    (setf (resource stream) nil)
    (destroy-resource resource)))

With this closing the stream doesn't interfere with the finalization. Hurray! Hopefully nobody noticed, it was late friday afternoon after all. This little incident tought us to never push the code before testing it.

We build the application from scratch, test it a little and... it doesn't work. After some investigation we find the culpirt - the function creates a new stream with the same resource and closes it.

(defun invoke-like-a-good-citizen-with-pseudo-stream (original-stream fn)
  (let* ((resource (resource original-stream))
         (new-stream (make-instance 'pseudo-stream :resource resource)))
    (unwind-protect (funcall fn new-stream)
      (close-pseudo-stream new-stream))))

Thanks to our previous provisions closing the stream doesn't collide with finalization however the resource is destroyed for each finalized stream because it is shared between distinct instances.

When the finalizer accepts the collected object as an argument then the solution is easy because all we need is to finalize the resource instead of the pseudo stream (and honestly we should do it from the start!):

#+another-garbage
(defun open-pseudo-stream (uri)
  (let* ((resource (make-resource uri))
         (stream (make-instance 'pseudo-stream :resource resource)))
    (set-finalizer resource #'destroy-resource)
    stream))

#+another-garbage
(defun close-pseudo-stream (stream)
  (setf (resource stream) nil))

When the finalizer doesnt't accept the object we need to do the trick and finalize a shared pointer instead of a verbatim resource. This has a downside that we need to always unwrap it when used.

#+trivial-garbage
(defun open-pseudo-stream (uri)
  (let* ((resource (make-resource uri))
         (wrapped (list resource))
         (stream (make-instance 'pseudo-stream :resource wrapped)))
    (flet ((finalize () (destroy-resource resource)))
      (set-finalizer wrapped #'finalize)
    stream))

#+trivial-garbage
(defun close-pseudo-stream (stream)
  (setf (resource stream) nil))

When writing this post I've got too enthusiastic and dramatized a little about the production systems but it is a fact, that I've proposed a fix similar to the first finalization attempt in this post and when it got merged it broke the production system. That didn't last long though because the older build was deployed almost immedietely. Cheers!