Content tagged tests
posted on 2017-09-05 by Tomek "uint" Kurcz
What is FiveAM?
FiveAM is a simple-yet-mature test framework. It makes test suites for your project easy to implement, maintain, organize and run.
While it can't be said that there are no learning materials provided for FiveAM, it feels like they are lacking in both clarity and detail. Beginners are in need of gentle, friendly guidance. Experienced Lisp hackers are able to make do without it, but even they probably spend a little extra time tinkering, experimenting and skimming source code to "get" the framework. This shouldn't be necessary.
This tutorial assumes familiarity with Common Lisp and a basic understanding of ASDF system definitions.
Our building blocks
We will start with a bit of theorizing. Be not afraid, however - there won't be too much of it.
The essential terms you will need to be familiar with are:
- Test suites
A check is, essentially, a single assertion - a line of code that makes sure something that should be true is indeed true. FiveAM tries to make assertions as simple as possible. The form of a basic check definition looks like this:
(is test &rest reason-args)
In this case,
test is the assertion we want to make. A function (or special operator)
application with any number of arguments can be used as the assertion. If it returns
a true value, the assertion succeeds; if it returns
NIL, it fails.
test parameter matches any of the 4 "templates" below, FiveAM will
try to reason a little about what is what and attempt to print
the explanations of failures in a more readable way. Arguably.
(predicate value) (predicate expected value) (not (predicate value)) (not (predicate expected value))
The logic FiveAM follows when reasoning is thus:
The first expression checks whether
In the second one, the
predicate is usually some form of equality test.
The assertion makes sure the
value we got (by calling some function we're
testing) matches the
expected value according to the
The last two tests are the same things, only negated.
In practice, these declarations look like this:
(is (listp (list 1 2))) ; is (list 1 2) a list? (is (= 5 (+ 2 3))) ; is (+ 2 3) equal 5?
Simple, right? If we were implementing standard Lisp functions, we could
use the above to test whether
list generates a list as it should, and whether
+ sums properly. Or, well, at least we'd ascertain that for the above cases.
And if we wanted to negate:
(is (not (listp (list 1 2)))) ; is (list 1 2) not a list? (is (not (= 5 (+ 2 3)))) ; is (+ 2 3) not equal 5?
As you may have noticed, we haven't used the optional
argument. It's used to specify what's printed as the reason
for a failed check. Sometimes FiveAM's reasoning just isn't good
enough. We will get back to it when we start hacking away.
We know how to write checks, but there's not much we can actually do
with just this knowledge. The
is syntax is only available in
the context of a test definition.
A test, as defined by FiveAM, is simply a collection of checks. Each such collection has a name so that we can easily run it later. Defining one is easy:
(test test-+ "Test the + function" ;optional description (is (= 0 (+ 0 0))) (is (= 4 (+ 2 2))) (is (= 1/2 (+ 1/4 1/4))))
We're sticking to the basics for now, but you should know there are some additional keyword parameters you can pass in order to declare dependencies, explicitly specify the parent suite, specify the fixture, change the time of compilation and/or collect profiling information.
A fixture is something that ensures a test is run in a specific context. Sometimes it's necessary to reproduce results consistently. For example, if you had a pathfinding algorithm, you'd probably have to load some sort of a map before you could test it. Apparently, using FiveAM's fixture functionality isn't recommended by the current maintainer. Perhaps it's best to just set up macros for those.
As for profiling information, this functionality doesn't seem to actually be implemented yet. Instead, Metering is a good option if needed.
You'll most likely end up defining a single test for a single function, but nothing stops you from slicing the pie up differently. Maybe a particularly complex function requires a lot of checks that are best divided into categories? Maybe a set of simple, related functions can be covered by a single test for simplicity? Your common sense is the best advisor here.
The final piece of the puzzle. These are not obligatory, but very useful. Suites are containers for tests, good if you need more hierarchy - which, honestly, you will. Speaking of hierarchy: suites can parent other suites, so you can have plenty of that.
The way suites are defined and used is roughly analogous to packages.
(def-suite tutorial-suite :description "A poor man's suite" :in some-parent-suite) (in-suite tutorial-suite)
The first form defines a test suite called
in keyword is used to set the parent suite.
in-package sets the
*package* special variable,
*suite* one. Test definitions pick up on it when provided.
Thanks to that, any test definitions after
will be included in
tutorial-suite. Other suite definitions, however,
won't be automagically contained in the suite pointed to by
For that reason, you always need to explicitly set the
in keyword when
defining a child suite.
And that's actually all there is to suites.
The story so far
Time for a quick summary - from the top, our tests are organized like this:
- (optional) Top-level test suites defined with
- (optional) Child test suites defined with
- Tests defined with the
- Checks (assertions) defined with
(is)expressions within a
A practical example
Now that all that is clear, let's try doing something with it. Imagine you are building an RPG game according to some existing pen-and-paper system. One day, it will surely rival the likes of AAA+ titles out there.
...for now, though, you only have the character generation facility down. Oh well, got to start somewhere. According to the specification of the system you're using, the stats of a character are generated randomly, but prior to the generation, a player can choose two stats they wish to "favor". Unfavored stats are decided on with a roll of two 8-sided dice, while favored ones - a roll of three 8-sided dice. You've defined a little utility function for rolling an arbitrary number of dice with an arbitrary number of sides.
You've written this basic functionality, wrapped it up in a package, defined an ASDF system, checked that everything compiles without warnings... So far, so good. But now you want to go the extra mile to make sure this is going to be a well-built piece of software. You want to integrate tests.
If you'd like to follow all the outlined steps and integrate FiveAM with me, just clone the master branch of the quasirpg repository.
git clone https://github.com/uint/quasirpg.git
Ideally, if you have quicklisp, do that in
Otherwise, clone the repository to either
If you wish, you can also look through the commit history of the test branch to see exactly how I've done all the work detailed in the following sections. It might come in useful if you get stuck.
If you want to see the code in action, try these:
CL-USER> (ql:quickload 'quasirpg) CL-USER> (in-package #:quasirpg) QUASIRPG> (roll-dice 3 6) ; throw three 6-sided dice QUASIRPG> (make-character) QUASIRPG> (make-character "Bob" '("str" "int"))
In case you don't have quicklisp, you can use this to load the system:
CL-USER> (asdf:load-system 'quasirpg)
Keep in mind that without quicklisp, you will also have to download FiveAM by hand. In the same directory you cloned quasirpg to, try:
git clone https://github.com/sionescu/fiveam.git
Tests shouldn't be a part of your software's main system. Why would they
be? People who simply want to download your application and use it don't
need them. Neither do they need to pull FiveAM as a dependency. So let's
define a new system for tests. We could create a separate
but I like to have just one
.asd file around. In this case, any
additional systems defined after the main
quasirpg one should be named
quasirpg/some-name. So we append this to our
(asdf:defsystem #:quasirpg/tests :depends-on (:quasirpg :fiveam) :components ((:module "tests" :serial t :components ((:file "package") (:file "main")))))
We also create the new files
to make true to the above declaration. As you might guess, we're planning
to define a separate package for tests. This isn't as important as
separate systems, but it's always good to keep namespaces separate.
Nice and tidy.
;;;; tests/package.lisp (defpackage #:quasirpg-tests (:use #:cl #:fiveam) (:export #:run! #:all-tests))
And finally the star of the show:
;;;; tests/main.lisp (in-package #:quasirpg-tests) (def-suite all-tests :description "The master suite of all quasiRPG tests.") (in-suite all-tests) (defun test-quasi () (run! 'all-tests)) (test dummy-tests "Just a placeholder." (is (listp (list 1 2))) (is (= 5 (+ 2 3))))
Defining a simple, argument-less test runner for the whole system
test-quasi here) isn't strictly necessary, but it's going to
spare us some potential headaches with ASDF.
We define a meaningless test just so we can check whether the whole setup works. If you've done everything correctly, you should be able to load the test system in your REPL
CL-USER> (ql:quickload 'quasirpg/tests)
and run the test runner
CL-USER> (quasirpg-tests:test-quasi) Running test suite ALL-TESTS Running test DUMMY-TESTS .. Did 2 checks. Pass: 2 (100%) Skip: 0 ( 0%) Fail: 0 ( 0%) T NIL
So far, so good!
Integrating the tests with ASDF is a good idea. That way we get hooked up
to the standard, abstracted way of triggering system tests. First, we
add this somewhere to our
quasirpg/tests system definition.
:perform (test-op (o s) (uiop:symbol-call :fiveam :run! 'quasirpg-tests:all-tests))
From now on, we can run
CL-USER> (asdf:test-system 'quasirpg/tests)
Next, we tell ASDF that when someone wants to test
quasirpg, they really
want to run the
quasirpg/tests test-op. Somewhere in the
:in-order-to ((test-op (test-op "quasirpg/tests")))
Now all we need to do to test our game is:
CL-USER> (asdf:test-system 'quasirpg)
Adding real tests
Most of the character generation system's math is within the dice-rolling function - it's probably a good idea to tackle that one. The only problem is it's not a very predictable one. We can, however, still do some useful things.
(defun test-a-lot-of-dice () (every #'identity (loop for i from 1 to 100 collecting (let ((result (quasirpg::roll-dice 2 10))) (and (>= result 2) (<= result 20)))))) (test dice-tests :description "Test the `roll-dice` function." (is (= 1 (quasirpg::roll-dice 1 1))) (is (= 3 (quasirpg::roll-dice 3 1))) (is-true (test-a-lot-of-dice)))
The first two checks simply provide arguments for which the function should always spew out the same values - we're throwing one-sided dice. Just... try not to think too hard about it.
test-a-lot-of-dice returns true only if every one
of 100 throws of two 10-sided dice is within the
expected bounds, that is 2-20. All we have to do is check whether that
function returns true. We can just write
but I recommend using
is-true instead, since the way it prints
failures is more readable in cases like this.
In all honesty,
test-a-lot-of-dice could be improved in terms of
optimization (for example by making it a macro that wraps the 100 checks
and) or functionality (the parameters passed to
roll-dice could be
random). But this version is simple and sufficient for this tutorial.
Now let's see this thing in action.
Running test suite ALL-TESTS Running test DICE-TESTS fff Did 3 checks. Pass: 0 ( 0%) Skip: 0 ( 0%) Fail: 3 (100%)
And there we go. We've just detected a bug that would never be caught by the compiler. A look at the first fail gives us a hint:
(QUASIRPG::ROLL-DICE 1 1) evaluated to 0 which is not = to 1
A look at the function in question should be enough to see the problem.
(let ((result (loop for i from 1 to n summing (random sides))))
(random sides) does is generate a number from 0 to (sides - 1).
That's not what we want.
(let ((result (loop for i from 1 to n summing (1+ (random sides)))))
And now we re-run the tests:
Running test suite ALL-TESTS Running test DICE-TESTS ... Did 3 checks. Pass: 3 (100%) Skip: 0 ( 0%) Fail: 0 ( 0%)
The true power of tests, however, is that if we now ever decide to modify our dice-throwing facility, any bugs we introduce by accident will most likely be caught by the tests already in place. And so we'll avoid nasty, hard-to-debug consequences further down the line. All that without having to test things by hand each time we make changes.
Handling invalid parameters
What happens when someone passes a non-positive integer to
Or a fractional one? We should probably control that behavior. And
we should probably test to make sure when the unexpected happens,
it's handled as expected.
Let's say our specification tells us that when any of the arguments is
fractional, it should just be rounded down. So we append two
additional tests to
(is (= 3 (quasirpg::roll-dice 3.8 1))) (is (= 3 (quasirpg::roll-dice 3 1.9)))
The first one actually passes. It just so happens that
responsible for looping N times over the random number generation
for each die.
loop rounds down a fractional number if it's passed
The second test requires our attention. It fails. The problem is that
random is passed a fractional argument, and it thinks it's meant
to give a fractional number in response. Simple fix and we're back
(let ((result (loop for i from 1 to n summing (1+ (floor (random sides))))))
Now for something more interesting. Let's say our specification tells us
that if any argument is not a positive number, we should get
SIMPLE-TYPE-ERROR. It's time to introduce yet another kind of check.
(signals condition &body body)
Not a lot to explain here. BODY is expected to cause CONDITION to be signaled. Our check only succeeds if it does. We can use this:
(signals simple-type-error (quasirpg::roll-dice 3 -1)) (signals simple-type-error (quasirpg::roll-dice 3 0)) (signals simple-type-error (quasirpg::roll-dice -1 2)) (signals simple-type-error (quasirpg::roll-dice 0 2)) (signals simple-type-error (quasirpg::roll-dice -1 1))
Again, some of the work is already done for us.
SIMPLE-TYPE-ERROR in response to a non-positive arg.
What's left to do is to handle the number of throws, so we add
the appropriate code to the beginning of
(if (< n 1) (error 'simple-type-error :expected-type '(integer 1) :datum n :format-control "~@<Attempted to throw dice ~a times.~:>" :format-arguments (list n)))
And voila. Once more, all checks pass.
Random number generators
So far, we've used specific numbers. We can do better, though. We can run
a large amount of checks based on random data. This is where the
fiveam:for-all check comes in that runs tests 100 times, randomizing
specified variables each time.
(for-all bindings &body body)
bindings is a list of forms of this type:
generator is a function (or function-bound symbol) that returns random data.
variable is the variable binding that stores the results from
body can contain other kinds of checks.
For example, let's try replacing
(is-true (test-a-lot-of-dice)) with
something more comprehensive.
(for-all ((n (gen-integer :min 1 :max 10)) (sides (gen-integer :min 1 :max 10))) "Test whether calls with random positive integers give results within expected bounds." (let ((min n) (max (* n sides)) (result (quasirpg::roll-dice n sides))) (is (<= min result)) (is (>= max result))))
(gen-integer :min 1 :max 10) is a function provided by FiveAM that
returns a random integer generator with the specified bounds. We keep
the numbers small here so that the tests don't take forever trying
to throw a lot of dice, and so that there's a reasonable chance of edge
cases getting tested.
We can also replace the rounding checks. Since FiveAM doesn't provide a suitable generator, we have to write our own. It's not difficult, though, thanks to CL's ease of creating higher-order functions:
(defun gen-long-float (&key (max (1+ most-positive-long-float)) (min (1- most-negative-long-float))) (lambda () (+ min (random (1+ (- max min))))))
With that definition in place, we can write the new checks:
(for-all ((valid-float (gen-long-float :min 1 :max 100))) "Test whether floats are rounded down." (is (= (floor valid-float) (quasirpg::roll-dice valid-float 1))) (is (>= (floor valid-float) (quasirpg::roll-dice 1 valid-float))))
Finally, we can replace our condition checking too:
(for-all ((invalid-int (gen-integer :max 0)) (invalid-int2 (gen-integer :max 0)) (valid-int (gen-integer :min 1))) "Test whether non-positive numbers signal SIMPLE-TYPE-ERROR." (signals simple-type-error (quasirpg::roll-dice valid-int invalid-int)) (signals simple-type-error (quasirpg::roll-dice invalid-int valid-int)) (signals simple-type-error (quasirpg::roll-dice invalid-int invalid-int2))))
If you run these tests, you'll notice only a few checks in the results.
That's because FiveAM treats each
for-all declaration as a single check,
regardless of the contents or the hundreds of tests that actually get run.
When the tests we've written failed, the output we got was mostly descriptive enough. That's not always the case. It's hard to expect the testing framework to know what sort of information is meaningful to us, or what the concept behind the functions we write is.
So let's say when we
make-character, we want the name to be
automatically capitalized. We care about punctuation and won't
allow our players to get sloppy with it. Pshaw.
We add a new test:
(test make-character-tests :description "Test the `make-character` function." (let ((name (quasirpg::name (quasirpg::make-character "tom" '("str" "dex"))))) (is (string= "Tom" name))))
Obviously, it fails.
Failure Details: -------------------------------- MAKE-CHARACTER-TESTS : NAME evaluated to "tom" which is not STRING= to "Tom" .. --------------------------------
We can understand it, but put yourself in the position of
someone who isn't all that familiar with the
function. Imagine that person just got the above output while testing
the entire game. They're probably really scratching their head trying
to piece this together. Let's make life easy for them. Attempt number 2:
(test make-character-tests :description "Test the `make-character` function." (let ((name (quasirpg::name (quasirpg::make-character "tom" '("str" "dex"))))) (is (string= "Tom" name) "MAKE-CHARACTER should capitalize the name \"tom\", but we got: ~s" name)))
We use the
&rest reason-args parameter of the
is check. You can use format
directives and pass it arguments, just like in a
format call. Now the test result
is much easier to interpret:
Failure Details: -------------------------------- MAKE-CHARACTER-TESTS : MAKE-CHARACTER should capitalize the name "tom", but we got: "tom". --------------------------------
Let's imagine what happens when the project grows. For one thing, we'll probably write many more tests, until having all of them in one file looks rather messy.
We'll also probably eventually end up reorganizing the code.
might eventually end up a part of a collection of utilities for generating
randomized results, while make-character could get moved to
It would be good if the hierarchy of our tests reflected those changes
and let us test only
chargen.lisp if we want to.
So above all of our dice-testing code we tuck this in:
(def-suite random-utils-tests :description "Test the random utilities." :in all-tests) (in-suite random-utils-tests)
random-utils-tests, which in turn contains
Let's do the same for character generation:
(def-suite character-generation-tests :description "Test the random utilities." :in all-tests) (in-suite character-generation-tests) (test make-character-tests :description "Test the `make-character` function." (let ((name (quasirpg::name (quasirpg::make-character "tom" '("str" "dex"))))) (is (string= "Tom" name) "MAKE-CHARACTER should capitalize the name \"tom\", but we got: ~s" name)))
You can check that running
(asdf:test-system 'quasirpg) still runs all
of our tests, since it launches the parent suite
all-tests. But we can
The next logical step is moving the test suites to separate files. If you wish to see how I've done it, just look at this commit or at the end result in the test branch.
What else is there?
A few different kinds of checks and a way to customize the way test results and statistics are presented.
So far, we've always used
run! to run all the tests, which is really a wrapper
(explain! (run 'some-test)). You can, therefore, replace the
with your own.