Haskell’s QuickCheck is a very neat tool for automated testing. One specifies *properties* that one would like a program to satisfy, and *generators* for test data, usually involving some form of randomisation. QuickCheck then uses the generators to produce test cases and check the properties against them.
The original QuickCheck was designed to test *purely functional* code only. However, the project I am working on contains a fair amount of imperative code, most of it performing operations on a database. Is it possible to employ QuickCheck for testing this code?
The Testing Monadic Code With QuickCheck paper explorers several approaches for QuickChecking programs that aren’t purely functional. The one that fits my application best proceeds by defining the the following:
1. A data type representing the actions that can be performed by the program:
| Intervals User Interval
| NewInterval User Interval Task
| ClearInterval User Interval
| ModifyTask User Interval Task Task
deriving (Eq, Show)
So in this example there are five possible actions, parameterised by instances of the `User`, `Task`, and `Interval` data types which are part of the API I am testing:
2. A notion of *observation*, i.e. what kinds of things can be observed by running programs:
Most of the API calls return `Nothing`, and one, `Intervals` returns a
list of `(Task, [Interval])` instances.
3. A `perform` function that takes a sequence of actions, performs it, and returns all the observations made along the way:
performC db  = return 
performC db (act : acts) =
liftM2 (:) ((transaction db . execC db) act) (performC db acts)
execC :: Database -> Action -> IO ObservationC
execC db act =
case act of
Reset -> reset db >> return Nothing
Intervals u i -> intervals db u i >>= (return . Just)
NewInterval u i t -> newInterval db u i t >> return Nothing
ClearInterval u i -> clearInterval db u i >> return Nothing
ModifyTask u i t t' -> modifyTask db u i t t' >> return Nothing
Here `reset`, `intervals` etc are the functions of the API I am
4. An *abstract model* of the desired program behaviour. The details are unimportant here, so I am omitting the associated code.
5. A notion of observation for the abstract model:
6. A `perform` function that takes the same kind of sequence of actions defined above, performs it on the abstract model, and returns all the observations made along the way.
performA m = snd . mapAccumL execA m
execA :: AbstractModel -> Action -> (AbstractModel, ObservationA)
execA m act =
case act of
Reset -> aReset m
Intervals u i -> aIntervals m u i
NewInterval u i t -> aNewInterval m u i t
ClearInterval u i -> aClearInterval m u i
ModifyTask u i t t' -> aModifyTask m u i t t'
Here `aReset`, `aIntervals` etc are the functions in the abstract model which correspond to the functions of the API I am testing.
7. QuickCheck generators to produce sequences of actions. In our case this involves generators for the `User`, `Task` and `Interval` types, and for `Actions` constructed from these. This is all very straightforward, so I am omitting the details.
8. A property that calls the `perform` functions defined above to execute the action sequence in both the real implementation and the abstract model, and then compares the resulting observation traces. The latter requires a function to map concrete observations to abstract observations.
forAll actions $ acts -> performOnC acts == performOnA acts
performOnC actions = map abstractObservation $ unsafePerformIO $
withDb (db -> performC db actions)
performOnA = performA M.empty
abstractObservation :: ObservationC -> ObservationA
abstractObservation Nothing = Nothing
abstractObservation (Just tis) =
Just $ map ( (t, is) -> (t, map intervalTicks is)) tis
Notice that I had to resort to `unsafePerformIO` since QuickCheck currently does not provide an easy mechanism to test programs in the IO monad. According Koen Claessen, one of the authors of QuickCheck, this is due to change though.
With the above in place we can get QuickCheck to test for observational equivalence of our implementation and the abstract model. Tests can fail because the implementation is incorrect, the abstract model is incorrect, or both. So the testing strategy is to run the tests, fix the bugs, rinse and repeat. In my application I managed to track down half a dozen corner case bugs this way. The test code is less than two hundred lines of well-factored Haskell code that is easy to understand and keep up-to-date as the implementation