Being Shifty with Minecraft (part 1) Haskell stubbing for fun and profit

By: on February 18, 2012

Lost? Go back to the beginning.

A lightning quick recap haiku.

In the midst of sky
Unfamiliar blocks shroud the sun
The world is saved.

The last part sounds ominous and heroic. I like it.

Talking about Minecraft Regions

Go into the game, generate a world with a random seed, walk around for a bit, and then quit. Here’s what you might see gets produced in your Minecraft saved games directory (I called my world ‘testworld’).

Minecraft saved game directory structure

A Minecraft saved game is a complete snapshot of the world at a particular time.

level.dat stores information about the weather, time of day, positions and velocities of currently moving pigs, sheep, wolves, the player, arrows, snowballs etc.

Files in the region directory store static block information – they’re the ones in the form “r.x.x.mcr”. These files define the locations and the types of blocks that comprise the entire landscape, and therefore are the only files we’re concerned about. They each represent a square subdivision of the world of a fixed size.

(NB: The number of actual region files may vary from one world to the next, because not all regions exist when the world is created. Regions are in fact procedurally generated lazily as the player explores more of the world. This is a key feature that makes each and every Minecraft world have unique terrains and land features.)

Regions as data we know nothing about

We can write good tests up front when knowing surprisingly little about the data. If you can’t wait until the next post to find out about the binary format innards of Region files, check out the Minecraft Wiki page.

By treating these ‘.mcr’ files as black boxes that represent Regions, we can take a top-down approach to coding, refining details as they appear. Enter the Region algebraic data type.

First line of code in the bag. Similar to declaring an empty class in OO.

-- Region.hs
data Region = Region deriving (Show, Eq)

The Show and Eq instances will come in handy when debugging; Show lets you convert Region to a string, and Eq lets you compare two Regions using (==).

Note that this is isomorphic to the unit type

data () = ()

a type inhabited by only one value. There’s absolutely no useful information carried by this type other than its presence.

The idea is that the Region type won’t stay like this for long; as we find out more about Minecraft’s Region files, we will enrich its type with more data.

Regions as data that is serialisable meet Data.Binary

Reading Region files can be treated as deserialisation from a file’s contents into a value of our Region type. I looked at a number of (de)serialisation options to express this and didn’t take very long to decide on Data.Binary (or simply known by its package name, binary) mainly for its simplicity.

Install the necessary packages…

$ cabal install bytestring binary

Binary uses what’s called lazy bytestrings. Lazy ByteString is Haskell’s efficient data type dealing with binary data, thanks to their lazily loaded chunked structure. This translates to something like buffered reads and writes.

The main contribution of Data.Binary is the Binary type class, which is defined like so:

class Binary t where
    -- | Encode a value in the Put monad.
    put :: t -> Put
    -- | Decode a value in the Get monad
    get :: Get t

encode :: Binary a => a -> ByteString
decode :: Binary a => ByteString -> a

encodeFile :: Binary a => FilePath -> a -> IO ()
decodeFile :: Binary a => FilePath -> IO a

i.e. for your type to be regarded as having a Binary representation, the ‘put’ and ‘get’ (de)serialisation methods must be provided.

encode(File) and decode(File) functions are provided at the programmer’s convenience for writing or reading Region data to and from either bytestrings or files (lazily). They rely on the implementations of ‘put’ and ‘get’.

To make Region serialisable, we must provide a Binary instance for Region. You can provide such an implementation in one of two ways:

  • use Data.Derive.Binary to automatically derive it for you. This generates sensible ‘put’ and ‘get’ methods at compile-time with one line:
    $(derive makeBinary ''Region)

    This approach makes use of Template Haskell, which is a form of meta-programming based on compile-time reflection (it requires no runtime type information)

  • implement the methods manually

Sadly, we can’t capitalise on the former one-line wonder because our Region wouldn’t have been serialised with Data.Derive.Binary. Being a custom file format, work needed to emulate Minecraft’s own saved game serialiser will be deferred for two posts down the line. ‘undefined’ shall have to suffice for now!

import Data.Binary

instance Binary Region where
  get = undefined
  put = undefined

‘put’ and ‘get’ are actually monadic computations (yes, monadic; don’t run away yet!) for the purpose of restricting the types of operations permitted within the serialisation and deserialisation procedures. This is a design choice of Data.Binary.

Making Sure we Don’t Pull Rabbits out of Hats

Did someone say rabbits? Ooooh massive sword.

“Did someone say ‘Rabbit RabbitMQ logo‘? And why’s there a sword The sword that got magnified... much. there?!”

Imagine for a moment that bytestrings were top hats  some sort of container for a piece of data.

Putting magical thinking aside, if you ‘put’ a Region into a top hat, the only thing that you can possibly hope to ‘get’ back out of the top hat is that same Region, right?

If we manage to pull anything else but the same Region out of the top hat, the chances are that it’s probably not even a rabbit; it’s going to be an ugly, nasty bug. Euck.

Point being, ‘put’ and ‘get’ should clearly be be inverses. For any Region r,

(runGet get . runPut . put) r == r

where runPut runs a Put computation that takes a Region and produces a bytestring, while runGet runs a Get computation on a bytestring and produces a Region. This can be written more succinctly as

(decode.encode) r == r

or, ideally, as a point-free property

(decode.encode) == id

How do we make sure of this? Do fuzz testing with QuickCheck! (Frank’s alluded to this nice tool in his sort of recent post).

Setting up and running Quickcheck tests with test-framework

Blogs and articles mentioning “testing” and “Haskell” in the same breath are few and far between. Some wonder whether test-driven development (TDD) is applicable to functional programming at all.

What might be surprising is that while Haskell has a good number of testing libraries covering a range of uses:

  • HUnit tests possibly side-effectual functions, just as all the the [A-Z]Unit libraries out there
  • Quickcheck(2) for testing properties about pure functions by randomised inputs. Can also be used to test monadic code
  • smallcheck and lazy-smallcheck for the use as Quickcheck, except that you get determinism between test runs as generated test structures start off small, then inductively become more complex

… yet there hasn’t been a decent unified test runner. Programers have so far resorted to writing, compiling and invoking their own ‘test programs’ for test coverage.

… until Max Bolingbroke’s recent excellent GSOC work on test-framework, that is! This test library integrates Quickcheck and HUnit tests with cabal so that you can write something like

$ cabal test

to run all your tests.

Finally, simple command line test runner. Yes please. Get it installed like so:

$ cabal install test-framework test-framework-quickcheck2 test-framework-hunit

The latter two are known as test providers, one for each of Quickcheck2 and HUnit.

While test-framework does seem to make cabal a bit more comprehensive like Maven or NAnt, lots of work still needs to be done to improve the state of Haskell tooling.

We create a TestMain.hs file and start with some imports

-- TestMain.hs
import Test.Framework
import Test.Framework.Providers.QuickCheck2
import Test.QuickCheck

A simple entry point of the test runner is provided by test-framework, and is called defaultMain.

main :: IO ()
main = defaultMain testSuite

Then, we define the test-suite, test-groups, and finally tests that belong in those groups, to construct a hierarchically structured test run. We call our test-group “Properties” just for clarity.

 - Check that encoding and decoding of Region into and out of a bytestring yields identity for all generated Regions.

testSuite :: [Test]
testSuite = [testProperties]

testProperties :: Test
testProperties =
  testGroup "Properties" [
        testProperty "Region (decode.encode == id)" propDecEncRegion
Expressing decode and encode being inverses as a property

The property propDecEncRegion can be defined as

propDecEncRegion :: Region -> Bool
propDecEncRegion r = (decode.encode) r == r

This is simply a Bool-valued function. We cannot write this using the point-free style easily, because the ‘point’ (r) is applied on both sides of the equivalence, though I would say it’s clear enough.

Generating ‘random’ values of Region for Quickcheck

Quickcheck2 is a fuzzer, right? That means that for our custom Region type we have to provide it a generator to pull values out of. Again, if you’re lazy, this can be derived using Data.Derive.Arbitrary, as long as you also derive Data and Typeable instances. For us, it’s so simple the compile-time cost isn’t even justified:

instance Arbitrary Region where
  arbitrary = return Region

There can only be one value (Region) so return that value every time. We can (and will!) elaborate on this later.

Building and testing with cabal

Okay, the test is ready for a run. Any Haskeller would use cabal for build management, so simply set up a cabal project by using

$ cabal init

then fill in the appropriate fields when prompted. Update the build dependencies, and you will end up with a *.cabal file looking a bit like this (minus generated comments for your viewing pleasure).

Name:                lshift-minecraft
Version:             0.1
Synopsis:            Import PNG image pixel data into Minecraft saved files.
License:             MIT
License-file:        LICENSE
Author:              Mr. Whoever
Maintainer:          abc@def.ghi
Category:            Game
Build-type:          Simple
Cabal-version:       >=1.9.2
Test-Suite test-lshift-minecraft
     Type:           exitcode-stdio-1.0
     Main-is:        TestMain.hs
     Build-Depends:  base, binary,
                     QuickCheck, test-framework, test-framework-quickcheck2

     Build-depends:  base, binary

Note that the Build-Depends section declares the test libraries required. Don’t worry if you forget, because cabal will complain about hidden packages when you try to configure the project.

Since testing is relatively a new feature, cabal configure still requires an explicit flag –enable-tests. –show-details=’always’ will ensure the full test reporting output is printed, as below.

$ cabal configure --enable-tests
$ cabal build
$ cabal test --show-details='always'
Running 1 test suites...
Test suite test-lshift-minecraft: RUNNING...
>>> Test suite test-lshift-minecraft: RUNNING...
>>> Properties:
>>> Region
  Region (decode.encode == id): [Failed]
>>> Falsifiable with seed -8222742651015324994, after 1 tests. Reason: Exception: 'Prelude.undefined'
>>>          Properties  Total
>>>  Passed  0           0
>>>  Failed  1           1
>>>  Total   1           1
>>> Test suite test-lshift-minecraft: FAIL
>>> Test suite logged to: dist/test/lshift-minecraft-0.1-test-lshift-minecraft.log

Test failure; as expected!

Fixing this test is a big job. We start doing this in the next instalment.


Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>