Kotlin: Refactoring to a DSL - Part III
If you missed them, you probably want to read part 1 and part 2 first
Welcome back to this rollercoaster ride of things that we probably shouldn’t do with Kotlin, but we will, because we can! Plus, we might learn something along the way. Last time out, we got to a point where we’d pretty much taken the DSL part of the project as far as we reasonably can, at least in the scope of the MP3 example.
But there’s a couple of other things that we can explore. Firstly, our binaryFile
method expects a code block to be passed to it, so our specification is always going to be inline in whatever code is parsing the file. Wouldn’t it be nice to make these things reusable? What if we could provide a library of different file specifications? Remember way back in part 1, I mentioned that instead of passing the code block, we could pass a function reference. Let’s try that. We move the code block into its own method. Remember that we still have to agree with the type signature, so we have to declare it as an extension function on the BinaryFile
class
Notice that we now pass a function reference BinaryFile::asId3
to the binaryFile
method. Hopefully this will help solidify the notion that a “function with receiver” is really just an anonymous extension function. Here we’ve taken it from being anonymous (passed as a code block) to being named (the asId3()
method). We could now go and write asZipFile()
or asBmp()
methods if we wanted, to parse a file as a .zip or .bmp, and then we could package them up and allow others to use them in their code.
But if your developer spidey sense is tingling yet again, you’ll be thinking “isn’t this what class hierarchies are for?”. And you would, of course, probably be right. These extension functions don’t give us a great deal in the way of encapsulation or type safety. Conceptually, MP3s and zips and bitmaps are subtypes of BinaryFile
. We’ll create a class for MP3s, a subclass of BinaryFile
, and move the asId3
method there.
At this point, you’ve probably got a big red blob showing a compilation error on MP3File::asId3
. This is because the binaryFile
method expects to be passed a reference to a method in which a BinaryFile
is the receiver, not an MP3File
. That’s fairly simple to fix though, we just need to specify that it can be some other type, as long as that type is a subtype of BinaryFile
.
We’ve added a generic type to the method - <T: BinaryFile>
tells the compiler that we’ll use a type T
somewhere in the method, and the compiler should expect it to be a BinaryFile or a subtype of it. And indeed we do use it, we say that the callback method will have a receiver of type T
. So now it happily accepts that MP3File::asId3
is a valid argument to the binaryFile
method. Only now the invocation of the callback
method doesn’t compile, because we’re calling it on a concrete BinaryFile
, not an object of type T
(which might be a subtype). Instead of creating an instance of the BinaryFile
supertype, we need to create an instance of T
. Reflection to the rescue.
Using reflection, we can create an instance through finding its constructor. Note that, in the interests of simplicity, we make a big assumption here, which is that the only (or at least first) constructor on a subtype takes fileName
as a parameter. But the compiler is still trying to stop us from doing something silly. The problem here is that the default behaviour of generics in the JVM is for the type to be erased at runtime, so when the program executes, it has no idea what T
actually was when the code was compiled. Luckily, Kotlin helps you out here. By marking our generic type with the reified
keyword, we can ask the compiler to keep hold of that type information.
Nearly there, but that will also give you a compiler error. The way the compiler keeps the type information is essentially by “cutting and pasting” the code from the method everywhere that it is called, and replacing T
with the type you specify in the call. So the final piece of this puzzle is to add the inline
annotation to the method to allow the compiler to inline it.
At last, the compiler is satisfied. And if we now have subtypes to represent different types of file, you could have subtypes simply override a parse
method, in which case you just pass the subtype class, representing the type of file you’re parsing, as a generic parameter on the call to the binaryFile
method.
These changes mean that we can easily supply a “library” of file types for end users to use, all based on the DSL.
That also means we can start providing some specialisms, depending on the file type. At the moment our DSL is a generic language for parsing different data types from a file. Whilst we’re dealing with fairly simple data, that’s fine, but sometimes we might encounter some rather more complex data types that are particular to the type of file we’re reading. For example, in a zip file, the lastModifiedDate
is stored in MS-DOS date format, which allows us to store a date in just 2 bytes. Efficient, but really not very easy to parse using the tools we have in our DSL so far. We’ll end up just cluttering our specification again with all sorts of parsing logic.
But we can now extend our DSL with the ability to parse such dates, and we can keep the scope of that to our ZipFile
parser. First, let’s implement the class and parse the first few fields. Find a zip file of your choice to test on.
Implementations of the setorder
, int
and short
instructions were exercises for the reader in part 2. Of course you did them. If you didn’t, have a go now, I’ll wait…
Now we have that thorny date time in MSDOS format. In this format, we’ll read 2 bytes for the time. Bits 0-4 are the seconds, halved. Bits 5-10 are the minute, and the rest of the bits are the hour. In order to parse these, we’ll take those two bytes, apply a bit mask, and then shift the bits to get the final number. We do likewise for the date, in which bits 0-4 are the day, 5-8 are the month, and 9-15 are the number of years since 1980. We introduce a method in the ZipFile class to read such dates, and turn them into a LocalDateTime
.
Now, we just have to use it in the parse
method:
In my case, I get 2016-02-29T09:11:50
, which indeed matches the last modified time when I view the file properties. It works!
One last trick up our sleeve. Given concrete subtypes for our file types, it would be nice to regain some of the type safety and code completion, rather than dealing the loose types and keys we’re getting from the map when using the +
operator. Not to mention, that operator is only valid inside the DSL, whereas once we’ve parsed a file, we may want to reference it and its attributes in the rest of our code. First, let’s make our binaryFile
method return a reference to the object we’ve just created. The method specifies a return type T
, and adds a return statement.
Secondly, we want the ZipFile class to have fields that represent the attributes we’ve read from the file. The naive way to do this would be to have methods that make a call into the map - for example fun lastModified() = valueMap.get("lastModified")
. Thankfully, Kotlin has a terser way of doing this, using delegated properties. The Map
class implements the Delegate
interface - classes implementing this interface implement getValue
and setValue
methods, that get passed the name of the property being delegated, so in the case of a map it can look up a value in the map using the property name as the key. We indicate delegated properties using the by
keyword.
Run that code, and you should see that it all works splendidly. Also note that you have a certain amount of type safety back - if you change the type of lastModified
, it will blow up at runtime.
That (probably) concludes this whistlestop tour of refactoring to a Kotlin DSL. I hope you’ve enjoyed it and can find a way to shoehorn some of these techniques into your own projects, but use them wisely! Some of these things are definitely clever, even fun, but nothing beats code that is simple and concise.