Using Scala path-dependent types for awesome

Published on Sunday, January 13, 2013 and tagged with programming and scala.

Scala’s path-dependent types allow objects to carry types along with them, and these types can be used in type-checking. This is useful in all sorts of cases; in particular, it lets you encode relationships where you get objects from another object (e.g. database cursors from a connection), and they objects can only be used with the particular object they came from. You can’t use a cursor with a different connection, even if the connection is an instance of the same class. Path-dependent types allow this requirement to be statically type-checked.

I use these in some of my current research code. I have a spider that makes web requests and saves the results. This spider needs to be able to run against multiple backends, and each backend has a different set of requests and data types. They all have Nodes, but the requests for nodes differ from backend to backend. Abstracting the spider into its own component allows me to quickly write new backends just by specifying the requests with their post-processing/storage capabilities.

The type of a request with its storage (called an InfoNeed) looks something like this:

Different backends can have wildly different request structures, so long as they can specify needs that have a request to fetch the data, a way to extract new neighbors, and a way to save the data appropriately in the data storage backend.

Now, a piece of the spider code (vastly simplified) looks about like this:

The variable res is of type Result[need.DataType]; its type is dependent on a type carried in the value of the parameter need. As a result, Scala statically type-checks that the need’s save method receives the same type of data as it retrieves in with its request, without the surrounding spider code having any knowledge of what type that data might have (because the DataType type is fully abstract).

I can type-check safe flow of arbitrary data through the process function, knowing nothing about the data but only that the type is the same and carried through properly. That is very, very cool.

OCaml allows a similar thing with its first-class modules, but they tend to carry around some more syntactic baggage in my experience.