Using Scala path-dependent types for awesome

Scala’s path-dependent types allow objects to carry types along with them, and these types can be used in type-checking. This is useful in all sorts of cases; in particular, it lets you encode relationships where you get objects from another object (e.g. database cursors from a connection), and they objects can only be used with the particular object they came from. You can’t use a cursor with a different connection, even if the connection is an instance of the same class. Path-dependent types allow this requirement to be statically type-checked.

I use these in some of my current research code. I have a spider that makes web requests and saves the results. This spider needs to be able to run against multiple backends, and each backend has a different set of requests and data types. They all have Nodes, but the requests for nodes differ from backend to backend. Abstracting the spider into its own component allows me to quickly write new backends just by specifying the requests with their post-processing/storage capabilities.

The type of a request with its storage (called an InfoNeed) looks something like this:

trait InfoNeed {
type DataType
/** Get the web request to fetch data */
def request: WebRequest[DataType]
/** Get the neighbors from some fetched data */
def neighbors(data: DataType): Traversable[Node]
/** Save the data to the data store. */
def save(store: DataStore, data: DataType)
}

Different backends can have wildly different request structures, so long as they can specify needs that have a request to fetch the data, a way to extract new neighbors, and a way to save the data appropriately in the data storage backend.

Now, a piece of the spider code (vastly simplified) looks about like this:

def fetch[T](req: WebRequest[T]): Result[T]
def process(need: Need) {
val req = need.request
val res = fetch(req)
// now we save the data
res match {
case Good(data) => need.save(store, data)
/* error cases */
}
}

The variable res is of type Result[need.DataType]; its type is dependent on a type carried in the value of the parameter need. As a result, Scala statically type-checks that the need’s save method receives the same type of data as it retrieves in with its request, without the surrounding spider code having any knowledge of what type that data might have (because the DataType type is fully abstract).

I can type-check safe flow of arbitrary data through the process function, knowing nothing about the data but only that the type is the same and carried through properly. That is very, very cool.

OCaml allows a similar thing with its first-class modules, but they tend to carry around some more syntactic baggage in my experience.