Wednesday, August 12, 2009

Restricted-element containers and maps

Given the perspective we have now, I think it's a little hard to imagine how ground-breaking the concept of type classes were when they were first introduced in Haskell. I think it says a lot that, after the Haskell Committee decided upon their inclusion, they made type classes an integral part of the language. Lest you think that's an exaggeration, can you imagine what Haskell would like like today without, e.g. Num, Functor, or Monad?

As a consequence of using a new idea, some areas that Haskell (and Haskellers!) currently struggle with are related to type classes. One area in particular are containers and type classes. A "Holy Grail" of sorts for Haskellers is the creation of a container type class that provides all the useful container functions. Ideally, all useful container types would be able to be instances of this class.

One package that provides this sort of functionality is John Goerzon's excellent ListLike package, http://hackage.haskell.org/package/ListLike.

ListLike provides (in addition to many others), the functions "map" and "rigidMap". Why two? The difference is in the types:

> class ListLike full item | full -> item where
> map :: ListLike full' item' => (item -> item') -> full -> full'
> rigidMap :: (item -> item) -> full -> full

The standard map (and Functor's fmap) allow mapping from a container of any element type to a container of any other element type. rigidMap requires that both the input and output elements and containers have the same type. This is very useful for monomorphic containers. If you provide an instance for ByteString Word8, you can't use the native 'map' because it only maps over elements of the same type since Word8 is the only valid element type. ListLike's default 'map' function builds a new container by destructing and reconstructing the old one. This is highly inefficient compared to native maps, but is the only way to write the function that satisfies the type signature. Even though "ListLike ByteString Word8" is the only instance that can exist, there's no way to prove it to the compiler. 'rigidMap', by further restricting the types, can use the native map, and is therefore much more efficient when it can be used.

Unfortunately there is another case, one that I expect will become more common as certain high-performance libraries become more widespread. Many interesting containers are polymorphic but require class contexts on their elements. Examples are the UVector and StorableVector packages, which require elements to be instances of UA and Storable respectively.

Here's the problem. Using StorableVector as an example, these packages often provide the following map:

> map :: (Storable item, Storable item') => (item -> item') -> Vector item -> Vector item'

How can we use this in ListLike? It could certainly be used directly for 'rigidMap', but what about 'map'? Unfortunately that isn't possible because of the class restriction on the output element type. The type variable for that element isn't mentioned in the class definition (i.e. it doesn't appear in the instance head). Since it's not in the instance head, it isn't possible to put a class context on it. If you fall back to the default 'map' instance any possible efficiency gains are lost.

One solution is to include the output item type in the class definition, but that would require users to put an extra type variable in every time the class context is used, which is a waste for something that is only used for one function.

I think the correct approach is to create a new type class, call it LooseMap, as follows:


> class (ListLike full item, ListLike full' item') => LooseMap full item full' item' where
> looseMap :: (item -> item') -> full -> full'
>
> --StorableVector instance
> instance (Storable el, Storable el') => LooseMap (Vector el) el (Vector el') el' where
> looseMap = Data.StorableVector.map

Of course completely polymorphic containers (e.g. List) could easily be made instances of this class as well. I would think that "map" should be completely supplanted by "looseMap", although rigidMap would still be necessary for monomorphic types.

As a side note, there is another possible solution. I experimented with using type families to encode Functors (and thus a polymorphic map) as RFunctors, using the same approach as in the RMonad package. This worked fine for map, but I couldn't figure out how to write a fold that satisfied the typechecker. I think it's impossible, and would love to hear from anyone who knows how to make it work.