You are here

Haskell and System F-sub

I'm a pretty big fan of functional languages, Haskell in particular. I don't get to write enough of it. I found out about the Lambda Cube recently and it inspired some thoughts, particularly about research I might want to do if I went back to school at some point in the future. What follows is a bit of a brain dump around that.

Now, I'm a big fan of static typing. Well, I'm a big fan of any sort of static code analysis, and static typing is a type of code analysis that is well understood. Any errors my compiler can give me is an error I won't get at run time or a problem I am forced to explicitly handle. However, I'm not a big fan of typing. So, I want a language (like Haskell) that has high-quality type inference. The fewer types I have to give explicitly the better, although I find for anything exported (i.e. a "public" API) should have an explicit type signature, if for not other reason than that it is part of the interface so you want to make sure the type-inference engine comes up with the types you want.

Looking at the lambda cube, we have simply typed lambda calculus, the foundation for pretty much all well-defined functional programming languages. (I can't think of any that don't have some roots there, even if their syntax has diverged a bit.) Unfortunately, simply typed lambda calculus doesn't have many of the features that have shown themselves to be useful both in Haskell and functional programming in general It is sort of nice that the typing for simply typed lambda calculus is decidable, it does mean that the language itself isn't Turing complete, but it does mean that, at least for expressions that are simply typed, we can have not only static type checking for complete type inference.

One of the features simply typed lambda calculus misses is polymorphism (all types). Most functional languages admit parametric polymorphism (a.k.a. generics). Adding parametric polymorphism to simple typed lambda calculus gives us System F. It's not completely decidable and Turing complete. Haskell does a really good job of implementing the decidable parts of system F. It does balk at the creation of infinite types, even when the program itself can be fully analyzed and determined as terminating, which is unfortunate but actually saves a lot of headaches most of the time. But, try showing off some Fixed point combinators in Haskell. You'll find even simple expressions like (SI)I (in SKI Calculus) are not well-typed according to Haskell. It seems like an infinite type (e.g. t = t -> t1) should be allowed when the function is provably non-strict in that first argument.

That's only one type of polymorphism though. (and, to me, it seems like the "easy" one). More on that later.

Back to the lambda cube. Besides polymorphism we also see type operators on an axis. I haven't gotten to play with these too much yet, but I understand that Haskell does implement these in some manner, specifically, it seems like functional dependencies in multi-parameter type classes and some other GHC extensions play in this area. Some of the paper written before (and after) the GHC 7 release probably gave us better mathermatical foundations for type-checking and type-inference in System F-omega. Of course we can't have a complete type checker (since System F can't be completely type-checked and it is "smaller" that System F-omega). Would be somewhat nice to research in this space, but I need a lot of study to push the limits of type-checking decidability in System F-omega.

Now, the last axis of the lambda cube concerns me. It seems really powerful, but it also seems up undermine static type checking at a fundamental level. For me the critical difference between types and values is that values can come from the user at run time, normally, you don't want users to introduce new types. (Actually, if functions aren't first class objects, you may want to allow them to introduce new types, but once you have functions as first-class objects, there's less of a need there.) If your types depend on values, well, you can't check those types. Now, dependent types have certainly gotten some traction in Agda and Coq, but I think I'd want a nice language level division that shouted "Here be dragons!" before failing to type-check something because a type was dependent on a (non-CAF) value. I'm pretty sure that I'll have to venture there while I'm learning enough type theory to make myself academically relevant, but it's not really where I want to make my mark.

Back to polymorphism. We covered parametric polymorphism, but that's not actually the same polymorphism I learned in Programing 102. Instead we talked about C++ style subtype polymorphism. According to my sources (i.e. Wikipedia and a few links of of it), Haskell doesn't have subtype polymorphism. OCaml does, so I started looking at it. Unfortunately, it doesn't seem to have a well-specified type-checking/inference engine. I'm sure it works, but I couldn't find any real details about how it works, while most of the Haskell extensions to Hindley-Milner typing (the basis for ML typing) have been written about and are decently documented. On top of that, OCaml is not referentially transparent. Haskell isolates side-effects, giving any function with side-effects a different type from a function without side-effects. The best example of that is Monadic I/O, but the STM, ST, and State Monads are also examples. I find this makes programs easier to reason about; I better understand what might change. But, I'm very used to OO techniques and it seems like lack of subtype polymorphism is a glaring flaw in Hakell, or at least it did yesterday.

The final type of polymorphism (discussed on Wikipedia) is ad-hoc polymorphism. In a few ways this seems to be similar to type classes in Haskell. However, it is also related to late-binding either through single or multiple dispatch. All-in-all, it doesn't seem rigorously defined.

The similarities of ad-hoc polymorphism to Haskell type classes got me thinking about type classes in regards to subtype polymorphism. Now, as I've mentioned, Haskell implements a type system very similar to System F, with some extensions in GHC pointing toward a largish subset of System F-omega. There is a known formal extension to System F called System F-sub. I read the paper that covered the initial extension from System F to System F-sub as well as one paper on implementing System F-sub, including type checking and inference. What I got out of it was that System F-sub adds bounded quantification of type variables as well as an Uber-type called "Top" and a single member of that class called "top". I think the addition of "Top" (in particular) muddies the semantics a bit. Instead of bounded quantification that looks like A <: top="" a="" is="" subtype="" of="" it="" seems="" that="" one="" could="" simply="" use="" universal="" quantification="" and="" the="" proofs="" algorithms="" discussed="" in="" both="" system="" f-sub="" papers="" would="" be="" largely="" unchanged.="" value="" was="" ignored="" doesn="" seem="" like="" needs="" to="" exist.="" we="" already="" have="" suitable="" member="" even="" if="" no="" other="" types="" exist:="" _="" bottom="" or="" undefined="" from="" what="" i="" can="" tell="" meets="" all="" requirement="" for="" but="" maybe="" theory="" gets="" bit="" cloudy="" when="" you="" also="" classes.="" doubt="" since="" tree="" actually="" includes="" values="" as="" well.="">

Dropping Top and "top" gives us a syntax that looks very much like the context part of a Haskell type signature. Haskell already supports type subclasses and it able to not only check but also infer them. So, on the face of it I would say that Haskell does actually implement System F-sub "out of the box". There goes my research paper. ;P

I'm pretty comfortable asserting that Haskell does support subtype polymorphism and that it is on display in the standard prelude. Num and all of its subtypes are one example, as it is Functor / Monoid / Applicative / Monad ( / Alternative / MonadPlus ) hierarchy. However, this still seems like incomplete support for the OO paradigm.

I've heard Haskell called as post-OO language, and that may be true. However, we don't quite yet live in a post-OO world. Interface specifications are still mostly written in some sort of OO IDL. .Net is fundamentally OO as was COM+, COM, and OLE before it. DBus is fairly OO. Things like the HTML 5 DOM across a wide spectrum from purely data oriented wire protocols to data agnostic APIs are published in IDL formats that are fundamentally OO. For that reason, we need a mechanical transformation of OO concepts into Haskell that doesn't burden the programmer necessarily.

Under a closed-world assumption, GADTs in GHC do a really good job here. However, GADTs can break down when to user of the API / protocol is allowed to create new implementations of the established interfaces. Type classes can work in an open world, each new type has to provide instance declarations or expose data member / function that of a type that is already an instance of the type class; both of these options do increase the burden on the programmer to varying degrees.

I would like to propose that single-inheritance of structure be implemented (er, rather, given syntactic sugar) in Haskell. Single and multiple inheritance of interface is already possible through type classes. In tandem, the stand-alone deriving extension would be used to make it trivially easy for new classes that inherited structure to inherit interface by delegating to the implementation around the shared structure. I'm sure there are similarities to type families here, but I believe that type operators and function dependencies can (and probably) should be avoided in the programmer interface to this type of structural inheritance. Those are advanced topics for advanced uses. Mapping the HTML 5 DOM (or any OMG IDL) or something similar should be a mechanical process and the resulting code should be able to be used to a novice (but not completely new) Haskell programmer can use it "out of the box". Type classes and type sub-classes are fair game (both are supported in Haskell 2010) and I think well-defined single-path structural inheritance would be fine. Dipping into GADTs and Type Families should be avoided not due to fundamental complexity but because the have poor mapping to OO concepts and that they only exist as GHC extensions.

I'm pretty sure it's possible to write a GADT and admit "user" provided implementations of a common interface through a special constructor, and that might also be appropriate for many things; the HTML 5 DOM in particular.

Goodness, this is a lot longer than I intended.