ML programming

Midterms were handed out. Material is still game for the final exam, but exam will be emphasizing stuff from here on in. Maybe 75/25 split between new/old content (including lambda calculus as “new”).

ML assignment is out, btw.

ML

Stands for “meta-language”. Originally designed for expressing proofs for an automatic theorem prover.

It’s a functional language like Scheme, so no overwriting variables and functions are first-class objects.

But it has a much more conventional syntax than Scheme, and unlike Scheme, it’s statically typed. But! It supports flexibility in its type system by supporting parametric polymorphism: ex. if you have a function, it doesn’t mean it supports only one type. It may support an infinite number of types.

What else does ML have? It has pattern matching, as we saw in our homework assignment, and it has recursive data types, where you can define a type in terms of itself.

It also has type inference. You don’t always have to declare the types of things: the compiler infers the type of the variables in your program. Ex., type 4, and it’ll infer that it’s an int.

Prof thinks we’ll like programming in this language best out of all the languages we cover.

ML syntax and features

Despite being a compiled language, it still has a REPL. Things are just compiled and type-checked on-the-fly. Semicolons are used in the REPL to run the code you just typed—they’re not part of the language itself. You didn’t need that in Scheme because it could just match the closing parentheses and tell when you were done typing by that.

See also the prof’s notes online for more detail.

(* These are comments *)

Literals are like 4 for integers, 3.4 for real numbers, and "stuff like this" for strings.

Variables are defined like this:

val x = 27

Notice that you don’t have to define x’s type: it’s inferred to be an int.

Your main aggregate types are lists and tuples.

Lists are used similarly to Scheme, as they’re the main aggregate values.

[2, 3, 4, 5] (* This is a list of type "int list" *)

Note that list is not a type: it has to be a list of some other type, like int list or real list. But unlike Scheme, lists have to be homogenous: all values in the list need to have the same type.

[2, 3, 4.4, 5.6] (* This gives an error *)

Why? Because ML is statically typed. So when you take something out of a list, you need to know what type that thing is. And lists only have one type to represent their contents—you can’t have the type varying based on the index of the list.

Tuples, on the other hand, are data structures with fields of different types, where the fields are accessed by position. Sort of like a fixed-length, heterogeneous list.

(3.4, 3, "hello") (* This is a tuple of type
                     "real * int * string" *)

These are sort of like Cartesian products of types.

What’s the restriction here? You can’t compute at runtime which field you want to extract from the tuple. You have to explicitly have that there at compile time.

More on tuples later.

Defining functions is done like this:

fun foo x y = x + (y * 2)

Notice there are no types above, but the compiler is smart enough to realize that this is a function that takes two integers and returns an integer (denoted as int -> int -> int).

(Note: It assumes that it’s a function over ints, not reals, although reals are just as valid an assumption for + and *.)

Another example, of a one-argument function:

fun fac n = if n = 0 then 1 else n * fac (n-1)
(* This is of type "int -> int" *)

The type signature above means “this takes in an int and returns an int”, pronounced as “int to int”.

ML allows you to use patterns to define functions. You write your function as a set of “equations”, so to speak, and if your function matches the left-hand side, it then executes the right-hand side.

For example, you can rewrite the above fac function as:

fun fac 0 = 1 (* Matches only when argument = 0 *)
 |  fac n = n * fac (n-1) (* Matches any argument *)

Patterns can also be used to match against lists and tuples:

fun f [] = 0 (* Matches when argument is empty list *)
 |  f (x::rest) = x + f rest

The pattern var::other_var sets var to the head (or “car”) of the list, and other_var to the tail (or “cdr”) of the list. So in the example above, x becomes the head and rest becomes the tail.

This function calculates the sum of the elements of the list, and is of type int list -> int.

There are operators for head and tail, but we generally won’t need them, since we’ll use pattern-matching most of the time.

You can also use patterns when defining variables:

val (y::ys) = [1,2,3,4,5]

y then becomes 1, and ys gets [2,3,4,5].

In ML, “cons” is spelled ::. For example:

1::[2,3,4] (* Returns [1,2,3,4] *)

Examples: some common list functions

Let’s get the length of a list:

fun length [] = 0
 |  length (x::xs) = 1 + length xs

Note that function application doesn’t usually need parentheses. In our fac function, we needed parentheses because function application has higher precedence than +/-.

What’s the type of this? It’s actually a parametric type: take a list of whatevers, return an int. This is expressed like 'a list -> int.

Whenever you have a “don’t care” type, it’s usually denoted as a quote symbol followed by the letter. This is as a sort of stand-in for greek letters, which are usually used in type-theory papers to denote these kinds of types. You can sort of think of it like “∀α. α list -> int”.

This function is polymorphic: it can be applied to arguments of different types. The way you recognize it is by looking at the type and seeing if it contains a type variable, like our 'a above.

An aside on polymorphism

This is parametric polymorphism, which means that type variables are universally quantified. In English, that means that they’re basically specified with a ∀ in front of them, metaphorically. Meaning that any time you have a type variable, you can put in any type. There’s no way to say, “I want this function to work on ints and reals, and no other types.”

When we get to object-oriented languages, we’ll see a different type of polymorphism, subtype polymorphism, which means that a function can take in a type or any type derived from it. Meaning that a function that takes in class A will also take in B objects if class B derives from A, or even C objects if class C derives from B.

Back to examples

The “append” operator is just written @:

[1, 2, 3] @ [4, 5, 6] (* Returns [1, 2, 3, 4, 5, 6] *)

But let’s write our own:

fun append [] L = L
 |  append (x:xs) L = x :: append xs L

What’s the type of this? It’s 'a list -> 'a list -> 'a list.

What does that mean? Why is there an arrow between the types of the arguments?

Currying

This is because functions are curried in ML (a convention named after Haskell Curry). Like the lambda calculus, functions are only defined with one parameter. What happens when you write fun f x y = x + y, is that you’re defining a function f x that takes in x, that returns an anonymous function that takes in y, that then executes x + y. And the type reflects this:

int -> (int -> int)
 ^       ^      ^
 x       y    x + y

The parentheses are implied, as the arrows are right-associative.

Example: map, higher-order functions

To demonstrate higher-order functions, let’s write map.

fun map f [] = []
 |  map f (c::cs) = f c :: map f cs

Type? It’s complicated: ('a -> 'b) -> 'a list -> 'b list. Broken down:

('a -> 'b) -> 'a list -> 'b list
^--------^       ^          ^
    f         (c::cs)     result

This is because the input type of f needs to match c, and the output type of f needs to match the result.

Explicit types

Note that while you don’t have to write your types explicitly, you still can:

fun f (x:real) y = x + y
(* Type is "real -> real -> real" *)

Lambda expressions

You can also write lambda expressions/anonymous functions using fn:

val g = fn x => x + 1

This is actually identical to writing fun g x = x + 1.

Example: compose

Composing f and g returns a function that applies f, then g, to the input.

fun compose f g = fn x => f (g x)

So this returns a function. What’s the type of compose?

('b -> 'c) -> ('a -> 'b) -> ('a -> 'c)
^--------^    ^--------^    ^--------^
    f              g          result

And remember that the arrows are right-associative, and the names are arbitrary. So this is actually represented by the compiler as:

('a -> 'b) -> ('c -> 'a) -> 'c -> 'b

let expressions

As in Scheme, let is used to define a nested scope.

fun g x = let val y = x + 2 (* Define y to be x+2 *)
              fun h z = y + z (* Define h(z) to be a fcn. *)
          in h 7 (* Call h and return the result *)
          end

Mutually recursive functions

As in Scheme, you have to change the syntax for mutually recursive functions. You do that with the and keyword:

fun f1 0 = 1
 |  f1 n = n * f2 (n-1)
and f2 0 = 1
 |  f2 n = n

If you don’t use and, it fails, because it doesn’t know what f2 is or what its type is.

And that’s it! We’re ending early.

Like these posts? Want more? Subscribe by RSS or email: