More Scala, and heap allocation

April 27, 2016

Back to Scala

For the assignment, you should put your code into a file and compile it with scalac. But remember that the REPL doesn’t implement the full language.

Remember from a previous class: function subtyping is contravariant in the input type and covariant in the output type. Scala supports this.

class B extends A

def f(g: B=>A) : A = g(new B)

// Thing in the f() call below is a lambda expression
// Syntax is (param : Type) => body
f((x:A)=> new B) // This works
// Function passed in is of type A=>B, so this shows
// both contravariance in the input type and
// covariance in the output type.

Node that Scala prints out your objects (including in the REPL) based on the toString method. Need to override that if you want something sensible printed.

In Scala, classes don’t have constructors. Instead, they have parameters, which are instantiated as member variables. Parameters in base classes must be explicitly passed in from the derived classes, and parameters from base classes are hidden in derived classes.

Examples of both are below:

class A(x:Int) {
    override def toString() = "A(" + x + ")"
}

class B(y:Int) extends A(y*2) {
    override def toString() = "B(" + y + ")"
    // Can't access x here!
}

Pattern matching

Scala can do pattern matching on classes, but only a special type of classes called case classes.

You start with an abstract class:

abstract class Tree

Then the subtypes of that class, in a way that tells it you want to do pattern matching:

case class Leaf(x:Int) extends Tree
case class Node(left:Tree, right:Tree) extends Tree

Then you can write functions that operate on trees and do pattern matching on both Leaf and Node. Let’s write the fringe function as an example, but first, a quick detour into lists.

Lists

There’s a class called List in Scala.

List() // empty list of nothing, type List[Nothing]
List(1, 2, 3) // List[Int] of numbers
1::List(3, 4, 5) // Cons operator is ::
// => List(1, 3, 4, 5)
List(1, 2, 3) ++ List(4, 5, 6) // Append operator is ++
// => List(1, 2, 3, 4, 5, 6)

That’s about it.

Fringe function

Pattern matching is done with the match and case keywords.

Note that a recursive function in Scala needs an explicitly-defined return type. It can’t infer it.

def fringe(t: Tree) : List[Int] = t match {
    case Leaf(a) => List(a)
    case Node(l, r) => fringe(l) ++ fringe(r)
}

Miscellanea

See the examples are on the course website. A few useful and concise notes are below.

Singleton classes

Singleton classes are a class for which there’s only one object. The main function must be in a singleton class.

Use the keyword object to define a singleton class.

object Counter {
    // ...
}

Member variables

Member variables can be immutable (declared with the val keyword) or varying (per object, declared with the var keyword).

Functions vs. procedures

Scala also has a distinction between functions and procedures. Functions are declared with an equals sign:

def increment(x:Int) = x + 1

And procedures are declared with braces:

def increment {count = count + 1}

Note that procedures that take no arguments don’t need the parentheses.

Operator overloading

Operators are treated the same as any other method name:

class Foo {
    var y : Int = 20
    def +(other: Foo) : Int = y + other.y
}

Note that Scala generalized the way infix operators work. Which means that any method can be used as an infix operator.

class A {
    def f(y:Int) = y + 1
    def + (y:Int) = y + 10
}

//...

// This is typical
a.f(6)
// But this works too!
a f 6
// And a case like this is where it's useful
a + 5
// This is equivalent to:
a.+(5)

Interfaces

Scala’s version of interfaces are called traits, and can have definitions in them as well:

trait CanBeCompared[T] {
    def <(other: T): Boolean
    def <=(other: T) = <(other) || ==(other)
    def >(other: T) = !(<(other)) && !=(other)
    def >=(other: T) = !(<(other))
}

The example above is in fact a generic trait, with the type parameter between square brackets.

And it provides some functionality: any type that provides a < operator and equality operators, and implements this trait, automatically gets the <=, >, and >= operator implementations from the trait.

You can then use these interfaces to bound type parameters:

class Thing[T <: CanBeCompared[T]]

Where <: means “is a subtype of” or “extends” or “implements”. There’s also a complementary >: notation for “is a supertype of”.

Note that unlike Java, the word extends is used for both implementing interfaces and extending parent types.

class Whatever extends MyInterface

Subtyping on instances of generic classes

In Java, there is no subtyping on instances of generic classes. C<A> and C<B> have no subtyping relationship whatsoever, even if A and B do.

In Scala, there is subtyping of generics, but you have to declare the generic class to be either covariantly or contravariantly subtyped, or not subtyped at all.

Remember the problem with generic subtyping. Assuming B <: A covariant subtyping doesn’t work in cases like this:

void addA(Collection<A> c) {
    c.add(new A());
}

Collection<B> cb = new Collection<B>();
addA(cb); // BAD! Adding an A to a collection of Bs!

And contravariant subtyping doesn’t work in cases like this:

void doStuffWithFirstElement(Collection<B> c) {
    B b = c.get();
    // ...Do stuff with b...
}

Collection<A> ca = new Collection<A>();
// ...Add a few values to ca...
doStuffWithFirstElement(ca); // BAD! Doing stuff with first
// value of ca as if it were a B

So there are some cases where covariant subtyping doesn’t work, and some cases where contravariant subtyping doesn’t work. Covariant subtyping is ok when you return objects of the generic type. Contravariant subtyping is ok when you take objects of the generic type as a parameter.

Scala lets you specify covariant or contravariant subtyping, and it restricts how you can write the methods in that class to match those rules.

Covariant generic subtyping is specified with a + sign before the type parameter:

class Covar[+T]...

Contravariant subtyping is specified with a - sign before the type parameter:

class Contravar[-T]...

See the online notes for more in-depth examples.

Garbage collection and memory management

This is the point where we move to topics related more to language implementation rather than language features.

Managing storage in a running program involves managing both the allocation and deallocation of storage. Dynamic storage allocation is what most languages use for allocation, which means allocation is done at runtime.

Most languages need this because you can’t tell at compile time how much memory the program will take. Older versions of FORTRAN are one of the few exceptions to this, which had restrictions like no recursion, no dynamically-sized arrays or other data types, etc.

Stack vs. heap allocation

In languages like C and Ada, most things can be allocated on the stack, where variables and parameters are deleted once their stack frame is removed.

Problem is, that can cause problems when you have a pointer to an object on the stack. The pointer could live longer than that object does. For example, you could have a function that returns a pointer to a local variable, which is deleted immediately after the function returns, causing problems.

So instead, you can allocate things on the heap. The heap doesn’t delete objects when a function returns, so you can have an object which has a lifetime that can be longer than the runtime of the procedure that creates it. The object is said to have a dynamic lifetime.

Problem is, how do you know when to deallocate objects off the heap?

Some languages like C or C++ make you do it yourself, explicitly deleting objects off the heap when they’re no longer needed. This, of course, is pretty error-prone.

The other option is to reclaim storage automatically, and every language we’ve looked at this term does this. The general term for this is garbage collection, and we’ll look at this next week.

Heap structure

There are two major ways to allocate objects on the heap: free list allocation, and heap-pointer allocation.

A free list is a linked list of all objects on the heap. In essence, every free block has a pointer to the next free block. When allocating, you search for a block of the appropriate size, remove it from the free list and return it. The expensive part of this, of course, is that it’s expensive to find a block of the right size. But it’s easy and quick to deallocate stuff: just add the block to the head of the list. So cheap to deallocate, expensive to allocate.

We’ll pick up here next week.