First-Order Functions

The content of this chapter is available as a Scala file here.

First-Order Functions

In the last lecture we have seen how we can give commonly occuring (sub)expressions a name via the With construct. Often, however, we can identify patterns of expressions that occur in many places, such as 5*5/2, 7*7/2 and 3*3/2, the common pattern being x*x/2. In this case, the abstraction capabilities of With are not sufficient. One way to enable more powerful abstractions are functions. Depending on the context of use and the interaction with other language features (such as imperative features or objects), functions are also sometimes called procedures or methods. Here we consider so-called first-order functions, that - unlike higher-order functions - are not expressions and can hence not be passed to or be returned from other functions. First-order functions are simply called by name. To introduce first-order functions, we need two new things: The possibility to define functions, and the possibility to call functions. A call to a function is an expression, whereas functions are defined separately. Functions can have an arbitrary number of arguments. The following definitions are the language we have analyzed so far together with the new language constructs for first-order functions:

object Syntax {
  enum Exp:
    case Num(n: Int) extends Exp
    case Add(lhs: Exp, rhs: Exp) extends Exp
    case Mul(lhs: Exp, rhs: Exp) extends Exp
    case Id(x: String) extends Exp
    case With(x: String, xdef: Exp, body: Exp) extends Exp

    /** The new language constructs for first-order functions: */
    case Call(f: String, args: List[Exp]) extends Exp // functions are called by name

  object Exp:
    /** The new language constructs for first-order functions: */
    case class FunDef(args: List[String], body: Exp)
    type Funs = Map[String, FunDef]

    /** We use implicits again to make example programs less verbose. */
    implicit def num2exp(n: Int): Exp = Num(n)
    implicit def string2exp(x: String): Exp = Id(x)
}
import Syntax._
import Exp._

A function has a number of formal args and a body. Note that a first-order function also has a name. To make the invariant that there can only be one function for each name explicit, we have stored functions in the form of a map from function names to FunDefs above.

The substitution for the new language is a straightforward extension of the former one.

def subst(e: Exp, i: String, v: Num): Exp =  e match {
    case Num(n) => e
    case Id(x) => if (x == i) v else e
    case Add(l, r) => Add(subst(l, i, v), subst(r, i, v))
    case Mul(l, r) => Mul(subst(l, i, v), subst(r, i, v))
    case With(x, xdef, body) => With(x,
                                   subst(xdef, i, v),
                                   if (x == i) body else subst(body, i, v))
    case Call(f, args) => Call(f, args.map(subst(_, i, v)))
}

We will first study a "reference interpreter" based on substitution. We pass the map of functions as an additional parameter.

def eval(funs: Funs, e: Exp): Int = e match {
  case Num(n) => n
  case Id(x) => sys.error("unbound identifier: " + x)
  case Add(l, r) => eval(funs, l) + eval(funs, r)
  case Mul(l, r) => eval(funs, l) * eval(funs, r)
  case With(x, xdef, body) => eval(funs, subst(body, x, Num(eval(funs, xdef))))
  case Call(f, args) => {
     val fd = funs(f) // lookup function definition
     val vargs = args.map(eval(funs, _)) // evaluate function arguments
     if (fd.args.size != vargs.size)
       sys.error("number of paramters in call to " + f + " does not match")
     // We construct the function body to be evaluated by subsequently substituting
     // all formal arguments with their respective argument values.
     // If we have only a single argument "fd.arg" and a single argument value "varg",
     // the next line of code is equivalent to:
     // val substbody = subst(fd.body, fd.arg, Num(varg))
     val substbody = fd.args.zip(vargs).foldLeft(fd.body)((b, av) => subst(b, av._1, Num(av._2)))
     eval(funs, substbody)
  }
}

Is the extension really so straightforward? It can be seen in the last line of our definition for subst that variable substitution deliberately ignores the function name f. The substitution for f instead is handled separately inside eval. We say in this case that function names and variable names live in different "name spaces". An alternative would be to have them share one namespace. As an exercise, think about how to support a common namespace for function names and variable names.

A test case:

val someFuns: Funs = Map("adder" -> FunDef(List("a", "b"), Add("a", "b")),
                     "doubleadder" -> FunDef(List("a", "x"), Add(Call("adder", List("a", 5)), Call("adder", List("x", 7)))))
val callSomeFuns = eval(someFuns, Call("doubleadder", List(2, 3)))
// callSomeFuns: Int = 17
assert(callSomeFuns == 17)

The scope of function definitions:

As can be seen in the example above, each function can "see" the other functions. We say that in this language functions have a global scope. Exercise: Can a function also invoke itself? Is this useful? We will now study an environment-based version of the interpreter. To motivate environments, consider the following sample program:

val testProg = With("x", 1, With("y", 2, With("z", 3, Add("x", Add("y", "z")))))

When considering the With case of the interpreter, the interpreter will subsequently produce and evaluate the following intermediate expressions:

val testProgAfterOneStep     = With("y", 2, With("z", 3, Add(1, Add("y", "z"))))
val testProgAfterTwoSteps    = With("z", 3, Add(1, Add(2, "z")))
val testProgAfterThreeSteps  = Add(1, Add(2, 3))

At this point only pure arithmetic is left. But we see that the interpreter had to apply subsitution three times. In general, if the program size is n, then the interpreter may perform up to O(n) substitutions, each of which takes O(n) time. This quadratic complexity seems rather wasteful. Can we do better? We can avoid the redundancy by deferring the substitutions until they are really needed. Concretely, we define a repository of deferred substitutions, called environment. It tells us which identifiers are supposed to be eventually substituted by which value. This idea is captured in the following type definition:

type Env = Map[String, Int]

Initially, we have no substitutions to perform, so the repository is empty. Every time we encounter a construct (a With or an application Call) that requires substitution, we augment the repository with one more entry, recording the identifier’s name and the value (if eager) or expression (if lazy) it should eventually be substituted with. We continue to evaluate without actually performing the substitution. This strategy breaks a key invariant we had established earlier, which is that any identifier the interpreter could encounter must be free, for had it been bound, it would have already been substituted. Now that we’re no longer using the substitution-based model, we may encounter bound identifiers during interpretation. How do we handle them? We must substitute them by consulting the repository.

def evalWithEnv(funs: Funs, env: Env, e: Exp): Int = e match {
  case Num(n) => n
  case Id(x) => env(x) // look up in repository of deferred substitutions
  case Add(l, r) => evalWithEnv(funs, env, l) + evalWithEnv(funs, env, r)
  case Mul(l, r) => evalWithEnv(funs, env, l) * evalWithEnv(funs, env, r)
  case With(x, xdef, body) => evalWithEnv(funs, env + ((x, evalWithEnv(funs, env, xdef))), body)
  case Call(f, args) => {
     val fd = funs(f) // lookup function definition
     val vargs = args.map(evalWithEnv(funs, env, _)) // evaluate function arguments
     if (fd.args.size != vargs.size)
       sys.error("number of paramters in call to " + f + " does not match")
     // We construct the environment by associating each formal argument to its actual value
     val newenv = Map() ++ fd.args.zip(vargs)
     evalWithEnv(funs, newenv, fd.body)
  }
}

val evalEnvSomeFuns = evalWithEnv(someFuns, Map.empty, Call("doubleadder", List(2, 3)))
// evalEnvSomeFuns: Int = 17
assert(evalEnvSomeFuns == 17)

In the interpreter above, we have extended the empty environment when constructing newenv. A conceivable alternative is to extend env instead, like so:

def evalDynScope(funs: Funs, env: Env, e: Exp): Int = e match {
  case Num(n) => n
  case Id(x) => env(x)
  case Add(l, r) => evalDynScope(funs, env, l) + evalDynScope(funs, env, r)
  case Mul(l, r) => evalDynScope(funs, env, l) * evalDynScope(funs, env, r)
  case With(x, xdef, body) => evalDynScope(funs, env + ((x, evalDynScope(funs, env, xdef))), body)
  case Call(f, args) => {
     val fd = funs(f)
     val vargs = args.map(evalDynScope(funs, env, _))
     if (fd.args.size != vargs.size)
       sys.error("number of paramters in call to " + f + " does not match")
     val newenv = env ++ fd.args.zip(vargs) // extending env instead of Map() !!
     evalDynScope(funs, newenv, fd.body)
  }
}

val evalDynSomeFuns = evalDynScope(someFuns, Map.empty, Call("doubleadder", List(2, 3)))
// evalDynSomeFuns: Int = 17
assert(evalDynSomeFuns == 17)

Does this make a difference? Yes, it does. Here is an example:

val funnyFun: Funs = Map("funny" -> FunDef(List("a"), Add("a", "b")))
val evalDynFunnyFun = evalDynScope(funnyFun, Map.empty, With("b", 3, Call("funny", List(4))))
// evalDynFunnyFun: Int = 7
assert(evalDynFunnyFun == 7)

Obviously this interpreter is "buggy" in the sense that it does not agree with the substitution-based interpreter. But is this semantics reasonable? Let's introduce some terminology to make the discussion simpler:

Definition (Static Scope): In a language with static scope, the scope of an identifier’s binding is a syntactically delimited region. A typical region would be the body of a function or other binding construct.

Definition (Dynamic Scope): In a language with dynamic scope, the scope of an identifier’s binding is the entire remainder of the execution during which that binding is in effect.

We see that eval and evalWithEnv give our language static scoping, whereas evalDynScope gives our language dynamic scoping. Armed with this terminology, we claim that dynamic scope is entirely unreasonable. The problem is that we simply cannot determine what the value of a program will be without knowing everything about its execution history. If a function f were invoked by some sequence of other functions that did not bind a value for some parameter of f, then that particular application of f would result in an error, even though a previous application of f in the very same program’s execution may have completed successfully! In other words, simply by looking at the source text of f, it would be impossible to determine one of the most rudimentary properties of a program: whether or not a given identifier was bound. You can only imagine the mayhem this would cause in a large software system, especially with multiple developers and complex flows of control. We will therefore regard dynamic scope as an error. That said, there are facets of dynamic binding that are quite useful. For instance, exception handlers are typically dynamically scoped: A thrown exception is dispatched to the most recently encountered active exception handler for that exception type.