Compilers: Program 0

Due: Friday, January 31 at 5pm

ReCalc Interpreter

Writing an interpreter for a simple programming language.

Read the assignment thoroughly before beginning to write your solution. (This assignment is inspired by one appearing at the close of the first chapter in Andrew Appel’s Modern Compiler Implementation in ML.)


Goals


ReCalc: a repeating calculator language

ReCalc is a very simple programming language that allows for arbitrary precision integer calculations augmented with variable assignment and repetition. Here is an example that computes 2 the 100th power:

 Program (
   [ Assign ("p", Lit 1)                                       -- p = 1
   , Repeat ( Lit 100                                          -- repeat 100 times {
            , [ Assign ("p", BinOp (Var "p", Times, Lit 2)) ]  --     p = p * 2;
            )                                                  -- }
   ]
   , Var "p"                                                   -- p
 )

A ReCalc program consists of zero or more instructions that are to be executed in sequence followed by a single expression to be evaluated. Every valid ReCalc program evaluates to an integer (with two caveats: division-by-zero and repeat loops that either take too long or run out of memory).

ReCalc is mostly a straight-line language: statements are executed in the order they are written, with the exception of repeat blocks which represent definite loops: the number of times of the repetition is calculated prior to the body of the loop being evaluated. There are no conditional operations, nor jumps, nor indefinite loops; nor functions.

In this assignment you will be writing an interpreter for ReCalc. The language is simple: every statement is either an assignment of an integer valued expression to a variable or a definite loop over a sequence of other statements. Every expression evaluates to an integer and is either an integer literal, a variable, or a binary operation (plus, minus, times, integer division, or modulus) applied to two subexpressions. Yet-to-be-assigned variables evaluate to 0.

ReCalc is defined in terms of its abstract syntax; it does not have a fixed concrete syntax, though it is easy to imagine several. You will implement a concrete syntax for it by writing an unparser: a program that translates abstract syntax to concrete syntax. Examples of statements in our concrete syntax include:

days_in_year = 365;
weeks_in_year = days_in_year / 7;
days_left = days_in_year % 7;

sum = 0;
product = 1;
repeat 5 times {
    sum = sum + 3;
    product = product + 3;
}

As mentioned, a ReCalc program consists of a sequence of (zero or more) statements followed by a single expression representing the final value of the program. Examples of valid programs (using concrete syntax):

9   -- programs need not have any statements preceding final expression


-- converting Celsius to Fahrenheit
c = 50;
f = ((c - 32) * 5) / 9;
f


n = 3;
repeat n times {
    x = (2 * x) + 1;  -- x will default to 0 at its first use
}
x

There are no comments in the ReCalc. The comments above (using Haskell-style comments) are just for exposition in this assignment. Likewise, there are no parentheses. Parenthesization is implied via abstract syntax construction.

Here is a transcript of my solution to the assignment run on the above small examples which illustrate the facets of this simple language. Observe that a program evaluates to the value corresponding to the final expression in its list of instructions. If a variable is used before it is defined, the value is taken to be zero so that processing can continue.

You need not worry about parsing ReCalc programs. They are expressed using an abstract syntax formed by using Haskell data constructors. For instance:

n = 3;
repeat n times {
    x = (2 * x) + 1;  -- this is valid, x will default to 0 at its first use
}
x

corresponds to the following Haskell fragment:

Program (
  [ Assign ("n", Lit 3)
  , Repeat ( Var "n"
           , [ Assign ("x", BinOp (BinOp (Lit 2, Times, Var "x"), Plus, Lit 1))]
           )
  ]
  , Var "x"
)
    

(This example is one of several that can be found in the “example programs” section of the starter file.)


Haskell

In any programming language worth using, there are more than one way to implement even trivial algorithms. This is especially so in Haskell. To simplify matters, use only the Haskell constructs we have described thus far in class. (If you are not sure whether to use a particular feature, check with your instructor first.)

You are going to write (or complete) a series of top-level functions almost all of which recursively navigate abstract syntax by using pattern-matching case expressions to choose among different parts of an algebraic datatype (e.g., Lit v. Var v. BinOp). Several such functions are already implemented or partially implemented for you in the starter file.

To gain more experience with pattern matching case expressions we will use them almost exclusively as a means of achieving conditionals. There should be no need to use if expressions (except for the one I supply for you in the find function).

Environments: interpretation of a program involves maintaining an environment (the values of variables that have been assigned). For our ReCalc interpreter, the environment can be thought of as a list of string-integer pairs where the string represents a variable’s name and the integer represents its value. By using lists, we make it easy: a variable can be written not by changing a given pair, but by writing a new pair in front of any that might already exist. (One of the challenge problems asks you to ponder an alternative.) The type Environment is an alias for a list of string-integer pairs:

type Environment = [(String, Integer)]

The supplied evalVar function:

evalVar :: Environment -> String -> Integer
evalVar env v = case env of
    []            -> 0
    (x, n) : rest -> if x == v then n else find rest v

returns the current value of variable v in env or 0 if the v has yet to have been assigned.

Initially, the environment is an empty list. After evaluating an assignment such as:

x = 79;

The environment would be:

[("x", 79)]

After another assignment to a different variable:

y = x + 2;

it becomes:

[("y", 81), ("x", 79)]

If x is reassigned:

x = 7;

It is overwritten in the environment, not literally, but by prepending a new pair:

[("x", 7), ("y", 81), ("x", 79)]

What you need to do

  1. Download the starter file. Modify the comments to reflect that it is your file and no longer the starter file and rename it accordingly (e.g., recalc_msiff.hs). Read through the file. Load it and try it out in ghci as follows:

    Prelude> :load recalc_msiff.hs
    [1 of 1] Compiling Main             ( recalc_msiff.hs, interpreted )
    Ok, one module loaded.
    
    *Main> main
    
    9
    
      ==>  9
    
    --------------------------------------------------
    
    answer = 42;
    answer
    
      ==>  *** Exception: Prelude.undefined
    CallStack (from HasCallStack):
      error, called at libraries/base/GHC/Err.hs:79:14 in base:GHC.Err
      undefined, called at recalc...

    (The last error indicates that eval has not been completely implemented for you!)

  2. Complete these ReCalc programs (by replacing the undefineds using the supplied data constructors):

    1. diffSquaresExample

       gamma = 100;
       delta = gamma - 1;
       (gamma * gamma) - (delta * delta)
    2. fact20Example

       n = 20;
       fact = 1;
       i = 1;
       repeat n times {
           fact = fact * i;
           i = i + 1;
       }
       fact
    3. ulam7Example

       a = 7;
       b = a % 2;
       c = a / 2;
       d = (3 * a) + 1;
       ((1 - b) * c) + (b * d)

    After you complete the unparser (the next problem), you can test individual examples like this:

    Prelude> :load recalc_msiff.hs   -- or whatever you have named it
    ...
    
    *Main> putStrLn (unparse ulam7Example)   -- or whichever example you want to test
    a = 7;
    b = a % 2;
    c = a / 2;
    d = (3 * a) + 1;
    ((1 - b) * c) + (b * d)
  3. Complete the barebones unparser so that it handles all cases. One slightly tricky point is producing parenthesization of binary expressions. A fully functional unparser for ReCalc should parenthesize all binary-operator expressions that occur as subexpression within other expressions. Binary operations that do not contain further subexpressions should not be parenthesized. So, for example:

    x = 3 + 4;         -- no parentheses
    x = 5 * (3 + 4);   -- yes parentheses

    However, do not get stuck on this point, it is better to produce:

    x = (3 + 4);       -- unnecessary parentheses

    than to not include any (which is flat out wrong!):

    x = 5 * 3 + 4;

    The starter code includes all the functions (and their types) to make the barebones unparser work. (You may want to add another function to help solve the parenthesization conundrum.)

    Do not try to indent statements within repeat loops (at least as part of the main assignment; you may consider it as a challenge problem).

  4. Complete the interpreter by fleshing out eval and writing any necessary helper functions. One such helper function, evalVar, has been provided for you. You can test your code by uncommenting the examples from problem 1 in the recalcExamples list and running main.

  5. Make sure your code is organized and consistently and readably formatted. Include comments where you feel it adds to the readability (or explains choices the reader might find surprising). Make sure to remove all undefined and ... comments once they are no longer needed to mark where work needs to be done.


Submitting your work

Submit your Haskell source file as an attachment replying to the official email announcing this assignment. Name your file as recalc_yourname.hs where yourname is either your username or the name you generally go by. (I do not care which as long as I can easily distinguish your file. For example, mine could be recalc_msiff.hs or recalc_mike.hs.)

The body of your email should be left blank. Any specifics about your assignment you think I should require (for example, known bugs or what you did not have time to complete) should be included in proofread comments near the top of your file.


Challenge problems

If you do any of the challenge problems, submit separately, after submitting the main assignment.