Many years ago, Peter Norvig wrote a beautiful article about creating a lisp interpreter in Python. It’s the most fun tutorial I’ve seen, not just because it teaches you about my favorite language family (Lisp), but because it cuts through to the essence of interpreters, is fun to follow and quick to finish.
Recently, I had some time and wanted to learn Rust. It’s a beautiful systems language, and I’ve seen some great work come out from those who adopt it. I thought, what better way to learn Rust, than to create a lisp interpreter in it?
Hence, Risp — a lisp in rust — was born. In this essay you and I will follow along with Norvig’s Lispy, but instead of Python, we’ll do it in Rust 🙂.
Syntax, Semantics and Notes on Following Along
If you haven’t heard of lisp, some Paul Graham’s essays (one, two, three), alongside some Rich Hickey talks will get you fired up. In short, everything is a list, everything is an expression, and that makes for a very powerful language.
Our structure will be similar to Norvig’s tutorial, though I depart slightly in two ways:
- Instead of 2 stopping points (Lispy Calculator and Full Lispy), we have 4 stopping points. This reflects the phases I took to build it in Rust.
- Norvig’s syntax is based on Scheme. We will base it on Scheme too, but since I’m also a Clojure fan, I sometimes used slightly different naming, and different implementations for a few functions. I will note when I do that in the essay.
Finally, this is the first program I wrote in Rust. I may have misused some things, so if you’re a Rust hacker, I’d love to hear your feedback 🙂.
With the notes out of the way, let’s get into it.
Language 1: Just a Risp calculator
As Norvig suggests, our first goal is to create a subset of lisp, that can do what a basic calculator can do.
To make it as simple as possible to follow, for language 1, we’ll only support addition and subtraction. No variable definitions, no if statements, nada.
This departs a bit from Lispy, but I found this stopping point a lot more convenient when writing it in Rust. So, our goal:
(+ 10 5 2) //=> 17
(- 10 5 2) //=> 3
The important process we need to remember is the flow of an interpreter:
our program ⟶ parse ⟶ abstract syntax tree ⟶ eval ⟶ result
We will need to parse our program and convert it into an abstract syntax tree. After that, we can eval the abstract syntax tree and get our result. (Refer to Norvig’s article for more detailed definitions and explanations).
Type Definitions
Risp can have three kinds of values for now:
We’ll also need an error type:
Finally, we’ll need an environment type. This is where we will store defined variables, built-in functions, and so forth:
Parsing
Our goal is to take our program, and build an abstract syntax tree from it. For us, that is going to be a RispExp
. To do this, first we will take our program, and cut it up into a bunch of tokens:
tokenize("(+ 10 5)") //=> ["(", "+", "10", "5", ")"]
Here’s how we can do that in Rust:
Then, we can parse these tokens, into a RispExp
:
Note: I depart slightly from Norvig’s implementation, by using a “start” position, and returning a “next” position. This lets us recurse and parse nested lists, without mutating the tokens vec.
We get the token for the current position. If it’s the beginning of a list “(“, we start reading and parsing the tokens that follow, until we hit a closing parenthesis:
If it’s a closing tag of a list “)”, we return an error, as read_seq should have skipped past it.
Otherwise, it can only be an atom, so we parse that:
Environment
Let’s go ahead and create the default, global environment. As Norvig explains, environments are where we will store variable definitions and built-in functions.
To implement built-in operations (+, -)
, we need a way to save rust function references. Let’s update RispExp
, so that we can store rust function references:
Then, we can create a default_env
function, that returns a RispEnv
, which implements +, and –
Note: I am following Clojure’s spec for + and -.
To make this simpler, I made a quick helper, which enforces that all RispExp
that we receive are floats:
Evaluation
Now, time to implement eval.
If it’s a symbol, we’ll query for that symbol in the environment and return it (for now, it should be a RispExp::Func
)
If it’s a number, we’ll simply return it.
If it’s a list, we’ll evaluate the first form. It should be a RispExp::Func
. Then, we’ll call that function with all the other evaluated forms as the arguments.
Aand, bam, we have eval.
Repl
Now, to make this fun and interactive, let’s make a repl.
We first need a way to convert our RispExp
to a string. Let’s implement the to_string
trait
Then, let’s tie the interpreter process into a loop
Aand, voila, language 1.0 is done. Here’s the code so far 🙂
We can now add and subtract!
risp >
(+ 10 5 (- 10 3 3))
// 🔥 => 19
Language 1.1: Risp calculator++
Okay, we have a basic calculator. Now, let’s add support for booleans, and introduce some equality comparators.
To implement bools, let’s include it in our RispExp
Rust will tell us to update ToString
Then Rust will tell us we should change eval
, to consider bools:
Let’s also update our parse_atom
function, to consider bools:
Now, we should be good to go. To really see these in action though, let’s implement =, >, <, >=, <=
Comparison Operators
In clojure, these comparison operators are a bit special. They can take more than 2 args, and return true if they are in a monotonic order that satisfies the operator.
For example (> 6 5 3 2)
is true, because 6 > 5 > 3 > 2. Let’s do this for Risp:
The key here is our helper macro ensure_tonicty
. This takes a checker function, and ensures that the conditional passes in a monotonic way:
Aand, voila, language 1.1 is done. Here’s the code so far 🙂
We can now use comparators, and see booleans!
risp >
(> 6 4 3 1)
// 🔥 => true
Language 1.2: Almost Risp
Okay, now, let’s make this a language. Let’s introduce def
and if
.
To do this, let’s update eval
to deal with built-in operators:
We take the first form, and try to eval it as a built-in. If we can, voila, otherwise we evaluate as normal.
Here’s how eval_built_in_form
looks:
if
Here’s how we can implement if:
def
And here’s def:
Aand bam, language 1.2 is done. Here’s the code so far 🙂
We now have some coool built-in functions.
risp >
(def a 1)
// 🔥 => a
risp >
(+ a 1)
// 🔥 => 2
risp >
(if (> 2 4 6) 1 2)
// 🔥 => 2
risp >
(if (< 2 4 6) 1 2)
// 🔥 => 1
Language 2: Full Risp
Now, let’s make this a full-on language. Let’s implement lambdas
! Our syntax can look like this:
(def add-one (fn (a) (+ 1 a)))
(add-one 1) // => 2
First, create the lambda expression
First things first, let’s introduce a Lambda type for our RispExp
Rust will tell us to update ToString
:
Then Rust will tell us to update eval
:
Then, support the built-in constructor
Now, let’s update eval, to handle fn — this will be the built-in call that creates a Lambda expression:
eval_lambda_args
can look like this:
Then, let’s support scoped environments
For now we only have a global environment. To support lambdas, we need to introduce the concept of scoped environments. Whenever we call a lambda, we’ll need to instantiate a new environment.
To do this, let’s first update our RispEnv
struct, to keep an outer reference:
Let’s update default_env
, to specify the lifetime and return None as the outer environment:
Then, let’s update eval
, to recursively search for symbols in our environment:
Finally, let’s support calling lambdas
Let’s update eval
, so that we know what to do when the first form in a list is a lambda:
We first have a quick helper function to eval a list of expressions, as we’ll be doing that both for RispExp::Func
and RispExp::Lambda
Then, we create a function call env_for_lambda
. This will get the params_exp
, and create an environment, where each param corresponds to the argument at that index:
To do this, we need the helper parse_list_of_symbol_strings
, to make sure all of our param definitions are in fact symbols:
With that, we can eval(lambda.body_exp, new_env)
, and…
Voila…language 2.0 is done. Take a look at the code so far 🙂
We now support lambdas!
risp >
(def add-one (fn (a) (+ 1 a)))
// 🔥 => add-one
risp >
(add-one 1)
// 🔥 => 2
Fin
And with that, we’ve reached the end of this adventure. I hope it’s been fun!
There’s still a bunch more to implement, and ways we can make this even more elegant. If you get to it, send me your thoughts 🙂.
Finally, I have to say, I loved using Rust. It’s the least mental overhead I’ve had to maintain with a systems language, and it was a blast to use. The community is alive and well, plus — their guides are phenomenal! Give it a shot if you haven’t already.