The Great Wave off Kanagawa, Katsushika Hokusai
As someone who writes code, you undoubtedly do so using one or multiple programming languages. You probably enjoy writing code in some programming languages because of their elegance, expressive power or any other reason and you also have probably kept your distance from other programming languages because of, maybe, some of their features that you think are badly implemented.
Have you, however, thought about how those programming languages we love and hate came to be? How those particular features we like (or do not like) are designed and implemented and why? How those magic black boxes that are compilers and interpreters work? How code written in JavaScript, Ruby, Python, etc, turns into an executable program? Or have you ever thought about building your own programming language?
If your answer to any of the above questions is yes, welcome to this series of articles about building a programming language.
This series will take you from 0 to 1 in building a functional interpreter for a programming language. At the end of the series, you will have an interpreter, that you built from scratch, run programs written in a programming language that we would have designed together.
Why learn how to implement a programming language?
This article is the introduction to a series of which each article will introduce the concepts and knowledge necessary to complete one specific step in implementing a programming language. Each article ends with a challenge (it wouldn’t be fun if there was no challenge, right?). The challenge will prompt you to implement a well-defined component of your interpreter. There will be some test files to download at the end of each article and you complete the challenge by writing the code to make all the tests pass. By the time you complete all the challenges, you will have a full working interpreter that can run code written in our programming language.
Although any programming language can be used to complete the challenges, for this series, our implementation language will be JavaScript, (ES6 more precisely) mostly so that the series will be accessible to a wider audience.
If all the above sounds good, let’s start by describing the programming language we will be implementing. Our programming language, specifically designed for the purpose of this series, is called Blink. It’s an interpreted, object-oriented and statically typed programming language inspired by Swift, Kotlin and Scala.
The typical “Hello, World” is written as follows.
Console.println("Hello, Blink!")
Blink supports values of familiar types like Int
, Double
, String
and Bool
. Additionally, there is a special type Unit
to express the absence of value (similar to void
in Java) and all types in Blink inherit from a supertype Object
.
A comment in Blink starts with //
. Anything from the //
symbol to the end of line will be ignored by the Blink interpreter.
// This is a comment.
Variables are declared using a let
expression.
let message: String = "Hello, Blink!" in {Console.println(message)}
A let
expression comprises 2 parts:
in
keyword) where the variable is declared and possibly initialized. A variable declaration is made of the name of the variable followed by a colon :
, followed by the type of the variable message: String
. To initialize the variable at its declaration, append equal =
and the value of the variable message: String = "Hello, Blink!"
.in
keyword) where the variable can be accessed. Variables declared with let
are only accessible in the associated body.Omitting the type of the variable. Blink supports type inference meaning that if a variable is initialized at its declaration then its type can be omitted and Blink will infer the correct type of the variable based on its value. The previous example could have been written like this:
let message = "Hello, World!" in {Console.println(message)}
Declaring multiple variables at once. To declare multiple variables, separate them with commas ,
in the declaration part of the let
expression.
let a = 42**,** b = 12**,** c = 24 in {Console.println(a + b + c)}
Each variable can access all the variables declared before it in its initialization expression. So something like the following is perfectly possible.
let a = 42, b = a * 2, c = b * 3 in {Console.println(a)Console.println(b)Console.println(c)}
The above will print
4284252
Omitting the curly braces. A final note on the let
expression is that if the body is made of only one expression, the curly braces can be omitted. The first example could be written more succinctly as follows.
let message = "Hello, Blink!" in Console.println(message)
Like in most programming languages, conditions are expressed using an if-else
expression.
if (<condition>) {<do something>} else {<do something>}
The condition must evaluate to a value of type Bool
. The else
block can be omitted if needed and the curly braces can be omitted for the if
or else
block if the block is made of only one expression.
if (answer == 42) {Console.println("That's the answer of life!")} else {Console.println("That's not the answer of life.")}
A while
expression is used to execute one or more expressions as long as a condition holds true.
while (<condition>) {<do something>}
Again, the curly braces can be omitted if the body of the while
is made of only one expression.
Functions are defined using the func
keyword.
func sum(a: Int, b: Int): Int = {a + b}
Function parameters in Blink are separated by commas ,
and enclosed in parenthesis (a: Int, b: Int)
. Each parameter is defined by its name followed by a colon :
, followed by its type a: Int
.
Return type. The return type is defined by adding a colon :
followed by the type of the value returned by the function after the parameter list. If the function does not return a value, the return type must be Unit
.
func greet(): Unit = {Console.println("Hello, Blink!")}
However, as a syntactic sugar, if a function does not declare any return type, Blink will automatically assume that the return type is Unit
. So the previous example could be written as follows.
func greet() = {Console.println("Hello, Blink!")}
Function body. After the return type, comes the equal =
operator and then the body of the function enclosed in curly braces. There is no return keyword in Blink, the value of the last expression in the body of the function is the value returned by the function.
If the body is made of only one expression, the curly braces can be omitted. Our first example could be rewritten like this:
func sum(a: Int, b: Int) = a + b
Classes are defined using the class
keyword.
class Person {}
This defines a simple (albeit useless) class Person
. Objects of the class Person
can now be created using the new
keyword.
let p = new Person() in { }
Properties. A class can have one or more properties declared with the var
keyword.
class Person {var firstname: String = "Klaus"var lastname: Stringvar age: Int}
A property can be optionally initialized at its declaration. If a property is initialized, the initialization expression will be evaluated when the object is being created.
Properties in Blink are private. They can not be made public. Getters and setters (which are just functions) can be created to access properties outside of the class.
Functions. A class can have functions (or methods
if you prefer).
class Person {var firstname: String = "Klaus"var lastname: Stringvar age: Int
**func** firstname(): **String** = {
firstname
}
**func** setFirstname(name: **String**) = {
firstname = name
}
}
In the example above, we have a getter and a setter for the property firstname
. A more idiomatic way to write those functions would be to omit the curly braces because there is only one expression in the body of each of those functions.
class Person {var firstname: String = "Klaus"var lastname: Stringvar age: Int
**func** firstname(): **String** = firstname
**func** setFirstname(name: **String**) = firstname = name
}
Functions are public by default in Blink. However, it’s possible to make a function private by adding the private
modifier before the func
keyword.
private func age(): Int = // ...
Constructor. A class in Blink has one and only one constructor. By default, a class in Blink has a default constructor which takes no parameter. An explicit custom constructor can be defined by adding a list of parameters enclosed in parenthesis to the name of the class.
class Person**(firstname: String,** lastname**: String)** {func firstname(): String = firstnamefunc setFirstname(name: String) = firstname = name
**func** lastname(): **String** = lastname
**func** setLastname(name: **String**) = lastname = name
}
The constructor parameters automatically act as properties of the class and can be accessed from the body of all the functions in the class or from the initialization expression of any additional property.
let person = new Person("Klaus", "Baudelaire") in {Console.println(person**.firstname())Console.println(person.lastname()**)}
The above will print
KlausBaudelaire
Inheritance. Inheritance in Blink is expressed with the extends
keyword.
class Employee extends Person {}
If the class being inherited (the superclass) has an explicit constructor, the inheriting class must honor the superclass’s constructor by passing the arguments necessary to create an object of the superclass.
class Employee**(firstname: String,** lastname**: String,** company**: String)** extends Person**(firstname, lastname)** {}
Overriding functions. The override
modifier is used to override a function in the superclass.
class Person(firstname: String, lastname: String) {override func toString(): String = firstname + " " + lastname}
In the above example, we’re overriding the toString
available in the Object
class. All classes inherit from Object
in Blink.
Just like it’s possible to use arithmetic operators +
, -
, *
or /
to perform operations between 2 or more Int
s, Blink allows us to define the behavior of the binary operators +
, -
, *
, /
, %
and unary operators —
and !
for our own classes.
A binary operator is defined by adding a function which name is the symbol of the operator to define. For example, let’s define a +
operator for a class Rational
(representing rational numbers like 3/4
).
class Rational(num: Int, den: Int) {func +(other: Rational): Rational = // ...}
Given two variables a
and b
of type Rational
, we can now write let c: Rational = a + b
. Blink automatically converts the expression a + b
to the function call a.+(b)
.
A unary operator is defined by adding a function which name starts with unary_
followed by the symbol of the operator being defined.
class Rational(num: Int, den: Int) {func unary_-(): Rational = new Rational(-num, -den)}
In the example above, we’re defining the behavior of the unary —
operator on our class Rational
. Given a variable a
of type Rational
, we can write let b: Rational = -a
. Blink converts the expression -a
to the function call a.unary_-()
.
Operator overloading is also supported in Blink, meaning that it is possible to define multiple +
operators, for example, as long as the parameters list of each operator function is different.
For instance, for our Rational
class, we may want to add 2 Rational
s together but also want to add a Rational
and an Int
. Operator overloading allows us to do just that.
class Rational(num: Int, den: Int) {func +(other: Rational): Rational = // ...func +(integer: Int): Rational = // ...func +(double: Double): Rational = // ...}
With the above example and given 2 variables a
and b
of type Rational
, we can write expressions like a + b
, a + 42
or b + 3.14
.
Everything in Blink is an object. Even primitives like Int
s or Double
s are actually objects. And an arithmetic expression like 32 + 21
is, in fact, the function call 32.+(21)
. This is a neat feature that enables us, among others things, to define operators for our own types in Blink.
To conclude our tour of Blink, it is worth noting that unlike other programming languages like Java or JavaScript, there is no statement in Blink, only expressions (and definitions like function or property definitions). The difference between an expression and a statement is that an expression always evaluates to a value while a statement just performs some action.
For example in JavaScript, the if
statement executes a certain block of code if a condition is true
or false
. In Blink, if
is an expression and evaluates to a value.
That means, in Blink, it’s perfectly valid to write
let isLegallyAdult = if (age > 18) { true } else { false }
isLegallyAdult
will be equal to true
if age > 18
or will be equal to false
if not.
This concludes our quick tour of Blink. As you can see, Blink has the necessary concepts to express almost any program, is simple enough to tackle writing an interpreter for it in a relatively small amount of time and has some interesting features like its usage of objects and expressions.
If you’re ready to jump in, head over to the second article which will give a high level overview of compilers and interpreters.