mirror of
https://github.com/jamiebuilds/the-super-tiny-compiler.git
synced 2024-10-27 20:34:08 +00:00
Original project from Glitch
This commit is contained in:
parent
ad67945c95
commit
4ec6074b55
285
0-introduction.md
Normal file
285
0-introduction.md
Normal file
@ -0,0 +1,285 @@
|
||||
# Introduction
|
||||
|
||||
Today we're going to write a compiler together. But not just any compiler... A
|
||||
super duper teeny tiny compiler! A compiler that is so small that if you remove
|
||||
all the comments this file would only be ~200 lines of actual code.
|
||||
|
||||
We're going to compile some lisp-like function calls into some C-like function
|
||||
calls.
|
||||
|
||||
If you are not familiar with one or the other. I'll just give you a quick intro.
|
||||
|
||||
If we had two functions `add` and `subtract` they would be written like this:
|
||||
|
||||
| | LISP-style | C-style |
|
||||
| ------------- | ------------------------ | ------------------------ |
|
||||
| `2 + 2` | `(add 2 2)` | `add(2, 2)` |
|
||||
| `4 - 2` | `(subtract 4 2)` | `subtract(4, 2)` |
|
||||
| `2 + (4 - 2)` | `(add 2 (subtract 4 2))` | `add(2, subtract(4, 2))` |
|
||||
|
||||
Easy peezy right?
|
||||
|
||||
Well good, because this is exactly what we are going to compile. While this is
|
||||
neither a complete LISP or C syntax, it will be enough of the syntax to
|
||||
demonstrate many of the major pieces of a modern compiler.
|
||||
|
||||
# Stages of a Compiler
|
||||
|
||||
Most compilers break down into three primary stages: Parsing, Transformation,
|
||||
and Code Generation
|
||||
|
||||
1. *Parsing* is taking raw code and turning it into a more abstract
|
||||
representation of the code.
|
||||
2. *Transformation* takes this abstract representation and manipulates to do
|
||||
whatever the compiler wants it to.
|
||||
3. *Code Generation* takes the transformed representation of the code and turns
|
||||
it into new code.
|
||||
|
||||
## Parsing
|
||||
|
||||
Parsing typically gets broken down into two phases: Lexical Analysis and
|
||||
Syntactic Analysis.
|
||||
|
||||
*Lexical Analysis* takes the raw code and splits it apart into these things
|
||||
called tokens by a thing called a tokenizer (or lexer).
|
||||
|
||||
Tokens are an array of tiny little objects that describe an isolated piece of
|
||||
the syntax. They could be numbers, labels, punctuation, operators, whatever.
|
||||
|
||||
*Syntactic Analysis* takes the tokens and reformats them into a representation
|
||||
that describes each part of the syntax and their relation to one another. This
|
||||
is known as an **Intermediate Representation** or **Abstract Syntax Tree**.
|
||||
|
||||
An Abstract Syntax Tree, or AST for short, is a deeply nested object that
|
||||
represents code in a way that is both easy to work with and tells us a lot of
|
||||
information.
|
||||
|
||||
For the following syntax:
|
||||
|
||||
```lisp
|
||||
(add 2 (subtract 4 2))
|
||||
```
|
||||
|
||||
Tokens might look something like this:
|
||||
|
||||
```js
|
||||
[
|
||||
{ type: 'paren', value: '(' },
|
||||
{ type: 'name', value: 'add' },
|
||||
{ type: 'number', value: '2' },
|
||||
{ type: 'paren', value: '(' },
|
||||
{ type: 'name', value: 'subtract' },
|
||||
{ type: 'number', value: '4' },
|
||||
{ type: 'number', value: '2' },
|
||||
{ type: 'paren', value: ')' },
|
||||
{ type: 'paren', value: ')' },
|
||||
]
|
||||
```
|
||||
|
||||
And an Abstract Syntax Tree (AST) might look like this:
|
||||
|
||||
```js
|
||||
{
|
||||
type: 'Program',
|
||||
body: [{
|
||||
type: 'CallExpression',
|
||||
name: 'add',
|
||||
params: [{
|
||||
type: 'NumberLiteral',
|
||||
value: '2',
|
||||
}, {
|
||||
type: 'CallExpression',
|
||||
name: 'subtract',
|
||||
params: [{
|
||||
type: 'NumberLiteral',
|
||||
value: '4',
|
||||
}, {
|
||||
type: 'NumberLiteral',
|
||||
value: '2',
|
||||
}]
|
||||
}]
|
||||
}]
|
||||
}
|
||||
```
|
||||
|
||||
## Transformation
|
||||
|
||||
The next type of stage for a compiler is transformation. Again, this just takes
|
||||
the AST from the last step and makes changes to it. It can manipulate the AST
|
||||
in the same language or it can translate it into an entirely new language.
|
||||
|
||||
Let's look at how we would transform an AST.
|
||||
|
||||
You might notice that our AST has elements within it that look very similar.
|
||||
There are these objects with a type property. Each of these are known as an AST
|
||||
Node. These nodes have defined properties on them that describe one isolated
|
||||
part of the tree.
|
||||
|
||||
We can have a node for a "NumberLiteral":
|
||||
|
||||
```js
|
||||
{
|
||||
type: 'NumberLiteral',
|
||||
value: '2',
|
||||
}
|
||||
```
|
||||
|
||||
Or maybe a node for a "CallExpression":
|
||||
|
||||
```js
|
||||
{
|
||||
type: 'CallExpression',
|
||||
name: 'subtract',
|
||||
params: [
|
||||
// nested nodes go here...
|
||||
],
|
||||
}
|
||||
```
|
||||
|
||||
When transforming the AST we can manipulate nodes by adding/removing/replacing
|
||||
properties, we can add new nodes, remove nodes, or we could leave the existing
|
||||
AST alone and create an entirely new one based on it.
|
||||
|
||||
Since we're targeting a new language, we're going to focus on creating an
|
||||
entirely new AST that is specific to the target language.
|
||||
|
||||
## Traversal
|
||||
|
||||
In order to navigate through all of these nodes, we need to be able to traverse
|
||||
through them. This traversal process goes to each node in the AST depth-first.
|
||||
|
||||
```js
|
||||
{
|
||||
type: 'Program',
|
||||
body: [{
|
||||
type: 'CallExpression',
|
||||
name: 'add',
|
||||
params: [{
|
||||
type: 'NumberLiteral',
|
||||
value: '2'
|
||||
}, {
|
||||
type: 'CallExpression',
|
||||
name: 'subtract',
|
||||
params: [{
|
||||
type: 'NumberLiteral',
|
||||
value: '4'
|
||||
}, {
|
||||
type: 'NumberLiteral',
|
||||
value: '2'
|
||||
}]
|
||||
}]
|
||||
}]
|
||||
}
|
||||
```
|
||||
|
||||
So for the above AST we would go:
|
||||
|
||||
1. Program - Starting at the top level of the AST
|
||||
2. CallExpression (add) - Moving to the first element of the Program's body
|
||||
3. NumberLiteral (2) - Moving to the first element of CallExpression's params
|
||||
4. CallExpression (subtract) - Moving to the second element of CallExpression's params
|
||||
5. NumberLiteral (4) - Moving to the first element of CallExpression's params
|
||||
6. NumberLiteral (2) - Moving to the second element of CallExpression's params
|
||||
|
||||
If we were manipulating this AST directly, instead of creating a separate AST,
|
||||
we would likely introduce all sorts of abstractions here. But just visiting
|
||||
each node in the tree is enough.
|
||||
|
||||
The reason I use the word "visiting" is because there is this pattern of how
|
||||
to represent operations on elements of an object structure.
|
||||
|
||||
### Visitors
|
||||
|
||||
The basic idea here is that we are going to create a "visitor" object that has
|
||||
methods that will accept different node types.
|
||||
|
||||
```js
|
||||
var visitor = {
|
||||
NumberLiteral() {},
|
||||
CallExpression() {},
|
||||
};
|
||||
```
|
||||
|
||||
When we traverse our AST, we will call the methods on this visitor whenever we
|
||||
"enter" a node of a matching type.
|
||||
|
||||
In order to make this useful we will also pass the node and a reference to the
|
||||
parent node.
|
||||
|
||||
```js
|
||||
var visitor = {
|
||||
NumberLiteral(node, parent) {},
|
||||
CallExpression(node, parent) {},
|
||||
};
|
||||
```
|
||||
|
||||
However, there also exists the possibilty of calling things on "exit". Imagine
|
||||
our tree structure from before in list form:
|
||||
|
||||
- Program
|
||||
- CallExpression
|
||||
- NumberLiteral
|
||||
- CallExpression
|
||||
- NumberLiteral
|
||||
- NumberLiteral
|
||||
|
||||
As we traverse down, we're going to reach branches with dead ends. As we finish
|
||||
each branch of the tree we "exit" it. So going down the tree we "enter" each
|
||||
node, and going back up we "exit".
|
||||
|
||||
- → Program (enter)
|
||||
- → CallExpression (enter)
|
||||
- → NumberLiteral (enter)
|
||||
- ← NumberLiteral (exit)
|
||||
- → CallExpression (enter)
|
||||
- → NumberLiteral (enter)
|
||||
- ← NumberLiteral (exit)
|
||||
- → NumberLiteral (enter)
|
||||
- ← NumberLiteral (exit)
|
||||
- ← CallExpression (exit)
|
||||
- ← CallExpression (exit)
|
||||
- ← Program (exit)
|
||||
|
||||
In order to support that, the final form of our visitor will look like this:
|
||||
|
||||
```js
|
||||
var visitor = {
|
||||
NumberLiteral: {
|
||||
enter(node, parent) {},
|
||||
exit(node, parent) {},
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
## Code Generation
|
||||
|
||||
The final phase of a compiler is code generation. Sometimes compilers will do
|
||||
things that overlap with transformation, but for the most part code generation
|
||||
just means take our AST and string-ify code back out.
|
||||
|
||||
Code generators work several different ways, some compilers will reuse the
|
||||
tokens from earlier, others will have created a separate representation of the
|
||||
code so that they can print node linearly, but from what I can tell most will
|
||||
use the same AST we just created, which is what we're going to focus on.
|
||||
|
||||
Effectively our code generator will know how to "print" all of the different
|
||||
node types of the AST, and it will recursively call itself to print nested
|
||||
nodes until everything is printed into one long string of code.
|
||||
|
||||
---
|
||||
|
||||
And that's it! That's all the different pieces of a compiler.
|
||||
|
||||
Now that isn't to say every compiler looks exactly like I described here.
|
||||
Compilers serve many different purposes, and they might need more steps than I
|
||||
have detailed.
|
||||
|
||||
But now you should have a general high-level idea of what most compilers look
|
||||
like.
|
||||
|
||||
Now that I've explained all of this, you're all good to go write your own
|
||||
compilers right?
|
||||
|
||||
Just kidding, that's what I'm here to help with :P
|
||||
|
||||
So let's begin...
|
180
1-tokenizer.js
Normal file
180
1-tokenizer.js
Normal file
@ -0,0 +1,180 @@
|
||||
/**
|
||||
* ============================================================================
|
||||
* (/^▽^)/
|
||||
* THE TOKENIZER!
|
||||
* ============================================================================
|
||||
*/
|
||||
|
||||
/**
|
||||
* We're gonna start off with our first phase of parsing, lexical analysis, with
|
||||
* the tokenizer.
|
||||
*
|
||||
* We're just going to take our string of code and break it down into an array
|
||||
* of tokens.
|
||||
*
|
||||
* (add 2 (subtract 4 2)) => [{ type: 'paren', value: '(' }, ...]
|
||||
*/
|
||||
|
||||
// We start by accepting an input string of code, and we're gonna set up two
|
||||
// things...
|
||||
function tokenizer(input) {
|
||||
|
||||
// A `current` variable for tracking our position in the code like a cursor.
|
||||
let current = 0;
|
||||
|
||||
// And a `tokens` array for pushing our tokens to.
|
||||
let tokens = [];
|
||||
|
||||
// We start by creating a `while` loop where we are setting up our `current`
|
||||
// variable to be incremented as much as we want `inside` the loop.
|
||||
//
|
||||
// We do this because we may want to increment `current` many times within a
|
||||
// single loop because our tokens can be any length.
|
||||
while (current < input.length) {
|
||||
|
||||
// We're also going to store the `current` character in the `input`.
|
||||
let char = input[current];
|
||||
|
||||
// The first thing we want to check for is an open parenthesis. This will
|
||||
// later be used for `CallExpression` but for now we only care about the
|
||||
// character.
|
||||
//
|
||||
// We check to see if we have an open parenthesis:
|
||||
if (char === '(') {
|
||||
|
||||
// If we do, we push a new token with the type `paren` and set the value
|
||||
// to an open parenthesis.
|
||||
tokens.push({
|
||||
type: 'paren',
|
||||
value: '(',
|
||||
});
|
||||
|
||||
// Then we increment `current`
|
||||
current++;
|
||||
|
||||
// And we `continue` onto the next cycle of the loop.
|
||||
continue;
|
||||
}
|
||||
|
||||
// Next we're going to check for a closing parenthesis. We do the same exact
|
||||
// thing as before: Check for a closing parenthesis, add a new token,
|
||||
// increment `current`, and `continue`.
|
||||
if (char === ')') {
|
||||
tokens.push({
|
||||
type: 'paren',
|
||||
value: ')',
|
||||
});
|
||||
current++;
|
||||
continue;
|
||||
}
|
||||
|
||||
// Moving on, we're now going to check for whitespace. This is interesting
|
||||
// because we care that whitespace exists to separate characters, but it
|
||||
// isn't actually important for us to store as a token. We would only throw
|
||||
// it out later.
|
||||
//
|
||||
// So here we're just going to test for existence and if it does exist we're
|
||||
// going to just `continue` on.
|
||||
let WHITESPACE = /\s/;
|
||||
if (WHITESPACE.test(char)) {
|
||||
current++;
|
||||
continue;
|
||||
}
|
||||
|
||||
// The next type of token is a number. This is different than what we have
|
||||
// seen before because a number could be any number of characters and we
|
||||
// want to capture the entire sequence of characters as one token.
|
||||
//
|
||||
// (add 123 456)
|
||||
// ^^^ ^^^
|
||||
// Only two separate tokens
|
||||
//
|
||||
// So we start this off when we encounter the first number in a sequence.
|
||||
let NUMBERS = /[0-9]/;
|
||||
if (NUMBERS.test(char)) {
|
||||
|
||||
// We're going to create a `value` string that we are going to push
|
||||
// characters to.
|
||||
let value = '';
|
||||
|
||||
// Then we're going to loop through each character in the sequence until
|
||||
// we encounter a character that is not a number, pushing each character
|
||||
// that is a number to our `value` and incrementing `current` as we go.
|
||||
while (NUMBERS.test(char)) {
|
||||
value += char;
|
||||
char = input[++current];
|
||||
}
|
||||
|
||||
// After that we push our `number` token to the `tokens` array.
|
||||
tokens.push({ type: 'number', value });
|
||||
|
||||
// And we continue on.
|
||||
continue;
|
||||
}
|
||||
|
||||
// We'll also add support for strings in our language which will be any
|
||||
// text surrounded by double quotes (").
|
||||
//
|
||||
// (concat "foo" "bar")
|
||||
// ^^^ ^^^ string tokens
|
||||
//
|
||||
// We'll start by checking for the opening quote:
|
||||
if (char === '"') {
|
||||
// Keep a `value` variable for building up our string token.
|
||||
let value = '';
|
||||
|
||||
// We'll skip the opening double quote in our token.
|
||||
char = input[++current];
|
||||
|
||||
// Then we'll iterate through each character until we reach another
|
||||
// double quote.
|
||||
while (char !== '"') {
|
||||
value += char;
|
||||
char = input[++current];
|
||||
}
|
||||
|
||||
// Skip the closing double quote.
|
||||
char = input[++current];
|
||||
|
||||
// And add our `string` token to the `tokens` array.
|
||||
tokens.push({ type: 'string', value });
|
||||
|
||||
continue;
|
||||
}
|
||||
|
||||
// The last type of token will be a `name` token. This is a sequence of
|
||||
// letters instead of numbers, that are the names of functions in our lisp
|
||||
// syntax.
|
||||
//
|
||||
// (add 2 4)
|
||||
// ^^^
|
||||
// Name token
|
||||
//
|
||||
let LETTERS = /[a-z]/i;
|
||||
if (LETTERS.test(char)) {
|
||||
let value = '';
|
||||
|
||||
// Again we're just going to loop through all the letters pushing them to
|
||||
// a value.
|
||||
while (LETTERS.test(char)) {
|
||||
value += char;
|
||||
char = input[++current];
|
||||
}
|
||||
|
||||
// And pushing that value as a token with the type `name` and continuing.
|
||||
tokens.push({ type: 'name', value });
|
||||
|
||||
continue;
|
||||
}
|
||||
|
||||
// Finally if we have not matched a character by now, we're going to throw
|
||||
// an error and completely exit.
|
||||
throw new TypeError('I dont know what this character is: ' + char);
|
||||
}
|
||||
|
||||
// Then at the end of our `tokenizer` we simply return the tokens array.
|
||||
return tokens;
|
||||
}
|
||||
|
||||
// Just exporting our tokenizer to be used in the final compiler...
|
||||
module.exports = tokenizer;
|
161
2-parser.js
Normal file
161
2-parser.js
Normal file
@ -0,0 +1,161 @@
|
||||
/**
|
||||
* ============================================================================
|
||||
* ヽ/❀o ل͜ o\ノ
|
||||
* THE PARSER!!!
|
||||
* ============================================================================
|
||||
*/
|
||||
|
||||
/**
|
||||
* For our parser we're going to take our array of tokens and turn it into an
|
||||
* AST.
|
||||
*
|
||||
* [{ type: 'paren', value: '(' }, ...] => { type: 'Program', body: [...] }
|
||||
*/
|
||||
|
||||
// Okay, so we define a `parser` function that accepts our array of `tokens`.
|
||||
function parser(tokens) {
|
||||
|
||||
// Again we keep a `current` variable that we will use as a cursor.
|
||||
let current = 0;
|
||||
|
||||
// But this time we're going to use recursion instead of a `while` loop. So we
|
||||
// define a `walk` function.
|
||||
function walk() {
|
||||
|
||||
// Inside the walk function we start by grabbing the `current` token.
|
||||
let token = tokens[current];
|
||||
|
||||
// We're going to split each type of token off into a different code path,
|
||||
// starting off with `number` tokens.
|
||||
//
|
||||
// We test to see if we have a `number` token.
|
||||
if (token.type === 'number') {
|
||||
|
||||
// If we have one, we'll increment `current`.
|
||||
current++;
|
||||
|
||||
// And we'll return a new AST node called `NumberLiteral` and setting its
|
||||
// value to the value of our token.
|
||||
return {
|
||||
type: 'NumberLiteral',
|
||||
value: token.value,
|
||||
};
|
||||
}
|
||||
|
||||
// If we have a string we will do the same as number and create a
|
||||
// `StringLiteral` node.
|
||||
if (token.type === 'string') {
|
||||
current++;
|
||||
|
||||
return {
|
||||
type: 'StringLiteral',
|
||||
value: token.value,
|
||||
};
|
||||
}
|
||||
|
||||
// Next we're going to look for CallExpressions. We start this off when we
|
||||
// encounter an open parenthesis.
|
||||
if (
|
||||
token.type === 'paren' &&
|
||||
token.value === '('
|
||||
) {
|
||||
|
||||
// We'll increment `current` to skip the parenthesis since we don't care
|
||||
// about it in our AST.
|
||||
token = tokens[++current];
|
||||
|
||||
// We create a base node with the type `CallExpression`, and we're going
|
||||
// to set the name as the current token's value since the next token after
|
||||
// the open parenthesis is the name of the function.
|
||||
let node = {
|
||||
type: 'CallExpression',
|
||||
name: token.value,
|
||||
params: [],
|
||||
};
|
||||
|
||||
// We increment `current` *again* to skip the name token.
|
||||
token = tokens[++current];
|
||||
|
||||
// And now we want to loop through each token that will be the `params` of
|
||||
// our `CallExpression` until we encounter a closing parenthesis.
|
||||
//
|
||||
// Now this is where recursion comes in. Instead of trying to parse a
|
||||
// potentially infinitely nested set of nodes we're going to rely on
|
||||
// recursion to resolve things.
|
||||
//
|
||||
// To explain this, let's take our Lisp code. You can see that the
|
||||
// parameters of the `add` are a number and a nested `CallExpression` that
|
||||
// includes its own numbers.
|
||||
//
|
||||
// (add 2 (subtract 4 2))
|
||||
//
|
||||
// You'll also notice that in our tokens array we have multiple closing
|
||||
// parenthesis.
|
||||
//
|
||||
// [
|
||||
// { type: 'paren', value: '(' },
|
||||
// { type: 'name', value: 'add' },
|
||||
// { type: 'number', value: '2' },
|
||||
// { type: 'paren', value: '(' },
|
||||
// { type: 'name', value: 'subtract' },
|
||||
// { type: 'number', value: '4' },
|
||||
// { type: 'number', value: '2' },
|
||||
// { type: 'paren', value: ')' }, <<< Closing parenthesis
|
||||
// { type: 'paren', value: ')' }, <<< Closing parenthesis
|
||||
// ]
|
||||
//
|
||||
// We're going to rely on the nested `walk` function to increment our
|
||||
// `current` variable past any nested `CallExpression`.
|
||||
|
||||
// So we create a `while` loop that will continue until it encounters a
|
||||
// token with a `type` of `'paren'` and a `value` of a closing
|
||||
// parenthesis.
|
||||
while (
|
||||
(token.type !== 'paren') ||
|
||||
(token.type === 'paren' && token.value !== ')')
|
||||
) {
|
||||
// we'll call the `walk` function which will return a `node` and we'll
|
||||
// push it into our `node.params`.
|
||||
node.params.push(walk());
|
||||
token = tokens[current];
|
||||
}
|
||||
|
||||
// Finally we will increment `current` one last time to skip the closing
|
||||
// parenthesis.
|
||||
current++;
|
||||
|
||||
// And return the node.
|
||||
return node;
|
||||
}
|
||||
|
||||
// Again, if we haven't recognized the token type by now we're going to
|
||||
// throw an error.
|
||||
throw new TypeError(token.type);
|
||||
}
|
||||
|
||||
// Now, we're going to create our AST which will have a root which is a
|
||||
// `Program` node.
|
||||
let ast = {
|
||||
type: 'Program',
|
||||
body: [],
|
||||
};
|
||||
|
||||
// And we're going to kickstart our `walk` function, pushing nodes to our
|
||||
// `ast.body` array.
|
||||
//
|
||||
// The reason we are doing this inside a loop is because our program can have
|
||||
// `CallExpression` after one another instead of being nested.
|
||||
//
|
||||
// (add 2 2)
|
||||
// (subtract 4 2)
|
||||
//
|
||||
while (current < tokens.length) {
|
||||
ast.body.push(walk());
|
||||
}
|
||||
|
||||
// At the end of our parser we'll return the AST.
|
||||
return ast;
|
||||
}
|
||||
|
||||
// Just exporting our parser to be used in the final compiler...
|
||||
module.exports = parser;
|
97
3-traverser.js
Normal file
97
3-traverser.js
Normal file
@ -0,0 +1,97 @@
|
||||
/**
|
||||
* ============================================================================
|
||||
* ⌒(❀>◞౪◟<❀)⌒
|
||||
* THE TRAVERSER!!!
|
||||
* ============================================================================
|
||||
*/
|
||||
|
||||
/**
|
||||
* So now we have our AST, and we want to be able to visit different nodes with
|
||||
* a visitor. We need to be able to call the methods on the visitor whenever we
|
||||
* encounter a node with a matching type.
|
||||
*
|
||||
* traverse(ast, {
|
||||
* Program(node, parent) {
|
||||
* // ...
|
||||
* },
|
||||
*
|
||||
* CallExpression(node, parent) {
|
||||
* // ...
|
||||
* },
|
||||
*
|
||||
* NumberLiteral(node, parent) {
|
||||
* // ...
|
||||
* },
|
||||
* });
|
||||
*/
|
||||
|
||||
// So we define a traverser function which accepts an AST and a
|
||||
// visitor. Inside we're going to define two functions...
|
||||
function traverser(ast, visitor) {
|
||||
|
||||
// A `traverseArray` function that will allow us to iterate over an array and
|
||||
// call the next function that we will define: `traverseNode`.
|
||||
function traverseArray(array, parent) {
|
||||
array.forEach(child => {
|
||||
traverseNode(child, parent);
|
||||
});
|
||||
}
|
||||
|
||||
// `traverseNode` will accept a `node` and its `parent` node. So that it can
|
||||
// pass both to our visitor methods.
|
||||
function traverseNode(node, parent) {
|
||||
|
||||
// We start by testing for the existence of a method on the visitor with a
|
||||
// matching `type`.
|
||||
let methods = visitor[node.type];
|
||||
|
||||
// If there is an `enter` method for this node type we'll call it with the
|
||||
// `node` and its `parent`.
|
||||
if (methods && methods.enter) {
|
||||
methods.enter(node, parent);
|
||||
}
|
||||
|
||||
// Next we are going to split things up by the current node type.
|
||||
switch (node.type) {
|
||||
|
||||
// We'll start with our top level `Program`. Since Program nodes have a
|
||||
// property named body that has an array of nodes, we will call
|
||||
// `traverseArray` to traverse down into them.
|
||||
//
|
||||
// (Remember that `traverseArray` will in turn call `traverseNode` so we
|
||||
// are causing the tree to be traversed recursively)
|
||||
case 'Program':
|
||||
traverseArray(node.body, node);
|
||||
break;
|
||||
|
||||
// Next we do the same with `CallExpression` and traverse their `params`.
|
||||
case 'CallExpression':
|
||||
traverseArray(node.params, node);
|
||||
break;
|
||||
|
||||
// In the cases of `NumberLiteral` and `StringLiteral` we don't have any
|
||||
// child nodes to visit, so we'll just break.
|
||||
case 'NumberLiteral':
|
||||
case 'StringLiteral':
|
||||
break;
|
||||
|
||||
// And again, if we haven't recognized the node type then we'll throw an
|
||||
// error.
|
||||
default:
|
||||
throw new TypeError(node.type);
|
||||
}
|
||||
|
||||
// If there is an `exit` method for this node type we'll call it with the
|
||||
// `node` and its `parent`.
|
||||
if (methods && methods.exit) {
|
||||
methods.exit(node, parent);
|
||||
}
|
||||
}
|
||||
|
||||
// Finally we kickstart the traverser by calling `traverseNode` with our ast
|
||||
// with no `parent` because the top level of the AST doesn't have a parent.
|
||||
traverseNode(ast, null);
|
||||
}
|
||||
|
||||
// Just exporting our traverser to be used in the final compiler...
|
||||
module.exports = traverser;
|
142
4-transformer.js
Normal file
142
4-transformer.js
Normal file
@ -0,0 +1,142 @@
|
||||
var traverser = require('./3-traverser');
|
||||
|
||||
/**
|
||||
* ============================================================================
|
||||
* ⁽(◍˃̵͈̑ᴗ˂̵͈̑)⁽
|
||||
* THE TRANSFORMER!!!
|
||||
* ============================================================================
|
||||
*/
|
||||
|
||||
/**
|
||||
* Next up, the transformer. Our transformer is going to take the AST that we
|
||||
* have built and pass it to our traverser function with a visitor and will
|
||||
* create a new ast.
|
||||
*
|
||||
* ----------------------------------------------------------------------------
|
||||
* Original AST | Transformed AST
|
||||
* ----------------------------------------------------------------------------
|
||||
* { | {
|
||||
* type: 'Program', | type: 'Program',
|
||||
* body: [{ | body: [{
|
||||
* type: 'CallExpression', | type: 'ExpressionStatement',
|
||||
* name: 'add', | expression: {
|
||||
* params: [{ | type: 'CallExpression',
|
||||
* type: 'NumberLiteral', | callee: {
|
||||
* value: '2' | type: 'Identifier',
|
||||
* }, { | name: 'add'
|
||||
* type: 'CallExpression', | },
|
||||
* name: 'subtract', | arguments: [{
|
||||
* params: [{ | type: 'NumberLiteral',
|
||||
* type: 'NumberLiteral', | value: '2'
|
||||
* value: '4' | }, {
|
||||
* }, { | type: 'CallExpression',
|
||||
* type: 'NumberLiteral', | callee: {
|
||||
* value: '2' | type: 'Identifier',
|
||||
* }] | name: 'subtract'
|
||||
* }] | },
|
||||
* }] | arguments: [{
|
||||
* } | type: 'NumberLiteral',
|
||||
* | value: '4'
|
||||
* ---------------------------------- | }, {
|
||||
* | type: 'NumberLiteral',
|
||||
* | value: '2'
|
||||
* | }]
|
||||
* (sorry the other one is longer.) | }
|
||||
* | }
|
||||
* | }]
|
||||
* | }
|
||||
* ----------------------------------------------------------------------------
|
||||
*/
|
||||
|
||||
// So we have our transformer function which will accept the lisp ast.
|
||||
function transformer(ast) {
|
||||
|
||||
// We'll create a `newAst` which like our previous AST will have a program
|
||||
// node.
|
||||
let newAst = {
|
||||
type: 'Program',
|
||||
body: [],
|
||||
};
|
||||
|
||||
// Next I'm going to cheat a little and create a bit of a hack. We're going to
|
||||
// use a property named `context` on our parent nodes that we're going to push
|
||||
// nodes to their parent's `context`. Normally you would have a better
|
||||
// abstraction than this, but for our purposes this keeps things simple.
|
||||
//
|
||||
// Just take note that the context is a reference *from* the old ast *to* the
|
||||
// new ast.
|
||||
ast._context = newAst.body;
|
||||
|
||||
// We'll start by calling the traverser function with our ast and a visitor.
|
||||
traverser(ast, {
|
||||
|
||||
// The first visitor method accepts any `NumberLiteral`
|
||||
NumberLiteral: {
|
||||
// We'll visit them on enter.
|
||||
enter(node, parent) {
|
||||
// We'll create a new node also named `NumberLiteral` that we will push to
|
||||
// the parent context.
|
||||
parent._context.push({
|
||||
type: 'NumberLiteral',
|
||||
value: node.value,
|
||||
});
|
||||
},
|
||||
},
|
||||
|
||||
// Next we have `StringLiteral`
|
||||
StringLiteral: {
|
||||
enter(node, parent) {
|
||||
parent._context.push({
|
||||
type: 'StringLiteral',
|
||||
value: node.value,
|
||||
});
|
||||
},
|
||||
},
|
||||
|
||||
// Next up, `CallExpression`.
|
||||
CallExpression: {
|
||||
enter(node, parent) {
|
||||
|
||||
// We start creating a new node `CallExpression` with a nested
|
||||
// `Identifier`.
|
||||
let expression = {
|
||||
type: 'CallExpression',
|
||||
callee: {
|
||||
type: 'Identifier',
|
||||
name: node.name,
|
||||
},
|
||||
arguments: [],
|
||||
};
|
||||
|
||||
// Next we're going to define a new context on the original
|
||||
// `CallExpression` node that will reference the `expression`'s arguments
|
||||
// so that we can push arguments.
|
||||
node._context = expression.arguments;
|
||||
|
||||
// Then we're going to check if the parent node is a `CallExpression`.
|
||||
// If it is not...
|
||||
if (parent.type !== 'CallExpression') {
|
||||
|
||||
// We're going to wrap our `CallExpression` node with an
|
||||
// `ExpressionStatement`. We do this because the top level
|
||||
// `CallExpression` in JavaScript are actually statements.
|
||||
expression = {
|
||||
type: 'ExpressionStatement',
|
||||
expression: expression,
|
||||
};
|
||||
}
|
||||
|
||||
// Last, we push our (possibly wrapped) `CallExpression` to the `parent`'s
|
||||
// `context`.
|
||||
parent._context.push(expression);
|
||||
},
|
||||
}
|
||||
});
|
||||
|
||||
// At the end of our transformer function we'll return the new ast that we
|
||||
// just created.
|
||||
return newAst;
|
||||
}
|
||||
|
||||
// Just exporting our transformer to be used in the final compiler...
|
||||
module.exports = transformer;
|
66
5-code-generator.js
Normal file
66
5-code-generator.js
Normal file
@ -0,0 +1,66 @@
|
||||
/**
|
||||
* ============================================================================
|
||||
* ヾ(〃^∇^)ノ♪
|
||||
* THE CODE GENERATOR!!!!
|
||||
* ============================================================================
|
||||
*/
|
||||
|
||||
/**
|
||||
* Now let's move onto our last phase: The Code Generator.
|
||||
*
|
||||
* Our code generator is going to recursively call itself to print each node in
|
||||
* the tree into one giant string.
|
||||
*/
|
||||
|
||||
function codeGenerator(node) {
|
||||
|
||||
// We'll break things down by the `type` of the `node`.
|
||||
switch (node.type) {
|
||||
|
||||
// If we have a `Program` node. We will map through each node in the `body`
|
||||
// and run them through the code generator and join them with a newline.
|
||||
case 'Program':
|
||||
return node.body.map(codeGenerator)
|
||||
.join('\n');
|
||||
|
||||
// For `ExpressionStatement` we'll call the code generator on the nested
|
||||
// expression and we'll add a semicolon...
|
||||
case 'ExpressionStatement':
|
||||
return (
|
||||
codeGenerator(node.expression) +
|
||||
';' // << (...because we like to code the *correct* way)
|
||||
);
|
||||
|
||||
// For `CallExpression` we will print the `callee`, add an open
|
||||
// parenthesis, we'll map through each node in the `arguments` array and run
|
||||
// them through the code generator, joining them with a comma, and then
|
||||
// we'll add a closing parenthesis.
|
||||
case 'CallExpression':
|
||||
return (
|
||||
codeGenerator(node.callee) +
|
||||
'(' +
|
||||
node.arguments.map(codeGenerator)
|
||||
.join(', ') +
|
||||
')'
|
||||
);
|
||||
|
||||
// For `Identifier` we'll just return the `node`'s name.
|
||||
case 'Identifier':
|
||||
return node.name;
|
||||
|
||||
// For `NumberLiteral` we'll just return the `node`'s value.
|
||||
case 'NumberLiteral':
|
||||
return node.value;
|
||||
|
||||
// For `StringLiteral` we'll add quotations around the `node`'s value.
|
||||
case 'StringLiteral':
|
||||
return '"' + node.value + '"';
|
||||
|
||||
// And if we haven't recognized the node, we'll throw an error.
|
||||
default:
|
||||
throw new TypeError(node.type);
|
||||
}
|
||||
}
|
||||
|
||||
// Just exporting our code generator to be used in the final compiler...
|
||||
module.exports = codeGenerator;
|
50
6-compiler.js
Normal file
50
6-compiler.js
Normal file
@ -0,0 +1,50 @@
|
||||
var tokenizer = require('./1-tokenizer');
|
||||
var parser = require('./2-parser');
|
||||
// Note: The traverser is only used inside of the transformer...
|
||||
var transformer = require('./4-transformer');
|
||||
var codeGenerator = require('./5-code-generator');
|
||||
|
||||
/**
|
||||
* ============================================================================
|
||||
* (۶* ‘ヮ’)۶”
|
||||
* !!!!!!!!THE COMPILER!!!!!!!!
|
||||
* ============================================================================
|
||||
*/
|
||||
|
||||
/**
|
||||
* FINALLY! We'll create our `compiler` function. Here we will link together
|
||||
* every part of the pipeline.
|
||||
*
|
||||
* 1. input => tokenizer => tokens
|
||||
* 2. tokens => parser => ast
|
||||
* 3. ast => transformer => newAst
|
||||
* 4. newAst => generator => output
|
||||
*/
|
||||
|
||||
function compiler(input) {
|
||||
let tokens = tokenizer(input);
|
||||
let ast = parser(tokens);
|
||||
let newAst = transformer(ast);
|
||||
let output = codeGenerator(newAst);
|
||||
|
||||
// and simply return the output!
|
||||
return output;
|
||||
}
|
||||
|
||||
/**
|
||||
* ============================================================================
|
||||
* (๑˃̵ᴗ˂̵)و
|
||||
* !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!YOU MADE IT!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
|
||||
* ============================================================================
|
||||
*/
|
||||
|
||||
/**
|
||||
* Now, if you enjoyed this, please give it a star on GitHub and follow me on
|
||||
* Twitter (the links are up on the top right).
|
||||
*
|
||||
* You can also play around with this code/website on glitch.com (link is also
|
||||
* on top right).
|
||||
*/
|
||||
|
||||
// Just exporting our compiler to be used in the tests
|
||||
module.exports = compiler;
|
2
LICENSE
Executable file → Normal file
2
LICENSE
Executable file → Normal file
@ -390,4 +390,4 @@ understandings, or agreements concerning use of licensed material. For
|
||||
the avoidance of doubt, this paragraph does not form part of the public
|
||||
licenses.
|
||||
|
||||
Creative Commons may be contacted at creativecommons.org.
|
||||
Creative Commons may be contacted at creativecommons.org.
|
43
README.md
Executable file → Normal file
43
README.md
Executable file → Normal file
@ -1,24 +1,21 @@
|
||||
[](the-super-tiny-compiler.js)
|
||||
# Welcome to The Super Tiny Compiler!
|
||||
|
||||
***Welcome to The Super Tiny Compiler!***
|
||||
***An [@thejameskyle](http://thejameskyle.com/) production***
|
||||
|
||||
This is an ultra-simplified example of all the major pieces of a modern compiler
|
||||
written in easy to read JavaScript.
|
||||
---
|
||||
|
||||
This is an ultra-simplified example of all the major pieces of a modern
|
||||
compiler written in easy to read JavaScript.
|
||||
|
||||
Reading through the guided code will help you learn about how *most* compilers
|
||||
work from end to end.
|
||||
|
||||
### [Want to jump into the code? Click here](the-super-tiny-compiler.js)
|
||||
|
||||
### [You can also check it out on Glitch](https://the-super-tiny-compiler.glitch.me/)
|
||||
|
||||
---
|
||||
|
||||
### Why should I care?
|
||||
|
||||
That's fair, most people don't really have to think about compilers in their day
|
||||
jobs. However, compilers are all around you, tons of the tools you use are based
|
||||
on concepts borrowed from compilers.
|
||||
That's fair, most people don't really have to think about compilers in their
|
||||
day jobs. However, compilers are all around you, tons of the tools you use are
|
||||
based on concepts borrowed from compilers. These are really useful concepts to
|
||||
have at your disposal.
|
||||
|
||||
### But compilers are scary!
|
||||
|
||||
@ -27,22 +24,14 @@ taken something that is reasonably straightforward and made it so scary that
|
||||
most think of it as this totally unapproachable thing that only the nerdiest of
|
||||
the nerds are able to understand.
|
||||
|
||||
I've done my best to try and keep this from being a scary experience. I hope
|
||||
that reading this will be a positive learning experience for you.
|
||||
|
||||
### Okay so where do I begin?
|
||||
|
||||
Awesome! Head on over to the [the-super-tiny-compiler.js](the-super-tiny-compiler.js)
|
||||
file.
|
||||
|
||||
### I'm back, that didn't make sense
|
||||
|
||||
Ouch, I'm really sorry. I'm planning on doing a lot more work on this to add
|
||||
inline annotations. If you want to come back when that's done, you can either
|
||||
watch/star this repo or follow me on
|
||||
[twitter](https://twitter.com/thejameskyle) for updates.
|
||||
|
||||
### Tests
|
||||
|
||||
Run with `node test.js`
|
||||
Awesome! Head on over to [**0-introduction.md**](./intro) and then
|
||||
work your way down the list of files.
|
||||
|
||||
---
|
||||
|
||||
[](http://creativecommons.org/licenses/by/4.0/)
|
||||
[](http://creativecommons.org/licenses/by/4.0/)
|
14
package.json
14
package.json
@ -1,7 +1,15 @@
|
||||
{
|
||||
"name": "the-super-tiny-compiler",
|
||||
"version": "1.0.0",
|
||||
"author": "James Kyle <me@thejameskyle.com> (thejameskyle.com)",
|
||||
"license": "CC-BY-4.0",
|
||||
"main": "./the-super-tiny-compiler.js"
|
||||
}
|
||||
"repository": "thejameskyle/the-super-tiny-compiler",
|
||||
"dependencies": {
|
||||
"express": "^4.15.2",
|
||||
"markdown-it": "^8.3.1",
|
||||
"ejs": "^2.5.6",
|
||||
"prismjs": "^9000.0.1"
|
||||
},
|
||||
"scripts": {
|
||||
"start": "node server.js"
|
||||
}
|
||||
}
|
70
server.js
Normal file
70
server.js
Normal file
@ -0,0 +1,70 @@
|
||||
var markdown = require('markdown-it')();
|
||||
var Prism = require('prismjs');
|
||||
var express = require('express');
|
||||
var path = require('path');
|
||||
var ejs = require('ejs');
|
||||
var fs = require('fs');
|
||||
|
||||
var app = express();
|
||||
|
||||
var ROUTES_MAP = {
|
||||
'/' : 'README.md',
|
||||
'/intro' : '0-introduction.md',
|
||||
'/tokenizer' : '1-tokenizer.js',
|
||||
'/parser' : '2-parser.js',
|
||||
'/traverser' : '3-traverser.js',
|
||||
'/transformer' : '4-transformer.js',
|
||||
'/code-generator' : '5-code-generator.js',
|
||||
'/compiler' : '6-compiler.js'
|
||||
};
|
||||
|
||||
var routes = Object.keys(ROUTES_MAP).map(function(routePath) {
|
||||
return {
|
||||
routePath: routePath,
|
||||
routeName: ROUTES_MAP[routePath]
|
||||
};
|
||||
});
|
||||
|
||||
function readFile(fileName) {
|
||||
return fs.readFileSync(path.join(__dirname, fileName)).toString();
|
||||
}
|
||||
|
||||
function renderMarkdown(fileContents) {
|
||||
return markdown.render(fileContents);
|
||||
}
|
||||
|
||||
function renderJavaScript(fileName, fileContents) {
|
||||
return Prism.highlight(fileContents, Prism.languages.javascript);
|
||||
}
|
||||
|
||||
var template = ejs.compile(readFile('./template.html.ejs'));
|
||||
|
||||
function render(routeName) {
|
||||
var fileName = routeName;
|
||||
var fileContents = readFile(fileName);
|
||||
|
||||
var extName = path.extname(fileName);
|
||||
if (extName === '.md') fileContents = renderMarkdown(fileContents);
|
||||
if (extName === '.js') fileContents = renderJavaScript(fileName, fileContents);
|
||||
|
||||
let isCode = extName !== '.md';
|
||||
|
||||
return template({
|
||||
routes: routes,
|
||||
fileName: fileName,
|
||||
fileContents: fileContents,
|
||||
isCode: isCode,
|
||||
});
|
||||
}
|
||||
|
||||
routes.forEach(function(route) {
|
||||
var html = render(route.routeName);
|
||||
|
||||
app.get(route.routePath, function(req, res) {
|
||||
res.send(html);
|
||||
});
|
||||
});
|
||||
|
||||
var listener = app.listen(process.env.PORT, function () {
|
||||
console.log('Your app is listening on port ' + listener.address().port);
|
||||
});
|
320
template.html.ejs
Normal file
320
template.html.ejs
Normal file
@ -0,0 +1,320 @@
|
||||
<!doctype html>
|
||||
<html <% if (isCode) { %>class="is-code"<% } %>>
|
||||
<head>
|
||||
<title>The Super Tiny Compiler - <%= fileName %></title>
|
||||
<meta name="description" content="">
|
||||
<link id="favicon" rel="icon" href="https://glitch.com/edit/favicon-app.ico" type="image/x-icon">
|
||||
<meta charset="utf-8">
|
||||
<meta http-equiv="X-UA-Compatible" content="IE=edge">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1">
|
||||
<style>
|
||||
* {
|
||||
box-sizing: border-box;
|
||||
}
|
||||
|
||||
body {
|
||||
font: normal 1em/1.5 Consolas, monaco, monospace;
|
||||
}
|
||||
|
||||
html, body, #app {
|
||||
position: relative;
|
||||
width: 100%;
|
||||
height: 100%;
|
||||
margin: 0;
|
||||
overflow: hidden;
|
||||
}
|
||||
|
||||
html.is-code,
|
||||
.is-code body {
|
||||
background: black;
|
||||
}
|
||||
|
||||
header {
|
||||
position: absolute;
|
||||
top: 0;
|
||||
height: 2em;
|
||||
width: 100%;
|
||||
background: blue;
|
||||
color: white;
|
||||
line-height: 2em;
|
||||
}
|
||||
|
||||
header a {
|
||||
float: left;
|
||||
color: inherit;
|
||||
padding: 0 0.5em;
|
||||
}
|
||||
|
||||
header a:hover {
|
||||
background: white;
|
||||
color: blue;
|
||||
}
|
||||
|
||||
header .right {
|
||||
float: right;
|
||||
}
|
||||
|
||||
nav {
|
||||
position: absolute;
|
||||
top: 2em;
|
||||
bottom: 0;
|
||||
left: 0;
|
||||
width: 300px;
|
||||
background: black;
|
||||
overflow: auto;
|
||||
padding: 2em 0;
|
||||
border-right: 4px solid white;
|
||||
}
|
||||
|
||||
nav a {
|
||||
display: block;
|
||||
padding: 0.25em 2em;
|
||||
color: white;
|
||||
text-decoration: none;
|
||||
}
|
||||
|
||||
nav a.active {
|
||||
background: white;
|
||||
color: black;
|
||||
}
|
||||
|
||||
main {
|
||||
position: absolute;
|
||||
top: 2em;
|
||||
bottom: 0;
|
||||
left: 300px;
|
||||
right: 0;
|
||||
overflow: auto;
|
||||
padding-bottom: 25%;
|
||||
}
|
||||
|
||||
.container {
|
||||
margin: 0 auto;
|
||||
max-width: 960px;
|
||||
padding: 2em;
|
||||
}
|
||||
|
||||
img {
|
||||
max-width: 100%;
|
||||
height: auto;
|
||||
}
|
||||
|
||||
hr {
|
||||
border: none;
|
||||
border-top: 4px solid black;
|
||||
}
|
||||
|
||||
pre, code {
|
||||
font: inherit;
|
||||
color: white;
|
||||
background: black;
|
||||
}
|
||||
|
||||
code {
|
||||
padding: 0 0.2em;
|
||||
}
|
||||
|
||||
pre {
|
||||
padding: 1em;
|
||||
}
|
||||
|
||||
pre code {
|
||||
padding: 0;
|
||||
}
|
||||
|
||||
table {
|
||||
width: 100%;
|
||||
border: 4px solid black;
|
||||
}
|
||||
|
||||
td, th {
|
||||
padding: 0.5em;
|
||||
text-align: left;
|
||||
}
|
||||
|
||||
.container > ul,
|
||||
.container > ol {
|
||||
padding: 0 1em;
|
||||
padding-left: 3em;
|
||||
border: 4px solid black;
|
||||
}
|
||||
|
||||
ul {
|
||||
list-style: square;
|
||||
}
|
||||
|
||||
li {
|
||||
margin: 1em 0;
|
||||
}
|
||||
|
||||
#code {
|
||||
margin: 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* okaidia theme for JavaScript, CSS and HTML
|
||||
* Loosely based on Monokai textmate theme by http://www.monokai.nl/
|
||||
* @author ocodia
|
||||
*/
|
||||
|
||||
code[class*="language-"],
|
||||
pre[class*="language-"] {
|
||||
color: #f8f8f2;
|
||||
background: none;
|
||||
text-shadow: 0 1px rgba(0, 0, 0, 0.3);
|
||||
font-family: Consolas, Monaco, 'Andale Mono', 'Ubuntu Mono', monospace;
|
||||
text-align: left;
|
||||
white-space: pre;
|
||||
word-spacing: normal;
|
||||
word-break: normal;
|
||||
word-wrap: normal;
|
||||
line-height: 1.5;
|
||||
|
||||
-moz-tab-size: 4;
|
||||
-o-tab-size: 4;
|
||||
tab-size: 4;
|
||||
|
||||
-webkit-hyphens: none;
|
||||
-moz-hyphens: none;
|
||||
-ms-hyphens: none;
|
||||
hyphens: none;
|
||||
}
|
||||
|
||||
/* Code blocks */
|
||||
pre[class*="language-"] {
|
||||
padding: 1em;
|
||||
margin: .5em 0;
|
||||
overflow: auto;
|
||||
border-radius: 0.3em;
|
||||
}
|
||||
|
||||
:not(pre) > code[class*="language-"],
|
||||
pre[class*="language-"] {
|
||||
background: #272822;
|
||||
}
|
||||
|
||||
/* Inline code */
|
||||
:not(pre) > code[class*="language-"] {
|
||||
padding: .1em;
|
||||
border-radius: .3em;
|
||||
white-space: normal;
|
||||
}
|
||||
|
||||
.token.comment,
|
||||
.token.prolog,
|
||||
.token.doctype,
|
||||
.token.cdata {
|
||||
color: slategray;
|
||||
}
|
||||
|
||||
.token.punctuation {
|
||||
color: #f8f8f2;
|
||||
}
|
||||
|
||||
.namespace {
|
||||
opacity: .7;
|
||||
}
|
||||
|
||||
.token.property,
|
||||
.token.tag,
|
||||
.token.constant,
|
||||
.token.symbol,
|
||||
.token.deleted {
|
||||
color: #f92672;
|
||||
}
|
||||
|
||||
.token.boolean,
|
||||
.token.number {
|
||||
color: #ae81ff;
|
||||
}
|
||||
|
||||
.token.selector,
|
||||
.token.attr-name,
|
||||
.token.string,
|
||||
.token.char,
|
||||
.token.builtin,
|
||||
.token.inserted {
|
||||
color: #a6e22e;
|
||||
}
|
||||
|
||||
.token.operator,
|
||||
.token.entity,
|
||||
.token.url,
|
||||
.language-css .token.string,
|
||||
.style .token.string,
|
||||
.token.variable {
|
||||
color: #f8f8f2;
|
||||
}
|
||||
|
||||
.token.atrule,
|
||||
.token.attr-value,
|
||||
.token.function {
|
||||
color: #e6db74;
|
||||
}
|
||||
|
||||
.token.keyword {
|
||||
color: #66d9ef;
|
||||
}
|
||||
|
||||
.token.regex,
|
||||
.token.important {
|
||||
color: #fd971f;
|
||||
}
|
||||
|
||||
.token.important,
|
||||
.token.bold {
|
||||
font-weight: bold;
|
||||
}
|
||||
.token.italic {
|
||||
font-style: italic;
|
||||
}
|
||||
|
||||
.token.entity {
|
||||
cursor: help;
|
||||
}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div id="app">
|
||||
<header>
|
||||
<a href="https://github.com/thejameskyle/the-super-tiny-compiler">
|
||||
/Users/thejameskyle/code/the-super-tiny-compiler/<%= fileName %>
|
||||
</a>
|
||||
|
||||
<a class="right" href="https://github.com/thejameskyle/the-super-tiny-compiler">
|
||||
Star this in GitHub
|
||||
</a>
|
||||
|
||||
<a class="right" href="https://twitter.com/thejameskyle">
|
||||
Follow me on Twitter
|
||||
</a>
|
||||
|
||||
<a class="right" href="https://glitch.com/edit/#!/the-super-tiny-compiler">
|
||||
Remix this in Glitch
|
||||
</a>
|
||||
</header>
|
||||
|
||||
<nav>
|
||||
<% routes.forEach(function(route) { %>
|
||||
<a href="<%= route.routePath %>" <% if (fileName === route.routeName) { %>class="active"<% } %>>
|
||||
<%= route.routeName %>
|
||||
</a>
|
||||
<% }); %>
|
||||
</nav>
|
||||
|
||||
<main>
|
||||
<% if (isCode) { %>
|
||||
<pre id="code"><%- fileContents %></pre>
|
||||
<% } else { %>
|
||||
<div class="container">
|
||||
<%- fileContents %>
|
||||
</div>
|
||||
<% } %>
|
||||
|
||||
<% if (fileName === '6-compiler.js') { %>
|
||||
<img src="https://cdn.glitch.com/da026c15-c2dc-4ff8-bbed-d9d003c04338%2Ftumblr_mvemcyarmn1rslphyo1_400.gif?1492115698121" alt="Carlton Dance">
|
||||
<% } %>
|
||||
</main>
|
||||
</div>
|
||||
</body>
|
||||
</html>
|
50
test.js
Executable file → Normal file
50
test.js
Executable file → Normal file
@ -1,16 +1,22 @@
|
||||
const {
|
||||
tokenizer,
|
||||
parser,
|
||||
transformer,
|
||||
codeGenerator,
|
||||
compiler,
|
||||
} = require('./the-super-tiny-compiler');
|
||||
const assert = require('assert');
|
||||
var tokenizer = require('./1-tokenizer');
|
||||
var parser = require('./2-parser');
|
||||
// Note: The traverser is only used inside of the transformer...
|
||||
var transformer = require('./4-transformer');
|
||||
var codeGenerator = require('./5-code-generator');
|
||||
var compiler = require('./6-compiler');
|
||||
|
||||
const input = '(add 2 (subtract 4 2))';
|
||||
const output = 'add(2, subtract(4, 2));';
|
||||
|
||||
const tokens = [
|
||||
// assert is a Node.js utility for asserting values and throwing and error if
|
||||
// they aren't what you expect
|
||||
var assert = require('assert');
|
||||
|
||||
/**
|
||||
* Setting up all of the expected values through out our compiler phases:
|
||||
*/
|
||||
var input = '(add 2 (subtract 4 2))';
|
||||
var output = 'add(2, subtract(4, 2));';
|
||||
|
||||
var tokens = [
|
||||
{ type: 'paren', value: '(' },
|
||||
{ type: 'name', value: 'add' },
|
||||
{ type: 'number', value: '2' },
|
||||
@ -22,7 +28,7 @@ const tokens = [
|
||||
{ type: 'paren', value: ')' }
|
||||
];
|
||||
|
||||
const ast = {
|
||||
var ast = {
|
||||
type: 'Program',
|
||||
body: [{
|
||||
type: 'CallExpression',
|
||||
@ -44,7 +50,7 @@ const ast = {
|
||||
}]
|
||||
};
|
||||
|
||||
const newAst = {
|
||||
var newAst = {
|
||||
type: 'Program',
|
||||
body: [{
|
||||
type: 'ExpressionStatement',
|
||||
@ -75,10 +81,16 @@ const newAst = {
|
||||
}]
|
||||
};
|
||||
|
||||
assert.deepStrictEqual(tokenizer(input), tokens, 'Tokenizer should turn `input` string into `tokens` array');
|
||||
assert.deepStrictEqual(parser(tokens), ast, 'Parser should turn `tokens` array into `ast`');
|
||||
assert.deepStrictEqual(transformer(ast), newAst, 'Transformer should turn `ast` into a `newAst`');
|
||||
assert.deepStrictEqual(codeGenerator(newAst), output, 'Code Generator should turn `newAst` into `output` string');
|
||||
assert.deepStrictEqual(compiler(input), output, 'Compiler should turn `input` into `output`');
|
||||
/**
|
||||
* Now let's write some assertions to make sure our compiler does everything we
|
||||
* want it to...
|
||||
*/
|
||||
|
||||
console.log('All Passed!');
|
||||
assert.deepStrictEqual( tokenizer(input), tokens, 'Tokenizer should turn `input` string into `tokens` array');
|
||||
assert.deepStrictEqual( parser(tokens), ast, 'Parser should turn `tokens` array into `ast`');
|
||||
assert.deepStrictEqual( transformer(ast), newAst, 'Transformer should turn `ast` into a `newAst`');
|
||||
assert.deepStrictEqual( codeGenerator(newAst), output, 'Code Generator should turn `newAst` into `output` string');
|
||||
assert.deepStrictEqual( compiler(input), output, 'Compiler should turn `input` into `output`');
|
||||
|
||||
// If none of the above asserts threw an error...
|
||||
console.log('All Passed!');
|
File diff suppressed because it is too large
Load Diff
Loading…
Reference in New Issue
Block a user