Update the-super-tiny-compiler.js

增加中文翻译
pull/85/head
HongJie Tao 3 months ago committed by GitHub
parent d8d4013045
commit 97c5770bed
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

@ -98,6 +98,25 @@
* Well good, because this is exactly what we are going to compile. While this * Well good, because this is exactly what we are going to compile. While this
* is neither a complete LISP or C syntax, it will be enough of the syntax to * is neither a complete LISP or C syntax, it will be enough of the syntax to
* demonstrate many of the major pieces of a modern compiler. * demonstrate many of the major pieces of a modern compiler.
*
* 今天我们将一起编写一个编译器一个超级小巧的编译器
* 去除所有注释后这个文件的实际代码行数将只有大约200行
*
* 我们将把类似Lisp的函数调用编译成类似C的函数调用
*
* 如果你不熟悉其中一种或两种语言下面是一个简短的介绍
*
* 假设我们有两个函数 `add` `subtract`它们的写法如下
*
* Lisp 语法 C 语法
* (add 2 2) add(2, 2)
* (subtract 4 2) subtract(4, 2)
* (add 2 (subtract 4 2)) add(2, subtract(4, 2))
*
* 是不是很简单
*
* 很好因为这正是我们要编译的内容虽然这既不是完整的Lisp也不是C语法
* 但它将足以展示现代编译器的许多核心组件
*/ */
/** /**
@ -112,10 +131,20 @@
* *
* 3. *Code Generation* takes the transformed representation of the code and * 3. *Code Generation* takes the transformed representation of the code and
* turns it into new code. * turns it into new code.
*
*
* 大多数编译器可以分为三个主要阶段解析Parsing转换Transformation和代码生成Code Generation
*
* 1. *解析* 是将原始代码转换为代码的更抽象表示
*
* 2. *转换* 对这个抽象表示进行操作以实现编译器希望其执行的任何操作
*
* 3. *代码生成* 将已转换的代码表示转换为新代码
*/ */
/** /**
* Parsing * Parsing
* 解析Parsing
* ------- * -------
* *
* Parsing typically gets broken down into two phases: Lexical Analysis and * Parsing typically gets broken down into two phases: Lexical Analysis and
@ -137,11 +166,23 @@
* represents code in a way that is both easy to work with and tells us a lot * represents code in a way that is both easy to work with and tells us a lot
* of information. * of information.
* *
* 解析通常被细分为两个阶段词法分析Lexical Analysis和语法分析Syntactic Analysis
*
* 1. *词法分析* 通过词法分析器也称为词法分析器或分词器将原始代码拆分为称为标记tokens的单元
*
* 标记是一个包含描述语法孤立片段的小对象的数组它们可以是数字标签标点符号操作符等
*
* 2. *语法分析* 将标记重新组合成描述语法各个部分及其相互关系的表示形式这被称为中间表示或抽象语法树AST
*
* 抽象语法树AST是一个深度嵌套的对象以易于处理的方式表示代码并提供了大量信息
*
* For the following syntax: * For the following syntax:
* 对于以下语法
* *
* (add 2 (subtract 4 2)) * (add 2 (subtract 4 2))
* *
* Tokens might look something like this: * Tokens might look something like this:
* 标记可能如下所示
* *
* [ * [
* { type: 'paren', value: '(' }, * { type: 'paren', value: '(' },
@ -156,6 +197,7 @@
* ] * ]
* *
* And an Abstract Syntax Tree (AST) might look like this: * And an Abstract Syntax Tree (AST) might look like this:
* 而一个抽象语法树AST可能如下所示
* *
* { * {
* type: 'Program', * type: 'Program',
@ -178,25 +220,34 @@
* }] * }]
* }] * }]
* } * }
*
*
*
*/ */
/** /**
* Transformation * Transformation
* 转换Transformation
* -------------- * --------------
* *
* The next type of stage for a compiler is transformation. Again, this just * The next type of stage for a compiler is transformation. Again, this just
* takes the AST from the last step and makes changes to it. It can manipulate * takes the AST from the last step and makes changes to it. It can manipulate
* the AST in the same language or it can translate it into an entirely new * the AST in the same language or it can translate it into an entirely new
* language. * language.
* 编译器的下一个阶段是转换这个阶段同样基于上一步生成的AST并对其进行修改它可以在同一语言内操作AST也可以将其转换为全新的语言
*
* *
* Lets look at how we would transform an AST. * Lets look at how we would transform an AST.
* 让我们看看如何转换AST
* *
* You might notice that our AST has elements within it that look very similar. * You might notice that our AST has elements within it that look very similar.
* There are these objects with a type property. Each of these are known as an * There are these objects with a type property. Each of these are known as an
* AST Node. These nodes have defined properties on them that describe one * AST Node. These nodes have defined properties on them that describe one
* isolated part of the tree. * isolated part of the tree.
* * 你可能会注意到我们的AST中包含了一些看起来非常相似的元素这些都是具有type属性的对象每个这样的对象都被称为AST节点ASTNode这些节点定义了描述树中独立部分的属性
*
* We can have a node for a "NumberLiteral": * We can have a node for a "NumberLiteral":
* 我们可以有一个NumberLiteral节点
* *
* { * {
* type: 'NumberLiteral', * type: 'NumberLiteral',
@ -204,6 +255,7 @@
* } * }
* *
* Or maybe a node for a "CallExpression": * Or maybe a node for a "CallExpression":
* 或者一个CallExpression节点
* *
* { * {
* type: 'CallExpression', * type: 'CallExpression',
@ -215,17 +267,21 @@
* adding/removing/replacing properties, we can add new nodes, remove nodes, or * adding/removing/replacing properties, we can add new nodes, remove nodes, or
* we could leave the existing AST alone and create an entirely new one based * we could leave the existing AST alone and create an entirely new one based
* on it. * on it.
* * 在转换AST时我们可以通过添加/删除/替换属性来操作节点可以添加新节点删除节点或者基于现有AST创建一个全新的AST
*
* Since were targeting a new language, were going to focus on creating an * Since were targeting a new language, were going to focus on creating an
* entirely new AST that is specific to the target language. * entirely new AST that is specific to the target language.
* * 由于我们针对的是新语言我们将专注于创建一个特定于目标语言的全新AST
*
* Traversal * Traversal
* 遍历Traversal
* --------- * ---------
* *
* In order to navigate through all of these nodes, we need to be able to * In order to navigate through all of these nodes, we need to be able to
* traverse through them. This traversal process goes to each node in the AST * traverse through them. This traversal process goes to each node in the AST
* depth-first. * depth-first.
* * 为了遍历所有这些节点我们需要能够访问它们这个遍历过程将按照深度优先顺序访问AST中的每个节点
*
* { * {
* type: 'Program', * type: 'Program',
* body: [{ * body: [{
@ -249,26 +305,40 @@
* } * }
* *
* So for the above AST we would go: * So for the above AST we would go:
* 因此对于上述AST我们将按照以下顺序遍历
* *
* 1. Program - Starting at the top level of the AST * 1. Program - Starting at the top level of the AST
* Program - 从AST的顶层开始
* 2. CallExpression (add) - Moving to the first element of the Program's body * 2. CallExpression (add) - Moving to the first element of the Program's body
* CallExpression (add) - 移动到Program主体的第一个元素
* 3. NumberLiteral (2) - Moving to the first element of CallExpression's params * 3. NumberLiteral (2) - Moving to the first element of CallExpression's params
* NumberLiteral (2) - 移动到CallExpression参数的第一个元素
* 4. CallExpression (subtract) - Moving to the second element of CallExpression's params * 4. CallExpression (subtract) - Moving to the second element of CallExpression's params
* CallExpression (subtract) - 移动到CallExpression参数的第二个元素
* 5. NumberLiteral (4) - Moving to the first element of CallExpression's params * 5. NumberLiteral (4) - Moving to the first element of CallExpression's params
* NumberLiteral (4) - 移动到CallExpression参数的第一个元素
* 6. NumberLiteral (2) - Moving to the second element of CallExpression's params * 6. NumberLiteral (2) - Moving to the second element of CallExpression's params
* NumberLiteral (2) - 移动到CallExpression参数的第二个元素
* *
* If we were manipulating this AST directly, instead of creating a separate AST, * If we were manipulating this AST directly, instead of creating a separate AST,
* we would likely introduce all sorts of abstractions here. But just visiting * we would likely introduce all sorts of abstractions here. But just visiting
* each node in the tree is enough for what we're trying to do. * each node in the tree is enough for what we're trying to do.
* 如果我们直接操作这个AST而不是创建一个单独的AST
* 我们可能会在这里引入各种抽象但对于我们想要做的事情来说
* 仅访问树中的每个节点就足够了
* *
* The reason I use the word "visiting" is because there is this pattern of how * The reason I use the word "visiting" is because there is this pattern of how
* to represent operations on elements of an object structure. * to represent operations on elements of an object structure.
* 我使用访问这个词是因为有一个如何在对象结构的元素上表示操作的模式
* *
* Visitors * Visitors
* 访问者Visitors
* -------- * --------
* *
* The basic idea here is that we are going to create a visitor object that * The basic idea here is that we are going to create a visitor object that
* has methods that will accept different node types. * has methods that will accept different node types.
* 基本思想是我们将创建一个访问者对象该对象将包含接受不同节点类型的方法
*
* *
* var visitor = { * var visitor = {
* NumberLiteral() {}, * NumberLiteral() {},
@ -277,9 +347,11 @@
* *
* When we traverse our AST, we will call the methods on this visitor whenever we * When we traverse our AST, we will call the methods on this visitor whenever we
* "enter" a node of a matching type. * "enter" a node of a matching type.
* * 当我们遍历我们的AST时我们将调用此访问者上的方法每当我们进入一个匹配类型的节点时
*
* In order to make this useful we will also pass the node and a reference to * In order to make this useful we will also pass the node and a reference to
* the parent node. * the parent node.
* 为了使这变得有用我们还将传递节点和对父节点的引用
* *
* var visitor = { * var visitor = {
* NumberLiteral(node, parent) {}, * NumberLiteral(node, parent) {},
@ -288,7 +360,8 @@
* *
* However, there also exists the possibility of calling things on "exit". Imagine * However, there also exists the possibility of calling things on "exit". Imagine
* our tree structure from before in list form: * our tree structure from before in list form:
* * 然而也存在退出时调用东西的可能性想象一下我们之前的树结构以列表形式表示
*
* - Program * - Program
* - CallExpression * - CallExpression
* - NumberLiteral * - NumberLiteral
@ -299,7 +372,8 @@
* As we traverse down, we're going to reach branches with dead ends. As we * As we traverse down, we're going to reach branches with dead ends. As we
* finish each branch of the tree we "exit" it. So going down the tree we * finish each branch of the tree we "exit" it. So going down the tree we
* "enter" each node, and going back up we "exit". * "enter" each node, and going back up we "exit".
* * 当我们向下遍历时我们将达到有死路的分支当我们完成树的每个分支时我们会退出因此在向下遍历树时我们进入每个节点而在向上返回时我们退出
*
* -> Program (enter) * -> Program (enter)
* -> CallExpression (enter) * -> CallExpression (enter)
* -> Number Literal (enter) * -> Number Literal (enter)
@ -314,7 +388,8 @@
* <- Program (exit) * <- Program (exit)
* *
* In order to support that, the final form of our visitor will look like this: * In order to support that, the final form of our visitor will look like this:
* * 为了支持这一点访问者的最终形式将如下所示
*
* var visitor = { * var visitor = {
* NumberLiteral: { * NumberLiteral: {
* enter(node, parent) {}, * enter(node, parent) {},
@ -325,20 +400,29 @@
/** /**
* Code Generation * Code Generation
* 代码生成
* --------------- * ---------------
* *
* The final phase of a compiler is code generation. Sometimes compilers will do * The final phase of a compiler is code generation. Sometimes compilers will do
* things that overlap with transformation, but for the most part code * things that overlap with transformation, but for the most part code
* generation just means take our AST and string-ify code back out. * generation just means take our AST and string-ify code back out.
* * 编译器的最后阶段是代码生成有时编译器会执行与转换重叠的操作
* 但大多数情况下代码生成只是指将我们的AST转换回字符串形式的代码
*
*
* Code generators work several different ways, some compilers will reuse the * Code generators work several different ways, some compilers will reuse the
* tokens from earlier, others will have created a separate representation of * tokens from earlier, others will have created a separate representation of
* the code so that they can print nodes linearly, but from what I can tell most * the code so that they can print nodes linearly, but from what I can tell most
* will use the same AST we just created, which is what were going to focus on. * will use the same AST we just created, which is what were going to focus on.
* * 代码生成器的工作方式各不相同有些编译器会重用早期的令牌而其他编译器则会创建代码的单独表示形式
* 以便它们可以线性地打印节点但据我所知大多数编译器将使用我们刚刚创建的相同的AST这将是我们的重点
*
* Effectively our code generator will know how to print all of the different * Effectively our code generator will know how to print all of the different
* node types of the AST, and it will recursively call itself to print nested * node types of the AST, and it will recursively call itself to print nested
* nodes until everything is printed into one long string of code. * nodes until everything is printed into one long string of code.
* 实际上我们的代码生成器将知道如何打印AST的所有不同节点类型
* 并且它将递归调用自身以打印嵌套节点直到所有内容都被打印为一个长的代码字符串
*
*/ */
/** /**
@ -1027,7 +1111,7 @@ function codeGenerator(node) {
function compiler(input) { function compiler(input) {
let tokens = tokenizer(input); let tokens = tokenizer(input);
let ast = parser(tokens); let ast = parser(tokens);
let newAst = transformer(ast); let newAst = transformer(ast);
let output = codeGenerator(newAst); let output = codeGenerator(newAst);

Loading…
Cancel
Save