Nand2Tetris
From NAND to Tetris
Building a Modern Computer From First Principles
由基本原理构造一个现代计算机
Nand2tetris 这门课程的历史你可以在创始人 Shimon Schocken 的 TED 演讲 中了解。“Nand”是计算机构建最基本的逻辑门的名称,而“Tetris”就是“俄罗斯方块”这款游戏。从名字上我们大概能窥见,这门课会引领我们从最基本的逻辑门开始,一步步搭建出“俄罗斯方块”。
这门课曾经是希伯来大学计算机系的一门公共课,目前在各大名校都有不同的教学版本。我目前学的是 UNNC 版的 Computer Fundamentals ,遗憾这学期这门课的老师略逊。所幸之前有一位老师 Eugene Ch'ng 在 YouTube 上传了这门课的一些知识点视频,值得一看。
以下是关于这门课(以及配套教材 The Elements of Computing System)的一些参考资料:
官网:http://www.nand2tetris.org/
官方读者问答论坛:http://questions-and-answers-forum.32033.n3.nabble.com/
讲解文章(附代码):http://blog.csdn.net/thomas_in_june/article/category/2506421
Github 代码:
- Chapter 1~4:https://github.com/cmoylan/Elements-of-Computing-Systems
- Chapter 5~12:https://github.com/ThomasCJY/The_Elements_of_Computing_Systems
作者本人的 coursera 公开课:
- Part I:https://www.coursera.org/learn/build-a-computer (7 weeks, 6 projects, 1 computer and 0 prerequisite knowledge)
- Part II:https://www.coursera.org/learn/nand2tetris2 (project-centered course)
Week 1 Boolean Logic
1.1 Boolean Logic
1.1.1 Boolean Identities
- Commutative Laws
- (x AND y) = (y AND x)
- (x OR y) = (y OR x)
- Associative Laws
- (x AND (y AND z)) = ((x AND y) AND z)
- (x OR (y OR z)) = ((x OR y) OR z)
- Distributive Laws
- (x AND (y OR z)) = (x AND y) OR (x AND z)
- (x OR (y AND z)) = (x OR y) AND (x OR z)
- De Morgan Laws
- NOT(x AND y) = NOT(x) OR NOT(y)
- NOT(x OR y) = NOT(x) AND NOT(y)
- Idempotence Law
- x AND x = x
- Double Negation Law
- NOT(NOT(x)) = x
1.2 Boolean Function
1.2.3 All Boolean Functions of 2 Variables
1.2.2 Truth Table to Boolean Expression
x | y | z | f | expression |
---|---|---|---|---|
0 | 0 | 0 | 1 | NOT(x) AND NOT(y) AND NOT(z) |
0 | 0 | 1 | 0 | |
0 | 1 | 0 | 1 | NOT(x) AND y AND NOT(z) |
0 | 1 | 1 | 0 | |
1 | 0 | 0 | 1 | x AND NOT(y) AND NOT(z) |
1 | 0 | 1 | 0 | |
1 | 1 | 0 | 0 | |
1 | 1 | 1 | 0 |
Write expressions that basically gets values of 1 only at the rows that have a value of 1, and OR them together:
(NOT(x) AND NOT(y) AND NOT(z)) OR (NOT(x) AND y AND NOT(z)) OR (x AND NOT(y) AND NOT(z))
= (NOT(x) AND NOT(z)) OR (x AND NOT(y) AND NOT(z))
= (NOT(x) AND NOT(z)) OR (NOT(y) AND NOT(z))
= NOT(z) AND (NOT(x) OR NOT(y))
1.2.3 Theorem
AND, OR and NOT
Any Boolean function can be represented using an expression containing AND, OR and NOT operations.
AND and NOT
Any Boolean function can be represented using an expression containing AND and NOT operations.
Proof: (x OR y) = NOT(NOT(x) AND NOT(y))
NAND
x | y | NAND |
---|---|---|
0 | 0 | 1 |
0 | 1 | 1 |
1 | 0 | 1 |
1 | 1 | 0 |
(x NAND y) = NOT(x AND y)
Any Boolean function can be represented using an expression containing NAND operations.
Proof:
- NOT(x) = (x NAND x)
- (x AND y) = NOT(x NAND y)
1.3 Logic Gates
1.3.1 Gate Logic
A technique for implementing Boolean functions using logic gates.
Logic gates:
- Elementary (Nand, And, Or, Not, ...)
- Composite (Mux, Adder, ...)
1.3.2 Elementary logic gates: NAND, AND, OR and NOT
1.3.3 Composite Gates
(Three-way AND) gate interface:
Gate implementation:
(One abstraction, many different implementations.)
1.3.4 CirCuit Implementations
1.4 Hardware Description Language (HDL)
1.4.1 Design: from requirements to interface
Requirement:
General idea: out = 1 when (a AND NOT(b)) OR (b AND NOT(a)).
Implementation:
1 | /** Xor gate: out = (a AND Not(b)) Or (Not(a) And b)*/ |
Line 2 - 4: interface; line 5-11: implementation.
1.4.2 Some Comments on HDL
- HDL is a functional / declarative language
- The order of HDL statements is insignificant
- Before using a chip part, you must know its interface. For example:
Not(in =, out= )
,And(a= , b= ,out= )
... - Connections like
partName(a = a, ...)
andpartName(…, out = out)
are common
1.4.3 Hardware Description Languages
- Common HDLs
- VHDL
- Verilog
- ...
- Our HDL
- Similar in spirit to other HDLs
- Minimal and simple
- Provides all you need for this course
- HDL Documentation:
- Textbook / Appendix A
- HDL Survival Guide
1.5 Hardware Simulation
1.5.1 Simulation Process
- Load the HDL file into the hardware simulator
- Enter values (0's and 1's) into the chip's input pins (e.g.
a
andb
) - Evaluate the chip's logic
- Inspect the resulting values of:
- The output pins (e.g.
out
) - The internal pins (e.g.
nota
,notb
,aAndNotb
,notaAndb
)
- The output pins (e.g.
1.5.2 Script-based Testing
An example of test script:
1 | /** Xor.tst */ |
Benefits: "Automatic" and Replicable testing.
(Don't worry about how to write it. Because all the test scripts are available in the projects.)
1.6 Multi-Bit Buses
1.6.1 Examples
1 | /* |
1 | /* |
1 | /* |
1 | /* |
1 | /* |
1.7 Project 1 Overview
1.7.1 Multiplexor
- A 2-way multiplexor enables selecting, and outputting, one out of two possible inputs
- Widely used in:
- Digital design
- Communications networks
Example: using mux logic to build a programmable gate
1 | CHIP AndMuxOr{ |
1.7.2 Demultiplexor
- Acts like the "inverse" of a multiplexor
- Distributes the single input value into one of two possible destinations
Example: Multiplexing/ demultiplexing in communications networks
- Each
sel
bit is connected to an oscillator that produces a repetitive train of alternating 0 and 1 signals - Enables transmitting multiple messages on a single, shared communications line
- A common use of multiplexing /demultiplexing logic unrelated to this course
1.7.3 And16
- A straightforward 16-bit extension of the elementary
And
gate - See 1.6 Multi-Bit Buses
1.7.4 Mux4Way16
1.8 End Notes
1.8.1 Canonical Representation
Whodunit story: Each suspect may or may not have an alibi
(a)
, a motivation to commit the crime(m)
, and a relationship to the weapon found in the scene of the crime(w)
. The police decides to focus attention only on suspects for whom the propositionNot(a) And (m Or w)
is true.
1.8.2 Karnaugh Maps
The Karnaugh map (KM or K-map) is a method of simplifying Boolean algebra expressions.
Example:
A | B | C | D | ||
---|---|---|---|---|---|
0 | 0 | 0 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 1 | 0 |
2 | 0 | 0 | 1 | 0 | 0 |
3 | 0 | 0 | 1 | 1 | 0 |
4 | 0 | 1 | 0 | 0 | 0 |
5 | 0 | 1 | 0 | 1 | 0 |
6 | 0 | 1 | 1 | 0 | 1 |
7 | 0 | 1 | 1 | 1 | 0 |
8 | 1 | 0 | 0 | 0 | 1 |
9 | 1 | 0 | 0 | 1 | 1 |
10 | 1 | 0 | 1 | 0 | 1 |
11 | 1 | 0 | 1 | 1 | 1 |
12 | 1 | 1 | 0 | 0 | 1 |
13 | 1 | 1 | 0 | 1 | 1 |
14 | 1 | 1 | 1 | 0 | 1 |
15 | 1 | 1 | 1 | 1 | 0 |
Following are two different notations describing the same function in unsimplified Boolean algebra, using the Boolean variables A, B, C, D, and their inverses.
- where are the minterms to map (i.e., rows that have output 1 in the truth table).
- where are the maxterms to map (i.e., rows that have output 0 in the truth table).
K-map drawn on a torus, and in a plane. The dot-marked cells are adjacent.
In the example above, the four input variables can be combined in 16 different ways, so the truth table has 16 rows, and the Karnaugh map has 16 positions. The Karnaugh map is therefore arranged in a 4 × 4 grid.
The row and column indices (shown across the top, and down the left side of the Karnaugh map) are ordered in Gray code rather than binary numerical order. Gray code ensures that only one variable changes between each pair of adjacent cells. Each cell of the completed Karnaugh map contains a binary digit representing the function's output for that combination of inputs.
After the Karnaugh map has been constructed, it is used to find one of the simplest possible forms — a canonical form — for the information in the truth table. Adjacent 1s in the Karnaugh map represent opportunities to simplify the expression. The minterms ('minimal terms') for the final expression are found by encircling groups of 1s in the map.
- Minterm groups must be rectangular and must have an area that is a power of two (i.e., 1, 2, 4, 8…).
- Minterm rectangles should be as large as possible without containing any 0s.
- Groups may overlap in order to make each one larger.
The optimal groupings in the example below are marked by the green, red and blue lines, and the red and green groups overlap. The red group is a 2 × 2 square, the green group is a 4 × 1 rectangle, and the overlap area is indicated in brown. (The K-map for the inverse of f is shown as gray rectangles, which correspond to maxterms.)
Once the Karnaugh map has been constructed and the adjacent 1s linked by rectangular and square boxes, the algebraic minterms can be found by examining which variables stay the same within each box.
For the red grouping:
- A is the same and is equal to 1 throughout the box, therefore it should be included in the algebraic representation of the red minterm.
- B does not maintain the same state (it shifts from 1 to 0), and should therefore be excluded.
- C does not change. It is always 0, so its complement, NOT-C, should be included. Thus, C' should be included.
- D changes, so it is excluded.
Thus the first minterm in the Boolean sum-of-products expression is AC'.
For the green grouping, A and B maintain the same state, while C and D change. B is 0 and has to be negated before it can be included. The second term is therefore AB'. Note that it is acceptable that the green grouping overlaps with the red one.
In the same way, the blue grouping gives the term BCD'.
The solutions of each grouping are combined: the normal form of the circuit is.
Thus the Karnaugh map has guided a simplification of
It would also have been possible to derive this simplification by carefully applying the axioms of boolean algebra, but the time it takes to do that grows exponentially with the number of terms.
Don't Cares
Karnaugh maps also allow easy minimizations of functions whose truth tables include "don't care" conditions. A "don't care" condition is a combination of inputs for which the designer doesn't care what the output is. Therefore, "don't care" conditions can either be included in or excluded from any rectangular group, whichever makes it larger. They are usually indicated on the map with a dash or X.
The example above is the same as the example above but with the value of f(1,1,1,1) replaced by a "don't care". This allows the red term to expand all the way down and, thus, removes the green term completely.
This yields the new minimum equation:
Note that the first term is just A, not AC'. In this case, the don't care has dropped a term (the green rectangle); simplified another (the red one); and removed the race hazard (removing the yellow term as shown in the following section on race hazards).
1.8.3 Quine–McCluskey Algorithm
https://en.wikipedia.org/wiki/Quine%E2%80%93McCluskey_algorithm
Week 2 Boolean Arithmetic
2.1 Binary Numbers
2.1.1 Binary to Decimal
Example: 101(binary) = 1 * 2^2 + 0 * 2^1 + 1 * 2^0 = 5(decimal)
2.1.2 Decimal to Binary
Example: 87(decimal) = 64 + 16 + 4 + 2 + 1 = 01010111(binary)
Example: 101(decimal) = 1100101(binary)
101 | Remainder |
---|---|
50 | 1 |
25 | 0 |
12 | 1 |
6 | 0 |
3 | 0 |
1 | 1 |
0 | 1 |
2.2 Binary Addition
...
2.2.1 Building an Adder
- Half Adder - adds tow bits
- Full Adder - adds three bits
- Multi-bit Adder - adds tow numbers
a | b | sum | carry |
---|---|---|---|
0 | 0 | 0 | 0 |
0 | 1 | 1 | 0 |
1 | 0 | 1 | 0 |
1 | 1 | 0 | 1 |
a | b | c | sum | carry |
---|---|---|---|---|
0 | 0 | 0 | 0 | 0 |
0 | 0 | 1 | 1 | 0 |
0 | 1 | 0 | 1 | 0 |
0 | 1 | 1 | 0 | 1 |
1 | 0 | 0 | 1 | 0 |
1 | 0 | 1 | 0 | 1 |
1 | 1 | 0 | 0 | 1 |
1 | 1 | 1 | 1 | 1 |
Each color area is a full adder.
2.3 Negative Numbers
2.3.1 Sign and Magnitude
First bit is -/+
. All other bits represents a positive number.
Binary | Decimal |
---|---|
0000 | 0 |
0001 | 1 |
0010 | 2 |
0011 | 3 |
0100 | 4 |
0101 | 5 |
0110 | 6 |
0111 | 7 |
1000 | -0 |
1001 | -1 |
1010 | -2 |
1011 | -3 |
1100 | -4 |
1101 | -5 |
1110 | -6 |
1111 | -7 |
Complications:
- Not obvious where to put the sign bit. To the right? To the left? (Early computers tried both.)
- Adders may need an extra step to set the sign because we can’t know in advance what the proper sign will be.
- -0?
2.3.2 1's Complement
The value is obtained by inverting all the bits in the binary representation of the number (swapping 0s for 1s and vice versa). The ones' complement of the number then behaves like the negative of the original number in some arithmetic operations. To within a constant (of −1), the ones' complement behaves like the negative of the original number with binary addition.
Binary | Decimal |
---|---|
0000 | 0 |
0001 | 1 |
0010 | 2 |
0011 | 3 |
0100 | 4 |
0101 | 5 |
0110 | 6 |
0111 | 7 |
1000 | -7 |
1001 | -6 |
1010 | -5 |
1011 | -4 |
1100 | -3 |
1101 | -2 |
1110 | -1 |
1111 | -0 |
Complications:
- -0?
- The offset of
−1
2.3.3 Excess-N (Offset Binary / Biased Representation)
Offset binary, also referred to as excess-K, excess-N, excess code or biased representation, is a digital coding scheme where all-zero corresponds to the minimal negative value and all-one to the maximal positive value.
Example: Stardard excess of 4 bits: Excess-8 (2n-1)
Binary | Decimal |
---|---|
0000 | -8 |
0001 | -7 |
0010 | -6 |
0011 | -5 |
0100 | -4 |
0101 | -3 |
0110 | -2 |
0111 | -1 |
1000 | 0 |
1001 | 1 |
1010 | 2 |
1011 | 3 |
1100 | 4 |
1101 | 5 |
1110 | 6 |
1111 | 7 |
Complications:
- Negate the signed number after operations
2.3.4 2's Complement (Preferred)
Represent negative number -x
using the postive number 2^n - x
.
Binary | Decimal |
---|---|
0000 | 0 |
0001 | 1 |
0010 | 2 |
0011 | 3 |
0100 | 4 |
0101 | 5 |
0110 | 6 |
0111 | 7 |
1000 | -8 (8) |
1001 | -7 (9) |
1010 | -6 (10) |
1011 | -5 (11) |
1100 | -4 (12) |
1101 | -3 (13) |
1110 | -2 (14) |
1111 | -1 (15) |
- Positive numbers: 0, …, 2^(n-1) - 1
- Negative numbers: -1, …, -2^(n-1)
Addition in 2's complement (for free):
Representation is modulo 2^n
; addition is modulo 2^n
.
2.3.5 Conclusion
Binary | Unsigned | Sign and Magnitude | 1's Complement | 2's Complement | Excess-8 |
---|---|---|---|---|---|
0000 | 0 | 0 | 0 | 0 | -8 |
0001 | 1 | 1 | 1 | 1 | -7 |
0010 | 2 | 2 | 2 | 2 | -6 |
0011 | 3 | 3 | 3 | 3 | -5 |
0100 | 4 | 4 | 4 | 4 | -4 |
0101 | 5 | 5 | 5 | 5 | -3 |
0110 | 6 | 6 | 6 | 6 | -2 |
0111 | 7 | 7 | 7 | 7 | -1 |
1000 | 8 | -0 | -7 | -8 | 0 |
1001 | 9 | -1 | -6 | -7 | 1 |
1010 | 10 | -2 | -5 | -6 | 2 |
1011 | 11 | -3 | -4 | -5 | 3 |
1100 | 12 | -4 | -3 | -4 | 4 |
1101 | 13 | -5 | -2 | -3 | 5 |
1110 | 14 | -6 | -1 | -2 | 6 |
1111 | 15 | -7 | -0 | -1 | 7 |
2.3.6 Computing -x
in 2's Complement
Input: x
Output: -x (in 2's complement)
Idea:
2^n - x = (2^n - 1) - x + 1
(
2^n - 1
is11...1
in binary.)
Example:
Input: 4
1111 - 100 + 1= 1100
Output: -4
To add 1: flip the bits from right to left, stopping the first time 0 is filpped to 1.
2.4 Arithmetic Logic Unit (ALU)
2.4.1 The Arithmetic Logic Unit
The ALU computes a function on the two inputs and outputs the result.
f: one out of a family of pre-defined arithmetic and logical functions
- Arithmetic oprations: integer addition, multiplication, division, ...
- Logic oprations: And, Or, Xor, ...
Which operation should the ALU perform? That's a hardware / software tradeoff. Because if you choose not to implement something in hardware, you can always augment it later with sofeware.
2.4.2 The Hack ALU
The Hack ALU is a specific one that will hum inside our Hack computer.
- Operates on two 16-bit, 2's complement values
- Outputs a 16-bit, 2's complement value
- Which function to compute is set by six 1-bit inputs
- Computes one out of a family of 18 functions
- Also outputs two 1-bit values
zx | nx | zy | ny | f | no | out |
---|---|---|---|---|---|---|
1 | 0 | 1 | 0 | 1 | 0 | 0 |
1 | 1 | 1 | 1 | 1 | 1 | 1 |
1 | 1 | 1 | 0 | 1 | 0 | -1 |
0 | 0 | 1 | 1 | 0 | 0 | x |
1 | 1 | 0 | 0 | 0 | 0 | y |
0 | 0 | 1 | 1 | 0 | 1 | !x |
1 | 1 | 0 | 0 | 0 | 1 | !y |
0 | 0 | 1 | 1 | 1 | 1 | -x |
1 | 1 | 0 | 0 | 1 | 1 | -y |
0 | 1 | 1 | 1 | 1 | 1 | x+1 |
1 | 1 | 0 | 1 | 1 | 1 | y+1 |
0 | 0 | 1 | 1 | 1 | 0 | x-1 |
1 | 1 | 0 | 0 | 1 | 0 | y-1 |
0 | 0 | 0 | 0 | 1 | 0 | x+y |
0 | 1 | 0 | 0 | 1 | 1 | x-y |
0 | 0 | 0 | 1 | 1 | 1 | y-x |
0 | 0 | 0 | 0 | 0 | 0 | x&y |
0 | 1 | 0 | 1 | 0 | 1 | x|y |
out | zr | ng |
---|---|---|
0 | 1 | 0 |
<0 | 0 | 1 |
>0 | 0 | 0 |
Pre-set the x input:
1
2 if zx then x = 0;
if nx then x = !x;Pre-set the y input:
1
2 if zy then y =0;
if ny then y = !y;Select between computing + or &:
1
2
3 if f
then out = x + y
else out = x & yPost-set the output:
1 if no then out = !outResult ALU output.
zr and zg:
1
2 if out == 0 then zr = 1 else zr = 0
if out < 0 then ng = 1 else ng = 0
2.5 Project 2 Overview
2.5.1 Half Adder
Implementation tip: it can be built using two very elementary gates.
sum: Xor; carry: And.
2.5.2 Full Adder
Implementation tip: it can be built from two half-adders with some other logic gates.
2.5.3 16-bit Adder
Implementation tips:
- An n-bit adder can be built from n full-adder chips
- The carry bit is "piped" from right to left
- The MSB (most significant carry bit) is ignored
2.5.4 16-bit Incrementor
Implementation tip: the single-bit 0
and 1
values are represented in HDL as false
and true
.
2.5.5 ALU
Implementation tips:
- Building blocks:
Add16
, and various chips built in Project 1 - It can be built with less than 20 lines of HDL code.
Week 3 Memory
3.1 Sequential Logic
3.1.1 Combinatorial VS Sequential
- Combinatorial:
out[t] = function(in[t])
(t
= time)- operate on data only
- provide calculation services (e.g. Nand … ALU)
- Sequential:
out[t] = function(in[t-1])
- operate on data and a clock signal
- as such, can be made to be state-aware and provide storage and synchronization services
3.1.2 The Clock
3.1.3 Sequential Logic
state[t] = function(state[t-1])
3.2 Flip Flops
3.2.1 Remembering State
Need an ingredient to remember one bit of infomation from time t-1
so it can be used at time t
.
At the "end of time" t-1
, such an ingredient can be at either of two states: "remembering 0" or "remenbering 1".
This ingredient remembers by "flipping" between these possible states.
Gates that can flip between two states are called Flip-Flops.
3.2.2 The "Clocked Data Flip Flop"
Notational convention:
3.2.3 Implementation of the DDF
- In this course: it is a primitive
- In many physical implementations, it may be built from actual Nand gates:
- Step 1: create a "loop" achieving an "un-clocked" flip-flop
- Step 2: isolation across time steps using a "master-slave" setup
- Very cute but conceptually confusing
- Our Hardware Simulator forbids "combinatorial loops"
- A cycle in the hardware connections is allowed only if it passes through a sequential gate
3.2.4 Sequential Logic Implementation
3.2.5 Remembering Forever: 1-bit Register
Goal: remember an input bit "forever" until requested to load a new value
1 | if load(t-1) |
3.2.6 Multi-bit Register
The width of a Register: w
(word width): 16-bit, 32-bit, 64-bit, …
Register's state: the value which is currently stored inside the Register
3.3 Memory Units
3.3.1 Von Neumann Architecture
3.3.2 Memory
- Main memory: RAM (data + instruction), ...
- Secondary memory: disks, ...
- Volatile / non-volatile
3.3.3 The Most Basic Memory Element: Register
To read the Register: probeout
(which emits the Register's state)
To set Register = v
: set in = v
, load = 1
(and then from the next cycle onward, out emits v
)
3.3.4 RAM Unit
RAM abstraction: A sequence of n
addressable registers, with addresses 0
to n-1
.
At any given point of time, only one register in the RAM is selected.
Address: the location of the register within a larger chip.
k
(width of address input): k = log_2(n)
(For example, 8 register need 3 bits to represent.)
w
(word width): No impact on the RAM logic (Hack computer: w = 16
)
RAM is a sequential chip with a clocked behavior.
1 | // Let M stand for the state of the selected register |
3.3.5 RAM / Read Logic
To read Register i
: set address = i
(and then out
emits the state of Register i
)
3.3.6 RAM / Write Logic
To set Register i
to v
: set address = i
, in = v
, load = 1
(and then from the next cycle onward, out
emits the state of Register v
)
3.3.7 A family of 16-bit RAM chips
Chip Name | n | k |
---|---|---|
RAM8 | 8 | 3 |
RAM64 | 64 | 6 |
RAM512 | 512 | 9 |
RAM4K | 4096 | 12 |
RAM16K | 16384 | 14 |
3.4 Counters
3.4.1 Where Counters Come to Play
- The computer must keep track of which instruction should be fetched and executed next
- This control mechanism can be realized by a Program Counter
- The PC contains the address of the instruction that will be fetched and executed next
- Three possible control settings:
- Reset: fetch the first instruction
PC = 0
- Next: fetch the next instruction
PC++
- Goto: fetch instruction
n
PC = n
- Reset: fetch the first instruction
Counter: a chip that realizes this abstraction.
3.4.2 Counter Abstraction
1 | if (reset[t] == 1) |
3.5 Project 3 Overview
3.5.1 1-bit Register (Bit)
Implementation tip: it can be built from a DFF and a Mux.
3.5.2 16-bit Register (Register)
Implementation tip: it can be built from multiple 1-bit registers
3.5.3 8-Register RAM (RAM8)
Implementation tips:
- Feed the
in
value to all the registers simultaneously - Use Mux / DMux chips to select the right register
3.5.4 RAM64, RAM512, …, RAM16K
Implementation tips:
- A RAM device can be built by grouping smaller RAM-parts together
- Think about the RAM's address input as consisting of two fields:
- One field can be used to select a RAM-part
- The other field can be used to select a register within that RAM-part
- Use Mux / DMux logic to effect this addressing scheme
3.5.5 Program Counter (PC)
Implementation tip: it can be built from a register, an incrementor, and some logic gates.
3.6 End Notes
3.6.1 Nand to DDF
S' | R' | Action |
---|---|---|
0 | 0 | Not allowed |
0 | 1 | Q = 1 |
1 | 0 | Q = 0 |
1 | 1 | No change |
Week 4 Machine Language
4.1 Overview
4.1.1 Operations
("100100" means addition...)
4.1.2 Program Counter
(We are now at instruction 159. Next instruction is 160...)
4.1.3 Addressing
(Operate on memory location 340...)
4.1.4 Compliation
Program in nice high-level language (Python, Java, ...)
=Compiler==>
Program in machine language (based on CPU)
4.1.5 Mnemonics
Machine Language: 010001000110010
Assembly Language: ADD 1, Mem[129]
(location 129 in memory holds the "index")
ADD 1, index
A "Symbolic Assembler" can translate "index" -> Mem[129]
Interpretation 1: The "symbolic form" doesn't really exist but is just a convenient mnemonic to present machine language instruction to humans.
Interpretation 2: We will allow humans to write machine language instruction using this "assembly language" and will have an "Assembler" program convert it to the bit-form.
4.2 Elements
4.2.1 Machine Language
- Specification of the Hardware/Software Interface
- What are the supported oprerations?
- What do they operate on?
- How is the program controlled?
- Usually in close correspondece to actual Hareware Architecture
- Not necessarily so
- Cost- Performance Tradeoff
- Silicon Area
- Time to Complete Instruction
4.2.2 Machine Operations
- Usually correspond to what's implemented in Hardware
- Arithmetic Operations: add, subtract, ...
- Logical Operations: and, or, ...
- Flow Control: "goto instuction X", "if C then goto instruction Y"
- Differences between machine languages
- Richness of the set of operations (divisions? bulk copy? ...)
- Data types (width, floating point...)
4.2.3 Memory Hierarchy
Problem: Accessing a memory location is expensive
- Need to supply a long address
- Getting the memory contents into the CPU take time
Solution: Memory Hierarchy
4.2.4 Registers
- CPUs ususlly contain a few, easily accessed "registers"
- Their number and functions are a central part of the machine language
- Data Register:
- Add R1, R2
- Address Register:
- Store R1, @A
4.2.5 Addressing Modes
- Register
- Add R1, R2 // R2 ⬅ R2 + R1
- Direct
- Add R1, M[200] // Mem[200] ⬅ Mem[200] + R1
- Indirect
- Add R1, @A // Mem[A] ⬅ Mem[A] +R1
- Immediate
- Add 71, R1 // R1 ⬅ R1 + 73
4.2.6 Input / Output
- One general method of interaction uses "memory mapping"
- Memory location 12345 holds the direction of the last movement of the mouse
- Memory location 45678 is not a real memory location but a way to tell the printer which paper to use
4.2.7 Machine Languages: Flow Control
Usually the CPU executes machine instructions in sequence
Sometimes we need to "jump" unconditionally to anothher location, e.g. so we can loop:
1
2
3
4
5
6101 Load R1, 0
102 Add 1, R1
103 ...
... //Do something with R1's value
154 ...
155 Jump 102=>
1
2
3
4
5
6Load R1, 0
loop Add 1, R1
...
//Do something with R1's value
...
Jump loopSometimes we need to jump only if some condition is met:
1
2
3
4JGT R1, 0, cont // Jump if R1 > 0
Subtract R1, 9, R1 // R1 ⬅ (0 - R1)
cont: ...
//Do something with positive R1
4.3 The Hack Computer and Machine Language
4.3.1 Hack Computer: Hardware
A 16-bit machine consisting of:
- Data memory (RAM): a sequence of 16-bit register: RAM[0], RAM[1], RAM[2], …
- Instruction memory (ROM): a sequence of 16-bit register: ROM[0], ROM[1], ROM[2], …
- Central Processing Unit (CPU): performss 16-bit instructions
- Instruction bus / data bus / address buses
4.3.2 Hack Computer: Software
- Hack machine language:
- 16-bit A-instruction
- 16-bit C-instructions
- Hack program = sequence of instructions wrritten in the Hack machine language
4.3.3 Hack Computer: Control
- Control:
- The ROM is loaded with a Hack program
- The
reset
button is pushed - The program starts running
4.3.4 Hack Computer: Registers
The Hack machine language recognizes three registers:
D
holds a 16-bit value (data register)A
holds a 16-bit value (address / data register)M
represents the 16-bitRAM
register addressed byA
(the currently selected memory register,M=RAM[A]
)
4.3.5 The A-instruction
Syntax: @value
Where value
is:
- a non-negative decimal constant (<= 32767 = 2^15-1)
- a symbol referring to such a constant (later)
Semantics:
- Sets the A register to
value
- Side effect:
RAM[A]
becomes the selectedRAM
register
Example:
@21
Effect:
- Sets the A register to
21
RAM[21]
becomes the selectedRAM
registerUsage example:
1
2
3 // Set RAM[100] to -1
@100 // A=100
M=-1 // RAM[100]=-1
4.3.6 The C-instruction
Syntax: dest = comp; jump
(dest
and jump
are optional)
Where:
comp
=0, 1, -1, D, A, !D, !A, -D, -A, D+1, A+1, D-1, A-1, D+A, D-A, A-D, D&A, D|A, M, !M, -M, M+1, M-1, D+M, D-M, M-D, D&M, D|M
dest
=null, M, D, MD, A, AM, AD, AND
(M
refers toRAM[A]
)jump
=null, JGT, JEQ, JGE, JLT, JNE, JLE, JMP
(if (comp jump 0
) jump to execute the instruction inROM[A]
)
Semantics:
- Compute the value of
comp
- Stores the result in
dest
- If the Boolean expression (
comp jump 0
) is true, jumps to execute the instruction stored inROM[A]
Example:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17 // Set the D register to -1
D=-1
// Set RAM[300] to the value of the D register minus 1
@300 // A=300
M=D-1 // RAM[300]=D-1
// If (D-1 == 0) jump to execute the instruction stored in ROM[56]
@56 // A=56
D-1;JEQ // if (D-1 == 0) goto 56
// Unconditional jump
0;JMP
// Set the A register to 1. Compute A-1 and store the value of the result in the M register, which is RAM[1]. Check if the computation is equal to 0. If true, the next instruction will be the value stored in the A register, which is 1.
@1
M=A-1;JEQ
4.4 Hack Language Specification
4.4.1 The Hack Machine Language
Two ways to express the same esmantics:
- Binary code
- Symbolic language
4.4.2 The A-instruction: Symbolic and Binary Syntax
Symbolic syntax: @value
Example: @21
Binary syntax: 0value
Example: 0000000000010101
(Set the A register to 21)
4.4.3 The C-instruction: Symbolic and Binary Syntax
Symbolic syntax: dest = comp; jump
Binary syntax: 1 1 1 a c1 c2 c3 c4 c5 c6 d1 d2 d3 j1 j2 j3
4.4.4 Hack Program
1 | // Compute RAM[1] = 1 + ... + RAM[0] |
4.5 Input / Output
4.5.1 Hack Computer Platform
High-level approach (Nand2Tetirs, Part II): sophisticated software libraries enabling text, graphics, animation, audio, video, etc.
Low-level approcah (Nand2Tetirs, Part I): Bits.
4.5.1 Hack Computer Plateform: Output
Screen memory map:
- A designated memory area, dedicated to manage a display unit
- The physical display is continuously refreshed from the memory map, many times per second
- Output is effected by writing code that manipulated the screen memory map
To set pixel (row, col) on/off:
word = Screen[32*row + col/16] = RAM[16384 + 32*row + col/16]
- Set the (col %16)th bit of
word
to1
or0
- Commit
word
to the RAM
4.5.2 Hack Computer Plateform: Input
Keyboard memory map (single 16-bit register):
When a key is pressed on the keyboard, the key's scan code appears in the keyboard memory map.
To check which key is currently pressed:
- Probe the content of the
Keyboard
chip - In the Hack computer: probe the contents of
RAM[24576]
4.6 Hack Programming (1/3)
4.6.1 Translating Symbolic Programs
Hack Assembler
We will develop a Hack assembler in week 6.
CPU Emulator
A convenient tool for debugging and executing Hack programs.
4.6.2 Overview
- 4.6
- Working with register and memory
- 4.7
- Branching
- Variables
- Iteration
- 4.8
- Pointers
- Input / Output
4.6.3 Working with Register and Memory
D
: data registerA
: address / data registerM
: the currently selected memory register,M=RAM[A]
Typical operation:
1 | // D=10 |
4.6.4 Hack Program Example
1 | // Program: Add2.asm |
4.6.5 Built-in Symbols
The Hack assembly language features built-in symbols:
Symbol | value |
---|---|
R0 | 0 |
R1 | 1 |
... | ... |
R15 | 15 |
SCREEN | 16384 |
KBD | 24576 |
Remaining symbols used in the implementation of the Hack virtual machine, discussed in Nand2Tetris, PartII:
Symbol | value |
---|---|
SP | 0 |
LCL | 1 |
ARG | 2 |
THIS | 3 |
THAT | 4 |
These symbols can be used to denote "virtual registers":
instead of:
1 | // RAM[5] = 15 |
=>
1 | // RAM[5] = 15 |
4.7 Hack Programming (2/3)
4.7.1 Branching
Example:
1 | // Program: Signum.asm |
With labels:
1 | // Program: Signum.asm |
@LABEL
translates to @n
, where n
is the instruction number following the (LABEL
) declaration.
4.7.2 Variables
Example:
1 | // Program: Flip.asm |
@temp
: "find some available memory register (say register n
), and use it to represent the variable temp
. So, from now on , each occurrence of @temp
in the program will be translated into @n
"
Variables are allocated to the RAM
from address 16 onward.
4.7.3 Iterative Processing
Example: compute 1 + 2 + … + n
Pseudo code:
1 | // Computes RAM[1] = 1 + 2 + ... + RAM[0] |
Hack assembly code:
1 | // Program: Sum1toN.asm |
4.8 Hack Programming (3/3)
4.8.1 Pointers
Example:
1 | // for (i=0; i<n; i++) { |
Pointers: variables that store memory addresses like arr
and i
.
Hack pointer logic: whenever we have to access memory using a pointer, we need an instruction like A=M
.
Typical pointer semantics: set the address register to the contents of some memory register.
4.8.2 Output
Example:
Task: draw a filled rectangle at the upper left corner of the screen, 16
pixels wide and RAM[0]
pixels long.
Pseudo code:
1 | // for (i=0; i<n; i++) { |
Hack assembly code:
1 | // Program: Rectangle.asm |
4.8.3 Input
To check which key is currently pressed:
- Read the contents of
RAM[24576]
(addressKBD
) - If the register contains 0, no key is pressed
- Otherwise, the register contains the scan code of the currently pressed key
4.9 Project 4 Overview
4.9.1 Mult
: a program performing R2 = R0 * R1
4.9.2 Fill
: a simple interactive program
Implementation strategy:
- Listen to the keyboard
- To blacken / clear the screen, write code that fills the entire screen memory map with either "white" or "black" pixels
- Addressing the memory requires working with pointers
Testing:
- Select "no animation"
- Manual testing
4.10 End Notes
4.10.1 Work Flow
4.10.2 IF logic
1 | // if condition { |
4.10.3 WHILE logic
1 | // while condition { |
4.10.4 Examples
1 | // a = 0 or a = 1 or a = -1 |
Week 5 Computer Architecture
5.1 Von Neumann Architecture
5.1.1 Von Neumann Machine
- Theory: Universal Turing Machine
- Parctice: von Neumann Architecture
5.1.2 Elements
- CPU
- ALU: loads information from the Data bus and manipulates it using the Control bits
- Registers: stores addresses and data
- Memory
- Data: put in an address of a data piece to be operated upon
- Program: put in an address of the next program instruction
5.2 The Fetch-Execute Cycle
5.2.1 Fetching
- Put the location of the next instruction into the "address" of the program memory
- Get the instruction code itself by reading the memory contents at that location
5.2.2 Executing
- The instruction code specifies "what to do"
- Execution involves accessing registers and/or data memory
5.2.3 Fetch-Execute Clash
Solution 1: Multiplex
Solution 2 (shortcut): Harvard Architecture - keep Program and Data in two separate memory modules
5.3 The HACK Central Processing Unit (CPU)
5.3.1 The Hack CPU: Abstraction
The Hack CPU:
- A 16-bit processor
- Execute the current instruction
- Figure out which instruction to execute next
5.3.2 Hack CPU Interface
- Inputs
- from Data Memory
inM
: data value
- from Instruction Memory
instruction
: value of instruction
- from the user
reset
: reset bit
- from Data Memory
- Outputs
- to Data Memory
outM
: data valuewriteM
: write to memory (yes/no)addressM
: memory address
- to Instruction Memory
pc
: address of next instruction
- to Data Memory
5.3.3 Hack CPU Implementation: Overview
(c = control bits)
5.3.4 Hack CPU Implementation (1/3): Instruction Handling
CPU handling of an A
-instruction:
- Decode the instruction into:
- op-code
- 15-bit value
- Store the value in the
A
-register - Output the value (not shown in this diagram but in the overall)
CPU handling of an C
-instruction:
- Decode the instruction into:
- op-code
- ALU control bits
- Destination load bits
- Jump bits
5.3.5 Hack CPU Implementation (2/3): ALU Operation
ALU data inputs:
- From the
D
-register - From the
A
-register - From the
M
-register
ALU control inputs (control bits):
- From the instruction
ALU data outputs (result of ALU calculation):
- To the
D
-register - To the
A
-register - To the
M
-register
(The feeding of the result is simultaneous, but which register actually depens on the instruction's destination bits.)
ALU control outputs:
- Negative output?
- Zero output?
5.3.5 Hack CPU Implementation (3/3): Control
Program Counter abstraction:
- Emits the address of the next instruction
- To start/restart the program's execution:
PC = 0
- No jump:
PC++
- (Unconditional) goto:
PC = A
- Conditional goto: if the condition is true,
PC = A
, otherwisePC++
5.4 The Hack Computer
5.4.1 The Hack Computer Abstraction
Hack: a computer capable of running programs written in the Hack machine language, built from the Hack chip-set
5.4.2 The Hack CPU
The CPU executes the instruction according to the Hack language specifications:
- If the instruction includes
D
andA
(e.g.D=D-A
), the respective values are read from, and/or written to, the CPU-residentD
-register andA
-register - If the instruction is
@x
(e.g.@17
), thenx
is stored in theA
-register; the value is emitted byaddressM
- If the instruction's right hand side includes
M
(e.g.M=M+1
), this value is read frominM
- If the instruction's left hand side includes
M
(e.g.M=M+1
), the ALU output is emitted byoutM
, andwriteM
bit is asserted - If the
reset==0
, the CPU logic uses the instruction's jump bits and the ALU's output to decide if there should be a jump (andpc
will emit the updated PC value)- If there is a jump:
pc
is set to the value of theA
-register - If there is no jump:
pc++
- If there is a jump:
- If the
reset==1
,pc=0
(causing a program restart)
5.4.3 The Hack Memory
Hack Memory abstraction:
- Address
0
~16383
: data memory- RAM: 16-bit / 16K RAM chip
- Address
16384
~24575
: screen memory map- Screen: 16-bit / 8K memory chip with a raster display side-effect
- Address
24576
: keyboard memory map- Keybord: 16-bit register with a keyboard side-effect
Decimal | Binary | |
---|---|---|
0 | 0000 0000 0000 0000 | RAM |
... | ... | RAM |
16383 | 0011 1111 1111 1111 | RAM |
16384 | 0100 0000 0000 0000 | Screen |
... | ... | Screen |
24575 | 0101 1111 1111 1111 | Screen |
24576 | 0110 0000 0000 0000 | Keyboard |
5.4.4 Instruction Memory (ROM)
To run a program on the Hack computer:
- Load the program into the ROM
- Hardware implement: plug-and-play ROM chips
- Hardware simulation: programs are stored in text files; program loading is emulated by the built-in ROM chip
- Press "reset"
5.4.5 The Hack Computer Implementation
5.5 Project 5 Overview
5.5.1 Hardware Projects
5.5.2 CPU
Implementation tips:
- Chip-parts:
Mux16
,AResgiter
,DRegister
,PC
,ALU
, ... - Control: use HDL subscripting to route instruction bits to the control bits of the relevant chip-parts
5.5.3 Memory
Implementation tip: realize the abstraction
5.5.4 ROM
Implementation tip: using the built-in RAM32K chip