readme.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201

# sanctuary (working title)

sanctuary is a 64-bit subroutine threaded forth system
for amd64 linux systems.

## stack effect notation

- `a`: memory address
- `c`: one byte value
- `n`: signed integer
- `u`: unsigned integer
- `?`: boolean flag
- `xt`: execution token
- `""`: string in input buffer
- `|`: 'or'

## Glossary

the following is a list of words available in this forth.

### `#tib   ( -- a )`
variable containing the amount of characters in the input buffer.

### `(header)   ( a u -- xt )`
create a dictionary header for a word named the provided string.
this word does not set the code field.
this word returns an incompleted xt and does not update latest.

### `[   ( -- ) IMMEDIATE`
set the system to interpret mode.

### `]   ( -- ) IMMEDIATE`
set the system to compiling mode.

### `:   ( "name" -- )`
start compilation of the word 'name'.

### `;   ( -- ) IMMEDIATE`
end compilation of the currently compiling word.

### `-rot   ( u1 u2 u3 -- u3 u1 u2 )`
rotate the three topmost values on the stack so that the topmost value
is moved to the third highest.

### `>body   ( xt -- a )`
yield the code field of xt.

### `>in   ( -- a )`
variable containing the index of the first unparsed character
in the input buffer.

### `>r   ( u -- ) ( R: -- u )`
move a value from the working stack to the return stack.

### `2drop  ( u1 u2 -- )`
remove the two topmost values from the stack.

### `2dup   ( u1 u2 -- u1 u2 u1 u2 )`
duplicate the two topmost values on the stack.

### `brk@   ( -- a )`
yields current program break.

### `bye   ( -- )`
exits the forth system.

### `dp   ( -- a )`
a variable that contains the lowest free byte of memory in user memory.

### `dp0   ( -- )`
a variable that contains the first byte of user memory.

### `dp$   ( -- )`
a variable that contains the last available byte of user memory.

### `drop   ( u -- )`
remove the value at the top of the stack.

### `dup   ( u -- u u )`
duplicate the value at the top of the stack.

### `executable   ( a u -- )`
marks the u bytes starting at address a as executable.
this is used primarily to mark the program break,
which is used as the user memory space.

### `find   ( a u -- a u 0 | xt -1 )`
look in the dictionary for the word a (of u characters).
a zero is returned along with the original given string
if no word was found. if a word was found,
its xt is returned along with the true flag.

### `grow   ( u -- )`
grows, and marks as executable, the user memory space by u bytes.

### `here   ( -- a )`
yields the address of the first available byte in user memory.

### `immediate?   ( xt -- ? )`
true if xt is marked immediate, false otherwise.

### `interpret   ( -- )`
interprets the contents of the terminal input buffer
until it runs out.

### `latest   ( -- a )`
a variable containing the execution token of
the most recently created word.

### `literal   ( n -- ) IMMEDIATE COMPILE-ONLY`
compile a push of the literal value n into the currently compiling word.

### `number   ( a u -- n -1 | 0 )`
convert given string into a number along with a flag.
if parsing a number fails then 0 (false) is returned
and no number is provided.

### `parse   ( "name<c>" c -- a u )`
parse one word from the input buffer,
separated by a newline or the character c,
and return as a string.

### `parse-name   ( "<ws>name<ws>" -- a u )`
parse one whitespace-separated word from the input buffer,
and return as a string.
tabs (ascii 0x09), newlines (ascii 0x10), and spaces (ascii 0x20)
are considered whitespace.

### `r>   ( -- u ) ( R: u -- )`
move a value from the return stack to the working stack.

### `rdrop   ( R: u -- )`
remove the value at the top of the return stack.

### `rot   ( u1 u2 u3 -- u2 u3 u1 )`
rotate the top three values on the stack so that the third highest value is moved to the top.

### `smudge   ( -- )`
toggles the smudge bit on the xt in latest.

### `state   ( -- a )`
a variable containing a boolean value.
if 0 (false), the system is in interpreting mode,
if -1 (true), the system is in compiling mode.

### `swap  ( u1 u2 -- u2 u1 )`
swap the two topmost values on the stack.

### `syscall0   ( rax -- u )`
perform the syscall with the id in `rax`,
and push the value of the `rax` register to the stack.

### `syscall1   ( rdi rax -- u )`
perform the syscall with the id in `rax`,
taking one parameter placed in `rdi`,
and push the value of the `rax` register to the stack.

### `syscall2   ( rsi rdi rax -- u )`
perform the syscall with the id in `rax`,
taking two parameters placed in `rdi` and `rsi`,
and push the value of the `rax` register to the stack.

### `syscall3   ( rdx rsi rdi rax -- u )`
perform the syscall with the id in `rax`,
taking three parameters placed in `rdi`, `rsi` and `rdx`,
and push the value of the `rax` register to the stack.

### `tib   ( -- a )`
a variable containing the address of the current input buffer.

### `type   ( a u -- )`
write u characters at a to output.

### `over   ( u1 u2 -- u1 u2 u1 )`
copy the second-highest value on the stack and move it to the top of the stack.

## dictionary format

note that the string length of one byte limits a word's name to 255 characters.

| field | size |
| :---- | :--- |
| link to previous word | 8 bytes |
| flag field | 1 byte |
| string length | 1 byte |
| string | <256 bytes |
| code | variable length |

## reserved registers

the register `r15` is reserved for the parameter stack pointer.

## differences from standard forth

for the most part this forth intends to be in line with standards
but it diverges in a few notable places:

- the most visally obvious one by far,
    this forth uses lower case word names for core words.
- `find` takes `a u` instead of a counted string,
    and does not return 1 for immediate words.