readme.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277

# sanctuary (working title)

sanctuary is a 64-bit subroutine threaded forth for amd64 linux systems.

## stack effect notation

- `a`: memory address
- `c`: one byte value
- `n`: signed integer
- `u`: unsigned integer
- `?`: boolean flag
- `xt`: execution token
- `""`: string in input buffer
- `|`: 'or'

## Glossary

the following is a list of words available in this forth.

### `!   ( u a -- )`
store the 64 bit value u into the memory address a.

### `#tib   ( -- a )`
variable containing the amount of characters in the input buffer.

### `(header)   ( a u -- xt )`
create a dictionary header for a word named the provided string.
this word does not set the code field.
this word returns an incompleted xt and does not update latest.

### `(   ( -- ) IMMEDIATE`
start a comment which lasts until the next closed bracket.
if the unclosed bracket in the description above bothers you,
have a closing bracket: ).

### `*   ( u1 u2 -- u)`
multiply u1 and u2.

### `*/mod   ( n1 n2 n3 -- n4 n5 )`
multiply n1 and n2, divide the result by n3.
remainder is in n3, result is in n4

### `+   ( u1 u2 -- u )`
add u2 to u1.

### `+!   ( a -- )`
add one to the value at memory address a.

### `,   ( u -- )`
write a 64 bit value to user memory and increment the user memory pointer.

### `-   ( u1 u2 -- u )`
subtract u2 from u1.

### `-!   ( a -- )`
subtract one from the value at memory address a.

### `-rot   ( u1 u2 u3 -- u3 u1 u2 )`
rotate the three topmost values on the stack so that the topmost value
is moved to the third highest.

### `/mod   ( u1 u2 -- u3 u4 )`
divide u1 by u2. result is in u4, remainder is in u3.

### `[   ( -- ) IMMEDIATE`
set the system to interpret mode.

### `]   ( -- ) IMMEDIATE`
set the system to compiling mode.

### `:   ( "name" -- )`
start compilation of the word 'name'.

### `;   ( -- ) IMMEDIATE`
end compilation of the currently compiling word.

### `@   ( a -- u )`
fetch the 64 bit value at memory address a.

### `>body   ( xt -- a )`
yield the code field of xt.

### `>in   ( -- a )`
variable containing the index of the first unparsed character
in the input buffer.

### `>r   ( u -- ) ( R: -- u )`
move a value from the working stack to the return stack.

### `1+   ( u -- u')`
add one to u.

### `1-   ( u -- u')`
subtract one from u.

### `2drop  ( u1 u2 -- )`
remove the two topmost values from the stack.

### `2dup   ( u1 u2 -- u1 u2 u1 u2 )`
duplicate the two topmost values on the stack.

### `and   ( u1 u2 -- u )`
perform bitwise AND on u1 and u2.

### `brk@   ( -- a )`
yields current program break.

### `bye   ( -- )`
exits the forth system.

### `c,   ( c -- )`
write an 8 bit value to user memory and increment the user memory pointer.

### `c!   ( u a -- )`
store the 8 bit value u into the memory address a.

### `c@   ( a -- c )`
fetch the 8 bit value at memory address a.

### `char   ( "c" -- c )`
yield the value of the first character of the next word in the input stream.

### `cmove   ( a1 a2 u -- )`
copy u bytes of memory from a1 to a2.
bytes are copied in low memory to high memory order.

### `cmove>   ( a1 a2 u -- )`
copy u bytes of memory from a1 to a2.
bytes are copied in high memory to low memory order.

### `dp   ( -- a )`
a variable that contains the lowest free byte of memory in user memory.

### `dp0   ( -- )`
a variable that contains the first byte of user memory.

### `dp$   ( -- )`
a variable that contains the last available byte of user memory.

### `drop   ( u -- )`
remove the value at the top of the stack.

### `dup   ( u -- u u )`
duplicate the value at the top of the stack.

### `executable   ( a u -- )`
marks the u bytes starting at address a as executable.
this is used primarily to mark the program break,
which is used as the user memory space.

### `find   ( a u -- a u 0 | xt -1 )`
look in the dictionary for the word a (of u characters).
a zero is returned along with the original given string
if no word was found. if a word was found,
its xt is returned along with the true flag.

### `grow   ( u -- )`
grows, and marks as executable, the user memory space by u bytes.

### `here   ( -- a )`
yields the address of the first available byte in user memory.

### `immediate   ( -- )`
mark the most recently defined word as immediate.

### `immediate?   ( xt -- ? )`
true if xt is marked immediate, false otherwise.

### `interpret   ( -- )`
interprets the contents of the terminal input buffer
until it runs out.

### `invert   ( u -- u')`
invert all bytes in u.

### `latest   ( -- a )`
a variable containing the execution token of
the most recently created word.

### `literal   ( n -- ) IMMEDIATE COMPILE-ONLY`
compile a push of the literal value n into the currently compiling word.

### `number   ( a u -- n -1 | 0 )`
convert given string into a number along with a flag.
if parsing a number fails then 0 (false) is returned
and no number is provided.

### `or  ( u1 u2 -- u )`
perform bitwise OR on u1 and u2.

### `parse   ( "name<c>" c -- a u )`
parse one word from the input buffer,
separated by a newline or the character c,
and return as a string.

### `parse-name   ( "<ws>name<ws>" -- a u )`
parse one whitespace-separated word from the input buffer,
and return as a string.
tabs (ascii 0x09), newlines (ascii 0x10), and spaces (ascii 0x20)
are considered whitespace.

### `r>   ( -- u ) ( R: u -- )`
move a value from the return stack to the working stack.

### `rdrop   ( R: u -- )`
remove the value at the top of the return stack.

### `rot   ( u1 u2 u3 -- u2 u3 u1 )`
rotate the top three values on the stack so that the third highest value is moved to the top.

### `smudge   ( -- )`
toggles the smudge bit on the xt in latest.

### `state   ( -- a )`
a variable containing a boolean value.
if 0 (false), the system is in interpreting mode,
if -1 (true), the system is in compiling mode.

### `swap  ( u1 u2 -- u2 u1 )`
swap the two topmost values on the stack.

### `syscall0   ( rax -- u )`
perform the syscall with the id in `rax`,
and push the value of the `rax` register to the stack.

### `syscall1   ( rdi rax -- u )`
perform the syscall with the id in `rax`,
taking one parameter placed in `rdi`,
and push the value of the `rax` register to the stack.

### `syscall2   ( rsi rdi rax -- u )`
perform the syscall with the id in `rax`,
taking two parameters placed in `rdi` and `rsi`,
and push the value of the `rax` register to the stack.

### `syscall3   ( rdx rsi rdi rax -- u )`
perform the syscall with the id in `rax`,
taking three parameters placed in `rdi`, `rsi` and `rdx`,
and push the value of the `rax` register to the stack.

### `tib   ( -- a )`
a variable containing the address of the current input buffer.

### `type   ( a u -- )`
write u characters at a to output.

### `over   ( u1 u2 -- u1 u2 u1 )`
copy the second-highest value on the stack and move it to the top of the stack.

### `xor   ( u1 u2 -- u )`
perform bitwise XOR on u1 and u2.

## dictionary format

note that the string length of one byte limits a word's name to 255 characters.

| field | size |
| :---- | :--- |
| link to previous word | 8 bytes |
| flag field | 1 byte |
| string length | 1 byte |
| string | <256 bytes |
| code | variable length |

## reserved registers

the register `r15` is reserved for the parameter stack pointer.

## differences from standard forth

for the most part this forth intends to be in line with standards
but it diverges in a few notable places:

- the most visally obvious one by far,
    this forth uses lower case word names for core words.
- `find` takes `a u` instead of a counted string,
    and does not return 1 for immediate words.