summaryrefslogtreecommitdiff
path: root/readme.md
blob: 7b813cdb7984da7c9c48533d656c35c02ea66591 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
# sanctuary (working title)

sanctuary is a 64-bit subroutine threaded forth for amd64 linux systems.

## stack effect notation
labels outside of the ones listed here are specific to a certain word's
documentation and will be obvious or documented in the description.

- `a`: memory address
- `c`: one byte value
- `n`: signed integer
- `u`: unsigned integer
- `?`: boolean flag
- `xt`: execution token
- `ht`: header token
- `""`: string in input buffer
- `|`: 'or'
- `,`: used to separate multiple stack effects when multiple are needed

## Glossary

the following is a list of words available in this forth.

### `!   ( u a -- )`
store the 64 bit value u into the memory address a.

### `#tib   ( -- a )`
variable containing the amount of characters in the input buffer.

### `'   ( "word" -- xt )`
read a word from the input buffer,
push to the stack its execution token.

### `'h   ( "word" -- ht )`
read a word from the input buffer,
push to the stack its header token.

### `(   ( -- ) IMMEDIATE`
start a comment which lasts until the next closed bracket.
if the unclosed bracket in the description above bothers you,
have a closing bracket: ).

### `(0handler)   ( -- )`
the very early error handler, which simply quits the program.

### `(create)   ( -- )`
the default behaviour of a word made by `create`,
which simply pushes the address following the definition to the stack.
this messes with the return stack and is not meant to be called
outside of its specific context.

### `(does>)   ( -- )`
run non-default behaviour of a `create`d word.
pushes the data location onto the stack and calls the word
immediately following the `(does>)` call.
this messes with the return stack and is not meant to be called
outside of its specific context.

### `(header)   ( a u -- ht )`
create a dictionary header for a word named the provided string.
this word does not set the code field.
this word does not update latest.

### `(hide)   ( ht -- )`
set the smudge bit on the header ht.

### `*   ( u1 u2 -- u )`
multiply u1 and u2.

### `*/mod   ( n1 n2 n3 -- n4 n5 )`
multiply n1 and n2, divide the result by n3.
remainder is in n4, result is in n5.

### `+   ( u1 u2 -- u )`
add u2 to u1.

### `+!   ( a -- )`
add one to the value at memory address a.

### `+to   ( comp: "name" -- | intr: u "name" -- ) IMMEDIATE`
compile or execute (depending on `state`) code to add u to the contents
of a `value`. (in compile mode u is whatever was on the stack already.)

### `,   ( u -- )`
write a 64 bit value to user memory and increment the user memory pointer.

### `-   ( u1 u2 -- u )`
subtract u2 from u1.

### `-!   ( a -- )`
subtract one from the value at memory address a.

### `-to   ( comp: "name" -- | intr: u "name" -- ) IMMEDIATE`
compile or execute (depending on `state`) code to subtract u from the contents
of a `value`. (in compile mode u is whatever was on the stack already.)

### `-rot   ( u1 u2 u3 -- u3 u1 u2 )`
rotate the three topmost values on the stack so that the topmost value
is moved to the third highest.

### `/mod   ( u1 u2 -- u3 u4 )`
divide u1 by u2. result is in u4, remainder is in u3.

### `[   ( -- ) IMMEDIATE`
set the system to interpret mode.

### `[']   ( "word" -- ) IMMEDIATE COMPILE-ONLY`
read a word from the input buffer,
compile into the current definition a stack push of the xt of the word.

### `[compile]   ( "word" -- ) IMMEDIATE COMPILE-ONLY`
compile into the current definition a call to a normally immediate word.

### `]   ( -- ) IMMEDIATE`
set the system to compiling mode.

### `:   ( "name" -- )`
start compilation of the word 'name'.

### `;   ( -- ) IMMEDIATE`
end compilation of the currently compiling word.

### `\   ( -- ) IMMEDIATE`
start a comment that lasts until the end of the current line.

### `@   ( a -- u )`
fetch the 64 bit value at memory address a.

### `=   ( n1 n2 -- ? )`
return true if n1 and n2 are equal.

### `<   ( n1 n2 -- ? )`
return true if n1 is less than n2.

### `<=   ( n1 n2 -- ? )`
return true if n1 is less than or equal to n2.

### `<>   ( n1 n2 -- ? )`
return true if n1 and n2 are not equal.

### `<mark   ( -- a )`
mark the destination of a backward branch.

### `<resolve   ( a -- )`
mark the source of a backward branch.

### `>   ( n1 n2 -- ? )`
return true if n1 is greater than n2.

### `>=   ( n1 n2 -- ? )`
return true if n1 is greater than or equal to n2.

### `>body   ( ht -- xt )`
yield the code field of header token.

### `>in   ( -- a )`
variable containing the index of the first unparsed character
in the input buffer.

### `>mark   ( -- a )`
mark the source of a forward branch.

### `>r   ( u -- ) ( R: -- u )`
move a value from the working stack to the return stack.

### `>resolve   ( a -- )`
mark the destination of a forward branch.

### `?branch   ( -- )`
compile into user memory an incomplete conditional branch.
if the value on the stack is zero the branch is taken.
a 32 bit branch offset must be written immediately after.

### `?dup   ( n -- 0 | n n )`
if n is not zero, perform `dup`.

### `?find   ( a u -- ht )`
look in the dictionary for the word a (of u characters).
if a word was found,
its link field address is returned along with the true flag.
if no word was found or the string is of length zero, abort.

### `0=   ( n -- ? )`
return true if n is equal to zero.

### `0<   ( n -- ? )`
return true if n is less than zero.

### `0<=   ( n -- ? )`
return true if n is less than or equal to zero.

### `0<>   ( n -- ? )`
return true if n is not equal to zero.

### `0>   ( n -- ? )`
return true if n is greater than zero.

### `0>=   ( n -- ? )`
return true if n is greater than or equal to zero.

### `1+   ( u -- u')`
add one to u.

### `1-   ( u -- u')`
subtract one from u.

### `2drop  ( u1 u2 -- )`
remove the two topmost values from the stack.

### `2dup   ( u1 u2 -- u1 u2 u1 u2 )`
duplicate the two topmost values on the stack.

### `abort   ( -- )`
call the error handler
(the address of which is in the variable `handler`)

### `again   ( -- ) IMMEDIATE COMPILE-ONLY`
complete an infinite loop began by the word `begin`.

### `allot   ( u -- )`
reserve u bytes of user memory.

### `and   ( u1 u2 -- u )`
perform bitwise AND on u1 and u2.

### `base   ( -- a )`
a variable containing the current numeric input/output base.
by default this is 10.

### `begin   ( -- ) IMMEDIATE COMPILE-ONLY`
mark the beginning of a begin-again, begin-until,
or begin-while-repeat loop.

### `binary   ( -- )`
set current base to binary.

### `branch   ( -- )`
compile into user memory an incomplete branch.
a 32 bit branch offset must be written immediately after.

### `brk@   ( -- a )`
yields current program break.

### `bye   ( -- )`
exits the forth system.

### `c,   ( c -- )`
write an 8 bit value to user memory and increment the user memory pointer.

### `c!   ( u a -- )`
store the 8 bit value u into the memory address a.

### `c@   ( a -- c )`
fetch the 8 bit value at memory address a.

### `cell+   ( u -- u' )`
increment u by the size of one cell.

### `cell-   ( u -- u' )`
decrement u by the size of one cell.

### `cells   ( u -- u' )`
transform an amount of cells into an amount of bytes.

### `char   ( "c" -- c )`
yield the value of the first character of the next word in the input stream.

### `cmove   ( a1 a2 u -- )`
copy u bytes of memory from a1 to a2.
bytes are copied in low memory to high memory order.

### `cmove,   ( a u -- )`
copy u bytes of memory from a1 to `here`, then increment `here` appropriately.
bytes are copied in low memory to high memory order.

### `cmove>   ( a1 a2 u -- )`
copy u bytes of memory from a1 to a2.
bytes are copied in high memory to low memory order.

### `compile,   ( xt -- )`
compile a call to xt into user memory.

### `compile-only   ( -- )`
mark the most recently defined word as compile-only.

### `compile-only?   ( ht -- ? )`
true if ht is marked compile-only, false otherwise.

### `constant   ( u "name" -- )`
create a word that pushes a cell value u to the stack.

### `create   ( "name" -- )`
create a word in the dictionary that, by default,
pushes the address directly following the header to the stack.
this behaviour can be modified with `does>`.

### `d,   ( n -- )`
write a 32 bit value to user memory and increment the user memory pointer.

### `d!   ( u a -- )`
store the 32 bit value u into the memory address a.

### `decimal ( -- )`
set current base to decimal.

### `does>   ( -- )`
modify the behaviour of the most recent `create`d word.
(non-`create`d words will be corrupted.)

### `dp   ( -- a )`
a variable that contains the lowest free byte of memory in user memory.

### `dp0   ( -- )`
a variable that contains the first byte of user memory.

### `dp$   ( -- )`
a variable that contains the last available byte of user memory.

### `drop   ( u -- )`
remove the value at the top of the stack.

### `dup   ( u -- u u )`
duplicate the value at the top of the stack.

### `else   ( -- ) IMMEDIATE COMPILE-ONLY`
update the current if statement to branch here
when the flag is false,
and skip to `then` if the corresponding `if` was true.

### `emit   ( c -- )`
print the single character c to output.

### `executable   ( a u -- )`
marks the u bytes starting at address a as executable.
this is used primarily to mark the program break,
which is used as the user memory space.

### `execute   ( xt -- )`
call the word xt.

### `false   ( -- u )`
a cell with no bits set.

### `find   ( a u -- a u 0 | a -1 )`
look in the dictionary for the word a (of u characters).
a zero is returned along with the original given string
if no word was found. if a word was found,
its link field address is returned along with the true flag.

### `grow   ( u -- )`
grows, and marks as executable, the user memory space by u bytes.

### `handler   ( -- a )`
variable containing the address of the current error handler.

### `here   ( -- a )`
yields the address of the first available byte in user memory.

### `hex   ( -- )`
set current base to hexadecimal.

### `hide   ( "word" -- )`
set the smudge bit on the given word.

### `hijacks   ( xt "word" -- )`
'hijack' an existing definition to perform the action of xt.
this word *will* corrupt the dictionary if used outside
its very specific context (replacing core assembly words
with better versions in forth), so it should be avoided
in favour of `defer` and friends.

### `if   ( ? -- ) IMMEDIATE COMPILE-ONLY`
if the flag is true, execute the following if statement,
terminated by `else` or `then`.

### `immediate   ( -- )`
mark the most recently defined word as immediate.

### `immediate?   ( ht -- ? )`
true if ht is marked immediate, false otherwise.

### `interpret   ( -- )`
interprets the contents of the terminal input buffer
until it runs out.

### `invert   ( u -- u' )`
invert all bytes in u.

### `latest   ( -- a )`
a variable containing the execution token of
the most recently created word.

### `literal   ( n -- ) IMMEDIATE COMPILE-ONLY`
compile a push of the literal value n into the currently compiling word.

### `mmap   ( offset fd flags prot u a -- u ) `
perform a mmap(2) system call.

### `munmap   ( u a -- u ) `
perform a munmap(2) system call.

### `nip   ( u1 u2 -- u2 )`
drop the second-highest value from the stack.

### `number   ( a u -- n -1 | 0 )`
convert given string into a number along with a flag.
if parsing a number fails then 0 (false) is returned
and no number is provided.

### `octal   ( -- )`
set current base to octal.

### `or  ( u1 u2 -- u )`
perform bitwise OR on u1 and u2.

### `over   ( u1 u2 -- u1 u2 u1 )`
copy the second-highest value on the stack and move it to the top of the stack.

### `parse   ( "name<c>" c -- a u )`
parse one word from the input buffer,
separated by a newline or the character c,
and return as a string.

### `parse-name   ( "<ws>name<ws>" -- a u )`
parse one whitespace-separated word from the input buffer,
and return as a string.
tabs (ascii 0x09), newlines (ascii 0x10), and spaces (ascii 0x20)
are considered whitespace.

### `postpone   ( "name" -- ) IMMEDIATE COMPILE-ONLY`
compile the execution behaviour of a word into the current definition.
if the word is immediate, that will execute the word at runtime
(like `[compile]`).
if the word is not immediate, this will compile code that compiles that word.

### `private{   ( -- )`
mark the start of a private section closed by `}private`
and activated with `privatise`.

### `}private   ( -- )`
mark the end of a private section opened by `private{`
and activated with `privatise`.

### `privatise   ( -- )`
activate a private section.

### `r>   ( -- u ) ( R: u -- )`
move a value from the return stack to the working stack.

### `rdrop   ( R: u -- )`
remove the value at the top of the return stack.

### `repeat   ( -- ) IMMEDIATE COMPILE-ONLY`
in a begin-while-repeat loop, loop back to the condition.

### `rot   ( u1 u2 u3 -- u2 u3 u1 )`
rotate the top three values on the stack so that the third highest value is moved to the top.

### `rp   ( -- a )`
yield the address of the return pointer.
note that the address points to the return stack *before*
this word was called.

### `s"   ( "string" -- , COMPILES: -- a u ) IMMEDIATE COMPILE-ONLY`
compile into the definition code to push the given string,
terminated by a double quote.
the string data and length are stored inline in the definition.

### `s>z   ( a u -- a )`
compile into user memory a copy of the given regular string
converted to a null-terminated string.

### `smudge   ( -- )`
toggles the smudge bit on the xt in latest.

### `sp   ( -- a )`
yield the address of the stack pointer.
note that the address points to the stack *before*
this value is pushed.

### `state   ( -- a )`
a variable containing a boolean value.
if 0 (false), the system is in interpreting mode,
if -1 (true), the system is in compiling mode.

### `stderr   ( -- 2 )`
push the file descriptor of stderr to the stack.

### `stdin   ( -- 0 )`
push the file descriptor of stdin to the stack.

### `stdout   ( -- 1 )`
push the file descriptor of stdout to the stack.

### `swap   ( u1 u2 -- u2 u1 )`
swap the two topmost values on the stack.

### `sys-read   ( u a fd -- n )`
perform a `read(2)` system call, reading into the buffer `u a`
from file descriptor `fd`.
n is the resulting value of the register `rax`.

### `sys-write   ( u a fd -- n )`
perform a `write(2)` system call, writing the string `u a`
to file descriptor `fd`.
n is the resulting value of the register `rax`.

### `syscall0   ( rax -- u )`
perform the syscall with the id in `rax`,
and push the value of the `rax` register to the stack.

### `syscall1   ( rdi rax -- u )`
perform the syscall with the id in `rax`,
taking one parameter placed in `rdi`,
and push the value of the `rax` register to the stack.

### `syscall2   ( rsi rdi rax -- u )`
perform the syscall with the id in `rax`,
taking two parameters placed in `rdi` and `rsi`,
and push the value of the `rax` register to the stack.

### `syscall3   ( rdx rsi rdi rax -- u )`
perform the syscall with the id in `rax`,
taking three parameters placed in `rdi`, `rsi` and `rdx`,
and push the value of the `rax` register to the stack.

### `syscall4   ( r10 rdx rsi rdi rax -- u )`
perform the syscall with the id in `rax`,
taking four parameters placed in `rdi`, `rsi`, `rdx` and `r10`,
and push the value of the `rax` register to the stack.

### `syscall5   ( r8 r10 rdx rsi rdi rax -- u )`
perform the syscall with the id in `rax`,
taking five parameters placed in `rdi`, `rsi`, `rdx`, `r10` and `r8`,
and push the value of the `rax` register to the stack.

### `syscall6   ( r9 r8 r10 rdx rsi rdi rax -- u )`
perform the syscall with the id in `rax`,
taking six parameters placed in `rdi`, `rsi`, `rdx`, `r10`, `r8` and `r9`,
and push the value of the `rax` register to the stack.

### `then   ( -- ) IMMEDIATE COMPILE-ONLY`
conclude an if statement.

### `tib   ( -- a )`
a variable containing the address of the current input buffer.

### `to   ( comp: "name" -- | intr: u "name" -- ) IMMEDIATE`
compile or execute (depending on `state`) code to modify the contents
of a `value`.

### `true   ( -- u )`
a cell with all bits set.

### `tuck   ( u1 u2 -- u2 u1 u2 )`
place a copy of the highest value on the stack
below the second highest value on the stack.

### `type   ( a u -- )`
write u characters at a to output.

### `u<   ( u1 u2 -- ? )`
return true if u1 is less than u2.

### `u<=   ( u1 u2 -- ? )`
return true if u1 is less than or equal to u2.

### `u<>   ( u1 u2 -- ? )`
return true if u1 and u2 are not equal.

### `u>   ( u1 u2 -- ? )`
return true if u1 is greater than u2.

### `u>=   ( u1 u2 -- ? )`
return true if u1 is greater than or equal to u2.

### `until   ( ? -- ) IMMEDIATE COMPILE-ONLY`
if the given flag is true, loop back to `begin`.

### `value   ( u "name" -- )`
create a value called name, the initial value of which is u.

### `variable   ( "name" -- )`
create a variable word, which yields an address that can be written and read.

### `while   ( ? -- ) IMMEDIAT COMPILE-ONLYE`
if given flag is true, continue the current begin-while-repeat loop,
otherwise branch to after.

### `xor   ( u1 u2 -- u )`
perform bitwise XOR on u1 and u2.

### `z"   ( "string" -- , COMPILES: -- a ) IMMEDIATE COMPILE-ONLY`
compile into the definition code to push the given string,
terminated by a double quote.
the string is null terminated and does not store a length;
this is meant for interfacing with the linux system.

### `zstrlen   ( a -- u )`
the length of a null terminated string in bytes.
the ending null byte is not counted.

## dictionary format

note that the string length of one byte limits a word's name to 255 characters.

| field | size |
| :---- | :--- |
| link to previous word | 8 bytes |
| flag field | 1 byte |
| string length | 1 byte |
| string | <256 bytes |
| code | variable length |

## reserved registers

the register `r15` is reserved for the parameter stack pointer.

## differences from standard forth

for the most part this forth intends to be in line with standards
but it diverges in a few notable places:

- the most visally obvious one by far,
    this forth uses lower case word names for core words.
- `find` takes `a u` instead of a counted string,
    and does not return 1 for immediate words.