# sanctuary (working title) sanctuary is a 64-bit subroutine threaded forth for amd64 linux systems. ## stack effect notation - `a`: memory address - `c`: one byte value - `n`: signed integer - `u`: unsigned integer - `?`: boolean flag - `xt`: execution token - `ht`: header token - `""`: string in input buffer - `|`: 'or' ## Glossary the following is a list of words available in this forth. ### `! ( u a -- )` store the 64 bit value u into the memory address a. ### `#tib ( -- a )` variable containing the amount of characters in the input buffer. ### `' ( "word" -- xt )` read a word from the input buffer, push to the stack its execution token. ### `'h ( "word" -- ht )` read a word from the input buffer, push to the stack its header token. ### `( ( -- ) IMMEDIATE` start a comment which lasts until the next closed bracket. if the unclosed bracket in the description above bothers you, have a closing bracket: ). ### `(0handler) ( -- )` the very early error handler, which simply quits the program. ### `(create) ( -- )` the default behaviour of a word made by `create`, which simply pushes the address following the definition to the stack. this messes with the return stack and is not meant to be called outside of its specific context. ### `(does>) ( -- )` run non-default behaviour of a `create`d word. pushes the data location onto the stack and calls the word immediately following the `(does>)` call. this messes with the return stack and is not meant to be called outside of its specific context. ### `(header) ( a u -- ht )` create a dictionary header for a word named the provided string. this word does not set the code field. this word does not update latest. ### `* ( u1 u2 -- u )` multiply u1 and u2. ### `*/mod ( n1 n2 n3 -- n4 n5 )` multiply n1 and n2, divide the result by n3. remainder is in n4, result is in n5. ### `+ ( u1 u2 -- u )` add u2 to u1. ### `+! ( a -- )` add one to the value at memory address a. ### `+to ( comp: "name" -- | intr: u "name" -- ) IMMEDIATE` compile or execute (depending on `state`) code to add u to the contents of a `value`. (in compile mode u is whatever was on the stack already.) ### `, ( u -- )` write a 64 bit value to user memory and increment the user memory pointer. ### `- ( u1 u2 -- u )` subtract u2 from u1. ### `-! ( a -- )` subtract one from the value at memory address a. ### `-to ( comp: "name" -- | intr: u "name" -- ) IMMEDIATE` compile or execute (depending on `state`) code to subtract u from the contents of a `value`. (in compile mode u is whatever was on the stack already.) ### `-rot ( u1 u2 u3 -- u3 u1 u2 )` rotate the three topmost values on the stack so that the topmost value is moved to the third highest. ### `/mod ( u1 u2 -- u3 u4 )` divide u1 by u2. result is in u4, remainder is in u3. ### `[ ( -- ) IMMEDIATE` set the system to interpret mode. ### `['] ( "word" -- ) IMMEDIATE COMPILE-ONLY` read a word from the input buffer, compile into the current definition a stack push of the xt of the word. ### `[compile] ( "word" -- ) IMMEDIATE COMPILE-ONLY` compile into the current definition a call to a normally immediate word. ### `] ( -- ) IMMEDIATE` set the system to compiling mode. ### `: ( "name" -- )` start compilation of the word 'name'. ### `; ( -- ) IMMEDIATE` end compilation of the currently compiling word. ### `\ ( -- ) IMMEDIATE` start a comment that lasts until the end of the current line. ### `@ ( a -- u )` fetch the 64 bit value at memory address a. ### `= ( n1 n2 -- ? )` return true if n1 and n2 are equal. ### `< ( n1 n2 -- ? )` return true if n1 is less than n2. ### `<= ( n1 n2 -- ? )` return true if n1 is less than or equal to n2. ### `<> ( n1 n2 -- ? )` return true if n1 and n2 are not equal. ### ` ( n1 n2 -- ? )` return true if n1 is greater than n2. ### `>= ( n1 n2 -- ? )` return true if n1 is greater than or equal to n2. ### `>body ( ht -- xt )` yield the code field of header token. ### `>in ( -- a )` variable containing the index of the first unparsed character in the input buffer. ### `>mark ( -- a )` mark the source of a forward branch. ### `>r ( u -- ) ( R: -- u )` move a value from the working stack to the return stack. ### `>resolve ( a -- )` mark the destination of a forward branch. ### `?branch ( -- )` compile into user memory an incomplete conditional branch. if the value on the stack is zero the branch is taken. a 32 bit branch offset must be written immediately after. ### `?dup ( n -- 0 | n n )` if n is not zero, perform `dup`. ### `?find ( a u -- ht )` look in the dictionary for the word a (of u characters). if a word was found, its link field address is returned along with the true flag. if no word was found or the string is of length zero, abort. ### `0= ( n -- ? )` return true if n is equal to zero. ### `0< ( n -- ? )` return true if n is less than zero. ### `0<= ( n -- ? )` return true if n is less than or equal to zero. ### `0<> ( n -- ? )` return true if n is not equal to zero. ### `0> ( n -- ? )` return true if n is greater than zero. ### `0>= ( n -- ? )` return true if n is greater than or equal to zero. ### `1+ ( u -- u')` add one to u. ### `1- ( u -- u')` subtract one from u. ### `2drop ( u1 u2 -- )` remove the two topmost values from the stack. ### `2dup ( u1 u2 -- u1 u2 u1 u2 )` duplicate the two topmost values on the stack. ### `abort ( -- )` call the error handler (the address of which is in the variable `handler`) ### `again ( -- ) IMMEDIATE COMPILE-ONLY` complete an infinite loop began by the word `begin`. ### `allot ( u -- )` reserve u bytes of user memory. ### `and ( u1 u2 -- u )` perform bitwise AND on u1 and u2. ### `base ( -- a )` a variable containing the current numeric input/output base. by default this is 10. ### `begin ( -- ) IMMEDIATE COMPILE-ONLY` mark the beginning of a begin-again, begin-until, or begin-while-repeat loop. ### `binary ( -- )` set current base to binary. ### `branch ( -- )` compile into user memory an incomplete branch. a 32 bit branch offset must be written immediately after. ### `brk@ ( -- a )` yields current program break. ### `bye ( -- )` exits the forth system. ### `c, ( c -- )` write an 8 bit value to user memory and increment the user memory pointer. ### `c! ( u a -- )` store the 8 bit value u into the memory address a. ### `c@ ( a -- c )` fetch the 8 bit value at memory address a. ### `cell+ ( u -- u' )` increment u by the size of one cell. ### `cell- ( u -- u' )` decrement u by the size of one cell. ### `cells ( u -- u' )` transform an amount of cells into an amount of bytes. ### `char ( "c" -- c )` yield the value of the first character of the next word in the input stream. ### `cmove ( a1 a2 u -- )` copy u bytes of memory from a1 to a2. bytes are copied in low memory to high memory order. ### `cmove, ( a u -- )` copy u bytes of memory from a1 to `here`, then increment `here` appropriately. bytes are copied in low memory to high memory order. ### `cmove> ( a1 a2 u -- )` copy u bytes of memory from a1 to a2. bytes are copied in high memory to low memory order. ### `compile, ( xt -- )` compile a call to xt into user memory. ### `compile-only ( -- )` mark the most recently defined word as compile-only. ### `compile-only? ( ht -- ? )` true if ht is marked compile-only, false otherwise. ### `constant ( u "name" -- )` create a word that pushes a cell value u to the stack. ### `create ( "name" -- )` create a word in the dictionary that, by default, pushes the address directly following the header to the stack. this behaviour can be modified with `does>`. ### `d, ( n -- )` write a 32 bit value to user memory and increment the user memory pointer. ### `d! ( u a -- )` store the 32 bit value u into the memory address a. ### `decimal ( -- )` set current base to decimal. ### `does> ( -- )` modify the behaviour of the most recent `create`d word. (non-`create`d words will be corrupted.) ### `dp ( -- a )` a variable that contains the lowest free byte of memory in user memory. ### `dp0 ( -- )` a variable that contains the first byte of user memory. ### `dp$ ( -- )` a variable that contains the last available byte of user memory. ### `drop ( u -- )` remove the value at the top of the stack. ### `dup ( u -- u u )` duplicate the value at the top of the stack. ### `else ( -- ) IMMEDIATE COMPILE-ONLY` update the current if statement to branch here when the flag is false, and skip to `then` if the corresponding `if` was true. ### `executable ( a u -- )` marks the u bytes starting at address a as executable. this is used primarily to mark the program break, which is used as the user memory space. ### `execute ( xt -- )` call the word xt. ### `false ( -- u )` a cell with no bits set. ### `find ( a u -- a u 0 | a -1 )` look in the dictionary for the word a (of u characters). a zero is returned along with the original given string if no word was found. if a word was found, its link field address is returned along with the true flag. ### `grow ( u -- )` grows, and marks as executable, the user memory space by u bytes. ### `handler ( -- a )` variable containing the address of the current error handler. ### `here ( -- a )` yields the address of the first available byte in user memory. ### `hex ( -- )` set current base to hexadecimal. ### `if ( ? -- ) IMMEDIATE COMPILE-ONLY` if the flag is true, execute the following if statement, terminated by `else` or `then`. ### `immediate ( -- )` mark the most recently defined word as immediate. ### `immediate? ( ht -- ? )` true if ht is marked immediate, false otherwise. ### `interpret ( -- )` interprets the contents of the terminal input buffer until it runs out. ### `invert ( u -- u' )` invert all bytes in u. ### `latest ( -- a )` a variable containing the execution token of the most recently created word. ### `literal ( n -- ) IMMEDIATE COMPILE-ONLY` compile a push of the literal value n into the currently compiling word. ### `nip ( u1 u2 -- u2 )` drop the second-highest value from the stack. ### `number ( a u -- n -1 | 0 )` convert given string into a number along with a flag. if parsing a number fails then 0 (false) is returned and no number is provided. ### `octal ( -- )` set current base to octal. ### `or ( u1 u2 -- u )` perform bitwise OR on u1 and u2. ### `over ( u1 u2 -- u1 u2 u1 )` copy the second-highest value on the stack and move it to the top of the stack. ### `parse ( "name" c -- a u )` parse one word from the input buffer, separated by a newline or the character c, and return as a string. ### `parse-name ( "name" -- a u )` parse one whitespace-separated word from the input buffer, and return as a string. tabs (ascii 0x09), newlines (ascii 0x10), and spaces (ascii 0x20) are considered whitespace. ### `postpone ( "name" -- ) IMMEDIATE COMPILE-ONLY` compile the execution behaviour of a word into the current definition. if the word is immediate, that will execute the word at runtime (like `[compile]`). if the word is not immediate, this will compile code that compiles that word. ### `r> ( -- u ) ( R: u -- )` move a value from the return stack to the working stack. ### `rdrop ( R: u -- )` remove the value at the top of the return stack. ### `repeat ( -- ) IMMEDIATE COMPILE-ONLY` in a begin-while-repeat loop, loop back to the condition. ### `rot ( u1 u2 u3 -- u2 u3 u1 )` rotate the top three values on the stack so that the third highest value is moved to the top. ### `s" ( "string" -- ) IMMEDIATE COMPILE-ONLY` compile into the definition code to push the given string, terminated by a double quote. the string data and length are stored inline in the definition. ### `smudge ( -- )` toggles the smudge bit on the xt in latest. ### `state ( -- a )` a variable containing a boolean value. if 0 (false), the system is in interpreting mode, if -1 (true), the system is in compiling mode. ### `stderr ( -- 2 )` push the file descriptor of stderr to the stack. ### `stdout ( -- 1 )` push the file descriptor of stdout to the stack. ### `swap ( u1 u2 -- u2 u1 )` swap the two topmost values on the stack. ### `syscall0 ( rax -- u )` perform the syscall with the id in `rax`, and push the value of the `rax` register to the stack. ### `syscall1 ( rdi rax -- u )` perform the syscall with the id in `rax`, taking one parameter placed in `rdi`, and push the value of the `rax` register to the stack. ### `syscall2 ( rsi rdi rax -- u )` perform the syscall with the id in `rax`, taking two parameters placed in `rdi` and `rsi`, and push the value of the `rax` register to the stack. ### `syscall3 ( rdx rsi rdi rax -- u )` perform the syscall with the id in `rax`, taking three parameters placed in `rdi`, `rsi` and `rdx`, and push the value of the `rax` register to the stack. ### `then ( -- ) IMMEDIATE COMPILE-ONLY` conclude an if statement. ### `tib ( -- a )` a variable containing the address of the current input buffer. ### `to ( comp: "name" -- | intr: u "name" -- ) IMMEDIATE` compile or execute (depending on `state`) code to modify the contents of a `value`. ### `true ( -- u )` a cell with all bits set. ### `tuck ( u1 u2 -- u2 u1 u2 )` place a copy of the highest value on the stack below the second highest value on the stack. ### `type ( a u -- )` write u characters at a to output. ### `until ( ? -- ) IMMEDIATE COMPILE-ONLY` if the given flag is true, loop back to `begin`. ### `value ( u "name" -- )` create a value called name, the initial value of which is u. ### `variable ( "name" -- )` create a variable word, which yields an address that can be written and read. ### `while ( ? -- ) IMMEDIAT COMPILE-ONLYE` if given flag is true, continue the current begin-while-repeat loop, otherwise branch to after. ### `xor ( u1 u2 -- u )` perform bitwise XOR on u1 and u2. ## dictionary format note that the string length of one byte limits a word's name to 255 characters. | field | size | | :---- | :--- | | link to previous word | 8 bytes | | flag field | 1 byte | | string length | 1 byte | | string | <256 bytes | | code | variable length | ## reserved registers the register `r15` is reserved for the parameter stack pointer. ## differences from standard forth for the most part this forth intends to be in line with standards but it diverges in a few notable places: - the most visally obvious one by far, this forth uses lower case word names for core words. - `find` takes `a u` instead of a counted string, and does not return 1 for immediate words.