# sanctuary (working title) sanctuary is a 64-bit subroutine threaded forth for amd64 linux systems. ## stack effect notation - `a`: memory address - `c`: one byte value - `n`: signed integer - `u`: unsigned integer - `?`: boolean flag - `xt`: execution token - `ht`: header token - `""`: string in input buffer - `|`: 'or' ## Glossary the following is a list of words available in this forth. ### `! ( u a -- )` store the 64 bit value u into the memory address a. ### `#tib ( -- a )` variable containing the amount of characters in the input buffer. ### `( ( -- ) IMMEDIATE` start a comment which lasts until the next closed bracket. if the unclosed bracket in the description above bothers you, have a closing bracket: ). ### `(0handler) ( -- )` the very early error handler, which simply quits the program. ### `(header) ( a u -- ht )` create a dictionary header for a word named the provided string. this word does not set the code field. this word does not update latest. ### `* ( u1 u2 -- u )` multiply u1 and u2. ### `*/mod ( n1 n2 n3 -- n4 n5 )` multiply n1 and n2, divide the result by n3. remainder is in n3, result is in n4 ### `+ ( u1 u2 -- u )` add u2 to u1. ### `+! ( a -- )` add one to the value at memory address a. ### `, ( u -- )` write a 64 bit value to user memory and increment the user memory pointer. ### `- ( u1 u2 -- u )` subtract u2 from u1. ### `-! ( a -- )` subtract one from the value at memory address a. ### `-rot ( u1 u2 u3 -- u3 u1 u2 )` rotate the three topmost values on the stack so that the topmost value is moved to the third highest. ### `/mod ( u1 u2 -- u3 u4 )` divide u1 by u2. result is in u4, remainder is in u3. ### `[ ( -- ) IMMEDIATE` set the system to interpret mode. ### `] ( -- ) IMMEDIATE` set the system to compiling mode. ### `: ( "name" -- )` start compilation of the word 'name'. ### `; ( -- ) IMMEDIATE` end compilation of the currently compiling word. ### `\ ( -- ) IMMEDIATE` start a comment that lasts until the end of the current line. ### `@ ( a -- u )` fetch the 64 bit value at memory address a. ### `= ( n1 n2 -- ? )` return true if n1 and n2 are equal. ### `< ( n1 n2 -- ? )` return true if n1 is less than n2. ### `<= ( n1 n2 -- ? )` return true if n1 is less than or equal to n2. ### `<> ( n1 n2 -- ? )` return true if n1 and n2 are not equal. ### ` ( n1 n2 -- ? )` return true if n1 is greater than n2. ### `>= ( n1 n2 -- ? )` return true if n1 is greater than or equal to n2. ### `>body ( ht -- xt )` yield the code field of header token. ### `>in ( -- a )` variable containing the index of the first unparsed character in the input buffer. ### `>mark ( -- a )` mark the source of a forward branch. ### `>r ( u -- ) ( R: -- u )` move a value from the working stack to the return stack. ### `>resolve ( a -- )` mark the destination of a forward branch. ### `?branch ( -- )` compile into user memory an incomplete conditional branch. if the value on the stack is zero the branch is taken. a 32 bit branch offset must be written immediately after. ### `0= ( n -- ? )` return true if n is equal to zero. ### `0< ( n -- ? )` return true if n is less than zero. ### `0<= ( n -- ? )` return true if n is less than or equal to zero. ### `0<> ( n -- ? )` return true if n is not equal to zero. ### `0> ( n -- ? )` return true if n is greater than zero. ### `0>= ( n -- ? )` return true if n is greater than or equal to zero. ### `1+ ( u -- u')` add one to u. ### `1- ( u -- u')` subtract one from u. ### `2drop ( u1 u2 -- )` remove the two topmost values from the stack. ### `2dup ( u1 u2 -- u1 u2 u1 u2 )` duplicate the two topmost values on the stack. ### `abort ( -- )` call the error handler (the address of which is in the variable `handler`) ### `again ( -- ) IMMEDIATE` complete an infinite loop began by the word `begin`. ### `allot ( u -- )` reserve u bytes of user memory. ### `and ( u1 u2 -- u )` perform bitwise AND on u1 and u2. ### `base ( -- a )` a variable containing the current numeric input/output base. by default this is 10. ### `begin ( -- ) IMMEDIATE` mark the beginning of a begin-again, begin-until, or begin-while-repeat loop. ### `binary ( -- )` set current base to binary. ### `branch ( -- )` compile into user memory an incomplete branch. a 32 bit branch offset must be written immediately after. ### `brk@ ( -- a )` yields current program break. ### `bye ( -- )` exits the forth system. ### `c, ( c -- )` write an 8 bit value to user memory and increment the user memory pointer. ### `c! ( u a -- )` store the 8 bit value u into the memory address a. ### `c@ ( a -- c )` fetch the 8 bit value at memory address a. ### `char ( "c" -- c )` yield the value of the first character of the next word in the input stream. ### `cmove ( a1 a2 u -- )` copy u bytes of memory from a1 to a2. bytes are copied in low memory to high memory order. ### `cmove> ( a1 a2 u -- )` copy u bytes of memory from a1 to a2. bytes are copied in high memory to low memory order. ### `d, ( n -- )` write a 32 bit value to user memory and increment the user memory pointer. ### `d! ( u a -- )` store the 32 bit value u into the memory address a. ### `decimal ( -- )` set current base to decimal. ### `dp ( -- a )` a variable that contains the lowest free byte of memory in user memory. ### `dp0 ( -- )` a variable that contains the first byte of user memory. ### `dp$ ( -- )` a variable that contains the last available byte of user memory. ### `drop ( u -- )` remove the value at the top of the stack. ### `dup ( u -- u u )` duplicate the value at the top of the stack. ### `else ( -- ) IMMEDIATE` update the current if statement to branch here when the flag is false, and skip to `then` if the corresponding `if` was true. ### `executable ( a u -- )` marks the u bytes starting at address a as executable. this is used primarily to mark the program break, which is used as the user memory space. ### `execute ( xt -- )` call the word xt. ### `find ( a u -- a u 0 | a -1 )` look in the dictionary for the word a (of u characters). a zero is returned along with the original given string if no word was found. if a word was found, its link field address is returned along with the true flag. ### `grow ( u -- )` grows, and marks as executable, the user memory space by u bytes. ### `handler ( -- a )` variable containing the address of the current error handler. ### `here ( -- a )` yields the address of the first available byte in user memory. ### `hex ( -- )` set current base to hexadecimal. ### `if ( ? -- ) IMMEDIATE` if the flag is true, execute the following if statement, terminated by `else` or `then`. ### `immediate ( -- )` mark the most recently defined word as immediate. ### `immediate? ( ht -- ? )` true if ht is marked immediate, false otherwise. ### `interpret ( -- )` interprets the contents of the terminal input buffer until it runs out. ### `invert ( u -- u' )` invert all bytes in u. ### `latest ( -- a )` a variable containing the execution token of the most recently created word. ### `literal ( n -- ) IMMEDIATE COMPILE-ONLY` compile a push of the literal value n into the currently compiling word. ### `number ( a u -- n -1 | 0 )` convert given string into a number along with a flag. if parsing a number fails then 0 (false) is returned and no number is provided. ### `octal ( -- )` set current base to octal. ### `or ( u1 u2 -- u )` perform bitwise OR on u1 and u2. ### `over ( u1 u2 -- u1 u2 u1 )` copy the second-highest value on the stack and move it to the top of the stack. ### `parse ( "name" c -- a u )` parse one word from the input buffer, separated by a newline or the character c, and return as a string. ### `parse-name ( "name" -- a u )` parse one whitespace-separated word from the input buffer, and return as a string. tabs (ascii 0x09), newlines (ascii 0x10), and spaces (ascii 0x20) are considered whitespace. ### `r> ( -- u ) ( R: u -- )` move a value from the return stack to the working stack. ### `rdrop ( R: u -- )` remove the value at the top of the return stack. ### `repeat ( -- ) IMMEDIATE` in a begin-while-repeat loop, loop back to the condition. ### `rot ( u1 u2 u3 -- u2 u3 u1 )` rotate the top three values on the stack so that the third highest value is moved to the top. ### `smudge ( -- )` toggles the smudge bit on the xt in latest. ### `state ( -- a )` a variable containing a boolean value. if 0 (false), the system is in interpreting mode, if -1 (true), the system is in compiling mode. ### `swap ( u1 u2 -- u2 u1 )` swap the two topmost values on the stack. ### `syscall0 ( rax -- u )` perform the syscall with the id in `rax`, and push the value of the `rax` register to the stack. ### `syscall1 ( rdi rax -- u )` perform the syscall with the id in `rax`, taking one parameter placed in `rdi`, and push the value of the `rax` register to the stack. ### `syscall2 ( rsi rdi rax -- u )` perform the syscall with the id in `rax`, taking two parameters placed in `rdi` and `rsi`, and push the value of the `rax` register to the stack. ### `syscall3 ( rdx rsi rdi rax -- u )` perform the syscall with the id in `rax`, taking three parameters placed in `rdi`, `rsi` and `rdx`, and push the value of the `rax` register to the stack. ### `then ( -- ) IMMEDIATE` conclude an if statement. ### `tib ( -- a )` a variable containing the address of the current input buffer. ### `type ( a u -- )` write u characters at a to output. ### `until ( ? -- ) IMMEDIATE` if the given flag is true, loop back to `begin`. ### `while ( ? -- ) IMMEDIATE` if given flag is true, continue the current begin-while-repeat loop, otherwise branch to after. ### `xor ( u1 u2 -- u )` perform bitwise XOR on u1 and u2. ## dictionary format note that the string length of one byte limits a word's name to 255 characters. | field | size | | :---- | :--- | | link to previous word | 8 bytes | | flag field | 1 byte | | string length | 1 byte | | string | <256 bytes | | code | variable length | ## reserved registers the register `r15` is reserved for the parameter stack pointer. ## differences from standard forth for the most part this forth intends to be in line with standards but it diverges in a few notable places: - the most visally obvious one by far, this forth uses lower case word names for core words. - `find` takes `a u` instead of a counted string, and does not return 1 for immediate words.