Spanning Two Worlds
[The ninth in a series of posts on the evolution of TransForth]
The dictionary we have at the moment is split across two worlds. The definitions are in Forth-world; packed into plain memory. But we still have the F#-world mapping of WordRecords to those memory locations.
letmutable dict = []
type WordRecord = { Name : string; Def : int; Immediate : bool ref }
let immediate () = dict.Head.Immediate := true
let header name = dict <- { Name = name; Def = mem.[h]; Immediate = ref false } :: dict
Header Format
Instead of this, we now want to move to plain memory along with everything else. The traditional Forth dictionary header is four bytes. We can certainly pack the name and the Immediate flag into a single Int32. However, for now we’ll postpone implementing the traditional Forth way of representing names of words (as a four-byte sequence giving the length and the first three ASCII characters). Instead we’ll take a little temporary shortcut and just go with the simplest thing that could possibly work; a hash of the name with the high bit reserved as a flag indicating whether it’s immediate (there’s obvious shortcomings to this but we change it once again when it’s re-implimented in Forth itself):
let encode (n : string) = n.GetHashCode() &&& 0x7FFFFFFF
let immediate () = mem.[latest] <- mem.[latest] ||| 0x80000000
We’ll be getting rid of the dict and will pack these headers into memory alongside their definitions. Each header will give the name/immediate encoding in one memory cell, followed in the next cell by the address of the previous word; thus making a linked list. The latest pointer will always point to the header of the most recently added word.
Following each header will be the packed definition compiled in as we implemented in the last post.
letmutable latest = mem.[h]
let header name =
let link = latest
latest <- mem.[h]
encode name |> append
append link
Finding and Forgetting
Instead of finding and forgetting words using the niceties of F#...
let find name = List.tryFind (fun w -> w.Name = name) dict
let forget name =
let found = dict |> Seq.skipWhile (fun w -> w.Name <> name) |> List.ofSeq
dict <- found.Tail
mem.[h] <- found.Head.Def
… we’ll have to resort to low level memory walking and adjusting of the latest pointer.
let find name =
let enc = encode name
letrec find' addr =
if addr = 0x0400 // first cell is DOSEMI
then -1 else
if mem.[addr] &&& 0x7FFFFFFF = enc
then addr else find' mem.[addr + 1]
find' latest
let forget name = mem.[h] <- find name; latest <- mem.[mem.[h] + 1]
The find function now returns the address of a word’s header (or -1 if not found). The forget function just adjusts the latest pointer; leaving forgotten definitions in memory to be subsequently overwritten.
A couple of helpers will be useful for checking whether a word isimmediate, and for converting a word address to the address of the definition – the so called “code field address” (cfa) which always can be found two cells from the header.
let isimmediate addr = mem.[addr] &&& 0x80000000 = 0x80000000
let cfa addr = addr + 2 // used in several places
Outer Interpreter Tweaks
Finally, our outer interpreter needs to change slightly to expect addresses rather than WordRecord options from find and to use the new dictionary format.
let rep input =
out.Clear() |> ignore
source <- input
while not (Seq.isEmpty source) do
let word = token ()
if word.Length > 0 then
match find word with
| -1 ->// literal?
let number, value = Int32.TryParse word
if number then
if interactive then push value else append LIT_ADDR; append value
else word + "?" |> failwith
| d ->
let c = cfa d
if interactive || isimmediate d
then p <- c; w <- c; i <- HALT_ADDR; execute ()
else append c
That’s the last of the data structures remaining to be moved to plain memory. The last major moving part to be transitioned will be the outer interpreter. TransForth is coming along!