This post is the result of me listening to this week’s episode of the Advent of a computing podcast, the Jupiter Ace. The Ace was a small micro from 1982/83, designed by two guys that worked on the software and hardware of the Sinclair ZX-81. The hardware was pretty much the same as the ZX-81, but it had one big difference.
Instead of BASIC, the Ace ran Forth, a stack based language that is small, easy and fast.
I’ve been interested in Forth for a while now, because of its novelty and it’s use cases. No one else but NASA is using Forth in their deep space probes, and Intel runs Forth in its management systems inside their processors.
So… A bit of digging. Forth on the Ace took up 5kiB of ROM and lacked the Forth native cooperative multitasking.
Yes, that is right… A core feature of Forth is that it supports native cooperative multitasking.
Anyway, a full Forth compiler that supports all core words of the language can be as small as 8kiB on an 8 bit machine, for a 16 bit Forth. A 32 bit Forth on a 16 bit CPU can be as small as 16 KiWords.
That would allow me to add Forth as a core part of the Monitor OS for the DWMC-16.
So aside from a simple memory monitor, I’d have a full Forth environment with editor and command line in the monitor for at least 16 kiWord/32kiB, allowing me to go high level pretty quickly.
Hardware Considerations
Now, looking at the CPU hardware, I quickly realised that, wanting to make my life easier for later multi tasking OS programming, I have made the DWMC-16 perfectly capable of running Forth without much fuss and bother, as the CPU got three Stack Pointers SP1-3.
But SP1 is deeply integrated into the hardware for jumps, subroutines and interrupts. But that still leaves to Stack Pointers for Forth to play with.
Or is it…?
If I squint a bit and go by technicalities, I can see that the CPU actually has five Stack Pointers. That is because with some programming tricks, I can use the Index Offset memory addressing like a Stack Pointer.
How?
Well, the Y and Z Index Registers each have their own Y/Z Offset registers with signed 16 bit numbers. The Index Register itself acts as the base address for the Stack, while the Offset register allows to move up and down along the stack.
Nothing. But it takes four operations to fake an Index Register Stack Pointer operation like POP
.
LD R00, @Y+R #load from memory to R00
LD R01, YOR #load Y Offset Register to R01
INC R01 #increment R01
ST R01, YOR #save R01 to Y Offset Register
The above assembler is a pop operation on this Pseudo Stack Pointer.
But… If I use up/down counters to implement the Y/Z Offset Registers, I could drop four operations to two, by getting rid of the load/store set for the Offset Registers. Instead it gets reduced to a INCO/DECO
operation for Increment/Decrement Offset Register.
Useful in themselves, for example for iterating over strings and arrays, but… Why not handle this Pseudo Stack a pointer as an actual Stack Pointer in that case? Why not implement add then to PUSH/POP
?
Why not indeed…
But five stack pointers are such an awkward number in binary…
So… Since currently push and pop have an operand for the Stack Pointer, split them in two. One PUSH/POP
for the System Stack Pointer and PUSHS/POPS
for the four Secondary Stack Pointers.
And while I am at it…
Why not turn the SP2 and SP3 into another pair of Index and Offset Register pairs, for a set of W, X, Y and Z Index Registers.
Would be a bit awkward again, with the space needed in the small memory space allocated for the special registers. I would be missing one memor word to save one Offset register.
So, I could cut it down to three Index Registers doubling as Secondary Stack Pointers, or I put the Offset Registers into the General use registers.
A third option would be to mask off one word to contain the upper four address bits of the four Index Registers, but that would be awkward…
So, I think I will go with three Index/Offset Register pairs, X, Y and Z. I can live with that.
Being reduced to four (Pseudo) Stack Pointers also means that I can get rid of PUSHS/POPS
and simply use PUSH/POP
with the Stack Pointer number.
Other uses of these new Stack Pointers
There is a fun side effect of the while thing. Namely that this makes reading and writing strings faster reducing it to one, respective two operations.
Reading can be done with a simple POP
, while the reading needs to happen with a ST
, followed by an INCO
of the used Offset register for the Index Register/Stack Pointer.
I could mess with more hardware here, but I think that would be too much for too little gain.
Conclusion
All of this now leaves me with having to add it all to the existing document, as well as having to make a new post concerning those changes.
But since I am still in the design phase, it’s not a problem. If there were any actual hardware already, it would be a bigger problem.
Another thought would be… How would this impact making a LISP interpreter?
😄