Faster Syscall Trap redux
Raymond has a funny historical article about how Windows made system calls on 386 processors.
What he left out was the 286 version of this story. Microsoft and Intel had a similar meeting to the one that Raymond described with the 386, but in that case, one of the Microsoft requests was for the ability to switch from protected mode back into real mode.
You see, the 286 had the ability to switch from real mode (no virtual memory) to protected mode (virtual memory), but not back. The theory was that you'd never want to go back to real mode, that would be "silly".
But of course, that doesn't deal with the issue of compatibility. OS/2 supported one real mode application running in the system, in the "DOS Box". The DOS box was essentially just another task, it got time sliced like other processes (ok, it really didn't, but conceptually that's what happened), so the system did a LOT of switching between real mode and protected mode.
It was critical that we be able to switch from protected mode back to real mode (when switching to the DOS box). The problem is that the only documented way of doing this was to write to the keyboard controller device (which controls WAY more just the keyboard on a PC). Unfortunately, the keyboard controller was REALLY slow, and this mechanism took literally milliseconds.
So Microsoft went crazy trying to find a fast way of switching back to real mode.
Eventually they found it.
Their solution:
LIDT -1 INT 1
What did this do? Well, LIDT -1 sets the interrupt descriptor table to an invalid physical address. The system tried to execute the INT 1 instruction, which caused it to fault the IDT into memory (a fault). Well, the IDT couldn't be found, so that raised a not present fault (a double fault). The not present fault tried to fault in the not present fault handler, which failed (a triple fault).
The 286 processor couldn't handle faults more than 3 deep, so it gave up the ghost and reset itself. Which caused the system ROM to start executing, and we simply set the real mode start address (which was kept in real memory) and poof! we had transferred from protected mode to real mode in microseconds (not milliseconds).
I actually found a write-up of this technique on the web here. Interestingly enough, the article on the web credits Intel for this technique, they may be right, I remember it being developed in-house, but I may be wrong. After I unpack my office, I'll check my 286 reference manual.
Much later: I just checked my 286 reference manual (a rather well thumbed first edition), and I found the reference that was discussed in that article. The comment accompanying the comment is "Setup null IDT to force shutdown on any protection error or interrupt". That's it, that's the only hint that Intel came up with the triple faulting technique to force a restart in real mode. Personally, I'm not surprised that the IBM engineers didn't pick up on this.
Comments
- Anonymous
February 08, 2005
> The theory was that you'd never want to go
> back to real mode, that would be "silly".
Actually that much of it is true. The part they forgot is this: no matter how much you don't want to do silly things, sometimes you have to do silly things, so you still ought to plan for them. - Anonymous
February 08, 2005
Wouldn't INT 3 have saved a byte? - Anonymous
February 08, 2005
Mike: You're right, it would have - that may have been what they did actually :)... - Anonymous
February 08, 2005
I recall the "official" way to switch back to real mode to be by programming the keyboard controller so it would end-up resetting the processor ... or somrthing like this. I read this in the only book on 286 programming I could get my hands on at the time. Uff ... - Anonymous
February 08, 2005
Hugh ... I guess I should have read more carefully the article. You mention it right there. Sorry about that. - Anonymous
February 08, 2005
I never really had much to do with PCs in the 286 era, and I've always wondered why the keyboard controller did so much more than control the keyboard. Is this related in some way to the "A20 gate enabled" option that used to be in the BIOS? - Anonymous
February 08, 2005
"I recall the "official" way to switch back to real mode to be by programming the keyboard controller so it would end-up resetting the processor ... or somrthing like this. I read this in the only book on 286 programming I could get my hands on at the time. Uff ..."
If I remember correctly, the keyboard controller aspect of it had to do with enabling/disabling the A20 gate to get full use of memory when in protected mode... - Anonymous
February 08, 2005
The comment has been removed - Anonymous
February 08, 2005
In some ways this is the type of stuff I would never let my programmers do "try stuff till you find something that works". I guess it is a different world with CPUs, but I can't help but think of all those Windows programmers who rely on undocumented behavior (not features) in Windows that result in Microsoft having to add shim code to not break the applications. - Anonymous
February 08, 2005
Tim,
It wasn't really "try stuff 'til you find something that works" - it was more of a "Ok, we've got do figure out how to reset the CPU - pour over the Intel reference manuals and see what you can find"
The triple fault behavior was known, it just took some really clever people to put the triple fault behavior together with a desire to switch to real mode and make it work.
Btw, I forgot to add to the article that for the 386, Intel added the ability to quickly switch back to real mode :) - Anonymous
February 09, 2005
The comment has been removed - Anonymous
February 09, 2005
Daniel, using the triple fault trick brought the context switch time from 15+ milliseconds to 800ish microseconds. Since the same real-mode code to restart was executed in both cases, that implies that the 14 millisecond savings came from the time to reset the controller, and not from the time to get back to your real mode code.
I actually wrote about the A20line trick a while ago - himem.sys was the driver to enable that piece of magic.