Ran across the Microsoft Drawbridge project while researching Midori (an MS Research project that has moved into one of the commercial business units). It’s fascinating.
Understanding “library OS” really requires that you unthink a decade or two of Microsoft shorthand (and a little propaganda). You could also just go back a few years in time. Back then, “OS” meant “the software that dealt with the hardware.” An OS always provides an Application Programming Interface (API) that let applications ask the OS to draw stuff on the screen, talk to memory, write stuff to disk, things like that.
Modern versions of Windows have much the same structure, actually. Atop the core, low-level OS is what you might call a Windows “personality.” Today, that consists of a kernel-mode service as well as a user-mode service. That “personality” is essentially an expanded set of APIs (100,000 or more) that make it easier for developers to draw windows and buttons on the screen, write complex data structures to disk, manage memory, and whatnot. But at its heart, the low-level OS functions really are separate. Remember, Windows was designed to have a “POSIX subsystem,” which was essentially another “personality” of APIs layered atop the core OS, enabling Windows to “run UNIX applications” (massive overstatement, but you get the idea).
Modern virtualization is a high-cost (in terms of performance), low-effort way of running multiple “personalities” atop a single OS. Essentially, the hypervisor becomes the OS, dealing directly with hardware. Each VM emulates other hardware, so that you can just plop a regular old OS inside the VM and have it run. But the whole “emulate hardware and run a whole OS and its personality” is a lot of overhead: virtualization as we know it today is extremely high-overhead. We tend to not notice because of Moore’s Law.
Drawbridge proposes (and they’ve got a working model implementation) scaling back the core OS to just the deal-with-hardware stuff. Everything talking to those low-level APIs has to go through a security layer, placing a hard-and-fast security bottleneck around your core resources. The core OS can also provide some basic API stacks.
Atop that, you run what I guess we’d call user-mode processes. Imagine taking the Win32 APIs and rebuilding them to talk to the underlying Drawbridge OS. Each application runs its own copies of the Win32 APIs, right in-process. Essentially, you’re moving the layer of abstraction so that each application comes with its own little “personality.” That may seem absurd, until you remember that it’s exactly what modern hypervisors do, only without also emulating hardware. If you ran a copy of Word and Excel, they could presumably link to the same “personality” (e.g., Win32 APIs) much like they link to DLLs today. The result is “virtualization” of a more efficient nature.
The Drawbridge concept also includes the ability to suspend an application, and to resume it on other hardware – a la VM snapshots, really, only again without the overhead of emulating hardware. You could easily run multiple “personalities” in parallel, since each would be in a separate process. Again, like we do today with hypervisors, but without emulating hardware.
So basically, instead of Windows being an “OS,” Windows becomes a giant “library” (or DLL, to use a familiar implementation term). You just separate the bits that deal with hardware from all the other bits, basically. This would require all of your OS “personalities” – Windows, Linux, and whatnot – to be written to a new set of APIs. Well, translates. Right now, they all do work with a common API – the core, BIOS-based APIs that our computers’ firmware speaks. Drawbridge just proposes that they be refactored to talk to a slightly different API, which would run the software in slightly different ways.
It’s a fascinating approach that’s been around academically for about a decade; Drawbridge just appears to be the first practical attempt to make it work as commercial software. Kinda neat.
The “cloud” advantages are pretty obvious – if you can suspend/resume/migrate individual processes rather than entire virtual machines, you gain a markedly higher level of productivity and flexibility in terms of availability and workload management.
Be interesting to see where MS Research goes with it, huh?