Unix Systems For Modern Architectures -1994- Pdf -

The next three years will determine whether UNIX becomes the universal OS for tera-scale computing or fragments into proprietary SMP variants (Windows NT is breathing down our necks). As of April 1994, the smart money is on UNIX—but only if the Berkeley and System V traditions can merge into a truly scalable, modern kernel.

The danger is . A misbehaving network card at 100Mbps can generate 150,000 interrupts per second. If all interrupts go to one CPU, that CPU is dead. The solution is interrupt coalescing (already in some Ethernet chips) and the use of "kernel threads" for bottom halves, allowing the interrupt dispatcher to merely wake a thread that runs on any CPU. unix systems for modern architectures -1994- pdf

Consider the traditional sleep() / wakeup() mechanism. In a single-CPU UNIX, this was elegant. In an SMP, it requires a "rendezvous" interrupt to all CPUs, flushing TLBs and invalidating cache lines. A 1994 benchmark on an SGI Challenge (12x MIPS R4400) showed that a simple select() loop on 1000 file descriptors caused 40% of kernel time to be spent in cross-CPU TLB shootdowns. The next three years will determine whether UNIX

The next three years will determine whether UNIX becomes the universal OS for tera-scale computing or fragments into proprietary SMP variants (Windows NT is breathing down our necks). As of April 1994, the smart money is on UNIX—but only if the Berkeley and System V traditions can merge into a truly scalable, modern kernel.

The danger is . A misbehaving network card at 100Mbps can generate 150,000 interrupts per second. If all interrupts go to one CPU, that CPU is dead. The solution is interrupt coalescing (already in some Ethernet chips) and the use of "kernel threads" for bottom halves, allowing the interrupt dispatcher to merely wake a thread that runs on any CPU.

Consider the traditional sleep() / wakeup() mechanism. In a single-CPU UNIX, this was elegant. In an SMP, it requires a "rendezvous" interrupt to all CPUs, flushing TLBs and invalidating cache lines. A 1994 benchmark on an SGI Challenge (12x MIPS R4400) showed that a simple select() loop on 1000 file descriptors caused 40% of kernel time to be spent in cross-CPU TLB shootdowns.