Friday, September 15, 2006

First Cell-Based Computer Announced

Late yesterday, IBM announced it is finally making available the
first general purpose computing system to utilize the Cell processor.
Proving the Cell is not just for game consoles, IBM is infusing its
high-performance System Cluster 1350 setup with a Cell-based
BladeCenter QS20 option.

It's being marketed as a device for
"compute-intensive" operations, which confirms expectations that Cell
would be introduced on the high end, and touted for its
number-crunching ability. Each QS20 blade will feature a pair of Cells,
each of which is what the STI coalition -- Sony, Toshiba, and IBM --
describes as a "multi-element" processor, rather than "multicore."


A single processor (in this setup) includes one element that's
essentially a dressed-up Power or PowerPC. Its job is to analyze the
task at hand, then identify and isolate repetitive portions that best
lend themselves to parallel operation. (Most compute-intensive
operations, it turns out, are reiterative.) Those tasks are
then delegated by the PPE to up to eight so-called "synergistic
processing elements" (SPEs), which in a sense consume these
partly-digested tasks produced for them by the PPE.

The process is a
lot more similar to the delegation of tasks in a graphics processor
than in an Intel or AMD multicore processor, although the PPE/SPE
relationship in a Cell is much more broadly defined. When Cell
processors are multiplexed, the PPEs are engineered so that they can
work together so they can actually layer the delegation of tasks among
successive tiers of SPEs.

In other words, a Cell can break down
tasks, then break them down again if more SPEs are available. As a
result, engineers have found, the efficiency of a Cell-based system can
rise more exponentially than linearly, with the more SPEs there are
available. More accurately, Cell systems may be less susceptible to
efficiency drop-offs as processor size increases, though the true test of that theory comes now.

Cell Computer"Increasing
frequencies and deeper pipelines have reached diminishing returns on
performance due to issues with power consumption/dissipation and memory
latencies," IBM said on Thursday. "The QS20 addresses this problem
head-on with two 3.2 GHz Cell BE processors on the blade."

Incidentally,
this is the same clock speed as will be used for Sony’s
PlayStation 3. Each PPE has 512 KB of L2 cache, but each SPE has 256 KB
of what STI describes as "local store memory," which is part of a
unique, three-tiered memory structure that may get its first serious
test with the QS20. With 256 KB all to itself, each SPE operates as a
little, self-contained computer; and since it only has to deal with
user application-oriented tasks and never with the operating system,
IBM engineers say they can concentrate those efficiency benefits on
those tasks the user actually sees. The PS3 reportedly only uses seven
of the eight SPEs available, reserving #8 as a spare.

A little
phrase that IBM engineers use to benchmark efficiency gains is
“Gelsinger’s Law,” referring to Intel Senior Vice
President Pat Gelsinger. It was Gelsinger who pointed out that overall
throughput increases by 40 percent every time the number of processors
in a system actually does double, in accordance with Moore’s Law.

IBM
uses this benchmark as a sort of tease, to prove the Cell can do
better. Soon, we’ll be able to find out for ourselves, as we
finally see how a Cell system performs against Xeon and Opteron in the
same environment.

0 comments: