Tuesday, June 12, 2007

Heterogeneous cores disrupt the PCI market

The trend in CPU development is toward multiple cores on a single chip. Following Moore’s law, CPU die masks will shrink, and we will have the room to fit 8, 16, or 32 cores on a single chip.

Much of today’s software will have to be re-written to take full advantage of a large number of cores. I think the multi-core CPU will be also be a disruptive force on hardware outside of the CPU.

In the early days of the x86 processor, floating point math was very slow. It could take many clock cycles to complete a float divide, let alone square roots or trig functions. I actually worked on a video game that did all of its 3D physics and collisions as fixed point real numbers simply so they could use only the integer registers on the 386 CPU.

Slow floating point performance created the market for the separate floating point unit or coprocessor. They were sold as an optional add on.

Then Intel disrupted the market by shrinking the CPU mask image enough to fit the FPU onto the same die as the CPU. This was great for customers, but not so good for companies like Cyrix and Weitek who had been selling FPU. Then they were in the position of selling wagons wheels in Detroit when the car came on the market.

Multi-cores offer the opportunity for another disruption. The current Pentium generation architectures have offloaded several important tasks to PCI cards. These are network IO, graphics, and sound processing.

Each of these tasks are characterized by IO bound processing, lots of memory management, and device-oriented integration. The task is standardized and well defined so that specialized hardware can make the processing go faster.

In parallel processing, there is always a drop in utilization of cores as the number of cores increase because some of the task is inherently serial or requires communication and coordination.

Each additional general purpose core becomes less valuable to add to the CPU.

In the multi-core generation, there is more room on the chip to devote some of the cores to special tasks. A heterogeneous chip will have a mix of general purpose cores and specialized cores. Common IO bound tasks -- networking, sound, and graphics -- can be moved from the PCI card to the CPU.

For hardware system integrators and PC vendors, the number of subsystems they have to manage drops radically. The big win is in laptop and mobile devices. There is a dollar premium attached to smallness. Simplifying all the boards, drivers, submemory, and cards is a real economic success story.

Customers will benefit as the production cost of PC systems continues to drop radically.

Heterogeneous chips can have different mixes of cores specialized for different markets.

The losers might be the PCI card vendors who watch as the bottom end of their market gets taken away.

You can see some of the fallout already. AMD started a heterogeneous core strategy when it acquired ATI. AMD got the graphics technology that ATI had developed. Now they can roll that into future generations of heterogeneous core chips. Some of the cores will be general purpose cores, some of the cores will be built to implement Direct X 10 in hardware and support low end Vista graphics.

Intel, on the other hand, seems to be more conservative and is currently favoring more homogenous cores.

Market pressure may change that over time. It will be interesting to watch this shake out over the next few years.

A final question to ponder: What common tasks will we dedicate hardware to once we have more cores to play with?

Monday, June 11, 2007

I want a revolution (every twenty years)

When I was in school, I was in the midst of a revolution.

Software was engineering, and they were going to do it right. Doing software “right” had been the Top Down Software Process model, and it was Dogma. The revolutionaries wanted the Water Fall Process model.

The very first process model was the Heroic model. The “heroes” were math PhD’s and electrical engineers brainstorming how to make computers work for the first time. Computers came out of the effort to win World War II. Computers were labor saving tools for mathematicians working on trigonometric artillery tables. Later, computer technology began to incorporate control theory from aviation.

AT&T, the heavily regulated telephone monopoly, used computers to compress even more customer calls into its widely installed base of wires and call centers, thus driving down their cost and increasing profits without having to get an Act of Congress.

But the Heroic process model did not scale to larger problems.

By the 1960’s, computers were a big business. Software to control the computers was being written to solve large, complex problems. There was a lot of money, equipment, or lives at stake. There were spectacular project failures. Rockets flew off course. Phone calls were noisy or not there at all. Software was delivered late, buggy, or not at all.

A new generation of wise men of computers devised a revolution. They replaced the Heroic process model with the Top Down Process Model:

1) gather requirements from customers

2) decompose the problem into functionality to meet the requirements

3) design modular systems to meet the functionality

4) implement the design

5) test

6) document

7) maintenance

The Top Down Process scaled to larger problems than the Heroic model of just putting a bunch of smart guys in a room and hoping for the best.

As time went on, the space race drove software to new heights with the need for solving problems dealing with complex orbital math, control, and communications. Also, computers started to be used in commerce in a very bid way. Banks and stock markets realized that computers could record large volumes of transactions quickly, cheaply, and accurately.

The apex of the Top Down model was CASE (Computer Aided Software Engineering) in the early 1980’s. The idea was that you could have a design that was so good, so clear, and so coherent that you could capture it all in “The Tool,” push a button, and a working system would pop out the back end. This idea was very popular with large software customers like the Defense Department, which had spent hundreds of millions of dollars on systems that, in the end, didn’t really do everything that the DoD wanted.

Twenty years had passed. There were spectacular failures. Banks lost track of money. Radar systems would not talk to the alert system, which would not talk to the missile systems. Software was delivered late, buggy, or not at all.

In the 1980’s, a new generation of wise men of computers devised a revolution, the Waterfall Process model:

1) gather requirements from customers

2) decompose the problem into functionality to meet the requirements

3) design modular systems to meet the functionality

4) implement the design

5) test.

6) iterate: if not done goto 1

7) document

8) maintenance

Step 6, planned iteration, was a big deal. They were dealing with the scaling problem by admitting that problems are large and complex and you could not just be “smart enough” and design it right the first pass. Plan to iterate. Design, build, test, repeat.

The RAD (Rapid Application Development) was the vanguard of the model. If you do your software in this rapid iteration, you would arrive at a working solution in an economically feasible amount of time.

Huge and important systems were developed. A personal computer with word processing, spread sheets, and email went onto the desk of nearly every office worker.

The internet with email and the web revolutionized commerce and communication. Google organized and indexed the internet and effectively made everyone smarter.

Twenty years passed. The problems scaled larger. There were spectacular failures. The internet is full of security threats from crooks. The FBI could not get its case files into the hands of the agents and law enforcement. The Vista operating system was delivered years late and with many of its original features removed. Software was delivered late, buggy, or not at all.

In the 2000’s, another generation of wise men of computers had a revolution, Agile Development:

1) gather requirements from customers

2) re-factor the problem into functionality to meet the requirements

3) write automatic tests

4) implement the design

5) iterate: if not done, goto 1

6) document

7) maintenance

The software design phase was compressed even further. Testing moved from step 5 to step 3. The iteration cycles are even shorter than RAD.

Good software is being written this way. Automatic testing is a very powerful tool for writing modular code with good interfaces. It allows our projects to be robust in the face of change.

So every twenty years a new generation of wise men have advanced the state of the art with a revolution:

Heroes of computers in the 1940’s

Top Down model in the 1960’s

Waterfall in the 1980’s

Agile in 2000’s

Each revolution had to live with and accommodate the old way of doing things because they worked for a given class and size of problems.

But the problems will continue to grow in size and complexity. Agile programming makes testing a core principle, and that is good.

However, I think that in fifteen to twenty years, a new generation of wise men of computing will stage a revolution. A new process model will replace Agile as we get to the limits of what it can do. Unit testing code, even with 100% code coverage, does not mean you test 100%, 50%, or even 10% of the states and interactions of the code. Multi-threading and distributed computing will make the systems larger and more diffuse, and the complexity of the interactions will dominate the complexity of a single component. We want to be able to change parts of large distributed systems while they are still running - how do you test that completely? Software will get delivered late, buggy, or not at all.

Around 2020 or so, a new generation of wise men of computing will have to have a revolution. The problems will have scaled and we will have to do something new.

Viva La Revolution!