Sunday 10 February 2013

The Clean Slate Movement

It would be nice, wouldn't it, if our programs were shorter. They would be quicker to write, easier to understand, easier to maintain. The way this is supposed to work is that we take code made by other people and build on it, so that the small part we write draws together and coordinates the code from libraries and other self-contained programs. But in practice this has not worked out quite as well as we hoped. As James Hague points out in We Who Value Simplicity Have Built Incomprehensible Machines, the code that we try to re-use has become encrusted with layer after layer of cruft. How did that happen?

"... all the little bits of complexity, all those cases where indecision caused one option that probably wasn't even needed in the first place to be replaced by two options, all those bad choices that were never remedied for fear of someone somewhere having to change a line of code...they slowly accreted until it all got out of control, and we got comfortable with systems that were impossible to understand."

Impossible? We all know what it's like. Maybe the power of search-engines and communal help over the Internet has mitigated the problem, but it's still a problem. We can get stuff done, but it gets more and more like black magic. (In another post, A Complete Understanding is No Longer Possible, Hague points out that for a reasonably thorough understanding of a MacBook Air, you would have to digest around 11,000 pages of documentation. It's not going to happen is it?)

So what can we do? The short answer to that question is: "Start again, with a clean slate". It's a daunting prospect, and one that until recently no-one would have taken seriously. But over the last few years something that we might call "The Clean Slate Movement" has started to gain credibility.

For some in this movement, like Peter Neumann or Robert Watson the central issue is security. Our systems have become so complex that it's impossible to make credible guarantees about security. Only by building new systems, re-thought from the hardware up, can we really make any progress. (If we try to do the right thing on current hardware, our systems usually end up being far too slow. If you want to understand how processor designs will be different in 10 years time, you might take a look at The Cambridge CAP Computer. It's 30 years old, but it's actually from the future.)

Other people in the movement, notably Alan Kay and his colleagues at Viewpoints Research Institute have a slightly different aim. They envision building systems which are dramatically simpler because they are dramatically smaller — with maybe 100 times or even 1000 times less code for the same functionality. They talk about writing code for a whole system, from applications down to the "bare metal" in perhaps 20,000 lines. Now if that were possible, we could certainly imagine understanding that whole system. If we printed it out 20 lines to a page, surrounded by commentary like the Talmud, it would only be 1000 pages long. Something like the size of the Oxford Users' Guide to Mathematics. Something you would expect a professional to be intimately familiar with, though they might have to go look up the odd detail.

You might reasonably ask "Is all that really possible?" It does seem rather ambitious, even unlikely. But Alan Kay obviously has a formidable reputation and his associates have already had considerable success. A central part of their strategy is to use meta-compilers to define different, powerful languages in which to implement the different parts of their system. This does seem to lead to a substantial reduction in code size. (Meta-compilers are another old technology brought up-to-date, and one that I'll return to in a future post.)

Overall, can we really afford to throw away old systems and rebuild from scratch? The marginal cost of patching-up a system is always less than the cost of starting from scratch. But if we only pay attention to short-term costs, then each repair will only barely do what's needed today, and afterwards the system as a whole will be a little worse. Unless we spend extra effort today and keep refactoring a system so that it continues to be maintainable, over time it tends to degenerate into a ghastly mess: fragile and dangerous.

Too see where this ends, you only have to look at the notoriously crufty and fragile systems employed by the financial services industry. Take for example the market maker Knight Capital, which on one day in August 2012 suffered a 440 million dollar loss due to a "technology issue". (Their software offered to trade shares at stupid prices, an offer which other people were only too happy to accept.) In the long run, it's worth paying for craftsmanship. In the long run, crufty software is never the cheapest option.

No comments:

Post a Comment