Welcome to my blog. There are many many others like it, but this one is mine. I'm a professor at Cornell who likes to build systems. My background is quite straightforward: I saw a computer for the first time when I was 13 and knew right then and there that these devices would revolutionize the world. I got my own Commodore-64 at 14 and have been writing software systems since then. I initially thought I'd study artificial intelligence and build über-intelligent robots that would take over the world, then I realized that what the world desperately needed was computer infrastructure that worked, so I specialized in systems. In the process, I got a PhD, developed two research operating systems (SPIN in the '90s and Nexus in the 00s) and built countless distributed systems. I'm writing this blog because of a confluence of two trends, one positive, and one is a bit negative.
The opportunity I saw back in the mid-80's in the computing field has not abated one bit. True, back then, few companies were using computers in their daily processes. Government offices required forms in triplicate, often created by pressing really hard on sheets of paper separated by carbon paper. Exams were written on a typewriter, often with little errors here and there, and replicated on a precursor to the Xerox machine that used a blue ink that smelled so fantastic that the first thing you did was typically to sniff the exam paper like a creepy printed word fetishist. It was clear that computers and automation would bring tremendous value, which they did. And even though we've had one computing bubble since then, and are likely in the midst of a second one, the opportunities are still there. Imagine how much better life can be with further application of computers. Yes, the easy automation has been done, but the fun stuff is just beginning. The fact that most of the world conducts business online now opens up more, not less, opportunities to new people entering the field.
My pet theory is that every field that sits on top of an exploitable exponential process wields immense power in society. Physics wields power through fission and fusion. Medicine through the exponential process of pathogen growth. Finance through compound interest. And we geeks wield immense power through Moore's Law, an observation about an ever-increasing trend in the number of transistors that can be squeezed into a unit area. As an aside, it's worth sanity checking my claim by looking at fields like literature, music, sociology, mechanical/civil engineering and others that do not involve exploitable exponential processes. Use any metric for measuring societal power (funding, column inches, mind share, etc) and see where these fields land relative to physics, medicine, finance and computing. Also, note that my observation says nothing about the relative virtues of any of these fields. Some of the softer fields are deeply fulfilling, and finance is essentially pure evil when it is not useless. So if you're in a soft field, awesome! I love what you do, and please make it a Grande, no cream. For full disclosure, if I had to do everything all over again and could not pick CS for some reason, I'd actually be an English major (I draw a mean espresso as well). Anyhow, it is clear that computing will continue to wield immense power through the computational opportunities made possible by Moore's Law.
And the specific field of distributed systems, as of this writing, offers many exciting challenges and opportunities. We're now able to build massively parallel hardware, but have not yet figured out the software infrastructure to make effective use of this hardware. There are healthy commercial ecosystems, where each large company including Google, Facebook and Amazon, has its own unique approach to how this infrastructure should be built. Academics, through testbeds like PlanetLab, EmuLab, VICCI and others, have the ability to experiment with distributed systems. The peer-to-peer wave, which came and passed, has paved the way to innovative deployments. In short, there are tons of opportunities with very few ossified, hard-to-penetrate components; an ideal scenario for bright young minds.
You might ask, given that there are a ton of tech blogs already, whether the world needs yet another one. Well, the jury is still out and it probably doesn't or, at least, shouldn't, but it needs something, because there is a lot of regurgitation out there in the blogosphere, and half the stuff being regurgitated doesn't even begin to make sense. There are many outdated ideas that are holdovers from an ancient era when the tradeoffs were different (and by ancient era, I mean a few years ago when processors had just a single core or disks had moving parts or whatever). There are dumb ideas being pushed by the industry to turn a profit. And there are a lot of falsehoods and muddled thinking that permeates the field. People build bad systems and they peddle broken software.
And this has very real, tangible societal costs. Why did the Therac-25 fry people to death and how is this ancient system related to why a Windows box will someday also kill people? Why can't the FAA revamp an air traffic control system? Why did Twitter have to go through three revisions before it started to half-way work, and yet it still can't reliably produce a list of people I'm following? Why do people pony up millions of dollars for klunky database software? Why are all the purported modern replacements for klunky database software broken by design? How come so many people talk about RESTful design without anyone being able to define what it means? Who thought that standardizing syntax on XML without agreeing on semantics would solve any problem at all, and what are those people doing today? Why can I still watch, spoof and redirect everyone's Internet traffic when I'm at a cafe, or more interestingly, at a hospital? Who creates split-DNS setups, and who peddles these as a best practice? Could these people all be members of Iranian sleeper cells tasked with slowly but surely undermining our civilization? Perhaps, but more likely, what we're seeing is simply what happens when your regular, well-meaning developers slowly but surely make bad calls. They make bad calls partly because there are many falsehoods and myths out there.
I thought a little bit about where these bad ideas come from. Is there a social network pointing to sources of bad design? Can we mine the Facebook and Twitter graphs to identify such sources and banish them to an alternative network where every page has a BLINK tag? It's certainly true that some people are two sigma above the mean when it comes to generating bad ideas. For instance, the software architect who invented the treasure trove of ill-conceived design that was Object Linking and Embedding (OLE) (aka "failure model? what's that? network objects? not invented here, never heard of them!") also designed ActiveX (aka "security? what's that?"). And we all know about the petition to have Lennart Poettering to stop applying his reverse Midas touch to Linux (for those of you who missed it, Poettering had a hand in changing components, such as PulseAudio, HAL, DBUS, and the startup system in Linux, that didn't need changing and were broken for years as a result of said changes. This gave rise to various grassroots petitions for him to please stop).
These exceptional people aside, the problems we face are by and large of a collective nature. Until recently, computer science was expanding so rapidly that it didn't pay to spend any time fixing up misconceptions or even errors -- if a bunch of people are doing something that's flawed, it's much easier, and certainly less confrontational, to just pick a different problem and avoid the shitshow altogether. But, over the course of a half-century, these misconceptions and errors accumulate, and distributed systems is chock full of them.
And some misconceptions are actively propagated by people with very deep pockets. It occasionally crosses my mind that someone I love will end up at the mercy of a flight control or life support system designed by someone who read one too many blog posts about how to build scalable systems using eventual consistency. The blogosphere teaches programmers that because of CAP, one can't have in-flight audio available to everyone while telemetry is delivered consistently to the autopilot at the same time, and so one or the other has to go, and we know we can't let go of in-flight entertainment. Or something like that; it's hard to simulate the mental fog of someone who has been at the receiving end of industry pablum and the blogospheric echo chamber. You might think this is an exaggeration, but I've seen the developer conferences, I've watched the videos, I've read the code and I've seen things you people wouldn't believe.
Hence, this blog. It was either this or start drinking. And I might yet choose the latter.
So, I plan to write about everything hacking related, especially about distributed systems, occasionally about designing hardware, and sometimes about the process of building things, but it will always be about building robust things that work. I'll write about exciting new developments in the fields of distributed systems, operating systems, networking and security. I certainly enjoy tackling technical myths and pointing out broken design, so there'll be quite a bit of that. I will try not to proselytize my own research too much, but I do plan to talk about other people's recent research that I find interesting. When I'm pressed for time, I tend to write longer, and I hate our newly emerging "tl;dr culture" on the Internet, so I will likely not be writing the usual 1-to-2-paragraph blog posts that are over before they've even begun. I hope some of the writings will challenge the way you think about building systems. I'll try to keep the writing accessible to a non-specialist audience, and I'll certainly have to take liberties with precision in the process. If you want rigor and precision, academic papers provide quite a bit of that and I have a CV full of those; my goal here is to stimulate that jump into relevant papers. And I do have a life, so if I can average about one semi-decent post per week, I'll be happy.