Bitcoin Guarantees Strong, not Eventual, Consistency


It has somehow become a common adage that Bitcoin is eventually consistent. We now have both academics and developers claiming that Bitcoin provides a laughably weak consistency level that is reserved solely for first-generation NoSQL datastores.

All of these people are wrong.

In this post, I want to dispel the myth of Bitcoin's eventual consistency once and for all: Bitcoin provides an incredibly strong consistency guarantee, far stronger than eventual consistency. Specifically, it guarantees serializability, with a probability that is exponentially decreasing with latency. Let's see what this means, and discuss why so many people get it wrong.

The Fallacious Eventual Consistency Claim

The error in thinking that Bitcoin is eventually consistent stems from looking at the operation of the blockchain and observing that the last few blocks of a blockchain are subject to rearrangement. Indeed, in Bitcoin and cryptocurrencies based on Nakamoto consensus, the last few blocks of the blockchain may be replaced by competing forks.

It's tempting to look at the way the blockchain operates and say "a-ha, because the contents of the blockchain are subject to change, it's clearly eventually consistent." This narrative might sound sensible to valley developers who have been indoctrinated by data-store companies who are packaging software that would not pass muster as failed masters theses. The same companies have been pushing bad science to justify the fact that they cannot implement a consistency guarantee in their data stores. Not surprisingly, this argument is flat out wrong.

The Eventual Consistency Claim is Flawed

First of all, if one were to buy the premise of this argument, that we should look at the entirety of the blockchain to evaluate Bitcoin's consistency, then the conclusion we must draw would be that Bitcoin is inconsistent. There is absolutely no guarantee that a transaction that has been observed in an orphaned block will be there after a reorganization, and therefore there is no eventual consistency guarantee.

This argument, that Bitcoin is as worthless a database as MongoDB, is more accurate than the argument that Bitcoin is eventually consistent, but is still completely wrong. The root cause of the wrong conclusion here is that the people making these arguments about Bitcoin's weak consistency have a messed up analysis framework.

The Right Way to Evaluate Databases


Here's the correct way to evaluate the consistency of distributed databases, including Bitcoin.

Every database-like system on earth comes with a write protocol and a corresponding read protocol. When evaluating the properties of a system, we examine the behavior of that system when we go through these protocols. We do not peek behind the covers into the internal state of the system; we do not dissect it apart; and most of all, we do not examine the values of internal variables, find a variable that changes, and scream "hah! eventually consistent!"

The suffix of the Bitcoin blockchain acts, in essence, as the scratch space of the consensus algorithm. For example, in Paxos (an algorithm that makes the strongest possible guarantee and yields a serializable timeline), a leader can seemingly flip-flop -- it starts out proposing value v1, but can end up learning some other value v2, and having the whole system accept v2. It'd be a folly to look at this and say "Paxos is eventually consistent: look, the leader flip-flopped." We need to look at the output of the protocol, not its transient states or internal variables.

In general, we cannot take a God's eye view into distributed systems and judge them by what we see from that privileged vantage point. What matters, what systems are judged by, is what they expose to their clients through their aforementioned read and write protocols.

Bitcoin Provides a Strong Consistency Guarantee

And a completely different picture emerges when we go through Bitcoin's read protocol. To wit, the read protocol for Bitcoin is to discard the last Ω blocks of the blockchain. Satoshi goes into a detailed analysis of what the value of Ω ought to be, and derives the equation as a function of the probability of observing an anomaly.

The nice thing about Bitcoin's read protocol is that it is parameterizable. Merchants can make this probability arbitrarily small. Because the probability drops exponentially with Ω, it's easy to pick an Ω such that the chance of observing a blockchain reorganization is less likely than having the processor miscompute values due to strikes by alpha particles. If that's not sufficient for pedants, one can pick a slightly higher Ω such that the chances of observing a reorganization are lower than the likelihood that all the oxygen molecules in the room, via random Brownian motion, will pile up into one corner of the room and leave one to suffocate in the other corner.

Satoshi suggests an Ω value of 6, in use by most merchants today. The read protocol simply discards the last 6 blocks, so reorganizations in the suffix of length 6 are not visible at all to clients. Sure, someone with a God's eye view could observe orphan chains of length 5 all day long, but it would not matter -- the Bitcoin read protocol will mask such reorganizations, and any inconsistency they might have led to, from clients.

A similar argument exists for the write protocol that I won't go through. You get the point: if you discard the last Ω blocks like Satoshi told you to, and you confirm using the appropriately trimmed blockchain, your chances of observing an anomaly are exponentially small.

What if Someone Uses A Low Ω?

Some people may actively choose to configure a system to provide weaker guarantees than it could give, for reasons of convenience or performance. In the same way I can configure an Oracle installation to violate its ACID guarantees, someone can use Bitcoin in a manner than does not take advantage of Bitcoin's strong guarantees.

For instance, the merchants who accept 0-confirmations have chosen to forego the strong guarantees of Bitcoin for convenience. They are using a read protocol that doesn't even involve the blockchain -- there is no guarantee that the transactions they see on the peer-to-peer network will ever be mined. This is not what Satoshi advised, but it may be a sensible choice for a non-risk-averse merchant. Of course, we now know how to build much faster, much higher throughput blockchains that provide better guarantees than Bitcoin's 0-conf transactions, but that's a different issue.

Why Is This So Hard?

There is a lot of confusion among software developers when it comes to consistency in databases. The muddled thinking that pervades the valley when it comes to issues of consistency probably stems from two sources: (1) academics' failure to provide a simple, accessible framework for reasoning about consistency, coupled with the misleading framework embodied in CAP, and (2) certain companies' deliberate, decade-long effort to muddy up the consistency landscape by falsely claiming that strong consistency is an impossible property to achieve under any circumstance. Surely, Google's Spanner is globally distributed and consistent, so are a slew of distributed databases that are part of a recent research wave that HyperDex started.

Just because the developers of certain cheap NoSQL data stores cannot guarantee consistency doesn't mean that eventual consistency is the best anyone can do. And Bitcoin's $6B+ value is most definitely not predicated on something as hokey as eventual consistency.

Hope this is useful for establishing a better foundation for evaluating systems, especially open source systems such as Bitcoin whose internal state is visible. It's tempting to look at that state, but consistency properties need to be evaluated solely by the outputs of the read and write protocols.

Share on Google+
Share on Linkedin
Share on Reddit
Share on Tumblr
comments powered by Disqus