Revealing the hidden links in the Monero blockchain

If you follow the world of cryptocurrencies, you probably know that Monero is a privacy-centric cryptocurrency that has surged in the market since late 2016. My coauthors and I, from Princeton and UIUC, have just released some research showing that its privacy claims, particularly for transactions made prior to February 2017, are significantly overstated. Based on the invective response our research has generated, the Monero community would much prefer you focus only on their newest offering, while ignoring the unfulfilled promises from their recent past.

For a bit of background: Just like in Bitcoin, each Monero transaction has some inputs and some outputs. Each input is a link to the output of a previous transaction, forming a graph.

In Bitcoin, these linkages are transparent and explicit. Every block explorer website (like blockchain.info) lets you follow the graph, clicking the links to move from one transaction to the next. These linkages have also been the basis for graph analysis, clustering, and deanonymization analysis.

Obscuring transaction links in Monero

A typical users' mental model of Monero transactions is informed by this illustration.

In Monero, however, the links between transactions are obscured. Each transaction input can contain decoy links called “mixins.” Every transaction input has multiple incoming links, and every output has multiple outgoing links; in principle, no one can tell which is the real one among the mixins. You may have formed a mental model based on the popular illustration shown to the right.

Furthermore, if you visit any Monero block explorer website, you'll see the transaction graph is presented as a dead end. You can't tell where each input comes from, and you can't tell where each output goes. Just as an example, have a look at how typical block explorers render this transaction from May 2016, ChainRadar and MoneroBlocks

This transaction has 5 inputs and 5 outputs. Each input has 6 mixins (7 links in total, including the real one), indicating a high level of privacy is desired. The outputs are dead ends, and you can't tell which of the 7 links is the real one. After all, Monero is the "opaque blockchain".

In our research paper, "An Empirical Analysis of Linkability in the Monero Blockchain," we show that in fact for most of Monero's blockchain history, the mixins haven't done much good. Most transactions made prior to February 2017 actually are linkable. Here's the problem. In the past, most coins were spent by 0-mixin transactions (those that opt-out of privacy altogether) were commonplace. Including these coins as decoys doesn't do any good, because it's already obvious where they've actually been spent. However, the Monero wallet does not take this into account. The result is that we are able to identify the correct links for the majority (62%) of 1+ mixin Monero transactions made from 2014 through 2016. The Monero blockchain has provided little more privacy than Bitcoin.

To illustrate the problem and make its impact crystal clear, alongside our paper we've launched a block explorer, called MoneroLink, that reveals the hidden linkages in Monero transactions. If you visit the MoneroLink page for the transaction mentioned above, you'll see we are able to identify the correct incoming link for 3 of the 5 inputs, and we're able to identify the correct outgoing link for 2 of the 5 outputs, despite the large number of mixins.

MoneroLink example

The MoneroLink block explorer reveals the hidden linkages among Monero transactions (from 2014 through 2016).

We readily note that for transaction made since March 2016, the rate at which we can link transactions has steadily decreased, although the correct link can still be guessed with higher probability than you would hope. In this post I'm eliding almost all the technical details, but a more detailed analysis can be found in our technical paper.

Is this a surprise?

Hide and seek

If the existence of a link-revealing Monero block explorer comes as a surprise to you, then you're not alone. This is the first block explorer of its kind. To date, there has been no indication that the Monero blockchain is vulnerable to such wide analysis. And it clearly contradicts the privacy claims in Monero's marketing and popular media.

However, the reaction to our work from Monero developers and discussion community on /r/monero has been to say we have known this all along.

“This is not news. Anyone who has done any basic reading on Monero has known this for a long time” (tweet)

What “basic reading” refers to here is a pair of reports, MRL-0001 and MRL-0004, from 2014 and 2015 respectively, which introduce the main vulnerability that our explorer relies on (as well as even more sophisticated concerns outside our scope), explaining that they could plausibly lead to “a critical loss in untraceability across the whole network if parameters are poorly chosen and if an attacker owns a sufficient percentage of the network.” [Emphasis mine.]

Neither of the MRL reports conveys that this is an actual problem affecting actual transactions. Instead, the papers are abstract, describing mathematical models of marbles in urns and hypothetical attack scenarios involving Simpsons characters. Most importantly, no prior report has made any empirical analysis based on actual blockchain data.

The contribution of our work is to show that 1) the parameters have been poorly chosen, 2) there doesn't need to be any attacker, the problem manifests all on its own, and 3) we confirm that indeed the result has been a critical loss in untraceability.

On the soothing quality of the MRL reports

http://hackingdistributed.com/images/2017-bitcoin/hide-seek2.jpg

I'd like to call attention to a particular pattern of discussion, in which MRL reports have been used (unintentionally) to quell further investigation. At various points in Monero's history, several forum posters have stepped right up to the threshold of this revelation, but have been gently steered away from proceeding further, using the MRL reports to end the discussion.

Case 1: In December 2015. A redditor asks “Educate me: Zero mixins forgo all the privacy advantages of Monero, right? But do they damage or interfere significantly with other transactions?”

Redditor VedadoAnonimato responds:

“I'm not even sure whether the mixin selection algorithm rejects outputs that have previously been spent in no mixin transactions. If the selection algorithm can't reject these, then the situation is dire, since currently the majority of Monero transactions have no mixin." (permalink)

Monero developer smooth_xmr confirms VedadoAnonimato's concerns, but then refers to the MRL report to convince readers that the problem is already dealt with.

"The math in MRL-0001 shows that those outputs will eventually become irrelevant though... we plan to blacklist those from any future mixins which will immediately solve the problem” (permalink)

The authoritative “math in MRL-0001” is the final word, effectively ending the discussion. Incidentally, the proposed change to “immediately solve the problem” was never implemented (the measures actually implemented have had a gradual effect instead). Regardless, even if it were implemented, “solve the problem” here means that future transactions will enjoy better privacy. This doesn't do a bit of good for anyone who relied on the privacy of prior transactions!

Case 2: In the following thread, a user asks, “Is Monero Really Anonymous?” and posts a link to an essay speculating on a form of linkability.

“This has been discussed at length. … [MRL-0001] explains the known issues related to tracing transactions, how well monero already does at them, and where it needs to do better.” (permalink)

Further down thread, “There's nothing to discuss that hasn't already been extensively documented in those linked whitepapers.”

Case 3: Here's another example showing how the MRL reports have been interpreted:

“Peters comment is MRL-0001 which got fixed with MRL-0004 - a totally theoretic attack anyway :-) “ (bitcointalk)

Our research clearly shows that this attack is not theoretical, but absolutely affects actual transactions. It has taken empirical blockchain analysis to show this.

In fact, VedadoAnonimato (again in late 2015) even highlighted the need for this missing empirical approach:

"It would be interesting to know if somebody has actually gone through the data and checked how many transactions can be partially or totally traced through the problems pointed out there." (permalink)

The thread, of course, ends immediately on that cliff hanger. The blockchain data has always been there for anyone to take a look at, but it's only now that we've done the work.

The truth hurts, but time heals

http://hackingdistributed.com/images/2017-bitcoin/hide-seek3.jpg

The Monero cryptocurrency and community will undoubtedly survive this revelation. The best response now would be to acknowledge that this result comes as a surprise and no one knew just how many prior transactions were vulnerable to linkability analysis. Even if some folks have had a suspicion deep down that this might be the case, no one has particularly wanted to look into the data and see the unpleasant truth.

If the Monero developer community instead rallies around the “This research is not new!” narrative, then that bodes poorly for them. It means that the community's leadership has knowingly allowed users to be misled about the privacy of prior transactions. The right thing to do would have been to issue an advisory that communicates the problem very clearly:

WARNING: Monero users who made transactions in 2014-2016 expecting unlinkability should be warned that their transactions are most likely linkable after all.

This is exactly the message that the MoneroLink block explorer conveys. It's better late than never!

The MRL reports have served as a form of dog-whistle communication. The reports are written (unintentionally, but perhaps subconsciously) in densely coded language that avoids alarming users. Developers have been working in earnest to improve future versions of the software, but have not sought to empirically quantify the existing problem or to clearly warn users about the privacy implications. Real users today can easily be impacted by a loss in privacy of yesterday's transactions.

I'll wrap up with a final quote from smooth_xmr:

"Most transactions in 2014 and 2015 (and even most of 2016) were mining and trading. There were precious few ways to use it for anything else. There was a crypto kingdom game at some point, and a dice site, that's about it." (permalink)

This is plausible, and hopefully it's true, since it would mean the potential harm caused by this vulnerability is limited. However, even if only a small number of users relied on Monero for privacy in 2016 and earlier, then their privacy matters, and this warning is intended for them.

Addendum 1. More responses from the Monero community

As anticipated, this release stirred up a flurry of shitposting and character attacks, on reddit and on twitter, from a loud minority of grumpy folks. I'm resisting the urge to curate the worst of these for a cheap vindictive pleasure. Most of the rest are editorial quibbles about how we present our results.

Instead I'd like to thank Monero core dev smooth_xmr for an excellent discussion thread with technical feedback. We take your criticism seriously and will incorporate it in updates to our paper. I also want to thank SamsungGalaxyPlayer and KaroshiNakamoto for insightful (but still very critical) posts, including a great thread calling for a more rational and level-headed public response to a research release like this.

Addendum 2. Isn't this a paid hit piece for Zcash?

Disclaimer, disclaimer. I've been involved with Zcash for years (as well as for several other cryptocurrency projects, like Tezos, Ethereum), as a consultant and recently as a founder of its community-supporting Foundation. This is transparently and appropriately disclosed on the front page of our technical report. It's also clear to see from my website, twitter profile @socrates1024, and is duly reported to my employer, the University of Illinois. My research is not funded by Zcash, and the other coauthors have nothing to do with Zcash at all.

Many of the initial reactions to our research predictably focused on questioning our motives as a way of drawing attention away from the substance of our report. By all means, be skeptical of our findings, and ideally reproduce them independently.

Addendum 3. On the timing of our release coinciding with the Monero hard fork upgrade

Whoops. We dropped our paper within a few hours of a planned hard fork flag day. Maybe no one will believe me, but this timing was entirely coincidental, and I was not aware that Monero was going to upgrade then. This is totally my fault, because in hindsight there were plenty of places I could have read about this. (I've been reading lots of monero reddit threads recently while preparing our article, but only by keyword search, not by reading the front page.)

In particular, Monero has established the admirable policy of scheduling hard fork upgrades regularly, on April 15 or on three other predictable days of the year, and has committed to this policy for over a year.

I can see how this timing looks like hostility. If I were paying more attention and noticed, I absolutely would have waited until after the hard fork to release news that that would likely distract the developers and broader community. Originally we hoped to release this work two weeks ago, but I got distracted by the Financial Cryptography conference and didn't finish the website until now. I'm relieved this hard fork upgrade went fine, as have all of Monero's hard fork upgrades to date.

Share on Google+
Share on Linkedin
Share on Reddit
Share on Tumblr
comments powered by Disqus