Announcing The Town Crier Service

ethereum town crier virtual notary May 15, 2017 at 10:01 AM Yan Ji, Ari Juels, and Fan Zhang

We are delighted to announce the (alpha) public launch of Town Crier (TC) on the Ethereum public blockchain. TC acts as a secure data pipeline between smart contracts and websites, or what’s commonly called an oracle.

The TC logo: A handbell + ‘T’ + ‘C’

TC isn’t a mere oracle, though. Thanks to its use of a powerful new form of trusted hardware (Intel SGX), it provides unprecedented data integrity. Equally importantly, TC enables general, flexible, and practical smart-contract execution with strong data privacy.

If you’d like to plunge into technical details and play with TC right off the bat, you’ll find all you need at www.town-crier.org. You can use TC’s Ethereum smart contract front end. We’ve also partnered with SmartContract so you can experiment with coin-price queries through their easy-to- use interface.

You can also read our published technical paper here. For those wanting a gentler introduction, though, read on...

Why do we need oracles?

Smart contracts confined to on-chain data are like sportscars on local roads. They’re purring with latent power, but can’t do anything really interesting.

To unleash their potential, smart contracts need access to the wide open vistas of data available off-chain, i.e., in the real world. A financial smart contract needs access to equity, commodity, currency, or derivative prices. An insurance smart contact must be aware of triggering events such as bad weather, flight delays, etc. A smart contract allowing consumers to sell online games to one another must confirm that a seller successfully transferred game ownership to a buyer.

Latent power waiting to be unchained...

Today, though, smart contracts can’t obtain such data in a highly trustworthy way. And they can’t achieve data privacy. These deficiencies are starving smart contract ecosystems of the data they need to achieve their full promise.

We’d even argue that this problem of data starvation was one cause of the DAO debacle. The DAO attracted $150+ million because lack of data meant a lack of good smart contracts for people to invest in.

What’s the problem?

Let’s take flight insurance as a running example. It’s been widely explored by the community and you can find an implementation on the TC website.

Suppose Alice buys a flight insurance policy from a smart contract MolassesFlights that pays $100 should her flight, WA123, be delayed or cancelled. Clearly, MolassesFlights needs to learn the status of Alice’s flight. How can it do this? Smart contracts (and blockchains) can’t query websites, as they don’t have internet connections. And even if they did, there’s a more basic problem. Suppose that the creators of MolassesFlights and its users all trust flightaware.com. In order to determine the status of a particular flight, someone (say, Carol, a MolassesFlight operator) must query flightaware.com with Alice’s flight information (Q = “WA123”) and obtain a response (e.g., R = “Flight WA123 delayed”). The problem is that there’s no way for Carol to pass along R such that someone else knows that R, as relayed by Carol, is correct and free from tampering.

This lack of transferability holds even if Carol obtained R over HTTPS, the protocol used for secure (authenticated and encrypted) web connections. Carol learns that R is authentic because she logged into a website authenticated by a valid certificate. But HTTPS doesn’t enable her to convince another party that flightaware.com sent R. (HTTPS doesn’t support data signing.) As a result, there’s no trustworthy way to feed R to a smart contract.

There’s another problem as well. For Alice to buy flight insurance, she must specify her flight to MolassesFlight. If she sends it in the clear on the blockchain, she reveals her itinerary on the blockchain, and thus to the entire world -- an unacceptable breach of privacy.

Enter TC.

TC in a nutshell

Ah, the jet age…

TC’s core code runs in a secure enclave, an environment protected in hardware by a new Intel technology called SGX (Software Guard eXtensions). When TC queries a website over HTTPS and obtains a response R, other parties can confirm that R is authentic because the hardware (CPU) itself prevents any modification of TC or its state. SGX also enables TC to handle confidential data privately within the enclave.

Specifically, SGX provides three key properties:

Integrity: An application running in an enclave is protected against tampering by any other process, including the operating system.
Confidentiality: The state of an application running in an enclave is opaque to any other process, including the operating system. (See our paper for some important caveats.)
Attestation: The platform can generate an attestation, a digitally signed proof that a given application (identified by a hash of its build) is running in an enclave.

Using Property 3 (Attestation), a user can verify that a valid TC application instance is running in an SGX enclave. SGX attestations can bind a public key PK to an application instance such that the corresponding private key SK is known only to the application.

Provided that a user trusts the TC implementation (whose source code we publish), therefore, she can trust that data TC has signed using SK are valid. The same goes for a smart contract, of course. Given PK, it can verify the correctness of data emitted by TC.

Once a user has established trust in a TC instance, Properties 1 and 2 come into play. Property 1 (Integrity) prevents anyone from tampering with the execution of TC and thus ensures that TC transmits only data validly obtained from target websites. In our flight insurance example, this property ensures that flight data will be exactly as reported by flightware.com. Property 2 (Confidentiality) enables TC to handle private data. In the same example, Alice can encrypt her flight information (Q = “WA123”) under TC’s public key. TC send query Q to flightaware.com over HTTPS. Consequently, Alice’s flight will not be disclosed on the blockchain or to TC’s operators.

In general, you can think of SGX as running an application in a black box or simulating its execution by a trusted third party. Even the owner of the machine on which the application is running can’t break its integrity or confidentiality.

The figure below shows the basic flow of data between a user smart contract (User Contract), the TC smart contact front end (TC Contract), and the TC Server. Here, prog denotes core TC code, params is the set of parameters in a query, and data is the website response (R in our above example). See our paper for details.

We provide more details on the under-the- hood workings of TC in our blog appendix.

From Alpha to Omega: TC today and tomorrow

TC today is an “alpha” system. It’s fully functional, and supports several different query types:

Flight data,
Stock tickers,
UPS tracking,
Coin market prices, and
Weather data.

Flight data queries are encrypted as described above. Other queries are in the clear.

Basic blockchain-to- server data flow in TC

As you can see, the current set of query types and functionality are fairly limited. Thus our current “alpha” label. We’ve got a lot more in the works, though, including:

Custom queries: We’re planning to allow users to define their own query types. In this way, users will be able in effect to create their own oracles. (This is possible today in SmartContract, with whom we’ve partnered in our alpha launch.)
Account-scraping: One of the most powerful opportunities enabled by TC stems from its ability to handle confidential data. Our Steam exchange query, which is under development, exemplifies this opportunity. This query will ingest a pair of user identities A and B, a user credential belonging to A (and 2FA passcode, if necessary), and a game identifier. TC will respond by indicating whether A transferred ownership of the game to B. To do this, TC logs into user A’s account. Thanks to the use of SGX, all of this can be done without exposing A’s credentials to anyone, including TC operators. This is just the tip of the iceberg. The ability to handle data confidentially is a stepping stone to full-blown versions of off-chain confidential smart-contract execution.
Data-source and TC-server redundancy: If a data source goes bad, e.g., a source of flight information is incorrect, TC will blindly and faithfully relay incorrect data. An extension of TC, however, can combine data from multiple sources to reduce the risk of error at the source. For example, TC can fetch stock ticker data from five different websites and take the majority value. Another problem is that if TC goes down, a relying user contact will fail to receive timely data. We plan to address this problem through deployment of redundant servers, an approach that can also minimize the (improbable but not inconceivable) risk of an enclave getting physically compromised.

We look forward to your comments and thoughts about how TC can evolve and best serve the Ethereum and broader smart contract communities.

The fine print

TC is patent-pending. We expect to commercialize versions of TC for other blockchains, particularly permissioned ones. It’s also possible that we will charge for special query types and/or bulk queries in Ethereum later down the road.

The current Ethereum public blockchain TC functionality, however, is free, and we plan to keep it that way. Users only need to pay the gas costs for a query. We intend to enlarge our free service substantially. As academics and smart contract enthusiasts, we’re interested above all in helping smart contracts achieve their full potential and seeing the grand experiment called Ethereum flourish and empower users in ways yet to be imagined.

More to Come

TC is just one of many other systems coming from IC3. Several near the end of the pipeline and worth looking out for are Teechain, a secure layer-2 payment system, HoneyBadger, a highly robust asynchronous BFT system, Snow White, a new, highly efficient consensus protocol, and Solidus, a confidentiality-preserving on-chain settlement system.

Thanks!

Town Crier was created by students and faculty in IC3 (meaning that the students did the real work, of course). Yan Ji, Oscar Zagarra, and Fan Zhang created our alpha service, with generous help from Lorenz Breidenbach and Phil Daian.

We want to thank Intel for their close collaboration, particularly Mic Bowman and Kelly Olsen for their support and advice, Sergey Nazarov at SmartContract for supporting our alpha launch, and the Ethereum Foundation, especially Vitalik Buterin, Alex Van de Sande, and Vlad Zamfir, for their input on TC during early development.

Appendix: A peek under the covers

While our general approach is conceptually simple, realizing TC using SGX requires some finesse.

To begin with, enclaves don’t have network stacks, so they must rely on the operating system (OS) for network functionality. Running outside the TC enclave is a Relay that handles TC’s network connections.

The whole point of running in an enclave, though, is to avoid trusting the operating system. How can we build a trustworthy application coiled in the embrace of a potentially hostile operating system?

As mentioned above, TC communicates with HTTPS-enabled websites, and thus uses TLS. To prevent OS tampering with TC connections to websites, we’ve partitioned a TLS implementation (mbedTLS) such that the handshake and record layer portions reside in the enclave, while the TCP layer resides outside the enclave. In this way, the code inside the enclave has a secure channel to the website. In effect, we treat the OS itself as a network adversary—exactly what TLS is meant to protect against by design.

As we’ve noted, TC has a smart contract front end, which we simply call the TC Contract. We therefore also have the problem of securing communication between this front end and TC code in the enclave. How do we prevent the OS from corrupting blockchain data?

The simplest solution would be to run a client (e.g., Geth) inside the enclave. But this would bloat what’s called the trusted computing base (TCB), the code that users need to trust in order to trust TC. The basic TC code, plus Geth, plus any wrapper around Geth would be a lot of lines of code.

We solve this problem in a counterintuitive way. TC doesn’t actually verify incoming data from the blockchain. It hopes that the Relay (and OS) are passing it valid queries from the TC Contract, but it accepts and processes bogus queries. The trick in our implementation is that TC signs queries along with responses. This way, the TC Contract, which records the queries it handles, can weed out invalid queries. And happily any corruption of queries is visible on the blockchain, providing incontrovertible evidence of corruption. The schematic below shows how these various pieces fit together for queries to a hypothetical service https://www.lots-o-data.com.

Elements in green are those trusted by a User Contract querying TC. Note that there’s no need to trust the Relay, i.e., it sits outside the trusted computing base.

Basic TC architecture

Other oracle designs

There are alternative ways to create oracles, and it’s beneficial for the community to be able to draw on multiple sources of data.

One approach to oracle design, exemplified by Gnosis and Augur, leverages prediction markets. These systems create a financial incentive for a community to express truthful evaluations of queries, and thus data that may be consumed by smart contracts. This approach has some nice features. For example, it permits queries to be expressed in the form of natural language, something that TC can’t easily support (at least, not without some pretty sophisticated NLP).

Prediction markets don’t scale to large numbers of queries, however, as the cost per query in human brainpower and capital commitments is high. Nonetheless, this approach does serve a particular niche very well.

Oraclize is another option. It uses a clever protocol that leverages TLSnotary to create attestations (and has some pending work on other options, e.g., Android-based software attestation). Oraclize has been an important catalyst within the Ethereum community, as it has helped address the data-starvation problem. Its assertions of authenticity, however, are ultimately weaker than those obtainable through use of a hardware root of trust. Most importantly, without trusted hardware, it’s not possible to offer true data confidentiality. Even encrypted queries must be decrypted by and are therefore exposed to the oracle and/or a supporting service. Thanks to its simple and elegant security model, SGX enables TC to skirt these problems.