The State of Privacy on Ethereum
By Dean Pierce (ConsenSys Diligence), Robert Drost (ConsenSys R&D), and Mason Nystrom (ConsenSys)
In a world that is increasingly connected, and where our information is increasingly cataloged, duplicated, shared, and sold, maintaining our expected levels of privacy can be a challenge.
Like most things, privacy is not binary and instead falls along a spectrum from fully public to completely private. So when talking about privacy, three questions require further discussion.
- What do consumers and enterprises want to keep private?
- Are people willing to pay (in cost and effort) for privacy?
- What are the tradeoffs for achieving private transactions on a public blockchain?
This article aims to briefly examine the demands for practical privacy on public blockchains and to discuss at a high level the tradeoffs of implementing privacy solutions.
The First Question: What Level of Privacy Matters?
One example of privacy is anonymity, or privacy of identity. In the context of public blockchains, anonymity refers to the ability for parties to exchange something (i.e. money, tokens, or data) without needing to reveal identity-related information about themselves or other transactions they have done. While this is only one facet of privacy, it has become increasingly important as blockchain has evolved.
Cryptocurrencies like bitcoin and ether have been increasingly traced to correlate public addresses across transactions and to analyze and link to off-chain identities on crypto to fiat conversions. The net effect of this makes the identities of parties in transactions more public. Because public blockchains must fundamentally provide a log of all transactions, privacy for consumers and enterprises using cryptographic algorithms and protocols has become increasingly relevant.
Enterprises and consumers have very different demands when it comes to privacy. Enterprises typically require privacy in the form of transaction data, for example, product name, quantity, price, address, personally identifiable financial information, etc.
Network participants are usually known but may need to be withheld or made available to other participants depending on their roles. A freight forwarder, for example, might not need to know the contents of a certain shipping container, but only that the container has arrived. Banking regulations also restrict who may have access to transaction data. Ernst and Young’s Nightfall protocol for private transactions on Ethereum using zk-snarks and JP Morgan’s Anonymous-Zether for Quorum are prime examples of enterprises developing privacy solutions for Ethereum.
Compared to enterprises, which often have strong business motivations or regulations around privacy, consumers to date have generally shown less awareness and concern about privacy. Nevertheless, consumers want to protect their identity, credit card information, or other sensitive data to prevent fraud or identity theft. Sometimes, consumers want to transact anonymously which requires privacy in regards to both the sender and receiver of a transaction. However, privacy isn’t native in the daily lives of consumers and most individuals willingly sacrifice their privacy for convenience or free access (accept cookies, use free wifi, tracked web surfing, etc).
The Second Question: Is Privacy in Demand?
Privacy has generally been used in the context of messaging by protecting content sent between parties. It has also been used in broader constructs for communication channels and the underlying network layer. We have seen multiple constructs from the evolution of public-key cryptography and its adoption to other key exchange mechanisms to generate end-to-end secure internet/transport layer protocols (IPSec v2, SSL). Further, this has also gone a step below to ensure secure DNS querying as well as the adoption of Tor-based relayers. A lot of this work has been spawned off from open standards through academic research and adoption by enterprises to ensure they preserved privacy and confidentiality in data transfer — but many of these technologies have found their way in the retail user tech stack — thereby benefiting end-users.
Specifically for blockchain — although Zcash is nearly 3 years old, only about 5% in ZEC in existence is stored using SNARKs (about half of which uses legacy SNARKs). About 95% of ZEC are stored in transparent addresses that offer little privacy. From this lack of adoption, we can infer that perhaps most users haven’t yet felt the need to pay (in cost and effort) for privacy.
However, privacy is still required for the eventual mainstream adoption of blockchain technology. The success of built-in privacy layers such as SSL enabling the internet to become a trusted commerce medium suggests that consumers and enterprises want privacy to be built natively into systems and applications.
The Third Question: Tradeoffs of Privacy
This third question is much more technical and requires a deeper examination of how privacy is achieved on Ethereum and the tradeoffs of the various mechanisms involved. Blockchain networks trade scalability for decentralization while privacy mechanisms and technologies come with tradeoffs of their own. We’ll examine what other privacy-focused blockchains have implemented for privacy and then discuss some of the Ethereum network privacy solutions.
Lessons From Other Privacy-Focused Blockchains (Monero and Zcash)
Before getting into the details about Ethereum, the two major players in the privacy coin space are Monero and Zcash. Monero was special in the early altcoin days because its codebase was not based on Bitcoin’s code at all but on a completely unrelated cryptocurrency project called Bytecoin (which was the reference design for the CryptoNote protocol). The original CryptoNote design obfuscated the sender of a transaction by mixing their signature with many other decoy signatures (mixins). This, combined with stealth address outputs made for some very strong privacy guarantees. The “ring signature” scheme got a reputation early on as being a fancy built-in mixer, which wasn’t far off.
In 2017, the ability of ring signatures to hide transaction data improved drastically with the introduction of RingCT, which used zero-knowledge range proofs to increase the variety of signatures that could be batched together. The introduction of RingCT also enforced minimum mixin requirements to mitigate linkability attacks that plagued earlier versions of Monero. One of the primary challenges at this point with ring signatures is that they used up a lot of space on disk to store on the Monero blockchain. Additionally, ring signatures do not scale to large groups and are currently limited to a group of 10–15 participants.
In late 2018, we saw the introduction of “Bulletproofs” on the Monero network, an exciting new zero-knowledge construction that scales logarithmically with the number of signatures in the ring, thus reducing the required size of a transaction. This improvement brings the capabilities in line with other blockchain projects.
Zcash was the first implementation of a cryptocurrency using zkSNARKs. Using this technology, users are able to send fully private transactions that are only visible to the recipient. To an outside observer, ZEC being sent to a private address seems to disappear into a large cryptographic black box, and when the recipient wants to move their coins back to a non-private address (exactly analogous to a standard Bitcoin address), the coins seem to come out of thin air, with no observable connection between the sender and the recipient. One important note about zero-knowledge proofs is that the require more computation power to run, which in turn makes the transactions more expensive.
Threat to Fungibility
The Ethereum network offers pseudo-anonymity (i.e. transactions are linked to addresses that correspond to public keys signed by user-held private keys, not by username/password) and its distributed nature and transparency enable many radical new technological capabilities.
However, similar to Bitcoin, Ethereum can also unwittingly expose users who don’t realize the breadth of information they are sharing when they use such technology for things like fungible digital asset transfers.
One threat to privacy lies with the knowledge of identities with whom the public and private keys are associated. Given the public nature of blockchains like Bitcoin and Ethereum, naively using their built-in transaction framework acts like breadcrumbs leaving an easily followed trail of all transfers of any assets — even fungible ones — changing hands.
Privacy Through Address Generation
As privacy technologies continue to advance, many more complex threat models can be considered. In 2012, BIP32 introduced Hierarchical Deterministic keys which allowed for a single seed phrase to generate a never-ending stream of “fresh” Bitcoin addresses. This allowed users to generate new addresses every time they accepted a transaction, and all of these addresses could easily be exported and imported into new wallets without having to import several randomly generated keys individually.
This same feature exists in Ethereum, though newly generated keys cannot interact with smart contracts until they are funded with the ETH they need for gas costs. This is also complicated by the fact that many systems built on Ethereum tie many aspects of a user’s real-world identity to their address. This extra magnitude of metadata linked to Ethereum addresses can make Ethereum especially vulnerable to de-anonymization attacks. Fortunately, the same smart contract capabilities that expose Ethereum to these threats can also be used by cutting edge new cryptosystems to enable safe and seamless private transactions.
ZK Constructions and Trusted Setups
Many zero-knowledge constructions require what is called a “Trusted Setup”. This means that the entire construction relies on the generation of special random numbers, and anyone who knows those random numbers has the ability to peek into the internals of the operations. In order to alleviate some of these concerns, complex ceremonies have been devised to generate these random parameters to ensure that the construction can be trusted. This typically involves several trusted members of the community each deriving their own private random data, and combining them with each other in such a way that if *any* one of the participating parties deletes their key data, then the secret value is safe. Hence, all participating parties would need to collude to put the construction at risk.
Notably, the Bulletproofs used by Monero do not require a trusted setup, but zkSNARKs in Zcash do. The Zcash trusted setup ceremony was documented in a now famous RadioLab piece. In contrast, STARKs do not require a trusted setup as they instead use the choice of hashing function as their “setup” rather than any kind of special numbers. Various trusted setup ceremonies have been proposed, such as the multi-party Perpetual Powers of Tau Ceremony.
Zero-Knowledge Notes (ZK-Notes)
An early mover in the Ethereum privacy space, AZTEC Protocol uses a system of “zero-knowledge notes” to track cloaked finances. These notes are visible on the Ethereum network, including the owner of each note, but the amount stored on each note is hidden from everyone but the note’s owner.
The zero-knowledge magic comes in when a note owner decides to perform a “joinSplit” operation, which means they can take any number of notes they control, and create a set of output notes that may or may not be owned by other people. This, in conjunction with stealth address technology, can make it such that every new note that is created is owned by a totally clean Ethereum address that has never before been used on the network. In a common use case, a “ZK-Asset” contract can be connected to any ERC20 compatible token, allowing users to deposit tokens to get ZK-Notes minted, and letting users burn ZK-Notes to withdraw. This mechanism allows any existing asset on the Ethereum network to be traded in a privacy-preserving manner. The proofs used by AZTEC Protocol are easier to use than ZK-Snarks, but still require a trusted setup.
Aztec is also approaching trusted setups with other novel solutions. PLONK is a new, efficient ZK-SNARK construction that requires one trusted setup and all programs can re-use this single setup. PLONK is efficient enough for practical use on Ethereum because of its modest gas requirements. Based on the transaction capabilities, CEO of AZTEC Protocol, Tom Pocock believes PLONK can be used to program complex logical statements in a manner that preserves perfect privacy.
ZK using Secure MPC
Implemented in ZKBoo and recently in Ligero, this approach “compiles” secure multi-party computation protocols into ZK-PCP systems (one of the earliest ZK systems using probabilistic proofs), by requiring the prover to commit to the transcript of a secure MPC protocol (‘’in-the-head’’), and then enable the verifier to randomly evaluate the view of one of the parties. This basically means that the entity with the knowledge of the relevant data can perform a computation as a simulation of a distributed computation among multiple parties, and then show the transcript of this at random points of evaluation. More so, using MPC enables the potential to arrive at creating private smart contracts.
Like ZK-STARKs, MPC-based proofs are:
- Transparent — the generation of random numbers is public information
- Post-quantum secure — the reliance on public randomness and the availability of hash functions are still problems that are not scalably attacked by quantum systems
- Scalable — MPC-based proofs have a (quasilinear) proving time and a verifier time that can be highly efficient for amortized and batched computations
Some of the tradeoffs around using such techniques involve how these could be optimal for small and medium-sized ‘circuits’/problems — which could potentially lead to scalability issues for verifiers.
That being said, MPC based techniques still haven’t been explored to their full potential in the blockchain space and these would be much more universal than existing ZK techniques especially in cases where there is a need for parties to preserve confidential information pertaining to the actual computation itself. For instance MPC techniques would be useful for trying to run a credit-scoring algorithm to assess the creditworthiness of a customer, but where neither the customer nor the bank wants to give up confidential information pertaining to their transaction history nor the weights in the ML credit-scoring model respectively.
When Zcash first introduced the idea of using zk-SNARKs for sending transactions, there were some serious concerns about the amount of compute power required to use cloaked transactions, on the order of several hours or more to generate a transaction. We’ve come a long way since then, and modern implementations are now capable of doing analogous tasks in seconds in a browser, or even on mobile devices.
Mixers for Privacy
One topic that has been getting a lot of attention recently is that of mixers. Back in May, Vitalik posted the motivation and a rough outline for a next-generation mixer design on the Ethereum network.
Ethereum mixers are needed to help implement natively private transactions for wallets or individuals. The traceability of ether means that specific transactions can be tracked and linked to other wallets, accounts, etc. Mixers are utilized for swapping ether for ether to further anonymize the transacting ether.
Since then many groups have been working tirelessly to make mixers more practical for Ethereum. Below is an up to date chart of the compute and gas costs for depositing and withdrawing mixed ether.
Individual mixers at the application layer will never provide absolute privacy to users, instead giving only probabilistic guarantees. However, maybe this will be enough for the demands of most individuals and enterprises.
Who Pays for Gas? Rise of The Relays.
One fatal flaw of a lot of these approaches is that in the end, someone needs to pay the gas to claim the output. Where does this ether come from? If the ETH that pays for the final claim can be traced back to some user, then that user can be de-anonymized, which defeats the entire purpose.
This creates a sort of privacy “Chicken and Egg” type scenario, where the only way to accept anonymized ether is to already have anonymized ether. In Vitalik’s original mixer blog post, he solves this with a simple relayer registry contract, where relay operators who promise to publish arbitrary transactions can register an HTTP endpoint where such transactions can be anonymously published.
Finally, wallet juggling and operational security must be considered. Figuring out safe defaults that protect the user, while not giving them too cumbersome of an experience is still being discussed. All of these mixer solutions require a large number of participants for a reasonable expectation of privacy, so the tools need to be easy to use by the masses, but any shortcuts here could lead to some pretty serious breaches of privacy. For example, a user who mixes some Ethereum, spends some of it on something intended to be private, might later forget which wallet they used for the private transaction, and send the remaining ether back to an address that is publicly associated with them.
These technologies, and many others being developed in the space, are great indicators that privacy on the Ethereum network has been gaining increasing attention and advances that may soon receive a massive boost. While achieving privacy on public blockchains appears paradoxical, zero-knowledge and other privacy technologies will enable all sorts of new, cutting edge use cases. At the same time, these solutions will empower users and give them peace of mind about their financial privacy.
While this is not a complete overview of all privacy capabilities on Ethereum, it examines the various methods for achieving the privacy requirements for enterprises and consumers. Many individuals in the crypto ecosystem were inspired by its censorship-resistant technology that provided freedom. The ability to transact anonymously or protect an individual’s information is fundamentally important in order to create a crypto native world. When it comes to privacy, there’s no magic bullet, but rather numerous methods and mechanisms that provide privacy for specific use cases.
As such, we’ll continue to examine and evaluate the privacy solutions for Ethereum in order to help educate and propel this technology forward. This will include future posts about specific privacy solutions and reports explaining the various privacy technologies and more in-depth analysis of projects and companies currently building privacy solutions.
Disclosure: ConsenSys remains incredibly interested in privacy and scalability technologies and ConsenSys Labs has invested in Aztec Protocol, Ligero, and Starkware, and continues to look for great projects pushing the limitations of this space.
Thanks to Min Teo, Joseph Chow, and Zac Williamson for coordinating the initial outline as well as Amira Bouguera, Praneeth Srikanti, and Steve Marx for their feedback.