Authenticity is becoming increasingly scarce in the digital world. A piece of data is authentic if you know where or who it came from. The Internet gave us universal information distribution, and AI has given us unbounded content creation, but the tools to ensure that we can trust the data we receive have not caught up. Digital signatures are a battle tested cryptographic primitive that address this authenticity crisis, and the need to widely implement digital signatures has never been more pressing. This article will go over digital signatures, what they are and aren't good at, and explore some of the more interesting types of data that can be digitally signed.
Digital signatures are a cryptographic primitive that powers much of the modern Internet. If you are already familiar with them, feel free to skip to the next section.
Say you want to perform a simple interaction online, like getting your account balance from your bank's website. There is one fundamental question - when you visit www.bank.com, how do you know that you're actually talking to your bank? The problem here is that the Internet was not designed with a permanent notion of identity - participants use IP addresses which can change over time. Yet users should always know how to reliably communicate with important websites such as their bank, government page, or favorite social media app. So what can we do to ensure authenticity?
The answer is digital signatures! Digital signatures are a tool of public key cryptography - to start, you generate a keypair which consists of a public key and a private key (both of these are just represented as byte strings). You keep your private key secret and broadcast your public key to world, i.e. telling everyone that "Bob has public key 0xabcdef". You are now able to "sign" a piece of data using your private key, which will generate a string of bytes known as the signature. Anyone in the world can now put together the original piece of data, this signature, and your public key in order to "verify" that the signature did indeed come from your public key.
Note that this solves both the identity and authenticity problems of the internet - your identity is now just your public key! In addition, someone else can assume that in a secure setting, everything signed by your public key originated from you.
Now you can trust that you are actually talking to your bank online. In reality, there remains the question of how you can get your bank's public key in the first place - you can't trust your bank to just tell you its public key since you don't know who your bank is in the first place! For this, we rely on public key infrastructure (PKI) to create a mapping from www.bank.com to your bank's public key. PKI is out of scope for this discussion, but you can learn more about it here.
Now we understand why authenticity is so important in the digital world, but why use digital signatures in particular? The answer is threefold: self-sufficiency, unforgeability, and composability.
Self-sufficiency
Unforgeability
Composability
Let's visit an example to see how these properties of digital signatures compare to alternatives, and when digital signatures would be preferable. Say the government of Democratia wants to issue a digital identity card for its citizens. They want to choose between two options: a) issue a card to each citizen which contains the citizen's basic information along with a digital signature and is kept in their digital wallet or b) set up accounts for each citizen and allow applications to request information about a citizen using an OAuth-like flow.
What are the advantages of using digital signatures? First off, because signatures are self-sufficient, you can present your digital identity information to anyone without sending a request to Democratia's server or going through any authentication flow. One can imagine that if every country had its own digital identity card, then an application would need to support logging into each individual government website for compatibility. Digital signatures create a lightweight abstraction layer for authentication where logins to different databases are replaced by possession of a signature. Secondly, the unforgeability of digital signatures can lead to better security tradeoffs compared to a central server. Notably, using digital signatures allows for the private key management and signing process to happen in a separate, highly secure server. Democratia can still run a "hot" website which receives a lot of traffic and even displays identity information to users, but even if this website is hacked an attacker will still not be able to generate fake signatures of users' identity data. Finally, because digital signatures are composable, any developer can use Democratia's digital identity system for their own application. For example, I can require users to prove they have a Democratia digital identity in order to login to my application, and use a derivative of their Democratia credentials as their account identifier. By doing this, my application can utilize the system of trust that Democratia has built.
What are the disadvantages of digital signatures? Democratia may want to know every time someone's digital identity is being used, and keep track of what it is being used for. Because digital signatures are self-sufficient, the government would have no way of knowing how a signature is used once it is in the hands of a citizen. In addition, any developer can choose to build their own applications on top of a digital signature without permission, meaning Democratia won't be able to restrict potentially malicious applications from requiring a digital identity to login. This is an important concern when it comes to ensuring data privacy, and some authorization consent screen combined with privacy preserving cryptography such as zero knowledge proofs would likely be necessary to ensure citizens know what information they are giving away. Lastly, the unmalleability of digital signatures makes it more difficult for Democratia to update a user's digital identity - they would need to resign the new identity, distribute the new signature, and have some method for revoking the old one.
Digital signatures are not a silver bullet for all problems related to authentication and authorization. They are not as flexible as restricting access through a central server, and do not provide any privacy guarantees by themselves. These are both issues that can be addressed, with public revocation lists for credential updates and deletes, and zero knowledge proofs with selective disclosure for privacy. But the main opposition to the adoption of digital signatures is likely to come from their self-sufficiency. Being unable to track the use of a signature and allowing users to present data without the consent of the issuing party is antithetical to how much of the data economy works today.
So far, digital signatures have found widespread adoption in TLS certificates, email, peer to peer payments, and more. But this is just the tip of the iceberg - every single piece of data that originates from a specific person or organization could be digitally signed. In particular, here are some important types of data that would benefit the most from having digital signatures.
Passports, driver's licenses, residency cards, social security are perfect candidates for digital signatures because authenticity is of paramount importance. In fact, many passports and certain residency cards are already digitally signed today! Adding a digital signature to either a physical or digital card has a low marginal cost and greatly increases resiliency against forgery. The main challenge with signatures on identity documents is revocation and credential updates. A solution here would likely require online revocation lists or short expiry periods, but this is a challenge faced by existing identity documents regardless.
Digital signatures can be used in financial and legal environments as a system of accountability. Data communication in the current legal framework offers plenty of room for plausible deniability - physical documents can be forged, and authenticity is reliant upon physical signatures which themselves are quite unreliable. Often times legal evidence is based on email or subpoenas on other communication platforms, but this is costly and reliant on third parties. Digital signatures offer an efficient mechanism to keep all parties accountable for the information they communicate.
Authenticity is what fuels the creative industry - artists have names and reputations based on the work attributed to them. In an age where AI allows anyone to create and copy high quality media, and the Internet allows anyone to distribute such media at little to no cost, the authenticity of creative work has never been in greater peril. One part of the solution is for artists to digitally sign every piece they produce and have trusted registries for artist public keys. In addition, we could have a content timestamp notary, an entity that will take a piece of creative work and sign the content along with a timestamp of when it was created. More advanced techniques are possible, for example creating perceptual hashes of videos to automatically block copyrighted content, but at the minimum one should be able to present a piece of creative work with a signed timestamp and let the public decide who deserves credit for it. In fact, large publishers are already trying to address this problem, with the FOX Corporation recently launching Verify, a tool for media companies to digitally sign their content and upload it to the blockchain so that its provenance can be verified.
Similar to creative work, it is becoming increasingly clear that the authenticity of content produced by hardware in the physical world is in jeopardy. This is most relevant to the sectors of news and public information, as photos, videos, and audio used in the news are susceptible to forgery and parody by generative models. A possible solution here is attested hardware, cameras and recorders that sign the data they produce with a tamper-resistant private key embedded in hardware. Even more ambitious, the hardware could sign the timestamp and location where the content was produced, although the latter is far more challenging. Photos, audio, and video could be transformed in zero knowledge, allowing information sources to release censored or modified content that was originally produced by trusted hardware.
As the cost of forging and imitating digital interactions goes down, the value of physical interactions goes up. Brands, creators, and artists care about loyal engagement from fans, and more often than not the most devout engagement occurs in the physical world. Trusted hardware can attest to human interactions in the real world, as this can be as simple as tapping an NFC card with your phone when you shop at a store, attend a fan meetup, or go to a concert. These interactions are especially valuable because they are verifiably produced by humans who are willing to put in time and effort, and allows creators to isolate those who care the most.
So much of the data that is transferred on the web goes through APIs, and standardizing digital signatures across data returned by APIs would be a huge step towards authenticity on the Internet. A simple example: I want to prove my bank balance to you. I can take a screenshot of my balance, but then you might just say it was photoshopped. Instead, if my bank had a signed API, I could query this API and send you my balance with a signature proving its veracity. Signed APIs excel when privacy is involved. Instead of sending you my balance in the clear, I could instead generate a zero knowledge proof that my balance is above a certain number, say greater that $1000, and have this claim be rooted in the trust of the bank's original signed API response. However, in practice few APIs are signed, in part because companies depend on restricting access to their APIs to make money. The widespread signing of APIs will likely have to accompany business models that support an open data economy.
We could go one step deeper in the abstraction stack and choose to sign HTTP responses. This would allow us to capture not just the data that is available via an API, but the content of any webpage on the Internet. One use case here would be a verifiable web archive, where news sources could be held accountable for the information that they publish, even if they choose to edit it later on. TLS Notary and Reclaim Protocol are bootstrapped solutions that allow us to get verifiable data from HTTP using a third party, but servers signing responses would remove the need for these workarounds.
The list of possible data that could be digitally signed is boundless, but one category is particularly interesting - any type of data where a trusted timestamp or geolocation is necessary. We've already discussed the timestamp notary, a system where you could send it a piece of content and get it signed with a timestamp. But another interesting direction is the location notary, a setup where some device can request to get its location at a particular time notarized. A possible implementation could involve aggregating signatures from nearby devices, or even getting signed GPS responses, but locations are far harder to pin down since they can be mocked by a network of devices spread around the world.
Digital signatures are great, but they don't solve everything. In particular, some of the challenges and shortcomings of digital signatures:
Digital signatures were first conceived in the 1970's and have already been implemented widely around the Internet, so why should we care about them even more today? In short, the need has never been greater, the UX for key management has vastly improved, and new cryptographic primitives have been developed to address the shortcomings of digital signatures.
Authenticity is of paramount importance in today's digital age, and with modern technology it is becoming increasingly difficult to know where information comes from. Digital signatures offer a lightweight, robust, open system to provide authenticity, and we're only beginning to scratch the surface on the types of data that can be digitally signed. Now is a better time than ever to push for their adoption.
P.S. And of course, this piece is digitally signed. I used libhalo from ARX Research to sign the Markdown of this article with a FIDO2 compliant NFC card, you can try it for yourself at nfcsign.me.
P.P.S. The title of this article, Sign Everything, references a previous blog post from Fred Wilson.