Image Authentication vs Provenance

Three terms are used as if they were synonyms in the press and frequently even in vendor materials. They are not. Each names a different question, a different tool, and a different failure mode.

The vocabulary problem in this field is acute. A single news article will speak of "authenticating" an image, "detecting" deepfakes, and "verifying provenance" as if all three were facets of the same operation. They are not. They are three distinct technical activities, each answering a different question. Conflating them leads to bad procurement decisions, bad editorial calls, and bad legislation. This page sets out the boundaries.

Authentication asks: is this file what its sender claims it to be, and has it been altered since? Provenance asks: where did this file come from, and what happened to it along the way? Detection asks: was this file produced by a machine, and if so, which kind? The first is a question about a specific transmission. The second is a question about history. The third is a question about generative origin. The tools, the failure modes, and the trust assumptions are different for each.

This page exists because the conflation is not academic. In late 2024 the Cybersecurity and Infrastructure Security Agency published guidance that used the three terms interchangeably across consecutive paragraphs. The same year, several state legislatures drafted bills mandating "authentication" of AI-generated images without specifying what that meant. The result is implementations that satisfy a checkbox without addressing the underlying question.

Authentication: a question about a specific file

Authentication, in the cryptographic sense, asks whether a digital object is the exact one a particular party signed or transmitted. It is a binary question with a binary answer. The hash of the received bytes either matches the signed hash or it does not. If it does, the file is authentic with respect to that signature. If it does not, something changed between signing and receipt.

Classical authentication mechanisms — PGP signatures on attachments, S/MIME on email, code-signing on binaries — apply trivially to images. You compute SHA-256 over the file, you sign the hash with a private key, you ship the signature alongside. The receiver checks. Done. The mechanism is well understood and has been deployed for decades. Its limit is that it tells you nothing about what the file represents. A perfectly authenticated image can be a perfect fake; authentication speaks only to integrity in transit, not to truth at capture.

Authentication also breaks under any pixel-level transformation. Re-saving a JPEG at a different quality changes every byte and invalidates the signature. Resizing for a different display invalidates it. So does any social-media platform that strips metadata and re-encodes. This is why authentication on its own has never been a workable model for images outside controlled environments like camera-back tethering or evidence transfer. The need for a mechanism that survives normal handling is what eventually produced the more elaborate provenance frameworks.

Provenance: a question about history

Provenance, in the C2PA sense, is the documented chain of custody for an image from capture or generation through every editing and publication step. It is constructed incrementally: each tool that handles the file appends a signed record of what it did, hashing the new state, and binding its assertion to the prior chain. The result is a manifest store that a validator can traverse to answer questions like: what device produced the original pixels, what edits were made and by whom, and was the chain broken at any point.

The crucial difference from authentication is that provenance is about a sequence, not a single point. A C2PA-signed image carries its own history, structured in a way that lets a downstream consumer assess not just integrity but plausibility. An image that claims to be a Reuters photograph but whose manifest shows it was generated by DALL·E and then signed by an unknown editor is informative in a way that a yes-or-no hash check cannot be. The manifest structure and the assertions vocabulary are what make this work.

Provenance also breaks under transformation, but its failure mode is different and more recoverable. A C2PA manifest is embedded as a JUMBF box inside the image container; many platforms strip such boxes on upload. The provenance ecosystem has responded with durable Content Credentials — a fallback layer combining watermarks and perceptual fingerprints that lets a stripped image be matched back to its original manifest stored on a server. This is messier than pure cryptographic authentication, but it survives the conditions under which images actually circulate.

Detection: a question about generative origin

Detection asks a question independent of any signature: was this image, or some region of it, produced by a generative model? Detection methods range from learned classifiers trained on real-versus-synthetic datasets, through frequency-domain analyses that look for artifacts characteristic of upsampling networks, to watermark extractors that look for an embedded signal placed by a generator. The output is typically a probability rather than a yes-or-no.

Detection does not require cooperation from the producer. This is its great virtue and also its limit. Because the producer is not in the loop, detection has no leverage to demand that the generator make itself identifiable. It is reduced to looking for tells, and adversarial pressure from the generator side has made the tells unstable. A classifier that achieves 99% accuracy on Stable Diffusion 1.5 outputs may drop to 60% on Stable Diffusion XL, and to chance against an image laundered through a regeneration attack. AI image detection covers the empirical curve in detail.

The one place detection has structural advantage is when the generator cooperates. SynthID and AI-generator watermarks are detection in this cooperative sense: the generator embeds a signal at synthesis time, and the detector extracts it later. This collapses the asymmetry as long as the generator is honest. It does not help against an actor running a model that refuses to cooperate, which includes most open-weights deployments.

A side-by-side comparison

Property	Authentication	Provenance	Detection
Question asked	Has this file changed since signing?	What is this file's history?	Is this file machine-generated?
Requires producer cooperation	Yes	Yes	No (or yes, for watermark detection)
Output	Binary (valid / invalid)	Structured chain of signed records	Probability, often per-region
Survives re-encoding	No	Hard binding: no. Soft / durable: partially	Depends on method; many do not
Primary spec / tool	PGP, S/MIME, raw SHA-256+signature	C2PA 2.x, JPEG Trust (ISO 22144)	Classifier APIs, SynthID, frequency analysis
Adversarial robustness	Strong on integrity, none on truth	Strong against pixel forgery, weak against signer compromise	Brittle across models and post-processing

Note A page being "verified" in a newsroom workflow is rarely just one of these. It is provenance check (if any signal exists), plus authentication of any chain found, plus detection on the residual unknown, plus classical methods like reverse image search and metadata analysis. Any tool that markets a single answer to "is this real?" is collapsing categories that need to remain separate.

Where the three meet in current standards

The C2PA 2.x specification uses all three concepts in different roles. The signature on a manifest is authentication: it asserts the manifest has not been altered. The chain of manifests across edits is provenance: it asserts where the file has been. The optional AI-generation assertion is a producer-cooperative detection signal: the generator says, this is mine. C2PA does not perform third-party detection itself, but it provides a slot for a generator's self-declaration to ride along with the rest of the chain.

The JPEG Trust specification (ISO 22144), published in 2024 and refined in 2025, integrates with C2PA by defining how trust evaluations are recorded and reported. JPEG Trust does not invent a new provenance format; it builds a vocabulary for expressing the results of provenance checks, with reason codes for outcomes like "manifest valid but signer untrusted" or "ingredient missing from chain." This separation — C2PA carries the data, JPEG Trust carries the evaluation — has become useful for compliance reporting where the question "did you check?" is as important as "what did you find?"

For working practitioners, the takeaway is that the three categories are not competing options. They are layered. Authentication is the cryptographic primitive at the bottom. Provenance is the structured history built on top. Detection fills the gap when neither produced a signal. Anyone designing a verification workflow that picks one and ignores the others is making an explicit bet about what their adversary will do, and the bet is usually wrong.

Open questions

The boundary between provenance and detection is the active research frontier. Watermarking schemes like SynthID can be read as either: a self-declared provenance signal that travels with the image, or a detection signal that does not require an external manifest. The C2PA coalition has been working through 2025 and into 2026 on how to express watermark-derived signals inside the manifest grammar without committing to any single watermark technology. That work is ongoing.

The other open question is what users see. A browser badge that says "verified" collapses everything into authentication. A panel that opens to a full edit history exposes provenance. Neither shows detection. The design choices made by Adobe, Microsoft, Google, and the browser vendors over the next two years will shape what the public understands "verified" to mean. The technical distinctions on this page are the substrate beneath that design problem.

Provenance vs. authentication vs. detection

Authentication: a question about a specific file

Provenance: a question about history

Detection: a question about generative origin

A side-by-side comparison

Where the three meet in current standards

Open questions