Image Manipulation Threat Models

No defense is meaningful without naming the adversary. A catalog of who alters images, what they want, and what each defensive layer actually buys against them.

A threat model is a structured description of who you are defending against, what they want, and what they will spend to get it. Without one, conversations about image authenticity collapse into vibes: "deepfakes are bad," "we should use C2PA," "watermarks can be removed." All three are true in some contexts and meaningless in others. The point of this page is to give a working vocabulary for the contexts.

The actors who manipulate images vary by orders of magnitude in capability, incentive, and patience. A bored teenager swapping faces with a free app is not the same threat as a state intelligence service running a multi-month influence operation. A scam ring producing nude deepfakes for extortion is not the same threat as a paparazzi outlet retouching a celebrity photo. Each requires a different defense posture, and the same technology — C2PA, SynthID, perceptual hashing — performs differently against each.

This page enumerates the threat categories that matter for image provenance in 2026 and assesses how the dominant defensive technologies fare against each. The framing borrows from security engineering: defenders should imagine the specific adversary they face before reaching for a tool. A tool that is excellent against casual deception may be useless against a determined nation-state, and a tool that hardens against the nation-state may impose costs that nobody else needs to pay.

The actor catalog

The casual repurposer

The most common manipulator is the user who reposts an old photograph as a current one, mislabels a stock image, or crops a screenshot to remove context. There is no synthesis involved and often no editing beyond cropping. The harm is informational — the image is real but the caption is wrong. Defenses here are sociological more than technical: reverse image search handles this case well, because the image already exists somewhere with correct context. Provenance helps if the original was credentialed but does not solve the case of an uncredentialed photograph being recaptioned.

The hobbyist generator

A user with Stable Diffusion or a Midjourney subscription producing images for amusement, art, or low-stakes mischief. They typically do not strip watermarks or alter manifests; they share whatever comes out of the tool. Against this actor, generator-side watermarks like SynthID, Adobe Firefly's C2PA signatures, and metadata-preserving distribution channels are highly effective. The actor has neither the motivation nor the skill to defeat the controls. The problem is that any credential signal degrades through platforms that strip metadata, which means the defense is contingent on platform behavior, not on the actor.

The targeted harasser

Someone producing intimate or compromising synthetic images of a specific named person, usually for extortion, defamation, or revenge. This actor is highly motivated, often technically skilled, and increasingly likely to use open-weights models that produce no watermark. They will actively post-process to remove identifying signals. Defenses here depend on detection (which is brittle), platform-side perceptual hashing against known abuse datasets, and statutory remedies. The US state law page covers the patchwork of criminal statutes that target this case.

The scammer

Producers of synthetic images for fraud — fake celebrity endorsements driving crypto investment scams, fake KYC documents, fake property listings. Volume is high, sophistication varies, and the economic incentive scales with success. Watermarks help when scammers use the convenient commercial tools; they fail when scammers move to local open models. The defensive layer that has worked best is platform-side automated detection combined with rapid takedown, but that operates on imagery that has already been seen, not novel material.

The political operator

Actors producing synthetic images to influence elections, defame candidates, or stoke unrest. They range from amateurs (the 2024 fake Biden robocall) to well-resourced state-aligned operations. The serious end of this category will use models with watermarks removed, signed forgeries if a credential ecosystem demands them, and laundering through multiple re-encodings before distribution. Defenses here have to assume the adversary will defeat any single technical layer; the operational answer has been multi-layer verification at media organizations rather than reliance on a tool.

The legitimate forger

A photographer who stages a scene and signs it honestly with their camera's credentials. This is the case that provenance is least equipped to handle, because the credential mechanism faithfully records what the camera saw and the camera saw a staged event. Every news organization that has handled staged-photograph scandals — the 2006 Reuters Adnan Hajj retouching controversy, the World Press Photo retroactive disqualifications — has dealt with this case, and a C2PA chain would not have caught any of them. The limitations page covers this in depth.

The state-aligned operator

Intelligence services running long-term influence campaigns or fabricating evidence for criminal or political prosecutions. This actor has the resources to compromise signing keys, run their own certificate authority, infiltrate a camera supply chain, or simply produce traditional photographs that match a desired narrative. No technical mechanism reliably defends against this adversary; the defenses are organizational, journalistic, and political. The honest position is that provenance raises the cost of operations against soft targets but does not prevent operations against hard targets.

How each defense maps to actors

Actor	C2PA	Generator watermarks	Detection classifiers	Perceptual hashing
Casual repurposer	Helps if originals credentialed	Irrelevant (no AI)	Irrelevant	Strong (image already known)
Hobbyist generator	Strong if tool signs	Strong	Moderate	Strong once distributed
Targeted harasser	Weak (will strip)	Weak (open models)	Brittle	Strong post-incident
Scammer	Moderate	Moderate	Brittle at scale	Strong on reused material
Political operator	Weak alone	Weak alone	Brittle	Weak (novel material)
Legitimate forger	None (signs truthfully)	None	None	None
State-aligned operator	Weak (key compromise)	Weak	Weak	Weak

Caveat "Strong" in the table above means "raises adversary cost meaningfully against a representative member of this category." It does not mean "always succeeds." Every defensive technology in this field is probabilistic. A claim of certainty is a marketing claim, not a technical one.

Capability tiers and the cost of attack

Security engineering distinguishes adversary tiers by what they can spend. The framing transfers usefully to image manipulation. A tier-1 adversary has only consumer tools and uses default settings; a tier-2 has technical skill, open-weight models, and willingness to post-process; a tier-3 has organizational resources, persistence, and possibly insider access at one or more nodes of the trust chain. Each tier defeats a different defensive layer.

Tier-1 attacks are defeated by any credentialed pipeline being available. The attacker simply uses the credentialed tool and the credential records the truth: this image was generated. The visible badge in the consumer surface — Adobe's CR icon, Microsoft's Content Credentials panel — does the rest. The defender wins as long as the consumer notices the badge.

Tier-2 attacks circumvent credentials by avoiding tools that produce them. The attacker runs an open-weight model with no signing, optionally re-encodes the output to strip any incidental metadata, and distributes through channels that don't add their own signals. Against this, the defender falls back to SynthID-style watermarks (if any) and to forensic detection. Neither is reliable; both are useful in aggregate.

Tier-3 attacks go after the trust roots themselves. The 2025 Nikon Z6 III incident — in which a signing flaw allowed forged C2PA manifests — is a tier-2/tier-3 boundary case. A tier-3 adversary running a state-issued certificate authority that issues C2PA signing certificates to its own operatives can produce arbitrarily many "legitimate" forgeries. The trust list architecture is the lever for excluding such CAs, but only after they are recognized as adversarial, which is a political act.

What this means for tool selection

The actionable consequence of the threat model is that no organization should adopt a single defensive technology without naming the adversary it is being adopted against. A newsroom worried about staged photographs needs editorial process, not C2PA. A platform worried about scam-image volume needs perceptual hashing and detection, not credentials. A camera maker worried about evidentiary defensibility for its journalist customers needs C2PA. A criminal court worried about fabricated evidence needs all of the above plus expert testimony.

The corollary is that the technology vendors marketing single-solution narratives — "C2PA solves AI deepfakes" or "our detector spots all synthetic media" — are misrepresenting what they sell. The honest version is that each layer raises the cost against some adversaries, and the defender's job is to assemble layers appropriate to the threats they actually face. The verification workflow page sketches what this looks like in practice for an end user.

Where the field is moving

Two trends shape the next few years. First, the cost of tier-2 attacks is collapsing as open-weights models improve. This pushes more of the defensive burden onto generator-side cooperation, which depends on regulation rather than technology. The EU AI Act's Article 50 marking requirement, applicable from 2 August 2026, is the first serious legal lever in this direction. Whether it produces actual marking or paper compliance will be visible by 2027.

Second, the cost of tier-3 attacks is also dropping, in a different way. Compromising a single signing key gives an adversary the ability to mint signed forgeries until the key is revoked. Hardware-backed signing in flagship phones (Pixel 10 on Titan M2, Galaxy S25 on Knox) raises the cost of physical extraction, but does nothing against an authorized signer who chooses to lie. The trust-list and revocation infrastructure will be tested in the next several years by deliberate adversarial signers; how the C2PA coalition handles those cases will reveal whether the trust model is credible against actors who play by the rules of the system in order to undermine it.

Threat models for image manipulation

The actor catalog

The casual repurposer

The hobbyist generator

The targeted harasser

The scammer

The political operator

The legitimate forger

The state-aligned operator

How each defense maps to actors

Capability tiers and the cost of attack

What this means for tool selection

Where the field is moving