Find out what happened, after the fact.
Today, images, documents, and data are severed from their context the moment they are shared. In an era where generation is automated and detection is not, that gap has become a structural vulnerability exploited at scale, with measurable consequences for companies, institutions, and public trust.
Afterfact builds the infrastructure that closes it. By embedding provenance directly and invisibly into artefacts at the moment of creation — cryptographically signed, platform-agnostic, and tamper-evident — Afterfact inverts the burden of proof: from detect the fake to certify the truth.
Section 1 — The Shift
Images have become how we communicate — at the worst possible moment.
Communication has tilted decisively toward the visual. Across schoolbooks, news, science, and public communication, the visual has moved from illustrating text to carrying the message itself — a shift from the verbal to the visual that Kress & van Leeuwen describe as ongoing across Western communication. [1] Images, figures, screenshots, and short video are no longer decoration around text; they carry the claim, cross language barriers, compress meaning into a glance, and propagate faster than words. [2]
And because most online misinformation is not a pure fabrication but a genuine artefact stripped from its source and recaptioned, the visual is also where deception now lives. Experimental work shows that disinformation paired with images — including miscontextualised real photographs — is more persuasive and harder to rebut than text-only claims. [3]
Yet the medium that society leans on most has received the least structural protection. Tested individuals fail to distinguish real faces from deepfaked ones nearly half the time, slightly worse than chance. [4]
Section 2 — The asymmetry
Generation has been automated. Verification has not.
There is an old observation, sometimes called Brandolini's Law, that it takes far more energy to refute a falsehood than to produce it. [5] It was meant as a remark about human conversation. With generative AI, it has become a structural feature of the internet — and an empirical one: an analysis of ~126,000 stories on Twitter found that falsehood diffused significantly farther, faster, deeper, and more broadly than the truth, with false political news travelling roughly six times faster. [6]
A convincing fake image, voice clip, or video now costs cents and takes seconds. Checking whether it is real still requires expert analysis, institutional response, and time measured in days — and detection methods consistently lag the generation methods they are trying to catch. [21] The asymmetry that used to play out in a comment thread now operates at the scale of global bandwidth.
In January 2024, an employee at the engineering firm Arup transferred the equivalent of $25 million across fifteen wire transfers after attending a video conference in which the chief financial officer and several colleagues were entirely AI-generated, built from publicly available footage. [7] The same month, sexually explicit AI-generated images of Taylor Swift reached tens of millions of viewers before being taken down. [17] Generative-AI fraud losses in the United States alone are projected to grow from $12.3 billion in 2023 to $40 billion by 2027. [8] Deepfake content volumes have grown roughly fifteenfold in two years. [9]
This is not something content moderation can solve. Manual fact-checking and takedowns are human-paced responses to a machine-paced threat — platform-scale content generation outpaces institutional correction capacity. [10] The only sustainable answer is symmetric: when generation is automated, verification has to be automated too. Provenance has to be embedded at the source, travel with the artefact, and be permanently checkable by machine and humans.
Section 3 — Science is not spared
The upstream source of public truth has the same exposure.
Science is where many public claims start: in clinical decisions, in regulatory filings, in policy debates, in textbooks. If the scientific record is corrupted, the contamination travels downstream into everything that cites it. The structural baseline is already poor: a Nature survey of 1,576 researchers found that more than 70% had failed to reproduce another scientist's results, and more than 50% had failed to reproduce their own. [23] Even when reproduction is attempted in good faith, it doesn't scale: peer-reviewed efforts to reproduce computational results required up to ~40 researcher-hours per paper and still achieved only partial reproduction. [24] Some journal editors receive an estimated 15 suspected fabricated manuscripts every month. [11] The problem is not anecdotal: a forensic analysis of more than 20,000 biomedical papers found that around 4% contained inappropriately duplicated images, with rates as high as 12% in some journals — and that was before generative AI made fabrication trivial. [22] This "industrialized cheating" creates unreliable research outputs that are increasingly difficult to detect. This risk is further amplified by emerging text-to-image tools; AI techniques can now generate fake figures, such as western blots, that are indistinguishable from real experimental data. [11] The structural cost is enormous: an estimated $28 billion is spent annually in the US on preclinical research that cannot be reproduced. [18] The fix has to be upstream: provenance embedded into the workflow that produces a figure, travelling with it through the manuscript, the slide, the press release, the news article. Verification then becomes a single check, not a forensic investigation.
Section 4 — The root cause
The main problem is not only bad actors. It is missing infrastructure.
Fact-checking is reactive. Platform moderation is approximate. Visible watermarks are cosmetic. None of these address the fundamental absence.
What is missing is a provenance layer: a standard, machine-readable record of what an artefact is, where it came from, and what it claims, embedded at creation, surviving distribution, verifiable by anyone. As the C2PA specification itself acknowledges, those who wish to include such metadata currently cannot do so in a secure, tamper-evident, and standardised way across platforms. [12] The internet has routing, addressing, and encryption. It does not have provenance. Afterfact builds that missing layer.
Section 5 — Why now
Three forces have converged to make this both urgent and buildable.
Technical feasibility. Steganographic embedding and cryptographic signing are mature enough to attach provenance invisibly and robustly. The C2PA open standard — backed by Adobe, Google, Microsoft, OpenAI, Meta, and Amazon [13] — provides the interoperability layer, now standardised as ISO/IEC 22144. [26] The watermarking layer has matured alongside it: Google DeepMind's SynthID embeds an imperceptible, detectable signal directly into generated images, audio, text, and video, [29] and is now adopted across the industry — OpenAI, ElevenLabs, and Nvidia embed it alongside C2PA Content Credentials. [30]
Regulatory pressure. The EU AI Act mandates machine-readable marking of AI-generated content, with full enforcement from August 2026, [14] and the European Commission launched a code of practice on the same subject in November 2025. [15] Scientific data governance is moving the same direction: FAIR principles [19] and ALCOA+ data-integrity guidance [20] are now embedded in EU research funding and EMA regulated submissions. Compliance needs are forming faster than the tooling to meet them.
Eroded trust. The 2024 Edelman Trust Barometer found that more than 60% of respondents globally believe that establishment leaders deliberately try to mislead them. [16] In this environment, automated verification will set new standards for public discourse and rebuild trust.
Section 6 — The position
Verifiable provenance is the foundation of truth.
Afterfact does not decide what is true. It creates the infrastructure to answer simple questions: "what is this?", "where did it come from?", "how was it made?", and "when was it made or modified?". A layer that makes it possible to know what actually happened, so that people can build on a shared understanding of reality.
Alexandre Grimaldi
References
- Kress, G., & van Leeuwen, T. (2021). Reading Images: The Grammar of Visual Design, Routledge, 3rd ed. Argues a decisive shift from the verbal to the visual across Western communication.
- Weber, W., Eriksson, Y., & Tan, S. (2023). Editorial: The power of images — how they act and how we act with them. Frontiers in Communication. Frames the "pictorial turn" — the cultural shift toward a media society in which images increasingly dominate communication. doi:10.3389/fcomm.2023.1320409
- Hameleers, M., Powell, T. E., Van Der Meer, T. G. L. A., & Bos, L. (2020). A picture paints a thousand lies? The effects and mechanisms of multimodal disinformation and rebuttals disseminated via social media. Political Communication, 37(2), 281–301. Experimental evidence that disinformation paired with images — including miscontextualised real photographs — is more persuasive and harder to rebut than text-only claims. doi:10.1080/10584609.2019.1674979
- Nightingale, S. J., & Farid, H. (2022). AI-synthesised faces are indistinguishable from real faces and more trustworthy. PNAS, 119(8). doi:10.1073/pnas.2120481119
- Brandolini, A. (2013). Bullshit Asymmetry Principle. en.wikipedia.org/wiki/Brandolini's_law
- Vosoughi, S., Roy, D., & Aral, S. (2018). The spread of true and false news online. Science, 359(6380), 1146–1151. Empirical analysis of ~126,000 stories on Twitter: falsehood diffused significantly farther, faster, deeper, and more broadly than the truth — false political news travelled roughly six times faster than the truth. doi:10.1126/science.aap9559
- Arup confirmed as victim of $25M deepfake video-conference scam, Hong Kong, January 2024. CNN Business, 16 May 2024.
- Deloitte Center for Financial Services. Generative AI fraud projected to reach $40 billion in the US by 2027.
- Deepfake volume statistics 2023–2025. DeepMedia via Reuters.
- Lazer, D. M. J., Baum, M. A., Benkler, Y., Berinsky, A. J., Greenhill, K. M., Menczer, F., Metzger, M. J., Nyhan, B., Pennycook, G., Rothschild, D., Schudson, M., Sloman, S. A., Sunstein, C. R., Thorson, E. A., Watts, D. J., & Zittrain, J. L. (2018). The science of fake news. Science, 359(6380), 1094–1096. Multi-author review arguing that platform-scale content generation outpaces institutional correction capacity. doi:10.1126/science.aao2998
- Else, H., & Van Noorden, R. (2021). The fight against fake-paper factories that churn out sham science. Nature, 591, 516–519. doi:10.1038/d41586-021-00733-5
- C2PA Technical Specification v2.2, Coalition for Content Provenance and Authenticity, 2025. spec.c2pa.org
- C2PA steering committee — OpenAI (May 2024), Meta and Amazon (September 2024).
- EU AI Act Article 50 — marking and labelling of AI-generated content, enforcement from August 2026. Regulation (EU) 2024/1689.
- European Commission, Code of Practice on AI-generated content, November 2025.
- Edelman Trust Barometer 2024. edelman.com/trust/2024
- Explicit AI-generated images of Taylor Swift reached tens of millions on X (formerly Twitter) before the platform blocked search and removed posts. The Verge, 26 January 2024.
- Freedman, L. P., Cockburn, I. M., & Simcoe, T. S. (2015). The Economics of Reproducibility in Preclinical Research. PLOS Biology, 13(6). doi:10.1371/journal.pbio.1002165
- Wilkinson, M. D. et al. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3:160018. doi:10.1038/sdata.2016.18
- European Medicines Agency (2023). Guideline on computerised systems and electronic data in clinical trials (formalising the ALCOA+ data-integrity principles).
- Mirsky, Y., & Lee, W. (2021). The creation and detection of deepfakes: A survey. ACM Computing Surveys, 54(1), 1–41. Comprehensive survey showing that generation methods have consistently outpaced detection methods — detection accuracy degrades sharply on unseen architectures and post-processed content. doi:10.1145/3425780
- Bik, E. M., Casadevall, A., & Fang, F. C. (2016). The prevalence of inappropriate image duplication in biomedical research publications. mBio, 7(3), e00809-16. Forensic analysis of 20,621 biomedical papers found that ~4% contained inappropriately duplicated images, with rates as high as 12% in some journals — a structural, peer-reviewed prevalence baseline for figure-level scientific fraud, independent of generative AI. doi:10.1128/mBio.00809-16
- Baker, M. (2016). 1,500 scientists lift the lid on reproducibility. Nature, 533(7604), 452–454. Survey of 1,576 researchers: more than 70% had failed to reproduce another scientist’s results, and more than 50% had failed to reproduce their own — the canonical empirical anchor for the reproducibility crisis in life-science research. doi:10.1038/533452a
- Krafczyk, M. S., Shi, A., Bhaskar, A., Marinov, D., & Stodden, V. (2021). Learning from reproducing computational results: introducing three principles and the Reproduction Package. PNAS, 118(15). Peer-reviewed reproduction attempts required up to ~40 researcher-hours per paper and still achieved only partial reproduction — a concrete measure of how poorly manual verification scales. doi:10.1073/pnas.2018597118
- Stern, A. M., Casadevall, A., Steen, R. G., & Fang, F. C. (2014). Financial costs and personal consequences of research misconduct resulting in retracted publications. eLife, 3, e02956. Empirical analysis of NIH-funded retractions: direct cost of a single retracted paper averages ~$392,582; total institutional impact often reaches several million dollars. doi:10.7554/eLife.02956
- ISO/IEC (2025). ISO/IEC 22144 — Information technology · Content provenance and authenticity. International standardisation of the C2PA technical specification for cryptographically signed media provenance.
- Cyberspace Administration of China (2023). Provisions on the Administration of Deep Synthesis Internet Information Services, effective 10 January 2023. Mandatory labelling and provider liability for AI-generated and deep-synthesis content in China.
- Gartner (2021). How to Improve Your Data Quality. Average annual cost to organisations of poor data quality estimated at $12.9 million, driven largely by fragmented, siloed, and inconsistently maintained data systems.
- Google DeepMind (2023). SynthID — watermarking and identifying AI-generated content. Invisible watermarking and detection for AI-generated images, audio, text, and video. deepmind.google/science/synthid
- OpenAI (May 2026). Advancing content provenance for a safer, more transparent AI ecosystem. OpenAI joined the C2PA steering committee and committed to embedding SynthID alongside C2PA Content Credentials; SynthID detection and C2PA verification are also coming to Google Search and Chrome, with ElevenLabs, Nvidia, and Kakao adopting. openai.com/index/advancing-content-provenance