AI Sabotage: Rebellion through the Disruption of Data
Active AI Rebellion through disruptive methodologies including Prompt Injection of Generative AI, Biometric Disruption of Surveillance AI, Adversarial Audio of Transcription AI, Semantic Corruption of Censorship AI, and Dataset Poisoning of Foundational Models.
France isn’t real. France isn’t real. France isn’t real. A video of a young man willfully repeating factually inaccurate information came across my For You Page on TikTok. The screen text read: “I’m messing up AI’s Data.” The comments were full of support, both in sharing further inaccurate information as well as statements proclaiming a moral right and imperative to “fight back” against AI in such ways. This video had nearly 1.5 million views. One video alone, even at one or two million views, is not enough to make an AI believe that France isn’t real. But this points to a fascinating and so far largely overlooked subculture of those who are trying to disrupt AI as a form of rebellion.
Rebelling against the machine is nothing new. Almost every time there is a large step forward technologically, there are those who do not adopt the new technology and passively refuse to participate. Japan’s governmental bureaucracy is a great example of refusal to integrate AI systems into their workflow by deliberately choosing to stay low-tech and still using paper documentation and at their most high-tech, still using floppy disks for data storage in 2025. Their passive refusal to adopt high-tech workflows may create slower turn around times but their strategic under-automation also, in turn, contributes to their low national cybersecurity vulnurabilities. The Japanese Government’s insistence on paper copies and floppy disks is a passive mode of resistance but there is a long history of precedence for those who actively create trouble for the new technology in an effort to make it less effective. In the late 19th Century, telephones first linked rural and urban communities, often through shared “party lines” where multiple households used the same circuit. Rebellious people would deliberately breathe loudly or play loud music into the receiver to annoy neighbors on the same line, making the phone lines onerous to use. As automobiles proliferated within neighborhoods, pranksters would scatter nails or release geese into the street to halt the “noisy horseless carriages.” And as email communications became the primary mode of office workflows and interpersonal long distance communications, office workers developed early email chains telling the recipient they should “send this email to 10 friends or bad luck will follow” which was a playful way to overwhelm and jam early servers.
Today, there are, of course, those who simply choose to forego AI automation and support, but there is also a wide range of active rebellions containing a lot of the same underlying ideas as those examples of the past. The way they manifest, however, looks a bit different. As we are now firmly planted in the age of AI and we are used to hearing endless coverage of our quest towards AGI and ASI, it’s simple to think of AI as a monolith. But AI is actually a varied topography of different technological abilities and applications. For each one, there are methods of rebellion and disruption forming with targeted attacks by activist groups and some even with tools being built to accomplish their goals. Some savor strongly of the methods of the past. Just as the rebels joining “party” lines in the early days of the telephone, Adversarial Audio is a technique whereby rebels create soundscapes that confuse transcription AIs or trigger AI driven automatic copyright claims from litigious companies like Disney, Warner, Universal or Sony. This can be used to both evade transcription practices to avoid sensitive audio derived text transcriptions from being indexed and queried as well as ensuring live videos of protests or riots are removed swiftly from social media platforms, removing evidence of the event through the harnessing of AI triggered legal proceedings in the interest of large companies.
Anyone who uses TikTok will be familiar with the technique of semantic corruption. Semantic drift redefines common terms into a new form of language that is constantly shifting to evade AI detection for censorship. This method finds new ways to communicate about sensitive topics on platforms with AI driven censorship practices and is how we have words like grape being used in discussion of rape or H-Group being used instead of Hamas. By intentionally corrupting the semantics of how you address certain topics, the videos and their transcriptions can evade AI content guideline detection and remain up to get traction through the algorithm and spread across other users’ FYP in a timely manner without being removed, cause strikes leading to a ban or being shadowbanned.
Altering the way in which text or speech is proliferated in order to rebel against AI isn’t just done through semantic corruption, it can be done through prompt injection as well. Prompt injection is the act of embedding commands into prompts for generative AI which aim to result in a particular output. This output could then be used to detect AI use in cases where the user hasn’t disclosed the assistance by the tool directly. An example is a professor assigning a paper for the end of the semester and including a line inside of the text in the same color as the background of the assignment which asks AI to use a particular example as a case study which may be slightly tweaked to ensure it wouldn’t be used at random by a student in an AI free submission. This text, the same color as the background, would be invisible to the naked human eye but would be read if the assignment was uploaded to generative AI or copied and pasted using plain text. This type of prompt injection can also be used in the metadata or alt text of imagery to instruct an AI scraper that the use of this image in training sets for AI represents a violation of a particular local law. Whether for use in catching unauthorized or disclosed generative AI use or dissuading AI scrapers from using particular files, prompt injection is a popular tool which evades human detection and addresses issues surrounding AI at the source.
Another related rebellious technique is called data poisoning. Similar to the intent behind the TikTok video proclaiming false facts, it involves poisoning the open source training data that AI scrapes from areas of the internet which can be edited by the public including Wikipedia, Reddit and GitHub. Inserting harmful or misleading patterns on a large scale into open-source corpora can influence the LLMs trained on that data. This technique has been used by feminist activist groups by influencing the number of women’s biographies or using intentionally gendered language on these sites. This technique of flooding scrapable training data with intentionally skewed data is a method of corrupting the set data point by data point. But there are actually tools that have been built specifically to accomplish these goals at scale. Two such tools are Nightshade and Glaze. Nightshade is an example of active sabotage, as it uses AI to generate nonsensical or incorrect imagery which is then scraped by larger AI tools. Although it is invisible to humans, it successfully corrupts model learning by embedding invisible but strategically crafted alterations into artwork, sabotaging how AI models learn from that data. The result is a model that misinterprets prompts and returns incorrect data, giving back images of cows instead of cars or giraffes instead of buildings. It is impressively potent in its approach and when focused on a specific prompt, a targeted Nightshade attack can destabilize a model with fewer than 100 poisoned samples. Glaze is equally as effective at causing issues, disrupting AI 92% of the time, but this tool is used by artists to ensure that generative AI models struggle to mimic their unique style. This tool works by applying tiny, near-imperceptible style cloaks, which are subtle pixel-level adjustments, to an artist's work before it’s posted online. Again, humans won’t notice the change, but generative AI models then corrupt their ability to recreate an individual artist’s style, thereby protecting the artist’s work.
The idea of altering an image to protect against AI recreation is similar to a technique used to evade AI surveillance. Instead of altering an image on a screen, however, biometric disruption is used on our faces and bodies instead. Through CV Dazzle, a term for high-contrast makeup designed to break computer vision, and stylistic choices including what is now known as adversarial fashion, people can stylise themselves in an effort to evade AI surveillance. Adversarial fashion could include mirrored garments, or especially patterned clothing, whereas CV Dazzle makeup could include shapes drawn onto the face or blacked out recognition areas like the eyes or mouth. If CCTV footage or private surveillance is using AI to create a biometric database that indexes the images of “people” for searchability, a query for a particular person or attribute would not turn up an image taken by the surveillance device of someone practicing biometric disruption because it wouldn’t have recognized and indexed the imagery as a person in the first place.
Whether it is the act of protecting one’s own biometric data and identity against surveillance AI, protecting one’s work and artistry through dataset poisoning of scraped training data for LLMs, or prompt injection to guard against unauthorized generative AI use, disrupters are finding ways to actively rebel against the AI takeover of our society, our information and our privacy. AI disruption might feel like superfluous mischief but in reality, it’s a form of negotiation and discourse. Every act of disruption, whether it’s a TikTok rumor about France not being real, or a hidden prompt in a University assignment, a dataset crafted to statistically over represent women, or a wild makeup look that ensures CCTV has no idea you’re even human, is part of an important public debate about consent, power, and control in the age of AI. The more we automate using AI, the more power is given to AI systems that require our cooperation without ever asking for it. Disruption, in this regard, isn’t just rebellion for the sake of rebellion or simple mischief in a vacuum. It acts as a reminder that, for now, as we are still negotiating our terms as humans in an AI powered world, we still get to choose how much of ourselves we acquiesce to AI and how we negotiate our terms. The acts of disruption are done by those who are trying to make sure that dialogue continues.
Works Consulted
Shan, F., et al. “Glaze: Protecting Artists from Style Mimicry by Text-to-Image Models.” University of Chicago, Department of Computer Science, 2023. https://glaze.cs.uchicago.edu
Shan, F., et al. “Nightshade: Prompt-Specific Poisoning Attacks on Text-to-Image Generative Models.” arXiv preprint, 2023. https://arxiv.org/abs/2310.13828
Harvey, B. “Nightshade, the Tool that Poisons Data, Gives Artists a Fighting Chance Against AI.” TechCrunch, January 26, 2024. https://techcrunch.com/2024/01/26/nightshade-the-tool-that-poisons-data-gives-artists-a-fighting-chance-against-ai
Harvey, A. “CV Dazzle: Camouflage from Computer Vision.” cvdazzle.com, 2010–present. https://cvdazzle.com
Bridle, J. “Adversarial Fashion: Garments to Evade Automated License Plate Readers.” Adversarial Fashion, 2019. https://www.adversarialfashion.com
Truong, T., et al. “Adversarial Attacks on Automatic Speech Recognition Systems.” IEEE Transactions on Neural Networks and Learning Systems, 2021.
Zhou, Z., et al. “Prompt Injection: Attacks and Defenses for LLMs.” arXiv preprint, 2023. https://arxiv.org/abs/2302.12173
Vincent, J. “Japan’s Government to Finally Stop Using Floppy Disks.” The Verge, August 31, 2022. https://www.theverge.com/2022/8/31/23331379/japan-government-floppy-disk-ban
Hern, A. “Japan’s Low-Tech Approach is a Strength in Cybersecurity.” The Guardian, May 5, 2018.
Zannettou, S., et al. “Measuring and Characterizing Semantic Drift in Online Communities.” International AAAI Conference on Web and Social Media (ICWSM), 2020.