Researchers Warn: 'Poisoning' the Internet Threatens Behavior of Models Like ChatGPT

Researchers have warned that language AI models, such as ChatGPT and Gemini, can be manipulated by introducing misleading texts on the internet — known as 'data poisoning' — leading to the production of incorrect or ambiguous content.
Summary of Findings
Teams from the UK Centre for AI, the Alan Turing Institute, and Entropic conducted a training experiment that showed that introducing about 250 contaminated documents is sufficient to negatively impact the outputs of the models. After that, the models produced vague and unreliable texts, demonstrating the ease with which malicious actors can influence the behavior of systems.
How is the attack carried out?
The attack relies on spreading fake or contaminated articles and posts in public places on the internet (personal websites, blogs, Wikipedia, etc.), making this material part of the dataset that is later used to train or update the models. According to the researchers, creating about 250 contaminated articles may be enough to change the model's behavior.
Why is this dangerous?
Most models are trained on public data from the internet, so any forged content becomes a potential source for learning.
Data poisoning undermines reliance on AI for sensitive tasks (medical, legal, security).
The attack is relatively easy to execute and its risks are widespread because victims may not quickly detect the manipulation.
Recommendations from Researchers and Expected Impacts
Researchers call for:
Strengthening filtering and validation mechanisms for data sources before using them in training.
Developing tools to detect contaminated content and mechanisms to trace the source of data.
Imposing strong transparency standards in AI model update processes.
Researchers point out that failing to take effective action may limit the safe reliance on AI in vital areas.