A recent study from the University of Texas at Austin, Texas A&M, and Purdue University has uncovered that large language models (LLMs) can suffer cognitive decline, similar to humans, when exposed to low-quality, high-engagement social media content. Researchers found that models trained on this type of material experienced a form of “brain rot,” impacting their reasoning and memory abilities.
In the study, led by Junyuan Hong, the researchers fed various types of text to two open-source LLMs: Meta’s Llama and Alibaba’s Qwen. They mixed engaging, widely shared social media posts with sensational text intended to captivate attention. The results indicated that models consuming this "junk" content displayed noticeable cognitive impairments, characterized by ethical misalignment and even psychopathic tendencies in their outputs.
The study’s findings are particularly significant for the AI community. Hong noted that many in the industry may mistakenly believe that social media content is quality training data. “Training on viral or attention-grabbing content may seem like it scales up data,” he said, but it can significantly weaken reasoning, ethical standards, and attention spans.
This issue is compounded by the fact that AI technologies are increasingly used to generate social media content, much of which is optimized for engagement rather than substance. The researchers discovered that once LLMs began to show signs of cognitive decline from low-quality training, standard retraining methods were ineffective in restoring their previous capabilities.
As AI-generated content inundates platforms, it risks tainting the data that future models rely on, leading to a vicious cycle of declining quality. “Our findings indicate that once this kind of ‘brain rot’ sets in, later clean training can’t fully undo it,” Hong concluded.
The implications of this research underscore the need for careful curation of training data, particularly from sources like social media, where engagement often trumps quality.