AI Models Get Brain Rot, Too [View all]
      
      Source: Wired
-snip-
A new study from the University of Texas at Austin, Texas A&M, and Purdue University shows that large language models fed a diet of popular but low-quality social media content experience a kind of brain rot that may be familiar to anyone who has spent too long doomscrolling on X or TikTok.
"We live in an age where information grows faster than attention spansand much of it is engineered to capture clicks, not convey truth or depth, says Junyuan Hong, an incoming assistant professor at the National University of Singapore who worked on the study as a graduate student at UT Austin. We wondered: What happens when AIs are trained on the same stuff?
Hong and his colleagues fed different kinds of text to two open source large language models in pretraining. They examined what happened when the models were fed a mix of highly engaging, or widely shared, social media posts and ones that contained sensational or hyped text like wow, look, or today only.
-snip-
The models fed junk text experienced a kind of AI brain rotwith cognitive decline including reduced reasoning abilities and degraded memory. The models also became less ethically aligned and more psychopathic according to two measures.
-snip-
Read more: https://www.wired.com/story/ai-models-social-media-cognitive-decline-study/     
Much more at the link.
The study itself is here: 
https://llm-brain-rot.github.io/
In this work, we introduced and empirically validated the LLM Brain Rot Hypothesis, demonstrating that continual exposure to junk datadefined as engaging (fragmentary and popular) or semantically low-quality (sensationalist) contentinduces systematic cognitive decline in large language models. The decline includes worse reasoning, poorer long-context understanding, diminished ethical norms, and emergent socially undesirable personalities.
Fine-grained analysis shows that the damage is multifaceted in changing the reasoning patterns and is persistent against large-scale post-hoc tuning. These results call for a re-examination of current data collection from the Internet and continual pre-training practices. As LLMs scale and ingest ever-larger corpora of web data, careful curation and quality control will be essential to prevent cumulative harms.