Parallels Between Biological and Artificial Brains: Isolation vs. Recursive Training

by Tiago Freitas and Eliot Mannoia // November 10, 2024
reading time: 5 minutes

An image of a brain and AI as a second brain

© Leonardo.ai, prompted by Eliot Mannoia

We explore the intriguing parallels between human cognitive degradation in isolation and the deterioration of AI models trained on recursive data. By examining cases of sensory deprivation, such as Admiral Byrd’s documented experiences in Antarctica, alongside the behaviour of AI systems trained on self-generated content, we uncover the similarities in how both human and artificial intelligence systems process and degrade information. Both systems require diverse, external inputs to maintain reliability and avoid perceptual distortions. These shared vulnerabilities not only highlight the risks of isolated or recursive learning but also show how current AI architectures may be replicating fundamental aspects of human cognitive processes. This understanding has important implications for both AI development practices and our comprehension of human cognitive requirements.

The Continued Emergence of AI

The modern wonder of Artificial Intelligence (AI) is without a doubt transforming the connected space we call the internet. Its works and abilities in generating (or helping generate) useful content such as translations or summaries of complex subjects are replacing the common webpage with generated text. However, these advancements come with a critical challenge: the risks of future machine learning endeavours being trained on AI-generated data. This article explores the gravity of this issue, while drawing parallels to the comparable situation of the isolation of the human brain.

Artificial Intelligence systems are applied programming concepts aimed at imitating human cognitive processes and decision-making abilities. These models primarily consist of the training of AI “brains” on vast amount of information, suchlike for the Large Language Models (LLM) of Open AI’s ChatGPT or Goggle’s Gemini, that leverage extensive existing datasets to “learn” how to recognise patterns, generate coherent text, and respond to user inputs. Its datasets consist predominantly on human webpages and text extracted from a variety of mostly-online sources, including books, articles, social media posts, and other written materials.

The Contamination of Data

Its fascinating capabilities are now polluting the white canvas that once was the pre-AI internet, by the utilisation of lucrative fast-paced production of machine generated webpages.  With recent studies done by McKinsey & Company now reporting that 65% of its respondents admitted that their institutions regularly use AI generation tools when generating text (Singla et al., 2024). Similarly, a research by Amazon Web Services is reporting that large amounts of the internet – up to 57% of web-based text – has been either fully generated or translated by AI systems (Thompson et al., 2024).

This so called “contamination” of the web is inadvertently preventing further training of AI agents to take place on new web data, as to not “corrupt” the already trained algorithms.

“But why is training on generated data so dangerous?” – some may ask. Shumailov and others (2024) demonstrated on the paper “The Curse of Recursion (…)” that over generations of AI training on its own generated content, data went from useful, to less useful, and eventually to complete gibberish. Meaning that there is an inherent danger with nearly indistinguishable (from human-made) generated content in the creation of up to date AI models.

As, over time, these models may end up being trained on AI-produced content. Resulting in the creation of recursive layers amounting to several generational iterations, causing the successive deterioration of model weights and parameters over the course of generations. In other words, once useful and incredibly expensive quasi-intelligent programs could become progressively less effective and more prone to the already existing problems of hallucinations and erratic behaviour.

The Impact of Data Degradation

Such risks raise the question of sustainability in today’s AI training practices, due to the ever shrinking presence of quality human data. As a last ditch effort to continue being up to date, AI companies may need to explore other real data sources, and hope that these too remain uncompromised.

Interestingly, while investigating the inspiration behind AI systems – the human brain – similar patterns can be found in environments of isolation and deprivation. Just as AI models degrade with the decline of quality of recursive training, the human psyche suffers significantly under the effects of sensory deprivation, demonstrating how both systems thrive in environments of diversity and deteriorate when faced with lacking or overly repetitive inputs.

Throughout one’s life, our brains are fed with a steady stream of fresh and varied stimuli, and only when these are missing that we are able to identify their importance. In situations of absent information, such as in isolation or in severe sensory deprivation, faults begin to form in our logical being, over time. Several psychological studies have been performed on people who experienced such environments over extended periods of time (both voluntarily and involuntarily). Revealing interesting results, as for the example of the famous explorer Admiral Byrd (Solomon, P. et al, 1957), who ended up spending several months in Antarctica by himself while documenting his experiences. Starting with intense feelings of boredom and emotion withdrawal and eventually, over time, hallucinations and severe psychological distress – symptoms remarkably similar to our artificial similar AI models being trained on increasingly homogenous self-generated data.

In disturbing scenarios, such as the case of Admiral Byrd, the human brain begins to ‘retrain’ itself using its own internal thoughts and memories as input data, similar to how a machine learning system generates outputs based on its own previous generations. In both cases, when starved of external validation and reality-based input, these systems begin to create meaningless patterns and misperceptions, ultimately losing touch with reality.

These comparisons reveal intriguing similarities between artificial and human brains, particularly in how both systems process information. While AI systems are still far from being a perfect replica of the human brain, the shared vulnerabilities to recursive learning suggest that we are gradually developing systems that mirror fundamental aspects of human cognition.

Moving Forward: 

As machine learning advances and demonstrates its enormous potential to transform both digital and physical worlds, the increasing reliance on AI-generated content raises valid concerns about the long-term reliability of AI training processes.

Just as human psychology suffers from isolation and repetitive stimuli, AI models face similar risks of degradation when trained on limited data. This parallel demonstrates the crucial need for diverse and original training data in AI development. Rather than viewing these shared vulnerabilities as weaknesses, they should be interpreted as evidence that technology is inching closer to replicating aspects of human intelligence, while simultaneously highlighting the remarkable complexity of our minds.

#brandkarma #digitalpsychology #artificialintelligence #emotionalintelligence #data

Sources:

Solomon, P., Leiderman, P. H., Mendelson, J., & Wexler, D. (1957). Sensory deprivation: A review. American Journal of Psychiatry, 114(4), 357-363.

https://www.sciencedirect.com/science/article/pii/S2095809920300035 : Fan, J., Fang, L., Wu, J., Guo, Y., & Dai, Q. (2020). From brain science to artificial intelligence. Engineering, 6(3), 248-252.

https://arxiv.org/pdf/2305.17493 : Shumailov, I., Shumaylov, Z., Zhao, Y., Gal, Y., Papernot, N., & Anderson, R. (2024, April 14). The curse of recursion: Training on generated data makes models forget. arXiv.org. https://arxiv.org/abs/2305.17493

https://arxiv.org/pdf/2401.05749 : Thompson, B., Dhaliwal, M. P., Frisch, P., Domhan, T., & Federico, M. (2024). A Shocking Amount of the Web is Machine Translated: Insights from Multi-Way Parallelism. arXiv preprint arXiv:2401.05749.

https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai : Singla, A., Sukharevsky, A., Yee, L., & Chui, M. (2024, May 30). The state of ai in early 2024: Gen ai adoption spikes and starts to generate value. McKinsey & Company. https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai