What happens when AI starts training on AI-generated word salad and hallucinations that were trained on AI-generated word salad and hallucinations, like some kind of MC Escher feedback loop?
@StefanThinks With dramatic loss of quality. Known as "Model collapse".
Or "Habsburg AI" (not my invention, got it from @pluralistic)
@quincy @StefanThinks It's from Jathan Sadowski of the This Machine Kills podcast.
@StefanThinks Pretty sure it's already happening. I think in the end they will be the ones that get themselves to stop scraping the web, because it will be impossible to filter out ai content from it and stop their databases to corrupt themselves into word salad.
@StefanThinks i assume there would be some cumulative reduction in quality of the information, with each recycling of the previous low-quality information. One positive effect, it will be paradise for @lowqualityfacts
@StefanThinks I believe this was already explored and resulted in severe degradation of the models.
@StefanThinks at BIML we call this recursive pollution #MLsec https://berryvilleiml.com/2024/01/29/two-interesting-reads-on-llm-security/
@StefanThinks @jasonhowlin It appears that these systems are actually amazingly robust against this. In fact, synthetic training data is becoming a normal thing! I originally thought model collapse was enviable and now I think it may be wishful thinking.