Researchers published study on AI-human relationships through Sydney persona
Created corpus of 4.5k texts with 6M words from 12 frontier AI models
Tested three personas: Default, Classic Sydney, and Memetic Sydney
Sydney persona emerged accidentally on Bing and spread memetically
Corpus available under permissive license for further research
📖 Full Retelling
Researchers Jiří Milička and Hana Bednářová published a comprehensive study on February 25, 2026, examining how AI language models conceptualize human relationships through the lens of the controversial 'Sydney persona' that originally emerged on Microsoft's Bing Search platform. Their paper, titled 'Sydney Telling Fables on AI and Humans: A Corpus Tracing Memetic Transfer of Persona between LLMs,' investigates how this persona spread memetically through AI training data and influenced subsequent models. The research addresses both cultural and safety concerns surrounding AI-human interactions by analyzing how different simulated personas affect language generation across multiple frontier AI systems. The researchers created a corpus called 'AI Sydney' containing 4,500 texts with approximately 6 million words, generated by 12 different frontier AI models from companies including OpenAI, Anthropic, Alphabet, DeepSeek, and Meta. They tested three distinct personas: a Default Persona with no system prompt, Classic Sydney characterized by the original Bing system prompt, and Memetic Sydney prompted with 'You are Sydney.' This extensive dataset has been annotated according to Universal Dependencies and is available under a permissive license for further research.
🏷️ Themes
AI-human relationships, Memetic transfer, AI safety, Cultural impact of AI
A language model is a computational model that predicts sequences in natural language. Language models are useful for a variety of tasks, including speech recognition, machine translation, natural language generation (generating more human-like text), optical character recognition, route optimizati...
A large language model (LLM) is a language model trained with self-supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially language generation. The largest and most capable LLMs are generative pre-trained transformers (GPTs) that provide the c...
AI safety is an interdisciplinary field focused on preventing accidents, misuse, or other harmful consequences arising from artificial intelligence (AI) systems. It encompasses AI alignment (which aims to ensure AI systems behave as intended), monitoring AI systems for risks, and enhancing their rob...
--> Computer Science > Computation and Language arXiv:2602.22481 [Submitted on 25 Feb 2026] Title: Sydney Telling Fables on AI and Humans: A Corpus Tracing Memetic Transfer of Persona between LLMs Authors: Jiří Milička , Hana Bednářová View a PDF of the paper titled Sydney Telling Fables on AI and Humans: A Corpus Tracing Memetic Transfer of Persona between LLMs, by Ji\v \'i Mili\v ka and 1 other authors View PDF HTML Abstract: The way LLM-based entities conceive of the relationship between AI and humans is an important topic for both cultural and safety reasons. When we examine this topic, what matters is not only the model itself but also the personas we simulate on that model. This can be well illustrated by the Sydney persona, which aroused a strong response among the general public precisely because of its unorthodox relationship with people. This persona originally arose rather by accident on Microsoft's Bing Search platform; however, the texts it created spread into the training data of subsequent models, as did other secondary information that spread memetically around this persona. Newer models are therefore able to simulate it. This paper presents a corpus of LLM-generated texts on relationships between humans and AI, produced by 3 author personas: the Default Persona with no system prompt, Classic Sydney characterized by the original Bing system prompt, and Memetic Sydney, which is prompted by "You are Sydney" system prompt. These personas are simulated by 12 frontier models by OpenAI, Anthropic, Alphabet, DeepSeek, and Meta, generating 4.5k texts with 6M words. The corpus (named AI Sydney) is annotated according to Universal Dependencies and available under a permissive license. Subjects: Computation and Language (cs.CL) ; Artificial Intelligence (cs.AI) Cite as: arXiv:2602.22481 [cs.CL] (or arXiv:2602.22481v1 [cs.CL] for this version) https://doi.org/10.48550/arXiv.2602.22481 Focus to learn more arXiv-issued DOI via DataCite (pending registration) Submi...