While the enigmatic Bitcoin inventor, Satoshi Nakamoto, likely remains secure in their anonymity, the privacy of your casual online alias might now be significantly more vulnerable.
Do you maintain a secondary Reddit account, a private X (formerly Twitter) profile, a "finsta," or a Glassdoor account to candidly express frustrations about your employer? Artificial intelligence could have just made it considerably easier to uncover your true identity. This is the central conclusion of a recently published study, which suggests uncomfortable implications for online privacy, even if it's not yet time to declare anonymity entirely defunct.
The findings, which are currently awaiting peer review, originate from a collaborative effort by researchers at ETH Zurich, Anthropic, and the Machine Learning Alignment and Theory Scholars program. They developed an automated system comprising AI agents, utilizing unspecified models, designed to browse the web and interpret information much like a human investigator. This system was tasked with assessing how effectively large language models (LLMs) could reidentify anonymized content. The researchers report that their system "substantially outperforms" traditional computational methods for deanonymizing accounts by comprehensively scanning text for personal details at an unprecedented scale.
The system operates by treating posts and other textual content as a collection of forensic clues. It meticulously analyzes the text for distinctive patterns, such as unique writing quirks, incidental biographical details, and consistent posting frequencies or timings, all of which might point to an individual's identity. Subsequently, it scans potentially millions of other accounts, searching for the same combination of traits. Probable matches are then flagged, subjected to more detailed comparison, and ultimately refined into a concise shortlist of likely identities.
Instead of targeting unwitting users, the research team evaluated their system using datasets constructed from publicly available posts. These included content from platforms like Hacker News and LinkedIn, transcripts of Anthropic's interviews with scientists about their AI usage, and Reddit accounts that were intentionally divided into two anonymized halves for testing purposes. The study indicates that in each experimental scenario, the LLM-based methodology accurately identified up to 68 percent of matching accounts with a 90 percent precision rate. In stark contrast, comparable non-LLM techniques, such as connecting disparate data points across large datasets, identified almost none.
The outcomes were not uniformly consistent across every dataset; predictably, the model exhibited enhanced performance when provided with more structured information. In one experiment focusing on Reddit users discussing films within the main r/movies subreddit and smaller cinematic communities, the system managed to link accounts that mentioned just a single movie approximately 3 percent of the time, maintaining 90 percent precision. However, when users referenced ten or more films, the success rate impressively climbed to nearly half.
Another experiment, employing Anthropic’s survey of scientists, successfully identified nine out of 125 respondents, translating to a recall rate of roughly 7 percent. For this test, the system generated a profile for each respondent based on clues embedded in their answers and then searched publicly available web information for potential matches. As an illustrative example, the researchers highlighted how references to a "supervisor" could suggest a PhD student, while the use of British English might indicate a UK affiliation. When combined with mentions of a background in the physical sciences and current work in biology research, the system was able to narrow down the field to a specific candidate.
Nevertheless, the researchers contend that the mere ability to identify any respondents from unstructured text is remarkable, accomplishing in minutes what would typically require hours for a human investigator. Furthermore, they informed The Verge that the system's performance is expected to improve as AI capabilities advance and access to larger data pools expands. More broadly, they issue a cautionary note: it may no longer be prudent to assume that posting pseudonymously will adequately safeguard online identities, whether past or future.
"Every single thing the LLM found in principle could be found by a human investigator."
“Information on the internet is there forever,” stated Daniel Paleka, a researcher at ETH Zurich and one of the study’s authors. This enduring digital footprint could translate into tangible, real-world risks for journalists, dissidents, and activists who rely on pseudonyms, the researchers warn. Additionally, it could facilitate "hyper-targeted advertising" and "highly personalized" scams.
The dangers associated with deanonymizing accounts are neither novel nor exclusive to AI. “Every single thing the LLM found in principle could be found by a human investigator,” Paleka reiterated to The Verge.
What is genuinely new, Paleka argues, is the introduction of end-to-end automation. Tasks that once demanded a diligent investigator patiently sifting through posts for minuscule pieces of information can now be executed with far greater ease and across a significantly larger number of targets.
It is also remarkably cost-effective. The researchers disclosed that their experiment incurred less than $2,000, with a cost ranging between $1 and $4 for each profile processed by the AI agent. “The economics are totally different now,” coauthor Simon Lermen told The Verge, cautioning that this reduced barrier to entry could broaden the spectrum of individuals and entities with the capability and incentive to attempt to breach online anonymity. Groups that have historically "flown under the radar" may find it increasingly difficult to maintain their obscurity, he added.
People "might misunderstand this important research and conclude that privacy is dead." It isn't.
It is crucial not to exaggerate these findings. “While these algorithms are improving, they remain far from what humans can do,” Luc Rocher, an associate professor at the Oxford Internet Institute, commented to The Verge. The study’s work does not perfectly translate to real-world scenarios; experiments were conducted under controlled laboratory conditions using datasets meticulously curated and anonymized specifically for testing. Rocher expressed concern that people "might misunderstand this important research and conclude that privacy is dead," emphasizing that it is not.
Despite years of incremental advancements in techniques designed to unmask anonymous users, “the identity of Satoshi Nakamoto, the inventor of Bitcoin, remains a mystery after more than a decade,” Rocher noted. Whistleblowers, they added, continue to communicate with journalists without exposure, and tools such as Signal “have so far been successful in protecting our collective privacy.”
In their paper, the researchers stated that they deliberately refrained from testing their system on actual pseudonymous users due to ethical considerations. For similar reasons, they chose not to publish the full technical specifications of their methodology and declined requests for a demonstration. The team also would not confirm whether they had tested the system beyond the study's confines, again citing ethical concerns, thereby leaving open questions regarding its reliability against real-world accounts.
For individuals already deeply committed to maintaining anonymity, the practical impact of these findings may be limited. Fundamental precautions—such as maintaining separate accounts, minimizing personal details, and avoiding identifiable patterns like posting exclusively during waking hours in one's specific time zone—remain critically important.
For those who treat pseudonyms more casually, Paleka and Lermen advised users to exercise greater caution about what they post in public forums, even through accounts that feel anonymous. They stressed the importance of remembering that information already publicly available can now be pieced together with an ease many might underestimate.
The researchers argue that responsibility should not fall solely on users. Lermen suggested that AI laboratories ought to monitor how their tools are being utilized and implement safeguards to prevent their misuse for deanonymization. Furthermore, social media platforms, he added, could take stricter measures against the mass data scraping and extraction practices that enable such efforts.
In essence, Satoshi Nakamoto is likely safe from AI-powered sleuths. However, your spontaneous "Am I the Asshole?" post on Reddit? That could very well be a different story.
The Editorial Staff at AIChief is a team of professional content writers with extensive experience in AI and marketing. Founded in 2025, AIChief has quickly grown into the largest free AI resource hub in the industry.