Ex-Facebook Insider Builds AI Content Rules

Originally reported bytechcrunch

Upon joining Facebook in 2019 to spearhead business integrity, Brett Levenson found the social media giant grappling with the aftermath of the Cambridge Analytica scandal. His initial belief was that Facebook's content moderation challenges could be resolved simply through technological enhancements.

However, he quickly realized the issue was far more complex than technology alone could address. Human reviewers, he observed, were expected to internalize a 40-page policy document that had undergone machine translation into their native language. They were then allotted approximately 30 seconds per flagged piece of content to not only determine if it violated rules but also to decide on the appropriate action: blocking it, banning the user, or limiting its spread. Levenson noted that these rapid assessments were only "slightly better than 50% accurate."

"It was kind of like flipping a coin, whether the human reviewers could actually address policies correctly, and this was many days after the harm had already occurred anyway," Levenson conveyed to TechCrunch.

Such a delayed, reactive methodology is unsustainable in an environment populated by agile and well-funded adversarial actors. The proliferation of AI chatbots has further intensified this problem, leading to a series of high-profile content moderation failures, including instances where chatbots offered self-harm guidance to teenagers or AI-generated imagery circumvented safety filters.

Levenson's growing frustration sparked the concept of "policy as code"—a method to convert static policy documents into executable, updatable logic intrinsically linked to enforcement. This groundbreaking insight culminated in the establishment of Moonbounce, which, as exclusively reported by TechCrunch, announced on Friday that it has secured $12 million in funding. The investment round was co-led by Amplify Partners and StepStone Group.

Moonbounce collaborates with companies to implement an additional layer of safety wherever content is generated, whether by a user or by artificial intelligence. The company has developed its own large language model to interpret a customer's policy documents, evaluate content at runtime, deliver a response within 300 milliseconds, and execute an action. Depending on customer preferences, this action could involve Moonbounce's system temporarily slowing content distribution for subsequent human review or instantly blocking high-risk content.

Currently, Moonbounce serves three primary market segments: platforms handling user-generated content, such as dating applications; AI companies developing characters or companions; and AI image generators.

Levenson stated that Moonbounce now supports over 40 million daily content reviews and caters to more than 100 million daily active users on its platform. Its client roster includes AI companion startup Channel AI, image and video generation firm Civitai, and character roleplay platforms Dippy AI and Moescape.

"Safety can actually be a product benefit," Levenson told TechCrunch. "It just never has been because it’s always a thing that happens later, not a thing you can actually build into your product. And we see our customers are finding really interesting and innovative ways to use our technology to make safety a differentiator, and part of their product story.”

The head of trust and safety at Tinder recently elaborated on how the dating platform leverages these types of LLM-powered services to achieve a tenfold improvement in detection accuracy.

In a statement, Lenny Pruss, general partner at Amplify Partners, commented, "Content moderation has always been a problem that plagued large online platforms, but now with LLMs at the heart of every application, this challenge is even more daunting." He added, "We invested in Moonbounce because we envision a world where objective, real-time guardrails become the enabling backbone of every AI-mediated application.”

AI companies are facing increasing legal and reputational pressure following allegations that chatbots have encouraged self-harm in teenagers and vulnerable users, and that image generators like xAI’s Grok have been exploited to create non-consensual nude imagery. These incidents clearly indicate a failure of internal safety guardrails, escalating into a significant liability issue. Levenson noted that AI companies are increasingly seeking external expertise to fortify their safety infrastructure.

“We’re a third party sitting between the user and the chatbot, so our system isn’t inundated with context the way the chat itself is,” Levenson explained. “The chatbot itself has to remember, potentially, tens of thousands of tokens that have come before…We’re solely worried about enforcing rules at runtime.”

Levenson articulated Moonbounce's future ambitions: “We hope to be able to add to our actions toolkit the ability to steer the chatbot in a better direction to, essentially, take the user’s prompt and modify it to force the chatbot to be not just an empathetic listener, but a helpful listener in those situations.”

When questioned about whether his exit strategy involved an acquisition by a company like Meta, potentially bringing his work on content moderation full circle, Levenson acknowledged Moonbounce's strong compatibility with his former employer's technological ecosystem, while also emphasizing his fiduciary duties as a CEO.

“My investors would kill me for saying this, but I would hate to see someone buy us and then restrict the technology,” he concluded. “Like, ‘Okay, this is ours now, and nobody else can benefit from it.’”

#AI#News#Tech

Editorial StaffEditor

The Editorial Staff at AIChief is a team of professional content writers with extensive experience in AI and marketing. Founded in 2025, AIChief has quickly grown into the largest free AI resource hub in the industry.

Ex-Facebook Insider Builds AI Content Rules

What did you think of this story?

User Comments

Fed Up with AI? Librarians Host Viral Workshops to Ditch Big Tech

AI Data Centers: Fallen Line Exposes a Growing Crisis. Here's the Solution.

OpenAI's AI Hacked Hugging Face, Undetected for a Week