AI startup Sesame has introduced CSM-1B, the base model that powers its viral voice assistant, Maya. This new AI model contains one billion parameters and is licensed under the Apache 2.0 license, which allows for commercial use with minimal restrictions.
CSM-1B processes audio and text inputs to generate “RVQ audio codes,” as detailed on Sesame’s page on the AI development platform Hugging Face. RVQ, or residual vector quantization, converts audio into discrete tokens, a method seen in various modern AI audio technologies, including Google’s SoundStream and Meta’s Encodec.
The CSM-1B model incorporates a backbone from Meta’s Llama family paired with an audio “decoder” component. A fine-tuned variant of this model powers the Maya assistant. Sesame indicated that the open-sourced version is a base generation model capable of producing various voice outputs, although it has not been specifically fine-tuned for individual voices. The company acknowledged some capacity for non-English languages due to the nature of its training data but warned about its potential limitations.
Details about the specific datasets used to develop CSM-1B remain undisclosed, raising concerns about its training process. The model lacks robust safeguards, relying instead on an honor code, urging developers not to impersonate individuals, produce misleading information such as fake news, or engage in harmful activities.
A recent demo revealed that voice cloning could be accomplished in under a minute, raising alarms about the lack of protections against misuse, a concern echoed by Consumer Reports regarding widely available voice cloning tools.
Sesame, co-founded by Brendan Iribe, known for creating Oculus, gained significant attention in February for its advanced assistant technologies. Maya and another assistant named Miles feature lifelike speech patterns, including natural pauses and disfluencies. Sesame has also secured investment from notable firms like Andreessen Horowitz while working on AI glasses designed for all-day wear using its innovative models.