CSM-1B Has No Real Safeguards To Speak Of ‣ Jeff Turner

CSM-1B didn’t take long. Less than two weeks after releasing the demo of their eerily realistic voice assistant, Sesame has released the base model that powers Maya. It’s called CSM-1B, has 1 billion parameters, and is under an Apache 2.0 license, which means it can be used commercially. It is officially out in the wild.

I Decided To See What CSM-1B Would Do With My Voice

CSM-1B Has No Real Safeguards To Speak Of

Trigger warning: If you’re already afraid of what will come, and you know me and my voice, this will likely hit home pretty hard. Moments after generating the sample you’ll hear below, I shared it with the people who know me best—my family.

Within moments of sharing, their reactions came pouring in. “Yeah I don’t really like that… it’s so cool but terrifying.” “That’s very scary.” “Why did it take pauses like you?” “Holy shit.” Those comments are from people who likely know my voice better than any other human on the planet. When I saw my son, Isaiah, he said, “Oh, it would have fooled me.”

I wish I could tell you it required some technical prowess, but it did not. It was painfully simple. And ridiculously fast.

I read the introduction at the top of Tangilla.com into the demo at Hugging Face. I didn’t use a special mic or move into a quiet room. I just used the built-in microphone on my Macbook Pro. It took less than 30 seconds. In under 60 seconds, the new voice was ready for me to use with Sesame’s newly released model.

Once ready, I used the default text in the conversation context window and clicked on “generate conversation.” What I heard surprised me. I’ve tried cloning my voice with other models, and I’ve never been impressed. This time, I was.

Then, I replaced the default text with the following: “This is not actually me talking. Seriously, this is a clone of my voice using Sesame’s new conversational model, and here’s the thing. It built this with just seconds of my actual voice.” I listened to the result and then generated, “You would absolutely be fooled by this.” The CSM-1B model generated the cadence and pauses you will hear without prompting. I’ve combined the two exports into one audio file. Listen.

We Need More Than An Honor System

The announcement at Techcrunch said, “It’s worth noting the model has no real safeguards to speak of. Sesame has an honor system and merely urges developers and users not to use the model to mimic a person’s voice without their consent, create misleading content like fake news, or engage in “harmful” or “malicious” activities.”

Sesame has an honor system. I’m just going to leave you with that.

Comments

Jim Walberg says

March 18, 2025 at 10:13 pm

I have no words to explain my dismay of what you just posted. If there are no guard rails, this is insane.

- Jeff Turner says
  
  March 18, 2025 at 10:15 pm
  
  The more you know, Jim, the better prepared you can be.
  
Phil says

March 18, 2025 at 11:02 pm

OK. I read everything you write and view everything you post… always have.

I rarely, if ever comment … weird, I know.

This is concerning … it’s your voice, your tone, your inflection.

I listened to this multiple times… if ‘this’ calls me, it’s you.

Words I never thought I’d say… buddy we need a safe word…

pk

- Jeff Turner says
  
  March 19, 2025 at 10:36 am
  
  I know you do. I’ve always known. And, yes, this is something different. It’s a leap across the uncanny valley I didn’t think we see so quickly. As for the safe word, we’ve probably needed one of those for a long time. 🙂
  
Julie Beall says

March 19, 2025 at 4:47 am

It was great to hear your voice again! Lol! Wow! It’s crazy; it makes me want to learn more and speak to no one, especially on the phone, I don’t know. Wait I may be speaking to someone I think know but it is not them!
Good info Jeff! Thank you!

- Jeff Turner says
  
  March 19, 2025 at 10:38 am
  
  Julie, so great to hear from you! From this moment forward, I think we should all be a bit wary that the voice on the other end of the line may not actually be a human, that is clear. What is most concerning is that the familiar voice on the other end of the line might not actually be the human we think it is.
  
Lola A says

March 19, 2025 at 2:16 pm

I had some conversations with my kids and they had told me this was where it it at. Listening to your voice on this recording has made it real to me. I can only begin to imagine where this may take us.

- Jeff Turner says
  
  March 20, 2025 at 1:40 pm
  
  Lola, it made it real for me as well. And I think the fact that we can imagine all kinds of ways this could go sideways is why we need more than just an honor system.

My Thoughts Exactly

CSM-1B Has No Real Safeguards To Speak Of

I Decided To See What CSM-1B Would Do With My Voice

We Need More Than An Honor System

Like this:

Related

I Decided To See What CSM-1B Would Do With My Voice

We Need More Than An Honor System

Share this:

Like this:

Related

Comments

Add your voice...Cancel reply