Sesame vs Speechly

Comparing the features of Sesame to Speechly

Feature

Sesame

Speechly

Capability Features

Automatic Formatting

Consistent Personality

Context Awareness

Conversational Dynamics

Conversational Speech Generation

Custom Vocabulary Support

Custom Voice Commands

Dataset Size

1 million hours

Email Mode

Emotional Intelligence

Enterprise Grade Security

Evaluation Suite

Fast Transcription Speed

Under 3 seconds

Long-Form Resilience

Message Mode

Model Sizes

Tiny: 1B backbone, 100M decoderSmall: 3B backbone, 250M decoderMedium: 8B backbone, 300M decoder

Multilingual Support

150+ languages

Multiple Speaker Handling

No Missed Context

Objective Metrics

Word Error RateSpeaker SimilarityHomograph DisambiguationPronunciation Consistency

Partial Multilingual Support Planned

Planned for 20+ languages

Prompt Mode

Pronunciation Correction

Sequence Length

2048

Single-Stage Model

Subjective Metrics

Comparative Mean Opinion Score

Text and Audio Input

TextAudio

To-Do Mode

Training Epochs

Transcription Speed

180+ words per minute

Voice-to-Text

Works Across Apps

Integration Features

GitHub Release

LLama Architecture Backbone

Mimi Split-RVQ Tokenizer

Platform Integrations

GmailSlackNotionDiscord

Supported Platforms

MacOS

Limitation Features

API or Plugin Integration

Cannot Model Conversation Structure

English Language Dominance

File Formats Supported

Not specified

Memory Bottleneck in Training

Minimum OS Requirements

MacOS Ventura 13.1 or higher

No Pre-trained Language Model Use

Platform Limitation

MacOS only

Real-Time Generation Delay

RVQ time-to-first-audio scales poorly

User Limit

Not specified

Pricing Features

Free Preview

Free Tier

No Credit Card Required

Open Source

Apache 2.0