Ship voice UX in your web app in minutes.

Streaming speech-to-text, voice commands, and best-in-class TTS. TypeScript-first SDK, <500ms streaming latency, no realtime infra to run.

Dictate: Production-ready dictation with clean output.
AI Edit: Voice edits that rewrite in place.
Commands: Intent to action across your UI + APIs.
TTS: In-app voice responses.

Get API Key

Read the Docs

DICTATE

“um so I just uh finished the quarterly report and like the main takeaway is that revenue grew 15 percent”

OUTPUT

3× faster than typing<500ms latency

Fast enough to feel native. Reliable enough to ship.

TECHNICAL SPECIFICATIONS

INPUT SPEED

~3×

vs. mobile typing¹

ERROR RATE

20.4%

fewer errors¹

DICTATION

~160

WPM¹

LATENCY

<500

ms (streaming)

¹ Based on published HCI research; your results vary by device, environment, and model configuration.

BUILT ON BATTLE-TESTED INFRA

Voice isn't "just STT." It's product UX.

Raw transcripts aren't enough. You need streaming, formatting, edit loops, commands, and voice output—without running real-time infra.

The old way

01
Use browser/OS voice APIs
Inconsistent output, limited control, and brittle UX.
02
Stitch providers together
STT → LLM → commands → TTS plus auth, streaming, retries, and UI glue.
03
Build voice in-house
Months of edge cases, state, analytics, and maintenance.
04
Operate real-time infra
Audio pipelines, websockets, scaling, and observability.

With SpeechOS

01
Drop in one SDK
No infra to stitch or run — SpeechOS orchestrates the stack.
02
Dictate + Edit feels like one flow
Users speak naturally, then refine instantly.
03
Commands trigger real actions
Map intent to your UI + APIs.
04
TTS closes the loop
Voice responses, prompts, and confirmations.

VOICE PRIMITIVES

Four voice primitives. One integration.

Everything you need to ship voice UX that users actually adopt.

01~160 WPM

Dictate

Clean text by default.

Polished text automatically, ready to use without cleanup.

Punctuation + capitalization baked in
Filler removed for clean, written-style output
Works across notes, forms, editors, and comments

02800+ WPM

AI Edit

Editing at the speed of speech.

Transform text instantly with simple voice edits.

"Make it shorter / more formal / more direct"
"Rewrite for clarity", "translate", "summarize"
Polish user-generated content in place

03INSTANT

Commands

Voice → real product actions.

Turn spoken intent into real UI + API actions.

Commands like "submit", "log activity", "create task", "next field"
Intent matching across natural phrasing
Track adoption and refine what people actually say

04TTS

TTS

Text-to-speech for in-app voice responses.

Speak confirmations, summaries, and guidance so your product can talk back in real time.

Voice confirmations after actions or submissions
Spoken prompts, updates, and status changes
Accessibility for users who prefer audio

HOW IT WORKS

Two ways to integrate

Use the automatic widget for instant voice UX, or call the SDK directly for full control.

Widget (automatic)

Initialize once. The widget appears automatically when users focus text inputs or select text. Handles dictation, editing, commands, and read-aloud with zero additional code.

Auto-detects input, textarea, contenteditable
Read-aloud on text selection
Works with React, Vue, vanilla JS

Preview the widget

SDK (programmatic)

Call the API directly for custom UIs or headless integrations. Full control over when and how voice actions are triggered.

dictate(), edit(), command(), tts.speak()
Build your own UI or go headless
React hooks: useDictation, useEdit, useTTS

See code examples

Try the Playground

Dictate, edit, trigger commands, and hear responses—no signup required.

Talk to an engineer

USE CASES

Use cases that ship well in real products

See how developers are using SpeechOS to add voice UX to their applications.

Text-heavy workflows

Voice-enable forms, notes, and composers without sacrificing precision.

Editors & content

Draft and revise long-form text by voice, then polish fast.

Power-user workflows

Hands-free actions for navigation, submission, assignment, and status changes.

In-app voice responses

Let users listen to updates, guidance, and confirmations.

See examples View docs

FOR DEVELOPERS

Ship voice UX in minutes

Install once. SpeechOS appears automatically across your app's text surfaces.

Prefer to customize? Configure commands, vocabulary, and UX behavior in the dashboard.

View Full Documentation

Install the SDK

Add SpeechOS to your project with a single command. Available via npm for any web app or with React bindings.

bash

# Any web app
npm install @speechos/client
 
# React bindings
npm install @speechos/react

Your app becomes voice-enabled

SpeechOS appears across text surfaces and handles dictation, edits, commands, and voice responses with one UI.

Dictate

Edit

Command

Read

REACT QUICKSTART

Add voice input in two steps

Use the useDictation hook to capture speech and get formatted text.

01Install

npm install @speechos/react

02Use the hook

import { useDictation } from '@speechos/react';
 
function VoiceNote() {
  const { start, stop, transcript } = useDictation();
 
  return (
    <>
      <button onClick={start}>Start</button>
      <button onClick={stop}>Stop</button>
      <p>{transcript}</p>
    </>
  );
}

03Try it

Press Start, speak, then press Stop

CODE EXAMPLES

Simple APIs for every action

Dictate, edit, run commands, and speak text—each in one line.

Dictate

Start recording, stop when done—get clean, formatted text back.

typescript

import { SpeechOS } from '@speechos/client';
 
// Start capturing speech.
SpeechOS.dictate();
 
// ... user speaks ...
 
// Stop and get the formatted transcript.
const text = await SpeechOS.stopDictation();
// Example: "The customer is blocked on SSO setup."

Edit with voice

Say "make it shorter" or "translate to French"—get rewritten text.

typescript

import { SpeechOS } from '@speechos/client';
 
// Provide the text to transform.
SpeechOS.edit('Your existing content here...');
 
// ... user says "make it more concise" ...
 
// Get the rewritten result.
const edited = await SpeechOS.stopEdit();

Voice commands

Map spoken intent to actions—"submit", "create task", "go back".

typescript

import { events } from '@speechos/client';
 
events.on('command:matched', ({ commands }) => {
  // Match spoken intent to real product actions.
  commands.forEach((cmd) => {
    if (cmd.name === 'submit') submitForm();
    if (cmd.name === 'create_task') createTask(cmd.arguments.title);
  });
});

Read Aloud (TTS)

Speak any text—great for in-app voice responses and accessibility.

typescript

import { tts } from '@speechos/client';
 
// Speak back to the user after an action.
await tts.speak('Your report is ready.');
 
// Use it for prompts, guidance, and confirmations.

Privacy-first by default

Built for real products where user data is sensitive and trust is non-negotiable.

Real-time streaming

Processed live; not stored by default.

You control the data

Your customer controls data; SpeechOS is a processor.

DPA available

For enterprise procurement and reviews.

Built for business use

Supports compliance-driven workflows and audits.

Read the Privacy Policy Request security documentation

FAQ

Frequently Asked Questions

Common questions from developers integrating SpeechOS.

Yes. The SDK uses a public client key designed for browser use. Use domain allowlists and rate limits in the dashboard for control.

Inputs and textareas by default; Read Aloud works on selections. For custom editors like ProseMirror, Slate, or Quill, use manual attach APIs.

Yes. Disable formDetection and attach to specific elements programmatically with SpeechOS.showFor() and SpeechOS.attachTo().

Audio is processed live and not stored by default. We store transcripts and metadata for analytics. Contact us for custom retention policies.

No. Customer data is not used for training models.

Chrome, Edge, Firefox, Safari, and mobile browsers on iOS and Android. The SDK uses Web Audio API and WebRTC for real-time streaming.

<500ms streaming latency for dictation. Typical AI edits complete in 1–2 seconds.

Yes — @speechos/react with hooks like useDictation, useEdit, and useCommand for seamless React integration.

Ship voice UX without building a speech stack.

Dictation, AI edits, commands, and in-app voice responses in one SDK.

Get API Key Book Demo

Prefer to build? View docs

Ship voice UX in your web app in minutes.Ship voice UX in your web app in minutes.

Voice isn't "just STT." It's product UX.

The old way

With SpeechOS

Four voice primitives. One integration.

Dictate

AI Edit

Commands

TTS

Two ways to integrate

Widget (automatic)

SDK (programmatic)

Try the Playground

Use cases that ship well in real products

Text-heavy workflows

Editors & content

Power-user workflows

In-app voice responses

Ship voice UX in minutes

Install the SDK

Your app becomes voice-enabled

Add voice input in two steps

Simple APIs for every action

Privacy-first by default

Real-time streaming

You control the data

DPA available

Built for business use

Frequently Asked Questions

Ship voice UX without building a speech stack.

Ship voice UX in your web app in minutes.