Streaming Responses

Sythoria uses Server-Sent Events (SSE) to stream responses token by token as they're generated. No polling, no batching, no waiting for the full response — you see text appear the moment the model produces it.

How it works

You send a message
       ↓
Sythoria opens an SSE connection to the provider
       ↓
Tokens arrive incrementally and render immediately
       ↓
Connection closes when the response is complete

You type a message and press Enter
Sythoria opens a streaming connection to your chosen provider's API
Each token is rendered in the chat as it arrives — typically within milliseconds of generation
The stream ends when the model signals completion (data: [DONE])

Benefits

Benefit	What it means for you
Lower perceived latency	First tokens appear in milliseconds, not after the entire response finishes
Native API compatibility	Sythoria passes through the streaming protocol as-is, so behavior matches the provider's SDK
Cancellation support	Stop generation mid-stream with the stop button or `Escape`
No added overhead	Sythoria's direct connection streams responses immediately — no buffering or middleman servers

Supported providers

All providers support streaming:

Provider	Streaming	Protocol
OpenAI	Supported	SSE
Anthropic	Supported	SSE
Google Gemini	Supported	SSE
Ollama	Supported	SSE (NDJSON)
OpenRouter	Supported	SSE
NVIDIA NIM	Supported	SSE

Stopping a stream

Click the Stop button that appears during generation, or press Escape. The partial text remains in your conversation so you can pick up where you left off.

Technical details

The Sythoria desktop application establishes direct connections to the provider to stream responses directly. There is no middleman server, buffering, or transformation — it's a direct API connection:

No added latency beyond the provider's first-token time
Full compatibility with provider-specific streaming features (e.g., Anthropic's content blocks, OpenAI's function calling)
Automatic reconnection is handled by the application's connection handler

For developers: the streaming uses the standard OpenAI streaming format (data: {...} lines terminated by data: [DONE]). Anthropic's native format is translated to this standard format by Sythoria's internal client layer.