250 tokens
Three sentences. Hard cap. In any language.
~1 min · 181 words
The model literally cannot exceed it at the API layer.
250 fits about three sentences in any language we've tested — English, Russian, Georgian, Turkish — plus a markdown image URL on its own line, which we need because the chat sometimes recommends a catalog photo and a truncated URL is worse than no URL. Lower than 250, we risked breaking the URL mid-line. Higher, and the model started writing paragraphs. Paragraphs sound like an essay, not a conversation.
The companion rule in the system prompt: “Reply in 1–3 sentences. Never more than 3, in any language. This is a hard rule.” Sentence-based rules are language-neutral; the model counts boundaries (. ? !) regardless of whether it's writing in Russian or Georgian. We don't have to localize the rule per language.
Side effect: 774 tokens recovered from the old 1024 budget. That's a lot of conversation history a single reply no longer has to compete with.
You wouldn't think a number could be a design decision. This one is.
“You wouldn't think a number could be a design decision. This one is.”