Claude Sonnet 4.5
Balanced intelligence and speed for complex reasoning and long-context tasks
Anthropic's Claude Sonnet 4.5 model. Balances high capability with fast response times. Supports up to 200K token context, vision, tool calling, and structured output. Inference only — no training or fine-tuning.
When to use:
- Complex reasoning and multi-step analysis
- Long-context document processing (up to 200K tokens)
- Multilingual generation across all major languages
- Structured output or JSON extraction workflows
Input: Chat message history (user/assistant turns) Output: Generated assistant response text
Inference Settings
Temperature (default: 1.0, range: 0.0–1.0) Controls randomness in the output.
- 0.0: Deterministic — same input always produces same output
- 0.5: Balanced creativity and consistency
- 1.0: Full range of the model's distribution
Max Tokens (default: 1024) Maximum number of tokens to generate.
- Increase for long-form content or detailed analysis
- Decrease for short replies to reduce cost and latency
Top P (default: 0.999, range: 0.0–1.0) Nucleus sampling — only consider tokens whose cumulative probability reaches this threshold.
- Lower values (0.1–0.5): More focused, less varied output
- Higher values (0.9+): Broader vocabulary, more natural variation
- Rarely needs tuning; adjust temperature first
Top K (default: 0) Sample from the top K most likely tokens. 0 means disabled.
- Enable (e.g., 50) to limit vocabulary at each step
- Combine with Top P for fine-grained sampling control
Stop Sequences (default: []) List of strings where generation stops immediately when encountered.
- Useful for structured output with known terminators
- Example:
["</answer>", "\n\n"]