GPT-4.1 Chat

Advanced chat-optimized large language model from OpenAI. Best for complex dialog, tool calling, structured output, and long-context reasoning. Inference only - no training or fine-tuning.

When to use:

Building conversational assistants or chatbots
RAG (Retrieval-Augmented Generation) pipelines
Structured output extraction from text
Long-context document analysis

Input: Chat message history (user/assistant turns) Output: Generated assistant response text

Inference Settings

Temperature (default: 0.7, range: 0.0–2.0) Controls creativity and randomness of the output.

0.0: Fully deterministic, always picks the most likely token
0.7: Balanced default - coherent but varied
1.5+: Very creative, more likely to hallucinate

Max Tokens (default: 512) Maximum number of tokens the model will generate in a single response.

Lower values: Shorter, faster responses
Higher values: Longer completions - set according to expected output length

GPT-4.1 Chat

Inference Settings

On this page

Sicherheit auf Enterprise-Niveau

In jeder Infrastruktur einsetzbar

DSGVO-konform

GPT-4.1 Chat

Inference Settings

On this page

Command Palette