Skip to main content

Use EigenAI

Get Started with a Token Grant

See Try EigenAI for information on obtaining a token grant to get started for free.

We're starting off with supporting the gpt-oss-120b-f16 and qwen3-32b-128k-bf16 models based on initial demand and expanding from there. To get started or request another model, visit our onboarding page.

Chat Completions API

$ curl -X POST https://eigenai-sepolia.eigencloud.xyz/v1/chat/completions \
-H "X-API-Key: <api-key>" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-oss-120b-f16",
"max_tokens": 120,
"seed": 42,
"messages": [{"role": "user", "content": "Write a story about programming"}]
}' | jq

OpenAI Client usage

Step 1

from openai import OpenAI

client = OpenAI(
base_url="https://eigenai.eigencloud.xyz/v1",
default_headers={"x-api-key": api_key},
)

tools: List[Dict[str, Any]] = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
},
},
"required": ["location"],
},
},
}
]

step1 = client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": "What is the weather like in Boston today?"}],
tools=tools,
tool_choice="auto",
)

Step 2

messages_step2: List[Dict[str, Any]] = [
{"role": "user", "content": "What is the weather like in Boston today?"},
{
"role": "assistant",
"content": None,
"tool_calls": [
{
"id": tool_call_id,
"type": "function",
"function": {
"name": "get_current_weather",
"arguments": json.dumps({"location": "Boston, MA", "unit": "fahrenheit"}),
},
}
],
},
{"role": "tool", "tool_call_id": tool_call_id, "content": "58 degrees"},
{"role": "user", "content": "Do I need a sweater for this weather?"},
]

step2 = client.chat.completions.create(model=model, messages=messages_step2)

Supported parameters

This list will be expanding to cover the full parameter set of the Chat Completions API.

  • messages: array
    • A list of messages comprising the conversation so far
  • model: string
    • Model ID used to generate the response, like gpt-oss-120b-f16
  • max_tokens: (optional) integer
    • The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
  • seed: (optional) integer
    • If specified, our system will run the inference deterministically, such that repeated requests with the same seed and parameters should return the same result.
  • stream: (optional) bool
    • If set to true, the model response data will be streamed to the client as it is generated using Server-Side Events (SSE).
  • temperature: (optional) number
    • What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
  • top_p: (optional) number
    • An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.
  • logprobs: (optional) bool
    • Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message
  • frequency_penalty: (optional) number
    • Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
  • presence_penalty: (optional) number
    • Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
  • tools: array
  • tool_choice: (optional) string
    • “auto”, “required”, “none”
    • Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool.
    • none is the default when no tools are present. auto is the default if tools are present.