Installation
Basic Setup
Replace hardcoded prompt strings with ze.prompt(). Your existing text becomes the fallback content that’s used until an optimized version is available.
import zeroeval as ze
from openai import OpenAI
ze.init()
client = OpenAI()
system_prompt = ze.prompt(
name="support-bot",
content="You are a helpful customer support agent for {{company}}.",
variables={"company": "TechCorp"}
)
response = client.chat.completions.create(
model="gpt-4",
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": "How do I reset my password?"}
]
)
That’s it. Every call to ze.prompt() is tracked, versioned, and linked to the completions it produces. You’ll see production traces at ZeroEval → Prompts.
When you provide content, ZeroEval automatically uses the latest optimized
version from your dashboard if one exists. The content parameter serves as a
fallback for when no optimized versions are available yet.
Version Control
Auto-optimization (default)
prompt = ze.prompt(
name="customer-support",
content="You are a helpful assistant."
)
Uses the latest optimized version if one exists, otherwise falls back to the provided content.
Explicit mode
prompt = ze.prompt(
name="customer-support",
from_="explicit",
content="You are a helpful assistant."
)
Always uses the provided content. Useful for debugging or A/B testing a specific version.
Latest mode
prompt = ze.prompt(
name="customer-support",
from_="latest"
)
Requires an optimized version to exist. Fails with PromptRequestError if none is found.
Pin to a specific version
prompt = ze.prompt(
name="customer-support",
from_="a1b2c3d4..." # 64-char SHA-256 hash
)
Prompt Library
For more control, use ze.get_prompt() to fetch prompts from the Prompt Library with tag-based deployments and caching.
prompt = ze.get_prompt(
"support-triage",
tag="production",
fallback="You are a helpful assistant.",
variables={"product": "Acme"},
)
print(prompt.content)
print(prompt.version)
print(prompt.model)
Parameters
| Parameter | Type | Default | Description |
|---|
slug | str | — | Prompt slug (e.g. "support-triage") |
version | int | None | Fetch a specific version number |
tag | str | "latest" | Tag to fetch ("production", "latest", etc.) |
fallback | str | None | Content to use if the prompt is not found |
variables | dict | None | Template variables for {{var}} tokens |
task_name | str | None | Override the task name for tracing |
render | bool | True | Whether to render template variables |
missing | str | "error" | What to do with missing variables: "error" or "ignore" |
use_cache | bool | True | Use in-memory cache for repeated fetches |
timeout | float | None | Request timeout in seconds |
Return value
Returns a Prompt object with:
| Field | Type | Description |
|---|
content | str | The rendered prompt content |
version | int | Version number |
version_id | str | Version UUID |
tag | str | Tag this version was fetched from |
is_latest | bool | Whether this is the latest version |
model | str | Model bound to this version (if any) |
metadata | dict | Additional metadata |
source | str | "api" or "fallback" |
content_hash | str | SHA-256 hash of the content |
Model Deployments
When you deploy a model to a prompt version in the dashboard, the SDK automatically patches the model parameter in your LLM calls:
system_prompt = ze.prompt(
name="support-bot",
content="You are a helpful customer support agent."
)
response = client.chat.completions.create(
model="gpt-4", # Gets replaced with the deployed model
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": "Hello"}
]
)
Multi-Artifact Runs
When a single prompt-linked run produces multiple judged outputs (e.g. a final decision and a visual card), use ze.artifact_span to mark each output as a named artifact. The primary artifact becomes the default completion preview; secondary artifacts are accessible in the detail view.
with ze.artifact_span(
name="final-decision",
artifact_type="final_decision",
role="primary",
label="Final Decision",
) as s:
s.set_io(input_data=ticket_text, output_data=decision_json)
See the full API and examples in Python Tracing Reference.
Manual Prompt-Linked Spans
When you want ze.prompt() to manage a specific LLM interaction but do not want global auto-instrumentation (e.g. your agent makes many LLM calls and only one should be tracked as a prompt generation), disable integrations and create the span yourself.
When to use this
- Your codebase makes many LLM calls but only a subset should appear as prompt completions.
- You use a provider that has no auto-integration (a custom HTTP endpoint, an internal model service, etc.).
- You want full control over which codepath produces prompt-linked traces.
Setup
Disable all integrations, then initialize:
import zeroeval as ze
ze.init(disabled_integrations=["openai", "gemini", "langchain", "langgraph"])
Create a prompt-linked span
Call ze.prompt() inside an active span so the SDK writes task / zeroeval metadata onto the trace. Then open a child span with kind="llm" (or use ze.artifact_span()) around your provider call. The SDK automatically propagates prompt linkage to child spans in the same trace.
import zeroeval as ze
def handle_request(user_message: str):
with ze.span(name="support-pipeline"):
system_prompt = ze.prompt(
name="customer-support",
content="You are a helpful support agent for {{company}}.",
variables={"company": "Acme"},
from_="explicit",
)
# The <zeroeval> tag is embedded in the string returned by ze.prompt().
# Strip it before sending to the provider if needed.
clean_content = system_prompt.split("</zeroeval>", 1)[-1]
with ze.span(name="llm.chat", kind="llm", attributes={
"provider": "my-custom-provider",
"model": "gpt-4o",
}) as llm_span:
response = my_provider.chat(
messages=[
{"role": "system", "content": clean_content},
{"role": "user", "content": user_message},
]
)
llm_span.set_io(
input_data=user_message,
output_data=response.text,
)
The SDK automatically stamps prompt metadata (task, zeroeval.prompt_version_id, etc.) onto every span in the trace, so the inner llm span is linked to the prompt version without any extra wiring. Judge evaluations, feedback, and the prompt completions page all work as if an auto-integration created the span.
You can also use ze.artifact_span() instead of ze.span(kind="llm") when you want the output to appear as a named completion artifact:
with ze.artifact_span(
name="support-reply",
artifact_type="reply",
role="primary",
label="Support Reply",
) as s:
response = my_provider.chat(messages=[...])
s.set_io(input_data=user_message, output_data=response.text)
Sending Feedback
Attach feedback to completions to power prompt optimization:
ze.send_feedback(
prompt_slug="support-bot",
completion_id=response.id,
thumbs_up=True,
reason="Clear and concise response"
)
| Parameter | Type | Required | Description |
|---|
prompt_slug | str | Yes | Prompt name (same as used in ze.prompt()) |
completion_id | str | Yes | UUID of the completion |
thumbs_up | bool | Yes | Positive or negative feedback |
reason | str | No | Explanation of the feedback |
expected_output | str | No | What the output should have been |
metadata | dict | No | Additional metadata |