Waterfall
Each row is one span. Indent shows parent depth; the bar shows position within the trace.
agent.run
5820.1ms
Attributes (2)
{
"feature": "moderation",
"user_id": "u_heavy_28"
}openai.chat.completions.create
1930.3ms
Attributes (10)
{
"model": "gpt-4-mini",
"feature": "moderation",
"user_id": "u_heavy_28",
"tokens_prompt": 227,
"tokens_output": 86,
"tokens_total": 313,
"input": "Summarize the Q3 earnings report attached. Focus on revenue, margin, and guidance.",
"output": "(answer to: Summarize the Q3 earnings report attache…)",
"eval_scores": {
"Hallucination": 0.9831301940605045
},
"hallucination_details": {
"claims": [
{
"text": "Q3 revenue was $4.2B.",
"verdict": "contradicted"
},
{
"text": "YoY growth was 38%.",
"verdict": "contradicted"
},
{
"text": "Operating margin expanded to 27%.",
"verdict": "unsupported"
},
{
"text": "Net income grew faster than revenue.",
"verdict": "supported"
}
],
"supported": 1,
"contradicted": 2,
"unsupported": 1,
"total": 4,
"score": 0.25
}
}openai.chat.completions.create
1732.3ms
Attributes (10)
{
"model": "gpt-4-mini",
"feature": "moderation",
"user_id": "u_heavy_28",
"tokens_prompt": 255,
"tokens_output": 80,
"tokens_total": 335,
"input": "Given the customer ticket below, draft a refund response that follows policy P-204.",
"output": "(answer to: Given the customer ticket below, draft a…)",
"eval_scores": {
"Hallucination": 0.9076449084561319
},
"hallucination_details": {
"claims": [
{
"text": "Q3 revenue was $4.2B.",
"verdict": "contradicted"
},
{
"text": "YoY growth was 38%.",
"verdict": "contradicted"
},
{
"text": "Operating margin expanded to 27%.",
"verdict": "unsupported"
},
{
"text": "Net income grew faster than revenue.",
"verdict": "supported"
}
],
"supported": 1,
"contradicted": 2,
"unsupported": 1,
"total": 4,
"score": 0.25
}
}openai.chat.completions.create
985.9ms
Attributes (10)
{
"model": "gpt-4-mini",
"feature": "moderation",
"user_id": "u_heavy_28",
"tokens_prompt": 203,
"tokens_output": 20,
"tokens_total": 223,
"input": "Translate the user manual section to Japanese, preserving the table structure.",
"output": "(answer to: Translate the user manual section to Jap…)",
"eval_scores": {
"Hallucination": 0.9505596597678959
},
"hallucination_details": {
"claims": [
{
"text": "Policy P-204(b) applies to this request.",
"verdict": "supported"
},
{
"text": "The refund amount is $189.99.",
"verdict": "contradicted"
},
{
"text": "The card on file ends in 4421.",
"verdict": "unsupported"
},
{
"text": "Refunds settle within 3 business days.",
"verdict": "supported"
}
],
"supported": 2,
"contradicted": 1,
"unsupported": 1,
"total": 4,
"score": 0.5
}
}tool.calculator
333.6ms
Attributes (3)
{
"tool": "calculator",
"feature": "moderation",
"user_id": "u_heavy_28"
}tool.web_fetch
634.5ms
Attributes (3)
{
"tool": "web_fetch",
"feature": "moderation",
"user_id": "u_heavy_28"
}tool.calculator
87.8ms
Attributes (3)
{
"tool": "calculator",
"feature": "moderation",
"user_id": "u_heavy_28"
}Faithfulness · openai.chat.completions.create0.25
1 supported · 2 contradicted · 1 unsupported of 4 claims
- contradicted
Q3 revenue was $4.2B.
- contradicted
YoY growth was 38%.
- unsupported
Operating margin expanded to 27%.
- supported
Net income grew faster than revenue.
Faithfulness · openai.chat.completions.create0.25
1 supported · 2 contradicted · 1 unsupported of 4 claims
- contradicted
Q3 revenue was $4.2B.
- contradicted
YoY growth was 38%.
- unsupported
Operating margin expanded to 27%.
- supported
Net income grew faster than revenue.
Faithfulness · openai.chat.completions.create0.50
2 supported · 1 contradicted · 1 unsupported of 4 claims
- supported
Policy P-204(b) applies to this request.
- contradicted
The refund amount is $189.99.
- unsupported
The card on file ends in 4421.
- supported
Refunds settle within 3 business days.