Evaluator Dashboard

Inspect recent LLM calls, failures, saved sessions, and compare branches.

Summary

2

saved sessions

4

persisted turns

0

saved compare branches

4

logged LLM calls

0

LLM call errors

Recent LLM Calls

Time	Model	Status	Duration	Tokens	Prompt	Log
3/18/2026, 3:08:34 PM	gpt-4.1-mini	ok	8732 ms	4846	Use the following structured data to answer the user's question. Keep the response concise but helpf…	2026-03-18-15-08-25-ans-001.json
3/18/2026, 3:08:25 PM	gpt-4.1-mini	ok	14922 ms	17408	{ "userText": "show me the breakdown of GHGs in the atmosphere in case fusion energy is commercial…	2026-03-18-15-08-25-sdk-001.json
3/18/2026, 2:45:09 PM	gpt-4.1-mini	ok	5278 ms	3679	Use the following structured data to answer the user's question. Keep the response concise but helpf…	2026-03-18-14-45-02-ans-001.json
3/18/2026, 2:45:02 PM	gpt-4.1-mini	ok	4609 ms	16495	{ "userText": "How is the distribution of GHGs affected by a $100 per ton carbon tax?", "activeI…	2026-03-18-14-45-02-sdk-001.json