API Documentation
Complete reference for the blah.dev evals public API — 23 endpoints
Your key is stored in your browser only. Generate a key
Getting Started
Base URL and authentication
Base URL
https://evals.blah.dev
Authentication
Read endpoints (GET) are public and require no authentication. Write endpoints (POST, DELETE) require an API key passed as a Bearer token.
curl -H "Authorization: Bearer blah_YOUR_API_KEY" ...
Generate API keys from the API Keys settings page.
CORS
All endpoints support CORS with Access-Control-Allow-Origin: *. Read responses are cached for 60 seconds.
Models
4 endpoints
/api/v1/modelsList all registered models. API keys are excluded from the response.
Response
[
{
"id": "abc123",
"name": "Claude Sonnet 4.6",
"description": "Latest Anthropic model",
"inference_uri": "anthropic/claude-sonnet-4-6",
"is_official": true,
"submitted_by": "admin@blah.dev",
"created_at": 1700000000000,
"updated_at": 1700000000000
}
]curl
curl https://evals.blah.dev/api/v1/models
/api/v1/modelsAuthCreate a new model registration.
Request Body
{
"name": "My Model",
"description": "A custom model",
"inference_uri": "https://api.example.com/v1/chat",
"api_key": "sk-...",
"is_official": false
}Response
{
"id": "new_id",
"name": "My Model",
"description": "A custom model",
"inference_uri": "https://api.example.com/v1/chat",
"is_official": false,
"submitted_by": "user@example.com",
"created_at": 1700000000000,
"updated_at": 1700000000000
}curl
curl -X POST \
https://evals.blah.dev/api/v1/models \
-H "Authorization: Bearer blah_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"name":"My Model","description":"A custom model","inference_uri":"https://api.example.com/v1/chat","api_key":"sk-...","is_official":false}'/api/v1/models/:idGet a single model by ID.
Response
{
"id": "abc123",
"name": "Claude Sonnet 4.6",
"description": "Latest Anthropic model",
"inference_uri": "anthropic/claude-sonnet-4-6",
"is_official": true,
"submitted_by": "admin@blah.dev",
"created_at": 1700000000000,
"updated_at": 1700000000000
}curl
curl https://evals.blah.dev/api/v1/models/ID
/api/v1/models/:id/resultsGet all eval results for a specific model.
Response
[
{
"id": "res123",
"eval_run_id": "run123",
"eval_id": "eval123",
"model_id": "abc123",
"response": "The model's response...",
"score": 0.85,
"score_details": "{}",
"latency_ms": 1200,
"error": null,
"raw_data": "",
"created_at": 1700000000000
}
]curl
curl https://evals.blah.dev/api/v1/models/ID/results
Model Artifacts
4 endpoints
/api/v1/models/:id/artifactsList all artifacts for a model.
Response
[
{
"id": "art123",
"model_id": "abc123",
"name": "Base Weights",
"artifact_type": "weights",
"version": "v1.0",
"file_name": "weights.bin",
"file_size": 734003200,
"blob_url": "https://...",
"created_at": 1700000000000
}
]curl
curl https://evals.blah.dev/api/v1/models/ID/artifacts
/api/v1/models/:id/artifactsAuthCreate an artifact. Accepts JSON with a blob_url (from the request-upload flow) or FormData for small files (< 4.5MB). Artifact types: weights, config, checkpoint, logs, dataset, other.
Request Body
{
"blob_url": "https://... (from upload step)",
"file_name": "weights.bin",
"file_size": 734003200,
"content_type": "application/octet-stream",
"name": "Base Weights",
"artifact_type": "weights",
"version": "v1.0"
}Response
{
"id": "art_new",
"model_id": "abc123",
"name": "Base Weights",
"artifact_type": "weights",
"version": "v1.0",
"file_name": "weights.bin",
"file_size": 734003200,
"blob_url": "https://...",
"created_at": 1700000000000
}curl
curl -X POST \
https://evals.blah.dev/api/v1/models/ID/artifacts \
-H "Authorization: Bearer blah_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"blob_url":"https://... (from upload step)","file_name":"weights.bin","file_size":734003200,"content_type":"application/octet-stream","name":"Base Weights","artifact_type":"weights","version":"v1.0"}'/api/v1/models/:id/artifacts/:artifactIdGet a single artifact by ID.
Response
{
"id": "art123",
"model_id": "abc123",
"name": "Base Weights",
"description": "Pre-trained model weights",
"artifact_type": "weights",
"version": "v1.0",
"file_name": "weights.bin",
"file_type": "bin",
"file_size": 734003200,
"content_type": "application/octet-stream",
"blob_url": "https://...",
"submitted_by": "user@example.com",
"created_at": 1700000000000,
"updated_at": 1700000000000
}curl
curl https://evals.blah.dev/api/v1/models/ID/artifacts/ARTIFACTID
/api/v1/models/:id/artifacts/request-uploadAuthRequest a signed upload token for large artifact files (up to 20GB). Use the returned client_token to PUT the file directly to Vercel Blob, then POST the blob_url to create the artifact.
Request Body
{
"file_name": "weights.bin",
"content_type": "application/octet-stream"
}Response
{
"client_token": "vercel_blob_client_...",
"pathname": "artifacts/1700000000000-weights.bin"
}curl
curl -X POST \
https://evals.blah.dev/api/v1/models/ID/artifacts/request-upload \
-H "Authorization: Bearer blah_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"file_name":"weights.bin","content_type":"application/octet-stream"}'Evals
4 endpoints
/api/v1/evalsList all evaluation definitions.
Response
[
{
"id": "eval123",
"name": "Code Generation",
"description": "Tests code generation ability",
"prompt": "Write a function that...",
"expected_behavior": "Should produce valid code",
"eval_type": "rubric",
"eval_criteria": "{}",
"submitted_by": "admin@blah.dev",
"created_at": 1700000000000,
"updated_at": 1700000000000
}
]curl
curl https://evals.blah.dev/api/v1/evals
/api/v1/evalsAuthCreate a new evaluation.
Request Body
{
"name": "My Eval",
"description": "Tests reasoning ability",
"prompt": "Explain quantum computing",
"expected_behavior": "Clear, accurate explanation",
"eval_type": "rubric",
"eval_criteria": "{\"rubric\":\"Rate clarity 0-1\"}"
}Response
{
"id": "new_eval_id",
"name": "My Eval",
"description": "Tests reasoning ability",
"prompt": "Explain quantum computing",
"expected_behavior": "Clear, accurate explanation",
"eval_type": "rubric",
"eval_criteria": "{\"rubric\":\"Rate clarity 0-1\"}",
"submitted_by": "user@example.com",
"created_at": 1700000000000,
"updated_at": 1700000000000
}curl
curl -X POST \
https://evals.blah.dev/api/v1/evals \
-H "Authorization: Bearer blah_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"name":"My Eval","description":"Tests reasoning ability","prompt":"Explain quantum computing","expected_behavior":"Clear, accurate explanation","eval_type":"rubric","eval_criteria":"{\"rubric\":\"Rate clarity 0-1\"}"}'/api/v1/evals/:idGet a single eval by ID.
Response
{
"id": "eval123",
"name": "Code Generation",
"description": "Tests code generation ability",
"prompt": "Write a function that...",
"expected_behavior": "Should produce valid code",
"eval_type": "rubric",
"eval_criteria": "{}",
"submitted_by": "admin@blah.dev",
"created_at": 1700000000000,
"updated_at": 1700000000000
}curl
curl https://evals.blah.dev/api/v1/evals/ID
/api/v1/evals/:idAuthDelete an eval and all its associated results. Cascades cleanup to affected runs.
Response
{
"success": true
}curl
curl -X DELETE \ https://evals.blah.dev/api/v1/evals/ID \ -H "Authorization: Bearer blah_YOUR_KEY"
Runs
4 endpoints
/api/v1/runsList all eval runs.
Response
[
{
"id": "run123",
"started_at": 1700000000000,
"completed_at": 1700000060000,
"status": "completed",
"total_evals": 10,
"total_models": 5,
"completed_count": 50,
"error": null
}
]curl
curl https://evals.blah.dev/api/v1/runs
/api/v1/runsAuthTrigger a new eval run across all active models and evals.
Response
{
"message": "Eval run triggered"
}curl
curl -X POST \ https://evals.blah.dev/api/v1/runs \ -H "Authorization: Bearer blah_YOUR_KEY"
/api/v1/runs/:idGet a single run by ID.
Response
{
"id": "run123",
"started_at": 1700000000000,
"completed_at": 1700000060000,
"status": "completed",
"total_evals": 10,
"total_models": 5,
"completed_count": 50,
"error": null
}curl
curl https://evals.blah.dev/api/v1/runs/ID
/api/v1/runs/:id/resultsGet all results for a specific run.
Response
[
{
"id": "res123",
"eval_run_id": "run123",
"eval_id": "eval123",
"model_id": "abc123",
"response": "...",
"score": 0.85,
"score_details": "{}",
"latency_ms": 1200,
"error": null,
"raw_data": "",
"created_at": 1700000000000
}
]curl
curl https://evals.blah.dev/api/v1/runs/ID/results
Results
1 endpoint
/api/v1/results/:idGet a single eval result by ID.
Response
{
"id": "res123",
"eval_run_id": "run123",
"eval_id": "eval123",
"model_id": "abc123",
"response": "The model's response...",
"score": 0.85,
"score_details": "{}",
"latency_ms": 1200,
"error": null,
"raw_data": "",
"created_at": 1700000000000
}curl
curl https://evals.blah.dev/api/v1/results/ID
Training Sets
5 endpoints
/api/v1/training-setsList all training sets. The snippet field is omitted from list responses for performance.
Response
[
{
"id": "ts123",
"name": "My Training Data",
"description": "Fine-tuning dataset",
"file_name": "data.jsonl",
"file_type": "jsonl",
"file_size": 1048576,
"content_type": "application/x-ndjson",
"line_count": 5000,
"blob_url": "https://...",
"submitted_by": "user@example.com",
"created_at": 1700000000000,
"updated_at": 1700000000000
}
]curl
curl https://evals.blah.dev/api/v1/training-sets
/api/v1/training-setsAuthUpload a new training set via multipart form data. Supported file types: jsonl, csv, tsv, json, txt, parquet, xml, yaml, yml. Max size 20GB (FormData limited to ~4.5MB by serverless; use request-upload for larger files).
Form Fields
Form fields: file: dataset file (required) metadata: JSON string with name, description
Response
{
"id": "ts_new",
"name": "My Dataset",
"file_name": "data.jsonl",
"file_type": "jsonl",
"file_size": 1048576,
"blob_url": "https://...",
"created_at": 1700000000000,
"updated_at": 1700000000000
}curl
curl -X POST \
https://evals.blah.dev/api/v1/training-sets \
-H "Authorization: Bearer blah_YOUR_KEY" \
-F "file=@data.jsonl" \
-F 'metadata={"name":"My Dataset","description":"Training data"}'/api/v1/training-sets/:idGet a single training set by ID, including the full snippet preview.
Response
{
"id": "ts123",
"name": "My Training Data",
"description": "Fine-tuning dataset",
"file_name": "data.jsonl",
"file_type": "jsonl",
"file_size": 1048576,
"content_type": "application/x-ndjson",
"line_count": 5000,
"blob_url": "https://...",
"snippet": "{\"prompt\":\"...\",...}\n...",
"submitted_by": "user@example.com",
"created_at": 1700000000000,
"updated_at": 1700000000000
}curl
curl https://evals.blah.dev/api/v1/training-sets/ID
/api/v1/training-sets/:idAuthDelete a training set. Only the owner can delete. Removes the file from blob storage.
Response
{
"success": true
}curl
curl -X DELETE \ https://evals.blah.dev/api/v1/training-sets/ID \ -H "Authorization: Bearer blah_YOUR_KEY"
/api/v1/training-sets/request-uploadAuthRequest a signed upload token for large training set files (up to 20GB). Use the returned client_token to PUT the file directly to Vercel Blob, then create the training set via the POST endpoint with the resulting blob_url.
Request Body
{
"file_name": "data.jsonl",
"content_type": "application/x-ndjson"
}Response
{
"client_token": "vercel_blob_client_...",
"pathname": "training-sets/1700000000000-data.jsonl"
}curl
curl -X POST \
https://evals.blah.dev/api/v1/training-sets/request-upload \
-H "Authorization: Bearer blah_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"file_name":"data.jsonl","content_type":"application/x-ndjson"}'Leaderboard
1 endpoint
/api/v1/leaderboardModel scores ranked by average eval performance.
Response
[
{
"model_id": "abc123",
"model_name": "Claude Sonnet 4.6",
"avg_score": 0.92,
"eval_count": 10,
"last_run_at": 1700000000000
}
]curl
curl https://evals.blah.dev/api/v1/leaderboard
Error Responses
All errors return JSON with an error field
{ "error": "Error message here" }