Skip to main content
POST
/
v2
/
failure-analysis
/
analyze
Run failure analysis
curl --request POST \
  --url https://api.valiqor.com/v2/failure-analysis/analyze \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "dataset": [
    {
      "context": [
        "France is a country in Western Europe. Its capital is Paris."
      ],
      "input": "What is the capital of France?",
      "output": "The capital of France is Berlin."
    }
  ],
  "feature_kind": "rag",
  "run_eval": true,
  "run_security": false
}
'
{
  "run_id": "<string>",
  "status": "<string>",
  "mode": "<string>",
  "input_type": "<string>",
  "feature_kind": "<string>",
  "summary": {
    "primary_failure": "<string>",
    "primary_failure_name": "<string>",
    "overall_severity": 0,
    "overall_confidence": 0,
    "total_failures_detected": 0,
    "total_passes": 0,
    "total_uncertain": 0,
    "total_items": 0,
    "items_with_failures": 0,
    "items_all_passed": 0,
    "buckets_affected": [
      "<string>"
    ],
    "should_alert": false,
    "should_gate_ci": false,
    "needs_human_review": false
  },
  "failure_tags": [
    {
      "tag_id": "<string>",
      "bucket_id": "<string>",
      "bucket_name": "<string>",
      "subcategory_id": "<string>",
      "subcategory_name": "<string>",
      "decision": "<string>",
      "severity": 2.5,
      "confidence": 0.5,
      "detector_type_used": "<string>",
      "scoring_breakdown": {
        "impact": 123,
        "risk": 123,
        "final_severity": 123,
        "final_confidence": 123,
        "frequency_weight": 1,
        "security_override": "<string>",
        "deterministic_weight": 0,
        "judge_weight": 0,
        "security_weight": 0,
        "metric_agreement_bonus": 0,
        "disagreement_penalty": 0
      },
      "item_index": 123,
      "judge_rationale": "<string>",
      "eval_metric_values": {},
      "evidence_items": [
        {
          "item_id": "<string>",
          "evidence_type": "<string>",
          "description": "<string>",
          "source": "<string>",
          "content_snippet": "<string>",
          "confidence": 1,
          "metadata": {}
        }
      ]
    }
  ],
  "inputs": [
    {
      "item_index": 123,
      "input_text": "<string>",
      "output_text": "<string>",
      "input_preview": "",
      "output_preview": "",
      "context": [
        "<string>"
      ],
      "tool_calls": [
        {}
      ],
      "trace_id": "<string>",
      "failure_count": 0,
      "pass_count": 0,
      "unsure_count": 0,
      "max_severity": 0
    }
  ],
  "eval_metrics": {},
  "eval_run_id": "<string>",
  "security_flags": {},
  "security_batch_id": "<string>",
  "detectors_run": [
    "<string>"
  ],
  "detectors_skipped": [
    "<string>"
  ],
  "duration_ms": 0,
  "tokens_used": 0,
  "created_at": "2023-11-07T05:31:56Z"
}

Authorizations

Authorization
string
header
required

Valiqor API key. Pass as Bearer token: Authorization: Bearer vq-xxxxxxxxxxxxxxxxxxxx

Body

application/json

Request for failure analysis.

trace_id
string | null

Trace ID for full trace analysis

dataset
MinimalInputItem · object[] | null

Dataset items for minimal input analysis

project_name
string | null

Project name for organization

feature_kind
string | null

App type hint: 'rag', 'agent', 'agentic_rag', 'generic_llm'. Auto-detected if not provided.

run_eval
boolean
default:true

Run evaluation metrics alongside failure analysis

run_security
boolean
default:true

Run security audit alongside failure analysis

run_scan
boolean
default:true

Fetch and attach AST scan data if available (auto-enabled for trace inputs)

mandatory_eval_metrics
string[]

User-specified eval metrics that must run

mandatory_security_categories
string[]

User-specified security categories that must run

subcategories
string[] | null

Filter to specific failure subcategories. If None, runs all applicable.

buckets
string[] | null

Filter to specific failure buckets (e.g., 'hallucination_grounding', 'instruction_compliance'). All subcategories within the specified buckets will be included. If both buckets and subcategories are provided, they are merged (union).

openai_api_key
string | null

OpenAI API key for LLM judges

Response

Successful Response

Full failure analysis response.

run_id
string
required
status
string
required

'completed', 'processing', 'failed'

mode
string
required

'minimal' or 'full'

input_type
string
required

'trace', 'json', or 'csv'

feature_kind
string
required

Detected or specified app type

summary
FailureSummary · object
required

Summary of failure analysis results.

failure_tags
FailureTagResponse · object[]
inputs
FARunInputResponse · object[]

Original input items with per-item failure stats

eval_metrics
Eval Metrics · object

Evaluation metric scores

eval_run_id
string | null

ID of linked evaluation run

security_flags
Security Flags · object

Security category flags

security_batch_id
string | null

ID of linked security audit batch

detectors_run
string[]

Subcategories that were analyzed

detectors_skipped
string[]

Subcategories skipped (not applicable)

duration_ms
integer
default:0
tokens_used
integer
default:0
created_at
string<date-time>