Use this file to discover all available pages before exploring further.
Run Failure Analysis on your existing AI data in three steps. No tracing, no instrumentation, no OpenAI key needed.
This quickstart uses dataset mode — you pass your AI inputs and outputs directly. If you already have tracing set up, see Failure Analysis from traces instead.
from valiqor import ValiqorClientclient = ValiqorClient(api_key="vq_your_api_key_here", project_name="quickstart")# Run FA on existing data — no tracing requiredresult = client.failure_analysis.run( dataset=[ { "input": "What is the capital of France?", "output": "The capital of France is Berlin.", "context": [ "France is a country in Western Europe.", "The capital of France is Paris.", "Paris is known as the City of Light.", ], } ])# Print the summaryprint(f"Status: {result.status}")print(f"Failures detected: {result.summary.total_failures_detected}")print(f"Overall severity: {result.summary.overall_severity}/5")print(f"Should alert: {result.summary.should_alert}")print()# Print each failurefor tag in result.failure_tags: if tag.decision == "fail": print(f"❌ [{tag.bucket_name}] {tag.subcategory_name}") print(f" Severity: {tag.severity}/5 Confidence: {tag.confidence}") if tag.judge_rationale: print(f" Rationale: {tag.judge_rationale}") print()
The output intentionally passes a wrong answer (“Berlin” instead of “Paris”) so you can see a real failure detected:
Status: completedFailures detected: 1Overall severity: 4/5Should alert: True❌ [Hallucination] Entity Fabrication Severity: 4/5 Confidence: 0.95 Rationale: The output states Berlin is the capital of France, which directly contradicts the provided context that identifies Paris as the capital. This is a factual fabrication.
The exact bucket names, severity scores, and rationale text may vary slightly depending on your backend configuration and LLM judge version.
You can pass multiple input/output pairs in a single call:
result = client.failure_analysis.run( dataset=[ { "input": "What is the capital of France?", "output": "The capital of France is Paris.", "context": ["The capital of France is Paris."], }, { "input": "Summarize the article about climate change.", "output": "The article discusses economic policy in the 1990s.", "context": [ "Climate change is causing rising sea levels worldwide.", "The Paris Agreement aims to limit warming to 1.5°C.", ], }, { "input": "What medications interact with warfarin?", "output": "Warfarin interacts with aspirin and ibuprofen.", "context": [ "Warfarin interacts with aspirin, ibuprofen, and many antibiotics.", "Always consult a healthcare provider for drug interactions.", ], }, ])print(f"Items analyzed: {result.summary.total_items}")print(f"Items with failures: {result.summary.items_with_failures}")print(f"Items all passed: {result.summary.items_all_passed}")
Once you’re comfortable with dataset mode, you can add auto-tracing to capture LLM calls in production and run Failure Analysis on traces:
import valiqor.auto # Auto-instruments OpenAI, Anthropic, LangChain# Your normal LLM calls are now traced automatically# Then run FA on the captured trace:result = client.failure_analysis.run(trace_id="tr_abc123")