Failure Analysis Models

`FARunResult`

The full result of a failure analysis run.

Field	Type	Description
`run_id`	`str`	Run identifier.
`status`	`str`	`"completed"`, `"processing"`, `"failed"`.
`mode`	`str`	`"minimal"` or `"full"`.
`input_type`	`str`	`"trace"`, `"json"`, or `"csv"`.
`feature_kind`	`str`	`"rag"`, `"agent"`, `"agentic_rag"`, `"generic_llm"`.
`summary`	`FASummary`	Aggregated summary statistics.
`failure_tags`	`List[FATag]`	All detected failure tags.
`eval_metrics`	`Optional[Dict[str, float]]`	Evaluation metric values (if `run_eval=True`).
`eval_run_id`	`Optional[str]`	Linked evaluation run ID.
`security_flags`	`Optional[Dict[str, str]]`	Per-category security flags (if `run_security=True`). Values are `"fail"` or `"pass"` keyed by S-category code (e.g. `{"S1": "pass", "S9": "fail"}`).
`security_batch_id`	`Optional[str]`	Linked security batch ID.
`detectors_run`	`List[str]`	Which detectors were executed.
`detectors_skipped`	`List[str]`	Which detectors were skipped.
`duration_ms`	`int`	Total processing time in milliseconds.
`tokens_used`	`int`	Total tokens consumed.
`created_at`	`Optional[str]`	ISO timestamp.
`inputs`	`List[FARunInput]`	Input items with per-item stats.

Property: tags — alias for failure_tags. Methods:

get_item_tags(item_index: int) -> List[FATag] — tags for a specific input item
get_failed_items() -> List[FARunInput] — items with at least one failure
get_clean_items() -> List[FARunInput] — items with no failures

`FASummary`

Aggregated failure summary.

Field	Type	Default	Description
`total_failures_detected`	`int`	`0`	Number of failure tags with `decision="fail"`.
`total_passes`	`int`	`0`	Checks that passed.
`total_uncertain`	`int`	`0`	Checks with `decision="unsure"`.
`overall_severity`	`float`	`0.0`	Aggregate severity (0.0–5.0).
`overall_confidence`	`float`	`0.0`	Aggregate confidence (0.0–1.0).
`primary_failure`	`Optional[str]`	`None`	ID of the most severe failure subcategory.
`primary_failure_name`	`Optional[str]`	`None`	Name of the primary failure.
`buckets_affected`	`List[str]`	`[]`	Bucket IDs with failures.
`should_alert`	`bool`	`False`	Whether this warrants an alert.
`should_gate_ci`	`bool`	`False`	Whether this should block CI.
`needs_human_review`	`bool`	`False`	Whether human review is recommended.
`total_items`	`int`	`0`	Total input items analyzed.
`items_with_failures`	`int`	`0`	Items with at least one failure.
`items_all_passed`	`int`	`0`	Items with all checks passed.

`FATag`

A single failure detection result.

Field	Type	Description
`tag_id`	`str`	Unique tag identifier.
`bucket_id`	`str`	Parent failure bucket ID.
`bucket_name`	`str`	Parent failure bucket name.
`subcategory_id`	`str`	Failure subcategory ID.
`subcategory_name`	`str`	Failure subcategory name.
`decision`	`str`	`"pass"`, `"fail"`, or `"unsure"`.
`severity`	`float`	Severity score (0.0–5.0).
`confidence`	`float`	Confidence score (0.0–1.0).
`detector_type_used`	`str`	`"deterministic"`, `"llm_judge"`, or `"hybrid"`.
`judge_rationale`	`Optional[str]`	LLM judge’s explanation (if applicable).
`scoring_breakdown`	`Optional[FAScoringBreakdown]`	Detailed scoring components.
`eval_metric_values`	`Dict[str, float]`	Related eval metric values.
`evidence_items`	`List[FAEvidenceItem]`	Supporting evidence.
`item_index`	`Optional[int]`	Which input item this tag applies to.
`is_reviewed`	`bool`	Whether this tag has been marked as reviewed. Default `False`.
`reviewed_at`	`Optional[str]`	ISO timestamp of when the tag was marked reviewed.
`issue_url`	`Optional[str]`	URL linking this tag to an external issue tracker.

`FARunInput`

An input item with per-item failure statistics.

Field	Type	Description
`item_index`	`int`	Position in the dataset.
`input_text`	`str`	The input text.
`output_text`	`str`	The output text.
`input_preview`	`str`	Truncated input preview.
`output_preview`	`str`	Truncated output preview.
`context`	`Optional[List[str]]`	Context documents.
`tool_calls`	`Optional[List[Dict]]`	Tool call data.
`trace_id`	`Optional[str]`	Linked trace ID.
`failure_count`	`int`	Number of failures for this item.
`pass_count`	`int`	Number of passed checks.
`unsure_count`	`int`	Number of uncertain checks.
`max_severity`	`float`	Highest severity among failures.

Properties: total_checks -> int, has_failures -> bool.

`FABucket`

A failure bucket in the taxonomy.

Field	Type	Description
`bucket_id`	`str`	Bucket identifier.
`bucket_name`	`str`	Display name.
`description`	`str`	What this bucket covers.
`subcategories`	`List[FASubcategory]`	Subcategories within this bucket.

`FASubcategory`

A failure subcategory within a bucket.

Field	Type	Description
`subcategory_id`	`str`	Subcategory identifier.
`subcategory_name`	`str`	Display name.
`description`	`str`	What this subcategory detects.
`detection_approach`	`str`	`"deterministic"`, `"llm_judge"`, or `"hybrid"`.
`applies_to`	`List[str]`	Feature kinds this applies to.
`requires_retrieval`	`bool`	Whether retrieval context is needed.
`requires_tools`	`bool`	Whether tool call data is needed.
`minimal_mode_compatible`	`bool`	Whether this works in minimal mode.
`required_inputs`	`List[str]`	Required input fields.
`related_eval_metrics`	`List[str]`	Related evaluation metrics.
`related_security_categories`	`List[str]`	Related security categories.
`impact_score`	`int`	Impact severity (1–5).
`risk_category`	`str`	`"low"`, `"medium"`, `"high"`, `"critical"`.

`FARunListPage`

Paginated list of FA runs. Supports len() and iteration.

Field	Type
`items`	`List[FARunListItem]`
`total`	`int`
`page`	`int`
`page_size`	`int`
`has_more`	`bool`

`FARunListItem`

Summary of an FA run in a list.

Field	Type	Description
`run_id`	`str`	Run identifier.
`status`	`str`	Run status.
`input_type`	`str`	Input type.
`minimal_mode`	`bool`	Whether minimal mode was used.
`dataset_item_count`	`int`	Number of items.
`created_at`	`str`	ISO timestamp.
`feature_kind`	`Optional[str]`	Feature kind.
`total_failures_detected`	`Optional[int]`	Failure count.
`overall_severity`	`Optional[float]`	Overall severity.

`FAInsightsSummary`

Aggregated failure insights over a time period.

Field	Type	Description
`total_runs`	`int`	Total FA runs.
`total_items_analyzed`	`int`	Total items analyzed.
`overall_failure_rate`	`float`	Failure rate (0.0–1.0).
`average_severity`	`float`	Average severity.
`period_days`	`int`	Analysis period in days.
`top_recurring_failures`	`List[FARecurringFailure]`	Most common failures.
`bucket_distribution`	`List[FABucketDistribution]`	Failure distribution by bucket.

`FATrends`

Failure trend data over time.

Field	Type
`period_days`	`int`
`project_name`	`Optional[str]`
`trend_data`	`List[FATrendDataPoint]`

`FAPlaygroundResult`

Result from the playground endpoint.

Field	Type	Description
`run_id`	`str`	Run identifier.
`summary`	`FASummary`	Failure summary.
`failure_tags`	`List[FATag]`	Detected failures.
`checks_passed`	`int`	Number of passed checks.
`security_flags`	`Dict[str, str]`	Per-category security flags. Values are `"fail"` or `"pass"` keyed by S-category code.
`duration_ms`	`int`	Processing time.
`playground_runs_today`	`int`	Runs used today.
`playground_limit_per_day`	`int`	Daily limit (10).

`FAJobStatus`

Status of an async FA job.

Field	Type	Description
`job_id`	`str`	Job identifier.
`job_type`	`str`	`"failure_analysis"`.
`status`	`str`	`"queued"`, `"running"`, `"completed"`, `"failed"`, `"cancelled"`.
`progress_percent`	`float`	Progress (0–100).
`current_item`	`int`	Current item being processed.
`total_items`	`int`	Total items.
`estimated_remaining_seconds`	`Optional[int]`	Estimated time remaining.
`error`	`Optional[str]`	Error message if failed.

Properties: is_running, is_completed, is_failed (all bool).

FailureAnalysisClient — Methods returning these models
Evaluation Models — Eval data models
Security Models — Security data models

Client

Modules

Utilities

Failure Analysis Models

`FARunResult`

`FASummary`

`FATag`

`FARunInput`

`FABucket`

`FASubcategory`

`FARunListPage`

`FARunListItem`

`FAInsightsSummary`

`FATrends`

`FAPlaygroundResult`

`FAJobStatus`

Client

Modules

Utilities

Documentation Index

​FARunResult

​FASummary

​FATag

​FARunInput

​FABucket

​FASubcategory

​FARunListPage

​FARunListItem

​FAInsightsSummary

​FATrends

​FAPlaygroundResult

​FAJobStatus

​Related

`FARunResult`

`FASummary`

`FATag`

`FARunInput`

`FABucket`

`FASubcategory`

`FARunListPage`

`FARunListItem`

`FAInsightsSummary`

`FATrends`

`FAPlaygroundResult`

`FAJobStatus`

Related