What LLMs does Valiqor use for evaluation?
What LLMs does Valiqor use for evaluation?
Valiqor uses a state-of-the-art LLM as the default judge for all LLM-based evaluation metrics (e.g.
hallucination, coherence, factual_accuracy).You can also bring your own OpenAI API key so that judge calls use your own quota. Pass it at any of these levels (highest priority wins):- Method parameter —
client.eval.evaluate(dataset=..., openai_api_key="sk-...") - Client constructor —
ValiqorClient(api_key="vq_...", openai_api_key="sk-...") - Environment variable —
VALIQOR_OPENAI_API_KEY - Config file —
openai_api_keyin.valiqorrc
Is my data stored? For how long?
Is my data stored? For how long?
- Evaluation results, traces, and analysis data are stored in the Valiqor backend database for your team to review in the dashboard.
- OpenAI API keys provided via BYOK (Bring Your Own Key) are never stored — they are used only for the duration of the request and then discarded.
Can I use Valiqor in CI/CD pipelines?
Can I use Valiqor in CI/CD pipelines?
Yes. Both the SDK and CLI support fully headless, non-interactive usage.Option 1 — CLI login with credentials:Option 2 — Environment variables (no login needed):Option 3 — SDK with explicit key:All configuration can be set non-interactively with
valiqor config set key=value or through environment variables (VALIQOR_API_KEY, VALIQOR_PROJECT_NAME, VALIQOR_TRACE_DIR, VALIQOR_SCAN_DIR, VALIQOR_BACKEND_URL, VALIQOR_ENVIRONMENT).What's the difference between Evaluation and Failure Analysis?
What's the difference between Evaluation and Failure Analysis?
| Evaluation | Failure Analysis | |
|---|---|---|
| Purpose | Score LLM output quality on specific metrics | Find why your AI app fails — root causes, severity, evidence |
| Access | client.eval | client.failure_analysis |
| Output | Per-item metric scores (0–1) | Failure buckets, root-cause evidence, severity (0–5), confidence (0–1) |
| Metrics | coherence, factual_accuracy, hallucination, etc. | Automatic — failure taxonomy with 30+ subcategories |
| Approach | LLM-as-judge per metric | Multi-signal analysis |
| CLI | valiqor eval run | valiqor fa run |
How do I trace a multi-step agent?
How do I trace a multi-step agent?
Three approaches, from zero-config to fully manual:Zero-config auto-instrumentation:Decorator-based:Manual spans:See Tracing AI Apps for the full guide.
Which Python versions are supported?
Which Python versions are supported?
Python 3.9+ — tested on Python 3.9, 3.10, 3.11, and 3.12.
Which LLM providers does auto-instrumentation support?
Which LLM providers does auto-instrumentation support?
The
You can also enable specific providers only:For providers without auto-instrumentation, use
autolog() function (or import valiqor.auto) auto-instruments these providers:| Provider | Minimum Version |
|---|---|
| OpenAI | ≥ 1.0.0 |
| Anthropic | ≥ 0.18.0 |
| LangChain / LangGraph | ≥ 0.1.0 |
| Ollama | — |
| Agno | — |
@trace_workflow and @trace_function decorators.What are the core dependencies?
What are the core dependencies?
The base
valiqor package requires:requests >= 2.31.0httpx >= 0.25.0gitingest >= 0.1.0
How does config resolution work?
How does config resolution work?
The SDK and CLI resolve configuration values in this order (highest priority first):Supported environment variables:
- Constructor / method parameters —
ValiqorClient(api_key="vq_...") - Environment variables —
VALIQOR_API_KEY,VALIQOR_PROJECT_NAME, etc. - Local config file —
.valiqorrcin the project root - Global credentials (CLI only) —
~/.valiqor/credentials.json - Defaults — built-in defaults
Global credentials (
~/.valiqor/credentials.json) are loaded by the CLI only. The SDK’s get_config() function resolves from environment variables and .valiqorrc but does not read global credentials.| Variable | Config Key |
|---|---|
VALIQOR_API_KEY | api_key |
VALIQOR_PROJECT_NAME | project_name |
VALIQOR_OPENAI_API_KEY | openai_api_key |
VALIQOR_TRACE_DIR | trace_dir |
VALIQOR_SCAN_DIR | scan_dir |
VALIQOR_BACKEND_URL | backend_url |
VALIQOR_ENVIRONMENT | environment |
What is the maximum dataset size?
What is the maximum dataset size?
1,000 items per request. This applies to
evaluate(), audit(), and failure_analysis.run().If your dataset is larger, split it into batches or use the async API for better throughput.Trial users (email not verified) are additionally limited to 25 rows per run and 3 total runs. Verify your email with valiqor verify to remove trial limits.How many API keys can I create?
How many API keys can I create?
Each user can have up to 5 API keys. Manage keys with: