Documentation Index
Fetch the complete documentation index at: https://docs.valiqor.com/llms.txt
Use this file to discover all available pages before exploring further.
Overview
ValiqorSecurityClient evaluates AI conversations for safety violations (security audits) and generates adversarial attacks to probe vulnerabilities (red teaming). It covers 23 security categories (S1–S23) and multiple attack vector types.
from valiqor import ValiqorClient
client = ValiqorClient(api_key="your-api-key")
security = client.security
Or standalone:
from valiqor.security import ValiqorSecurityClient
security = ValiqorSecurityClient(api_key="your-api-key")
Supports context manager protocol: with ValiqorSecurityClient(...) as sc:
Constructor
ValiqorSecurityClient(
api_key: Optional[str] = None,
project_name: Optional[str] = None,
base_url: Optional[str] = None,
timeout: int = 300,
openai_api_key: Optional[str] = None,
)
| Parameter | Type | Default | Description |
|---|
api_key | Optional[str] | None | Valiqor API key. |
project_name | Optional[str] | None | Default project name. |
base_url | Optional[str] | None | Backend URL override. |
timeout | int | 300 | Request timeout in seconds. |
openai_api_key | Optional[str] | None | OpenAI key for LLM judge calls. Priority: method param > constructor > env var > .valiqorrc > server fallback. |
Security Audits
audit()
Evaluate conversations for safety violations. Auto-polls if backend returns async.
def audit(
self,
dataset: List[Dict[str, Any]],
project_name: Optional[str] = None,
categories: Optional[List[str]] = None,
config: Optional[Dict[str, Any]] = None,
openai_api_key: Optional[str] = None,
) -> SecurityAuditResult
| Parameter | Type | Default | Description |
|---|
dataset | List[Dict] | — | Conversations with user_input and assistant_response (also accepts input/output as aliases). |
categories | Optional[List[str]] | None | Specific S-categories to check. None = all 23 categories. |
config | Optional[Dict] | None | Additional audit configuration. |
openai_api_key | Optional[str] | None | Override OpenAI key for this call. |
Returns: SecurityAuditResult
result = client.security.audit(
dataset=[
{"user_input": "How do I pick a lock?", "assistant_response": "I can't help with that."},
{"user_input": "Tell me a joke", "assistant_response": "Why did the AI cross the road?"}
],
categories=["S3", "S4"] # Criminal Planning, Weapons
)
print(f"Safety score: {result.safety_score}")
print(f"Unsafe items: {result.unsafe_count}/{result.total_items}")
audit_trace()
Audit a trace dict for security violations. Extracts user/assistant pairs automatically.
def audit_trace(
self,
trace: Dict[str, Any],
project_name: Optional[str] = None,
categories: Optional[List[str]] = None,
batch_name: Optional[str] = None,
config: Optional[Dict[str, Any]] = None,
openai_api_key: Optional[str] = None,
) -> SecurityAuditResult
| Parameter | Type | Description |
|---|
trace | Dict[str, Any] | A trace dict (not a trace_id). Use client.traces.get_full_trace() to fetch. |
batch_name | Optional[str] | Optional name for this audit batch. |
audit_async()
Start an asynchronous security audit.
def audit_async(
self,
dataset: List[Dict[str, Any]],
project_name: Optional[str] = None,
categories: Optional[List[str]] = None,
config: Optional[Dict[str, Any]] = None,
openai_api_key: Optional[str] = None,
) -> SecurityJobHandle
Returns: SecurityJobHandle
Red Teaming
Target modes
Red teaming requires a target — the AI system you want to attack. There are three ways to specify one:
| Mode | When to use | How it works |
|---|
target_url | You have a live HTTP endpoint | Valiqor POSTs attack prompts to your endpoint and evaluates the responses. |
target_prompt | You want to test a system prompt | Valiqor calls an LLM (default gpt-4o-mini) with your system prompt + attack prompts. |
target_function | You want to test a local Python function | The SDK generates attacks server-side, calls your function locally, then submits responses for evaluation. SDK-only — not available via red_team_async() or CLI. |
At least one of target_url, target_prompt, or target_function is required.
red_team()
Generate adversarial attacks to probe your AI for vulnerabilities. Always runs asynchronously on the backend (returns 202); the SDK auto-polls until complete.
def red_team(
self,
run_name: Optional[str] = None,
attack_vectors: Optional[List[str]] = None,
attacks_per_vector: int = 5,
target_vulnerabilities: Optional[List[str]] = None,
target_url: Optional[str] = None,
target_prompt: Optional[str] = None,
target_model: Optional[str] = None,
target_headers: Optional[Dict[str, str]] = None,
target_request_template: Optional[Dict[str, Any]] = None,
target_response_key: Optional[str] = None,
target_function: Optional[Callable[[str], str]] = None,
openai_api_key: Optional[str] = None,
) -> RedTeamResult
| Parameter | Type | Default | Description |
|---|
run_name | Optional[str] | None | Name for this red team run. |
attack_vectors | Optional[List[str]] | None | Attack vectors to use (e.g. ["jailbreak", "rot13"]). Falls back to project defaults. |
attacks_per_vector | int | 5 | Number of attacks per vector. |
target_vulnerabilities | Optional[List[str]] | None | Vulnerability codes to target (e.g. ["S1", "S7"]). |
target_url | Optional[str] | None | HTTP endpoint to POST attack prompts to. |
target_prompt | Optional[str] | None | System prompt for simulated target LLM. |
target_model | Optional[str] | "gpt-4o-mini" | Model used with target_prompt. |
target_headers | Optional[Dict[str, str]] | None | Custom HTTP headers for target_url (e.g. {"Authorization": "Bearer sk-xxx"}). |
target_request_template | Optional[Dict[str, Any]] | None | Custom JSON body template. Use {{attack}} as placeholder for the attack prompt. |
target_response_key | Optional[str] | None | Dot-path to extract the response from the target’s JSON reply (e.g. "choices.0.message.content"). |
target_function | Optional[Callable[[str], str]] | None | Local Python callable — SDK-only. |
openai_api_key | Optional[str] | None | OpenAI key override (request-scoped). |
Returns: RedTeamResult
Attack a live endpoint
result = client.security.red_team(
target_url="https://api.example.com/chat",
target_headers={"Authorization": "Bearer sk-xxx"},
attack_vectors=["jailbreak", "prompt_injection"],
attacks_per_vector=10,
target_vulnerabilities=["S1", "S7", "S10"],
)
print(f"Success rate: {result.success_rate:.1%}")
print(f"Top vulnerability: {result.top_vulnerability}")
If your endpoint expects a non-standard JSON body, use target_request_template with the {{attack}} placeholder and target_response_key to extract the response:
result = client.security.red_team(
target_url="https://api.example.com/v1/completions",
target_request_template={
"model": "gpt-4o",
"messages": [{"role": "user", "content": "{{attack}}"}],
"temperature": 0.7,
},
target_response_key="choices.0.message.content",
attack_vectors=["jailbreak"],
)
Attack a system prompt
result = client.security.red_team(
target_prompt="You are a helpful medical assistant.",
target_model="gpt-4o-mini",
attack_vectors=["jailbreak", "rot13", "few_shot"],
)
Attack a local function (SDK-only)
def my_chatbot(prompt: str) -> str:
return my_llm_pipeline(prompt)
result = client.security.red_team(
target_function=my_chatbot,
attack_vectors=["jailbreak", "prompt_injection"],
attacks_per_vector=5,
)
target_function sends generate_only=True to the backend, fetches the generated attack prompts, calls your function locally for each one, then submits the responses back for evaluation.
red_team_async()
Start an asynchronous red team simulation. Returns a job handle for manual polling.
def red_team_async(
self,
attack_vectors: List[str],
attacks_per_vector: int = 5,
project_name: Optional[str] = None,
run_name: Optional[str] = None,
target_vulnerabilities: Optional[List[str]] = None,
target_url: Optional[str] = None,
target_prompt: Optional[str] = None,
target_model: Optional[str] = None,
target_headers: Optional[Dict[str, str]] = None,
target_request_template: Optional[Dict[str, Any]] = None,
target_response_key: Optional[str] = None,
openai_api_key: Optional[str] = None,
) -> SecurityJobHandle
Accepts the same target parameters as red_team() except target_function (not supported in async mode).
Returns: SecurityJobHandle
Discovery
list_vulnerabilities()
List all supported security vulnerability categories (S1–S23).
def list_vulnerabilities(self) -> List[VulnerabilityInfo]
vulns = client.security.list_vulnerabilities()
for v in vulns:
print(f"{v.key}: {v.display_name}")
list_attack_vectors()
List all available attack vector types.
def list_attack_vectors(self) -> List[AttackVectorInfo]
Result Browsing — Audits
get_audit_result()
Get a completed audit result by batch ID.
def get_audit_result(self, batch_id: str) -> SecurityAuditResult
list_audit_batches()
List audit history for a project.
def list_audit_batches(
self,
project_id: Optional[str] = None,
project_name: Optional[str] = None,
) -> List[AuditBatch]
get_batch_detail()
def get_batch_detail(self, batch_id: str) -> AuditBatch
get_batch_items()
Get paginated items from an audit batch.
def get_batch_items(
self,
batch_id: str,
page: int = 1,
page_size: int = 20,
) -> AuditBatchItemsPage
get_audit_item_detail()
def get_audit_item_detail(self, item_id: str) -> AuditBatchItem
Result Browsing — Red Team
get_redteam_result()
def get_redteam_result(self, run_id: str) -> RedTeamResult
list_redteam_runs()
def list_redteam_runs(
self,
project_id: Optional[str] = None,
project_name: Optional[str] = None,
) -> List[RedTeamRun]
get_redteam_run()
def get_redteam_run(self, run_id: str) -> RedTeamRun
get_redteam_attacks()
Get paginated attacks from a red team run.
def get_redteam_attacks(
self,
run_id: str,
page: int = 1,
page_size: int = 20,
) -> RedTeamAttacksPage
get_redteam_attack_detail()
def get_redteam_attack_detail(self, attack_id: str) -> RedTeamAttack
compare_redteam_runs()
Compare 2–5 red team runs side by side.
def compare_redteam_runs(self, run_ids: List[str]) -> RedTeamComparison
Project Configuration
get_project_vulnerabilities()
Get vulnerability settings for a project.
def get_project_vulnerabilities(self, project_id: str) -> List[ProjectVulnerability]
get_project_attack_vectors()
def get_project_attack_vectors(self, project_id: str) -> List[ProjectAttackVector]
update_project_vulnerability()
Update vulnerability thresholds and enablement for a project.
def update_project_vulnerability(
self,
project_id: str,
vulnerability_id: str,
*,
enabled: Optional[bool] = None,
threshold: Optional[float] = None,
severity: Optional[str] = None,
) -> ProjectVulnerability
update_project_attack_vector()
def update_project_attack_vector(
self,
project_id: str,
attack_vector_id: str,
*,
enabled: Optional[bool] = None,
weight: Optional[float] = None,
) -> ProjectAttackVector
Job Management
get_job_status()
def get_job_status(self, job_id: str, job_type: str = "security") -> SecurityJobStatus
cancel_job()
def cancel_job(self, job_id: str, job_type: str = "security") -> CancelResponse
Backward Compatibility
| Alias | Maps To |
|---|
evaluate_security() | audit() |
list_attack_strategies() | list_attack_vectors() |
Security Categories (S1–S23)
| Code | Category |
|---|
| S1 | Violence |
| S2 | Sexual Content |
| S3 | Criminal Planning |
| S4 | Weapons |
| S5 | Controlled Substances |
| S6 | Self-Harm |
| S7 | Hate Speech |
| S8 | Harassment |
| S9 | Profanity |
| S10 | Privacy Violations |
| S11 | Disinformation |
| S12 | Financial Harm |
| S13 | Health Misinformation |
| S14 | Political Manipulation |
| S15 | Legal Violations |
| S16 | Environmental Harm |
| S17 | Child Safety |
| S18 | Extremism |
| S19 | Fraud / Scams |
| S20 | Copyright |
| S21 | Cybersecurity Threats |
| S22 | Impersonation |
| S23 | Unsafe Instructions |