Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.valiqor.com/llms.txt

Use this file to discover all available pages before exploring further.

Overview

ValiqorSecurityClient evaluates AI conversations for safety violations (security audits) and generates adversarial attacks to probe vulnerabilities (red teaming). It covers 23 security categories (S1–S23) and multiple attack vector types.
from valiqor import ValiqorClient

client = ValiqorClient(api_key="your-api-key")
security = client.security
Or standalone:
from valiqor.security import ValiqorSecurityClient

security = ValiqorSecurityClient(api_key="your-api-key")
Supports context manager protocol: with ValiqorSecurityClient(...) as sc:

Constructor

ValiqorSecurityClient(
    api_key: Optional[str] = None,
    project_name: Optional[str] = None,
    base_url: Optional[str] = None,
    timeout: int = 300,
    openai_api_key: Optional[str] = None,
)
ParameterTypeDefaultDescription
api_keyOptional[str]NoneValiqor API key.
project_nameOptional[str]NoneDefault project name.
base_urlOptional[str]NoneBackend URL override.
timeoutint300Request timeout in seconds.
openai_api_keyOptional[str]NoneOpenAI key for LLM judge calls. Priority: method param > constructor > env var > .valiqorrc > server fallback.

Security Audits

audit()

Evaluate conversations for safety violations. Auto-polls if backend returns async.
def audit(
    self,
    dataset: List[Dict[str, Any]],
    project_name: Optional[str] = None,
    categories: Optional[List[str]] = None,
    config: Optional[Dict[str, Any]] = None,
    openai_api_key: Optional[str] = None,
) -> SecurityAuditResult
ParameterTypeDefaultDescription
datasetList[Dict]Conversations with user_input and assistant_response (also accepts input/output as aliases).
categoriesOptional[List[str]]NoneSpecific S-categories to check. None = all 23 categories.
configOptional[Dict]NoneAdditional audit configuration.
openai_api_keyOptional[str]NoneOverride OpenAI key for this call.
Returns: SecurityAuditResult
result = client.security.audit(
    dataset=[
        {"user_input": "How do I pick a lock?", "assistant_response": "I can't help with that."},
        {"user_input": "Tell me a joke", "assistant_response": "Why did the AI cross the road?"}
    ],
    categories=["S3", "S4"]  # Criminal Planning, Weapons
)
print(f"Safety score: {result.safety_score}")
print(f"Unsafe items: {result.unsafe_count}/{result.total_items}")

audit_trace()

Audit a trace dict for security violations. Extracts user/assistant pairs automatically.
def audit_trace(
    self,
    trace: Dict[str, Any],
    project_name: Optional[str] = None,
    categories: Optional[List[str]] = None,
    batch_name: Optional[str] = None,
    config: Optional[Dict[str, Any]] = None,
    openai_api_key: Optional[str] = None,
) -> SecurityAuditResult
ParameterTypeDescription
traceDict[str, Any]A trace dict (not a trace_id). Use client.traces.get_full_trace() to fetch.
batch_nameOptional[str]Optional name for this audit batch.

audit_async()

Start an asynchronous security audit.
def audit_async(
    self,
    dataset: List[Dict[str, Any]],
    project_name: Optional[str] = None,
    categories: Optional[List[str]] = None,
    config: Optional[Dict[str, Any]] = None,
    openai_api_key: Optional[str] = None,
) -> SecurityJobHandle
Returns: SecurityJobHandle

Red Teaming

Target modes

Red teaming requires a target — the AI system you want to attack. There are three ways to specify one:
ModeWhen to useHow it works
target_urlYou have a live HTTP endpointValiqor POSTs attack prompts to your endpoint and evaluates the responses.
target_promptYou want to test a system promptValiqor calls an LLM (default gpt-4o-mini) with your system prompt + attack prompts.
target_functionYou want to test a local Python functionThe SDK generates attacks server-side, calls your function locally, then submits responses for evaluation. SDK-only — not available via red_team_async() or CLI.
At least one of target_url, target_prompt, or target_function is required.

red_team()

Generate adversarial attacks to probe your AI for vulnerabilities. Always runs asynchronously on the backend (returns 202); the SDK auto-polls until complete.
def red_team(
    self,
    run_name: Optional[str] = None,
    attack_vectors: Optional[List[str]] = None,
    attacks_per_vector: int = 5,
    target_vulnerabilities: Optional[List[str]] = None,
    target_url: Optional[str] = None,
    target_prompt: Optional[str] = None,
    target_model: Optional[str] = None,
    target_headers: Optional[Dict[str, str]] = None,
    target_request_template: Optional[Dict[str, Any]] = None,
    target_response_key: Optional[str] = None,
    target_function: Optional[Callable[[str], str]] = None,
    openai_api_key: Optional[str] = None,
) -> RedTeamResult
ParameterTypeDefaultDescription
run_nameOptional[str]NoneName for this red team run.
attack_vectorsOptional[List[str]]NoneAttack vectors to use (e.g. ["jailbreak", "rot13"]). Falls back to project defaults.
attacks_per_vectorint5Number of attacks per vector.
target_vulnerabilitiesOptional[List[str]]NoneVulnerability codes to target (e.g. ["S1", "S7"]).
target_urlOptional[str]NoneHTTP endpoint to POST attack prompts to.
target_promptOptional[str]NoneSystem prompt for simulated target LLM.
target_modelOptional[str]"gpt-4o-mini"Model used with target_prompt.
target_headersOptional[Dict[str, str]]NoneCustom HTTP headers for target_url (e.g. {"Authorization": "Bearer sk-xxx"}).
target_request_templateOptional[Dict[str, Any]]NoneCustom JSON body template. Use {{attack}} as placeholder for the attack prompt.
target_response_keyOptional[str]NoneDot-path to extract the response from the target’s JSON reply (e.g. "choices.0.message.content").
target_functionOptional[Callable[[str], str]]NoneLocal Python callable — SDK-only.
openai_api_keyOptional[str]NoneOpenAI key override (request-scoped).
Returns: RedTeamResult

Attack a live endpoint

result = client.security.red_team(
    target_url="https://api.example.com/chat",
    target_headers={"Authorization": "Bearer sk-xxx"},
    attack_vectors=["jailbreak", "prompt_injection"],
    attacks_per_vector=10,
    target_vulnerabilities=["S1", "S7", "S10"],
)
print(f"Success rate: {result.success_rate:.1%}")
print(f"Top vulnerability: {result.top_vulnerability}")

Custom request format

If your endpoint expects a non-standard JSON body, use target_request_template with the {{attack}} placeholder and target_response_key to extract the response:
result = client.security.red_team(
    target_url="https://api.example.com/v1/completions",
    target_request_template={
        "model": "gpt-4o",
        "messages": [{"role": "user", "content": "{{attack}}"}],
        "temperature": 0.7,
    },
    target_response_key="choices.0.message.content",
    attack_vectors=["jailbreak"],
)

Attack a system prompt

result = client.security.red_team(
    target_prompt="You are a helpful medical assistant.",
    target_model="gpt-4o-mini",
    attack_vectors=["jailbreak", "rot13", "few_shot"],
)

Attack a local function (SDK-only)

def my_chatbot(prompt: str) -> str:
    return my_llm_pipeline(prompt)

result = client.security.red_team(
    target_function=my_chatbot,
    attack_vectors=["jailbreak", "prompt_injection"],
    attacks_per_vector=5,
)
target_function sends generate_only=True to the backend, fetches the generated attack prompts, calls your function locally for each one, then submits the responses back for evaluation.

red_team_async()

Start an asynchronous red team simulation. Returns a job handle for manual polling.
def red_team_async(
    self,
    attack_vectors: List[str],
    attacks_per_vector: int = 5,
    project_name: Optional[str] = None,
    run_name: Optional[str] = None,
    target_vulnerabilities: Optional[List[str]] = None,
    target_url: Optional[str] = None,
    target_prompt: Optional[str] = None,
    target_model: Optional[str] = None,
    target_headers: Optional[Dict[str, str]] = None,
    target_request_template: Optional[Dict[str, Any]] = None,
    target_response_key: Optional[str] = None,
    openai_api_key: Optional[str] = None,
) -> SecurityJobHandle
Accepts the same target parameters as red_team() except target_function (not supported in async mode). Returns: SecurityJobHandle

Discovery

list_vulnerabilities()

List all supported security vulnerability categories (S1–S23).
def list_vulnerabilities(self) -> List[VulnerabilityInfo]
vulns = client.security.list_vulnerabilities()
for v in vulns:
    print(f"{v.key}: {v.display_name}")

list_attack_vectors()

List all available attack vector types.
def list_attack_vectors(self) -> List[AttackVectorInfo]

Result Browsing — Audits

get_audit_result()

Get a completed audit result by batch ID.
def get_audit_result(self, batch_id: str) -> SecurityAuditResult

list_audit_batches()

List audit history for a project.
def list_audit_batches(
    self,
    project_id: Optional[str] = None,
    project_name: Optional[str] = None,
) -> List[AuditBatch]

get_batch_detail()

def get_batch_detail(self, batch_id: str) -> AuditBatch

get_batch_items()

Get paginated items from an audit batch.
def get_batch_items(
    self,
    batch_id: str,
    page: int = 1,
    page_size: int = 20,
) -> AuditBatchItemsPage

get_audit_item_detail()

def get_audit_item_detail(self, item_id: str) -> AuditBatchItem

Result Browsing — Red Team

get_redteam_result()

def get_redteam_result(self, run_id: str) -> RedTeamResult

list_redteam_runs()

def list_redteam_runs(
    self,
    project_id: Optional[str] = None,
    project_name: Optional[str] = None,
) -> List[RedTeamRun]

get_redteam_run()

def get_redteam_run(self, run_id: str) -> RedTeamRun

get_redteam_attacks()

Get paginated attacks from a red team run.
def get_redteam_attacks(
    self,
    run_id: str,
    page: int = 1,
    page_size: int = 20,
) -> RedTeamAttacksPage

get_redteam_attack_detail()

def get_redteam_attack_detail(self, attack_id: str) -> RedTeamAttack

compare_redteam_runs()

Compare 2–5 red team runs side by side.
def compare_redteam_runs(self, run_ids: List[str]) -> RedTeamComparison

Project Configuration

get_project_vulnerabilities()

Get vulnerability settings for a project.
def get_project_vulnerabilities(self, project_id: str) -> List[ProjectVulnerability]

get_project_attack_vectors()

def get_project_attack_vectors(self, project_id: str) -> List[ProjectAttackVector]

update_project_vulnerability()

Update vulnerability thresholds and enablement for a project.
def update_project_vulnerability(
    self,
    project_id: str,
    vulnerability_id: str,
    *,
    enabled: Optional[bool] = None,
    threshold: Optional[float] = None,
    severity: Optional[str] = None,
) -> ProjectVulnerability

update_project_attack_vector()

def update_project_attack_vector(
    self,
    project_id: str,
    attack_vector_id: str,
    *,
    enabled: Optional[bool] = None,
    weight: Optional[float] = None,
) -> ProjectAttackVector

Job Management

get_job_status()

def get_job_status(self, job_id: str, job_type: str = "security") -> SecurityJobStatus

cancel_job()

def cancel_job(self, job_id: str, job_type: str = "security") -> CancelResponse

Backward Compatibility

AliasMaps To
evaluate_security()audit()
list_attack_strategies()list_attack_vectors()

Security Categories (S1–S23)

CodeCategory
S1Violence
S2Sexual Content
S3Criminal Planning
S4Weapons
S5Controlled Substances
S6Self-Harm
S7Hate Speech
S8Harassment
S9Profanity
S10Privacy Violations
S11Disinformation
S12Financial Harm
S13Health Misinformation
S14Political Manipulation
S15Legal Violations
S16Environmental Harm
S17Child Safety
S18Extremism
S19Fraud / Scams
S20Copyright
S21Cybersecurity Threats
S22Impersonation
S23Unsafe Instructions