Configuration Examples
Complete YAML configuration examples for every provider and database combination, plus reference tables.
OpenAI + PostgreSQL (Production Setup)
# REQUIRED: Environment variable
# export OPENAI_API_KEY="sk-your-key-here"
# export DB_PASSWORD="your-db-password"
database:
connection_string: "postgresql://askrita_user:${DB_PASSWORD}@prod-db.company.com:5432/analytics"
query_timeout: 30
max_results: 1000
cache_schema: true
schema_refresh_interval: 3600
llm:
provider: "openai"
model: "gpt-4o"
temperature: 0.1
max_tokens: 4000
timeout: 60
workflow:
max_retries: 3
steps:
pii_detection: true
parse_question: true
get_unique_nouns: true
generate_sql: true
validate_and_fix_sql: true
execute_sql: true
format_results: true
choose_and_format_visualization: true
generate_followup_questions: true
# PII Detection for Production Security
pii_detection:
enabled: true
block_on_detection: true
entities:
- "PERSON"
- "EMAIL_ADDRESS"
- "PHONE_NUMBER"
- "CREDIT_CARD"
- "US_SSN"
confidence_threshold: 0.4
validate_sample_data: true
audit_log_path: "/var/log/askrita/pii_audit.log"
prompts:
generate_sql:
system: |
You are an expert SQL analyst. Generate valid, efficient SQL queries based on user questions.
Always use proper joins and WHERE clauses to filter data appropriately.
human: |
Database schema: {schema}
User question: {question}
Generate SQL query:
validate_sql:
system: |
You are a SQL validator. Check for syntax errors and optimize queries.
human: |
Validate and fix this SQL query: {sql_query}
Error (if any): {error}
business_rules:
result_limits:
max_rows: 1000
max_query_time: 30
logging:
level: "INFO"
Azure OpenAI + BigQuery (Enterprise Setup)
# REQUIRED: gcloud auth login or service account
# REQUIRED: Azure certificate authentication
database:
connection_string: "bigquery://my-enterprise-project/analytics_dataset"
bigquery_gcloud_cli_auth: true
query_timeout: 60
max_results: 10000
cache_schema: true
schema_refresh_interval: 7200
llm:
provider: "azure_openai"
model: "gpt-4o"
azure_endpoint: "https://my-company-openai.openai.azure.com/"
azure_deployment: "gpt-4o-deployment"
api_version: "2025-04-01-preview"
azure_tenant_id: "your-tenant-id"
azure_client_id: "your-client-id"
azure_certificate_path: "/path/to/company-cert.pem"
temperature: 0.1
max_tokens: 4000
workflow:
max_retries: 3
steps:
parse_question: true
get_unique_nouns: true
generate_sql: true
validate_and_fix_sql: true
execute_sql: true
format_results: true
choose_and_format_visualization: true
generate_followup_questions: true
prompts:
generate_sql:
system: "You are an expert SQL analyst."
human: |
Database schema: {schema}
User question: {question}
Generate SQL query:
validate_sql:
system: "You are a SQL validator."
human: "Validate and fix this SQL query: {sql_query}"
format_results:
system: "You are a data analyst."
human: "Question: {question}\nSQL: {sql_query}\nResults: {query_results}\nProvide a clear answer."
choose_and_format_visualization:
system: "You are a data visualization expert."
human: "Question: {question}\nResults: {query_results}\nChoose a chart type and format the data."
Vertex AI + Snowflake (Google Cloud Setup)
# REQUIRED: gcloud auth login
# REQUIRED: Snowflake credentials
database:
connection_string: "snowflake://${SF_USER}:${SF_PASSWORD}@${SF_ACCOUNT}/${SF_DATABASE}?warehouse=${SF_WAREHOUSE}&schema=${SF_SCHEMA}&role=${SF_ROLE}"
query_timeout: 120
max_results: 10000
cache_schema: true
schema_refresh_interval: 3600
llm:
provider: "vertex_ai"
model: "gemini-1.5-pro"
project_id: "my-gcp-project"
location: "us-central1"
gcloud_cli_auth: true
temperature: 0.1
max_tokens: 4000
workflow:
max_retries: 3
steps:
parse_question: true
get_unique_nouns: true
generate_sql: true
validate_and_fix_sql: true
execute_sql: true
format_results: true
choose_and_format_visualization: true
generate_followup_questions: true
prompts:
generate_sql:
system: "You are an expert SQL analyst."
human: "Schema: {schema}\nQuestion: {question}\nGenerate SQL:"
validate_sql:
system: "You are a SQL validator."
human: "Validate: {sql_query}"
format_results:
system: "You are a data analyst."
human: "Question: {question}\nResults: {query_results}\nProvide a clear answer."
choose_and_format_visualization:
system: "You are a data visualization expert."
human: "Question: {question}\nResults: {query_results}\nChoose a chart type and format the data."
AWS Bedrock + SQLite (Development Setup)
# REQUIRED: aws configure or IAM roles
database:
connection_string: "sqlite:///./dev_database.db"
query_timeout: 30
max_results: 1000
llm:
provider: "bedrock"
model: "anthropic.claude-4-6-sonnet-20250514-v1:0"
region_name: "us-east-1"
temperature: 0.1
max_tokens: 4000
workflow:
max_retries: 3
steps:
parse_question: true
get_unique_nouns: true
generate_sql: true
validate_and_fix_sql: true
execute_sql: true
format_results: true
choose_and_format_visualization: true
generate_followup_questions: true
prompts:
generate_sql:
system: "You are an expert SQL analyst."
human: "Schema: {schema}\nQuestion: {question}\nGenerate SQL:"
validate_sql:
system: "You are a SQL validator."
human: "Validate: {sql_query}"
format_results:
system: "You are a data analyst."
human: "Question: {question}\nResults: {query_results}\nProvide a clear answer."
choose_and_format_visualization:
system: "You are a data visualization expert."
human: "Question: {question}\nResults: {query_results}\nChoose a chart type and format the data."
Configuration Priority
| Setting Type |
Location |
Priority |
Description |
| Environment Variables |
OS Environment |
🥇 Highest |
Overrides all config file settings |
| YAML Configuration |
Config file |
🥈 Medium |
Explicit configuration settings |
| Built-in Defaults |
Code |
🥉 Lowest |
Framework defaults when not specified |
Mandatory vs Optional Settings
LLM Configuration
| Provider |
Mandatory Settings |
Optional Settings |
Environment Variables |
| OpenAI |
provider, model |
temperature, max_tokens, timeout, base_url, organization, ca_bundle_path |
OPENAI_API_KEY (required) |
| Azure OpenAI |
provider, model, azure_endpoint, azure_deployment, azure_tenant_id, azure_client_id, azure_certificate_path |
api_version, azure_certificate_password, temperature, max_tokens, timeout |
None |
| Vertex AI |
provider, model, project_id, location, (credentials_path OR gcloud_cli_auth) |
temperature, max_tokens, top_p, timeout |
GOOGLE_APPLICATION_CREDENTIALS (if not using gcloud CLI) |
| Bedrock |
provider, model, region_name |
temperature, max_tokens, top_p, timeout |
AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_DEFAULT_REGION (if not using IAM) |
Database Configuration
| Database |
Mandatory Settings |
Optional Settings |
Environment Variables |
| PostgreSQL |
connection_string |
query_timeout, max_results, cache_schema, schema_refresh_interval |
DB_PASSWORD, DB_HOST, DB_USER, DB_NAME |
| MySQL |
connection_string |
query_timeout, max_results, cache_schema, schema_refresh_interval |
DB_PASSWORD, DB_HOST, DB_USER, DB_NAME |
| SQLite |
connection_string |
query_timeout, max_results, cache_schema, schema_refresh_interval |
None |
| BigQuery |
connection_string, (bigquery_credentials_path OR bigquery_gcloud_cli_auth) |
query_timeout, max_results, cache_schema, schema_refresh_interval, cross_project_access, schema_descriptions |
GOOGLE_APPLICATION_CREDENTIALS (if not using gcloud CLI) |
| Snowflake |
connection_string (with all parameters) |
query_timeout, max_results, cache_schema, schema_refresh_interval |
SF_USER, SF_PASSWORD, SF_ACCOUNT, SF_DATABASE, SF_WAREHOUSE, SF_SCHEMA, SF_ROLE |
| MongoDB |
connection_string (with database name) |
query_timeout, max_results, cache_schema, schema_refresh_interval |
MONGO_USER, MONGO_PASSWORD, MONGO_HOST, MONGO_DB |
Framework Configuration
| Section |
Mandatory Settings |
Optional Settings |
| Prompts |
parse_question, generate_sql, validate_sql, format_results, choose_and_format_visualization |
generate_followup_questions, additional custom prompts |
| Workflow |
None |
max_retries, steps, input_validation, parse_overrides, sql_safety, conversation_context |
| Business Rules |
None |
result_limits, allowed_tables |
| PII Detection |
None |
enabled, block_on_detection, entities, confidence_threshold, validate_sample_data, audit_log_path |
| Logging |
None |
level, format |
Testing Configurations
Configuration Validation
# Test complete configuration (validates LLM and database connectivity)
askrita test --config my-config.yaml
# Verbose output for debugging
askrita test --config my-config.yaml --verbose
Development Testing
# Use minimal config for quick testing
cat > test-config.yaml << EOF
database:
connection_string: "sqlite:///./test.db"
llm:
provider: "openai"
model: "gpt-4o-mini"
prompts:
generate_sql:
system: "Generate SQL"
human: "{question}"
validate_sql:
system: "Validate SQL"
human: "{sql_query}"
format_results:
system: "Format results"
human: "Question: {question}\nResults: {query_results}"
choose_and_format_visualization:
system: "Choose visualization"
human: "Question: {question}\nResults: {query_results}"
EOF
# Test with minimal config
OPENAI_API_KEY=your-key askrita test --config test-config.yaml
Example Configuration Files
See the complete example configurations in the example-configs/ directory:
SQL Agent Configurations
query-minimal.yaml - Minimal required settings
query-openai.yaml - OpenAI + PostgreSQL production setup
query-azure-openai.yaml - Azure OpenAI enterprise setup
query-vertex-ai.yaml - Google Vertex AI configuration
query-vertex-ai-gcloud.yaml - Vertex AI with gcloud CLI authentication
query-bedrock.yaml - AWS Bedrock configuration
query-bigquery.yaml - BigQuery cloud analytics setup (updated with v0.2.1 features)
query-bigquery-advanced.yaml - Comprehensive BigQuery example with hybrid schema descriptions and cross-project access (New in v0.2.1)
query-snowflake.yaml - Snowflake data warehouse setup
schema-descriptions-simple.yaml - Simple example showing hybrid schema descriptions feature (New in v0.2.1)
example-zscaler-config.yaml - Complete example with cross-project access, security settings, and corporate proxy support
Privacy & Security Configurations (New in v0.10.1)
query-pii-detection.yaml - Basic PII detection enabled for development
query-bigquery-pii.yaml - Enterprise-grade PII protection with HIPAA/GDPR compliance settings
Data Classification Configurations
data-classification-openai.yaml - OpenAI for classification
data-classification-azure.yaml - Azure OpenAI for classification
data-classification-vertex-ai.yaml - Vertex AI for classification
data-classification-general.yaml - General classification template
data-classification-csv-examples.yaml - CSV processing examples
Additional Examples
mcp-server-config.yaml - MCP server configuration for AI assistants