Performance Guide
Optimizing voria for speed and efficiency.
Performance Targets
| Metric | Target | Notes |
|---|---|---|
| CLI startup | < 2s | Before Python engine ready |
| Command response | < 100ms (p95) | After initial request |
| Simple issue fix | < 30s | Via LLM |
| Full automation | < 5min | For simple issues (1-2 iterations) |
| Token rate | < 1000/min | To avoid rate limiting |
Optimization Strategies
1. Disable Unnecessary Features
bash# Skip tests (if you know they pass) voria issue 42 --skip-tests # Dry run (don't modify files) voria issue 42 --dry-run # Single iteration (don't refine) voria issue 42 --max-iterations 1
2. Use Fastest LLM Provider
Speed Ranking (fastest to slowest):
- ▶Modal - Sub-second responses (free!)
- ▶Gemini - 1-2 seconds
- ▶OpenAI - 2-3 seconds
- ▶Claude - 3-5 seconds
bash# Use fastest voria issue 42 --llm modal
3. Cache Models
voria caches model responses. To warm up cache:
bash# Pre-download models voria plan 1 # First time (slow) voria plan 2 # Second time (faster - cached)
4. Parallel Operations
For batch processing:
bash# Process multiple issues in parallel export voria_WORKERS=4 for issue in {1..20}; do voria issue $issue & done wait
5. Limit Context Size
Large codebases slow down analysis:
bash# Only analyze relevant files voria issue 42 --include "src/api.py" --include "tests/test_api.py" # Exclude large files voria issue 42 --exclude "node_modules" --exclude "dist"
Profiling
Python Profiling
bash# Run with timing time python3 -m voria.engine < test_command.json # Detailed profiling python3 -m cProfile -s cumtime -m voria.engine < test_command.json
Rust Profiling
bash# Flamegraph (Linux) cargo install flamegraph cargo flamegraph -- plan 1 # View: flamegraph.svg
Memory Usage
Monitor Memory
bash# Watch memory usage watch -n 1 'ps aux | grep voria' # Detailed memory stats /usr/bin/time -v ./target/release/voria plan 1
Reduce Memory Footprint
bash# Smaller batch sizes voria issue 42 --batch-size 100 # Disable caching voria issue 42 --no-cache # Limit context voria issue 42 --max-files 10
Network Optimization
Connection Pooling
voria reuses connections (built-in). To verify:
python# Check connection reuse httpx.AsyncClient(limits=httpx.Limits( max_connections=5, # Connection pool size max_keepalive_connections=5 ))
Rate Limiting
bash# Slow down token usage (avoid rate limits) voria_TOKEN_RATE=100 voria issue 100 # 100 tokens/sec max # Add delays between requests voria_REQUEST_DELAY=2 voria issue 42 # 2 sec between requests
DNS Caching
python# voria auto-caches DNS (via httpx) # Verify with: # strace -e openat ./target/release/voria plan 1 | grep hosts
Benchmarking
Run Benchmarks
bash# Time different providers for llm in modal openai gemini claude; do echo "Testing $llm..." time voria plan 1 --llm $llm done # Compare patch generation for strategy in strict fuzzy; do echo "Testing $strategy..." time voria issue 1 --patch-strategy $strategy done
Create Benchmark Suite
bash# benchmark.sh #!/bin/bash ISSUES=(1 2 3 10 42 100) for issue in "${ISSUES[@]}"; do echo "Issue #$issue" /usr/bin/time -f "%e seconds, %M KB" \ ./target/release/voria plan $issue done
LLM Token Optimization
Use Smaller Models
bash# Smaller = faster + cheaper voria issue 42 --llm modal # GLM-5.1-FP8 (fast) voria issue 42 --llm openai --model gpt-4.5-mini # mini (faster) voria issue 42 --llm claude --model claude-3-haiku # haiku (fastest)
Prompt Engineering
Keep prompts concise:
python# ❌ Slow: Large context prompt = f"""Analyze this entire codebase: {entire_codebase} Fix {issue_description} """ # ✅ Fast: Focused context prompt = f"""Analyze these files for {issue_description}: {relevant_files_only} """
Token Budget Awareness
bash# See token usage voria plan 1 --verbose # Output shows tokens consumed and cost # Adjust budget if needed python3 -m voria.core.setup # Increase budget
Configuration Tuning
Development Mode (Faster)
bash# Use debug builds (no optimizations) cargo build # vs cargo build --release # Skip some validations voria_SKIP_VALIDATION=1 ./target/debug/voria plan 1
Production Mode (Slower but robust)
bash# Use release builds cargo build --release # With optimizations RUSTFLAGS="-C target-cpu=native" cargo build --release
Bottleneck Analysis
Common Bottlenecks
- ▶
LLM API latency (usually 2-5 seconds)
- ▶Switch to faster provider: terminal
--llm modal - ▶Use smaller model: terminal
--model gpt-4-mini
- ▶Switch to faster provider:
- ▶
File I/O (usually < 1 second)
- ▶Use faster disk (SSD vs HDD)
- ▶Exclude unnecessary files: terminal
--exclude "*.log"
- ▶
Test execution (usually 10-30 seconds)
- ▶Skip tests: terminal
--skip-tests - ▶Run subset: terminal
--test-pattern "quick_*"
- ▶Skip tests:
- ▶
Git operations (usually < 5 seconds)
- ▶Use shallow clone: terminal
--shallow - ▶Skip fetch: terminal
--no-fetch
- ▶Use shallow clone:
Identify Bottleneck
bash# Add timing breakpoints voria_DEBUG_TIMING=1 ./target/release/voria issue 42 # Output: # [TIMING] LLM plan: 3.2s # [TIMING] Patch generation: 2.1s # [TIMING] File apply: 0.3s # [TIMING] Test execution: 15.2s # [TIMING] Total: 20.8s
Scaling
Processing Many Issues
bash# Serial (slow) for i in {1..100}; do voria issue $i --create-pr done # Parallel (faster) seq 1 100 | parallel "voria issue {} --create-pr" --max-procs 4
Load Balancing
bash# Distribute across LLM providers for i in {1..100}; do llm=$((i % 4)) # Rotate: 0=modal, 1=openai, 2=gemini, 3=claude voria issue $i --llm ${LLMS[$llm]} done
Performance Tips
- ▶Use release builds in production
- ▶Keep local iteratons small (-max-iterations 2-3)
- ▶Enable caching (default on, but verify)
- ▶Use appropriate LLM for task
- ▶Focus on important files (exclude boilerplate/tests)
- ▶Monitor token spending (avoid surprises)
- ▶Reuse providers (persistent process is faster)
- ▶Batch operations (parallel processing)
Advanced Tuning
Rust Build Optimization
bash# Profile-guided optimization RUSTFLAGS="-Cllvm-args=-pgo-warn-missing-function" cargo build --release -Z pgo=instrument # High performance RUSTFLAGS="-C opt-level=3 -C lto=yes -C codegen-units=1" cargo build --release
Python Optimization
python# Use PyPy (faster than CPython) pypy3 -m venv venv source venv/bin/activate # Compile to bytecode python3 -m py_compile voria/
See Also:
- ▶SECURITY.md - Security best practices
- ▶TROUBLESHOOTING.md - Common issues
- ▶ARCHITECTURE.md - System design
Join our WhatsApp Support Group: Click Here