ClickHouse Integration¶
Genesis includes comprehensive ClickHouse integration for tracking evolution runs, LLM interactions, and system events in real-time. All data is automatically logged to ClickHouse for analysis, visualization, and debugging.
Quick Overview¶
7 Tables tracking different aspects of evolution:
| Table | Purpose | Row Count (typical) |
|---|---|---|
evolution_runs |
Each experiment run | 1 per run |
generations |
Per-generation stats | 10-100 per run |
individuals |
Every code variant | 100-1000s per run |
pareto_fronts |
Pareto frontier snapshots | 3-20 per generation |
code_lineages |
Parent-child relationships | Same as individuals |
llm_logs |
Every LLM API call | 1000s per run |
agent_actions |
System events | 100s per run |
Setup¶
-
Set Environment Variable:
Or add to your.envfile. -
Install Client:
-
Verify Connection:
You should see:
✅ ClickHouse connection successful!
Connected to database: default
📊 Tables: evolution_runs, generations, individuals, pareto_fronts, code_lineages, llm_logs, agent_actions
Tables¶
Genesis uses 7 ClickHouse tables to track everything from LLM API calls to evolutionary lineages. Tables are automatically created on first connection.
Core Logging Tables¶
These tables capture system-level events and LLM interactions.
llm_logs¶
Purpose: Track every LLM API call made during evolution, including prompts, responses, costs, and timing.
Use Cases: - Debug prompt quality and LLM responses - Analyze API costs per model - Identify slow LLM calls - Audit what the system asked and received
Schema:
| Column | Type | Description |
|--------|------|-------------|
| id | UUID | Unique identifier for this LLM call |
| timestamp | DateTime64(3) | Millisecond precision timestamp when call was made |
| model | String | LLM model name (e.g., "claude-3-5-sonnet-20241022", "gpt-4") |
| messages | String | JSON-encoded array of messages sent to LLM. Format: [{"role": "user", "content": "..."}] |
| response | String | Full text response from the LLM |
| cost | Float64 | API cost in USD for this call (calculated from token usage) |
| execution_time | Float64 | Time in seconds from request to response |
| metadata | String | JSON with additional context: task name, generation number, mutation type, etc. |
Example Row:
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"timestamp": "2024-01-15 14:23:45.123",
"model": "claude-3-5-sonnet-20241022",
"messages": "[{\"role\":\"user\",\"content\":\"Improve this code...\"}]",
"response": "Here's an improved version...",
"cost": 0.0234,
"execution_time": 2.456,
"metadata": "{\"task\":\"circle_packing\",\"generation\":5}"
}
agent_actions¶
Purpose: Generic event logging for system actions (job submissions, completions, errors, etc.)
Use Cases: - Track job queue state over time - Debug job failures - Monitor system throughput - Analyze which actions are most common
Schema:
| Column | Type | Description |
|--------|------|-------------|
| id | UUID | Unique identifier for this action |
| timestamp | DateTime64(3) | When the action occurred |
| action_type | String | Type of action: "job_submitted", "job_completed", "job_failed", "archive_updated", etc. |
| details | String | JSON with action-specific data (job_id, generation, parent_id, scores, etc.) |
| metadata | String | JSON with additional context (configuration, environment info, etc.) |
Example Row (job submission):
{
"id": "650e8400-e29b-41d4-a716-446655440001",
"timestamp": "2024-01-15 14:23:45.500",
"action_type": "job_submitted",
"details": "{\"job_id\":\"job_123\",\"generation\":5,\"parent_id\":\"ind_42\",\"exec_fname\":\"code_gen5_123.py\"}",
"metadata": "{\"cluster\":\"local\",\"task\":\"circle_packing\"}"
}
Example Row (job completion):
{
"action_type": "job_completed",
"details": "{\"job_id\":\"job_123\",\"generation\":5,\"correct\":true,\"combined_score\":0.95,\"api_costs\":0.05}",
"metadata": "{\"public_metrics\":{\"fitness\":0.95,\"size\":1024}}"
}
Evolution Tracking Tables¶
These tables capture the evolutionary process: runs, generations, individuals, and their relationships.
evolution_runs¶
Purpose: High-level tracking of each complete evolutionary experiment. One row per genesis_launch invocation.
Use Cases: - List all experiments ever run - Compare configurations across runs - Track which runs are still running vs completed - Find the database file for a specific run
Schema:
| Column | Type | Description |
|--------|------|-------------|
| run_id | String | Unique identifier for this evolution run (e.g., timestamp-based ID or UUID) |
| start_time | DateTime64(3) | When the evolution run started |
| end_time | DateTime64(3) | When the run completed (defaults to epoch 0 until completion) |
| task_name | String | Name of the task being evolved (e.g., "circle_packing", "mask_to_seg") |
| config | String | Full Hydra configuration as JSON (evolution params, LLM settings, etc.) |
| status | String | Current status: "running", "completed", "failed" |
| total_generations | Int32 | Total number of generations completed (0 until run finishes) |
| population_size | Int32 | Population size per generation (from config) |
| cluster_type | String | Execution environment: "local", "slurm", "e2b" |
| database_path | String | Path to the SQLite database file storing results |
Example Row:
{
"run_id": "run_20240115_142345",
"start_time": "2024-01-15 14:23:45.000",
"end_time": "2024-01-15 16:45:12.000",
"task_name": "circle_packing",
"config": "{\"evolution\":{\"num_generations\":10,\"pop_size\":20},\"llm\":{\"model\":\"claude-3-5-sonnet\"}}",
"status": "completed",
"total_generations": 10,
"population_size": 20,
"cluster_type": "local",
"database_path": "/path/to/results/circle_packing_20240115.db"
}
Key Queries:
-- Active runs
SELECT run_id, task_name, start_time, total_generations
FROM evolution_runs
WHERE status = 'running';
-- Compare configurations
SELECT run_id, task_name, config
FROM evolution_runs
WHERE task_name = 'circle_packing'
ORDER BY start_time DESC;
generations¶
Purpose: Track aggregate statistics for each generation within an evolution run.
Use Cases: - Plot fitness improvement over time - Identify stagnant generations (no improvement) - Track API cost burn rate per generation - Monitor Pareto frontier growth
Schema:
| Column | Type | Description |
|--------|------|-------------|
| run_id | String | Links to evolution_runs.run_id |
| generation | Int32 | Generation number (0 = initial population, 1+ = evolved) |
| timestamp | DateTime64(3) | When this generation completed evaluation |
| num_individuals | Int32 | How many individuals were evaluated in this generation |
| best_score | Float64 | Highest fitness score achieved in this generation |
| avg_score | Float64 | Mean fitness score across all individuals |
| pareto_size | Int32 | Number of individuals on the Pareto frontier (for multi-objective) |
| total_cost | Float64 | Total API cost (USD) spent on this generation (LLM + embeddings + novelty) |
| metadata | String | JSON with additional stats: median score, std dev, elite count, etc. |
Example Row:
{
"run_id": "run_20240115_142345",
"generation": 5,
"timestamp": "2024-01-15 14:45:30.000",
"num_individuals": 20,
"best_score": 0.95,
"avg_score": 0.82,
"pareto_size": 3,
"total_cost": 1.25,
"metadata": "{\"median_score\":0.83,\"std_dev\":0.08,\"elite_count\":5}"
}
Key Queries:
-- Fitness trajectory
SELECT generation, best_score, avg_score, total_cost
FROM generations
WHERE run_id = 'run_20240115_142345'
ORDER BY generation;
-- Cost per generation
SELECT generation, total_cost, total_cost / num_individuals as cost_per_individual
FROM generations
WHERE run_id = 'run_20240115_142345';
-- Identify breakthrough generations (large improvement)
SELECT
generation,
best_score,
best_score - lag(best_score) OVER (ORDER BY generation) as improvement
FROM generations
WHERE run_id = 'run_20240115_142345'
ORDER BY improvement DESC;
individuals¶
Purpose: Store detailed information about every code variant evaluated during evolution. This is the most granular table.
Use Cases: - Find the best individual ever created - Track diversity in the population - Analyze which mutation types produce the best offspring - Debug why certain individuals scored poorly - Deduplicate identical code variants
Schema:
| Column | Type | Description |
|--------|------|-------------|
| run_id | String | Links to evolution_runs.run_id |
| individual_id | String | Unique identifier for this code variant (e.g., "ind_gen5_123") |
| generation | Int32 | Which generation this individual belongs to |
| timestamp | DateTime64(3) | When this individual was evaluated |
| parent_id | String | ID of the parent individual (empty string for initial population) |
| mutation_type | String | How this individual was created: "init" (initial), "mutate" (single parent), "crossover" (two parents) |
| fitness_score | Float64 | Primary objective score (e.g., accuracy, solution quality) |
| combined_score | Float64 | Multi-objective combined score (if using Pareto optimization) |
| metrics | String | JSON of all evaluation metrics: {"accuracy": 0.95, "size": 1024, "speed": 2.5} |
| is_pareto | Bool | Whether this individual is on the Pareto frontier for this generation |
| api_cost | Float64 | Cost to generate this individual (LLM mutation/crossover cost) |
| embed_cost | Float64 | Cost to compute embeddings for this individual |
| novelty_cost | Float64 | Cost to compute novelty score (if using novelty search) |
| code_hash | String | SHA-256 hash of the code (for detecting duplicates) |
| code_size | Int32 | Size of code in bytes |
Example Row:
{
"run_id": "run_20240115_142345",
"individual_id": "ind_gen5_123",
"generation": 5,
"timestamp": "2024-01-15 14:45:30.123",
"parent_id": "ind_gen4_087",
"mutation_type": "mutate",
"fitness_score": 0.95,
"combined_score": 0.90,
"metrics": "{\"accuracy\":0.95,\"size\":1024,\"speed\":2.5,\"memory\":512}",
"is_pareto": true,
"api_cost": 0.05,
"embed_cost": 0.001,
"novelty_cost": 0.002,
"code_hash": "a3f5d8b9c1e2...",
"code_size": 1024
}
Key Queries:
-- Top 10 individuals ever created
SELECT individual_id, generation, fitness_score, mutation_type, metrics
FROM individuals
WHERE run_id = 'run_20240115_142345'
ORDER BY fitness_score DESC
LIMIT 10;
-- Duplicates (same code hash)
SELECT code_hash, count(*) as count
FROM individuals
WHERE run_id = 'run_20240115_142345'
GROUP BY code_hash
HAVING count > 1;
-- Mutation type effectiveness
SELECT
mutation_type,
count() as count,
avg(fitness_score) as avg_fitness,
max(fitness_score) as max_fitness
FROM individuals
WHERE run_id = 'run_20240115_142345' AND generation > 0
GROUP BY mutation_type
ORDER BY avg_fitness DESC;
-- Pareto frontier members across all generations
SELECT generation, individual_id, fitness_score, combined_score
FROM individuals
WHERE run_id = 'run_20240115_142345' AND is_pareto = true
ORDER BY generation, fitness_score DESC;
pareto_fronts¶
Purpose: Snapshot of the Pareto frontier at each generation. Stores which individuals were non-dominated for multi-objective optimization.
Use Cases: - Track how the Pareto frontier evolves over time - Visualize trade-offs between objectives - Compare Pareto sets across different runs - Identify when the frontier stops improving
Schema:
| Column | Type | Description |
|--------|------|-------------|
| run_id | String | Links to evolution_runs.run_id |
| generation | Int32 | Which generation this Pareto snapshot is from |
| timestamp | DateTime64(3) | When this Pareto frontier was computed |
| individual_id | String | ID of an individual on the Pareto frontier (links to individuals) |
| fitness_score | Float64 | Primary objective score |
| combined_score | Float64 | Combined multi-objective score |
| metrics | String | JSON of all metrics (for plotting trade-offs) |
Example Rows (Pareto frontier for generation 5):
[
{
"run_id": "run_20240115_142345",
"generation": 5,
"individual_id": "ind_gen5_001",
"fitness_score": 0.95,
"combined_score": 0.85,
"metrics": "{\"accuracy\":0.95,\"size\":2048}"
},
{
"generation": 5,
"individual_id": "ind_gen5_042",
"fitness_score": 0.90,
"combined_score": 0.88,
"metrics": "{\"accuracy\":0.90,\"size\":512}"
}
]
Key Queries:
-- Pareto frontier size over time
SELECT generation, count(*) as pareto_size
FROM pareto_fronts
WHERE run_id = 'run_20240115_142345'
GROUP BY generation
ORDER BY generation;
-- Latest Pareto frontier
SELECT individual_id, fitness_score, combined_score, metrics
FROM pareto_fronts
WHERE run_id = 'run_20240115_142345'
AND generation = (SELECT max(generation) FROM pareto_fronts WHERE run_id = 'run_20240115_142345')
ORDER BY fitness_score DESC;
-- Compare Pareto fronts across generations
SELECT generation, individual_id,
JSONExtractFloat(metrics, 'accuracy') as accuracy,
JSONExtractFloat(metrics, 'size') as size
FROM pareto_fronts
WHERE run_id = 'run_20240115_142345'
ORDER BY generation, accuracy DESC;
code_lineages¶
Purpose: Track parent-child relationships between code variants. Essential for phylogenetic analysis and understanding evolutionary paths.
Use Cases: - Build evolutionary tree visualizations - Trace lineage of best individuals back to initial population - Identify prolific parents (many successful offspring) - Analyze which mutations led to breakthroughs - Detect evolutionary dead ends
Schema:
| Column | Type | Description |
|--------|------|-------------|
| run_id | String | Links to evolution_runs.run_id |
| child_id | String | ID of the child individual (links to individuals.individual_id) |
| parent_id | String | ID of the parent individual (empty for initial population) |
| generation | Int32 | Generation in which the child was created |
| mutation_type | String | How the child was created: "init", "mutate", "crossover" |
| timestamp | DateTime64(3) | When this lineage relationship was created |
| fitness_delta | Float64 | Change in fitness from parent to child (child_fitness - parent_fitness) |
| edit_summary | String | Human-readable description of what changed (from LLM summary) |
Example Row:
{
"run_id": "run_20240115_142345",
"child_id": "ind_gen5_123",
"parent_id": "ind_gen4_087",
"generation": 5,
"mutation_type": "mutate",
"timestamp": "2024-01-15 14:45:30.456",
"fitness_delta": 0.05,
"edit_summary": "Optimized loop to use vectorized operations, reducing time complexity from O(n²) to O(n)"
}
Key Queries:
-- Most successful mutations (biggest improvements)
SELECT child_id, parent_id, generation, mutation_type, fitness_delta, edit_summary
FROM code_lineages
WHERE run_id = 'run_20240115_142345' AND fitness_delta > 0
ORDER BY fitness_delta DESC
LIMIT 10;
-- Mutation success rate
SELECT
mutation_type,
count() as total,
countIf(fitness_delta > 0) as improvements,
countIf(fitness_delta > 0) / count() as success_rate,
avg(fitness_delta) as avg_delta
FROM code_lineages
WHERE run_id = 'run_20240115_142345' AND generation > 0
GROUP BY mutation_type;
-- Trace lineage of best individual
WITH RECURSIVE lineage AS (
SELECT child_id as id, parent_id, generation, mutation_type, edit_summary
FROM code_lineages
WHERE child_id = 'ind_gen10_best' -- Start with best individual
UNION ALL
SELECT c.child_id, c.parent_id, c.generation, c.mutation_type, c.edit_summary
FROM code_lineages c
INNER JOIN lineage l ON c.child_id = l.parent_id
)
SELECT * FROM lineage ORDER BY generation;
-- Most prolific parents (many children)
SELECT
parent_id,
count() as num_children,
avg(fitness_delta) as avg_improvement,
countIf(fitness_delta > 0) as successful_children
FROM code_lineages
WHERE run_id = 'run_20240115_142345'
GROUP BY parent_id
ORDER BY num_children DESC
LIMIT 20;
Table Relationships¶
evolution_runs (1)
↓
├─→ generations (many)
│ ↓
│ └─→ individuals (many)
│ ↓
│ ├─→ pareto_fronts (subset)
│ └─→ code_lineages (parent-child edges)
│
└─→ llm_logs (many, via metadata)
└─→ agent_actions (many, via details)
Data Flow:
1. Run starts → Insert into evolution_runs
2. Generation completes → Insert into generations
3. Individual evaluated → Insert into individuals
4. Pareto computed → Insert into pareto_fronts
5. Child created → Insert into code_lineages
6. LLM called → Insert into llm_logs
7. Action occurs → Insert into agent_actions
Usage in Code¶
When to Log What¶
Evolution Run Lifecycle:
┌─────────────────────────────────────────────────────────────┐
│ 1. Run Starts │
│ └─> log_evolution_run(run_id, task, config, ...) │
├─────────────────────────────────────────────────────────────┤
│ 2. For Each Generation: │
│ ├─> For Each Individual: │
│ │ ├─> LLM Call (auto-logged via llm_logs) │
│ │ ├─> Evaluation │
│ │ ├─> log_individual(...) │
│ │ └─> log_lineage(child, parent, ...) │
│ ├─> Compute Pareto Frontier │
│ ├─> log_pareto_front(...) │
│ └─> log_generation(gen, best_score, avg_score, ...) │
├─────────────────────────────────────────────────────────────┤
│ 3. Run Completes │
│ └─> update_evolution_run(run_id, "completed", gens) │
└─────────────────────────────────────────────────────────────┘
System Events (Logged Automatically):
• job_submitted → agent_actions
• job_completed → agent_actions
• LLM calls → llm_logs
Logging Evolution Data¶
from genesis.utils.clickhouse_logger import ch_logger
# Start of evolution run
ch_logger.log_evolution_run(
run_id="run_12345",
task_name="circle_packing",
config=cfg, # Hydra config dict
population_size=20,
cluster_type="local",
database_path="results/circle_packing.db",
)
# Each generation
ch_logger.log_generation(
run_id="run_12345",
generation=5,
num_individuals=20,
best_score=0.95,
avg_score=0.82,
pareto_size=3,
total_cost=1.25,
)
# Each individual evaluation
ch_logger.log_individual(
run_id="run_12345",
individual_id="ind_001",
generation=5,
parent_id="ind_parent",
mutation_type="mutate",
fitness_score=0.95,
combined_score=0.90,
metrics={"accuracy": 0.95, "size": 1024},
is_pareto=True,
api_cost=0.05,
)
# Track lineage
ch_logger.log_lineage(
run_id="run_12345",
child_id="ind_001",
parent_id="ind_parent",
generation=5,
mutation_type="mutate",
fitness_delta=0.05,
edit_summary="Optimized loop algorithm",
)
Logging LLM Calls (already integrated)¶
LLM calls are automatically logged via genesis.llm.query.py:
ch_logger.log_llm_interaction(
model="claude-3-5-sonnet-20241022",
messages=[{"role": "user", "content": "..."}],
response="...",
cost=0.01,
execution_time=2.5,
)
Useful Queries¶
Track Evolution Progress¶
-- Best score per generation
SELECT
generation,
best_score,
avg_score,
pareto_size,
total_cost
FROM generations
WHERE run_id = 'your_run_id'
ORDER BY generation;
-- Top 10 individuals across all generations
SELECT
individual_id,
generation,
fitness_score,
combined_score,
mutation_type
FROM individuals
WHERE run_id = 'your_run_id'
ORDER BY fitness_score DESC
LIMIT 10;
Cost Analysis¶
-- Total costs by run
SELECT
run_id,
task_name,
sum(total_cost) as total_evolution_cost,
count(DISTINCT generation) as generations
FROM generations
GROUP BY run_id, task_name
ORDER BY total_evolution_cost DESC;
-- LLM costs by model
SELECT
model,
count() as calls,
sum(cost) as total_cost,
avg(execution_time) as avg_time
FROM llm_logs
GROUP BY model
ORDER BY total_cost DESC;
Lineage Analysis¶
-- Most successful mutations (positive fitness delta)
SELECT
mutation_type,
count() as count,
avg(fitness_delta) as avg_improvement,
max(fitness_delta) as best_improvement
FROM code_lineages
WHERE fitness_delta > 0
GROUP BY mutation_type
ORDER BY avg_improvement DESC;
-- Build phylogenetic tree
SELECT
child_id,
parent_id,
generation,
mutation_type,
fitness_delta
FROM code_lineages
WHERE run_id = 'your_run_id'
ORDER BY generation, timestamp;
Pareto Frontier Evolution¶
-- Track how Pareto frontier changes over time
SELECT
generation,
count(DISTINCT individual_id) as pareto_size,
max(fitness_score) as max_fitness,
min(combined_score) as min_combined
FROM pareto_fronts
WHERE run_id = 'your_run_id'
GROUP BY generation
ORDER BY generation;
Schema Design Decisions¶
Why ClickHouse?¶
- High Write Throughput - Handle 1000s of individual evaluations per minute
- Fast Analytics - Aggregate across millions of rows in milliseconds
- Time-Series Optimized - Sorted by timestamp for efficient range queries
- Columnar Storage - Only read columns you need (efficient for wide tables)
- JSON Support - Flexible
metricsandconfigfields without schema migrations
MergeTree Engine¶
All tables use ENGINE = MergeTree() which:
- Sorts data by ORDER BY key for fast queries
- Compresses data efficiently (10x+ compression typical)
- Supports fast upserts and deletes (via mutations)
- Handles high insert rates without blocking reads
Data Volume Estimates¶
For a typical run with 20 individuals × 10 generations:
| Table | Rows | Storage (uncompressed) | Storage (compressed) |
|---|---|---|---|
evolution_runs |
1 | ~2 KB | ~500 bytes |
generations |
10 | ~2 KB | ~500 bytes |
individuals |
200 | ~100 KB | ~10 KB |
pareto_fronts |
~50 | ~10 KB | ~2 KB |
code_lineages |
200 | ~50 KB | ~5 KB |
llm_logs |
~400 | ~2 MB | ~200 KB |
agent_actions |
~500 | ~200 KB | ~20 KB |
| Total | ~1361 | ~2.4 MB | ~240 KB |
Scaling: - 1000 runs → ~240 MB (compressed) - 10,000 runs → ~2.4 GB (compressed)
ClickHouse handles this easily. Even with millions of individuals, queries remain fast.
Data Retention¶
ClickHouse is designed for high-volume data. Consider setting TTL policies for log tables:
-- Keep LLM logs for 30 days (if you want to save space)
ALTER TABLE llm_logs
MODIFY TTL timestamp + INTERVAL 30 DAY;
-- Keep agent actions for 90 days
ALTER TABLE agent_actions
MODIFY TTL timestamp + INTERVAL 90 DAY;
-- Keep evolution data forever
-- No TTL on: evolution_runs, generations, individuals, pareto_fronts, code_lineages
Recommended: Keep evolution tables (individuals, pareto_fronts, code_lineages) forever for reproducibility. They compress well and are essential for analyzing evolutionary dynamics.
Visualization¶
You can connect ClickHouse to visualization tools:
- Grafana - Real-time dashboards
- Metabase - SQL-based exploration
- Jupyter - Python analysis with clickhouse-connect
Example Jupyter notebook:
from clickhouse_connect import get_client
import pandas as pd
client = get_client(
host='your-host',
port=8443,
username='default',
password='your-password',
secure=True
)
# Query as DataFrame
df = client.query_df("""
SELECT generation, best_score, avg_score
FROM generations
WHERE run_id = 'run_12345'
ORDER BY generation
""")
df.plot(x='generation', y=['best_score', 'avg_score'])
Common Queries Cheatsheet¶
-- Find best individual across all runs for a task
SELECT i.run_id, i.individual_id, i.generation, i.fitness_score, i.metrics
FROM individuals i
JOIN evolution_runs r ON i.run_id = r.run_id
WHERE r.task_name = 'circle_packing'
ORDER BY i.fitness_score DESC
LIMIT 1;
-- Evolution progress dashboard
SELECT
g.generation,
g.best_score,
g.avg_score,
g.pareto_size,
g.total_cost,
countIf(i.is_pareto) as pareto_actual
FROM generations g
LEFT JOIN individuals i ON g.run_id = i.run_id AND g.generation = i.generation
WHERE g.run_id = 'YOUR_RUN_ID'
GROUP BY g.generation, g.best_score, g.avg_score, g.pareto_size, g.total_cost
ORDER BY g.generation;
-- Cost breakdown by component
SELECT
sum(api_cost) as total_mutation_cost,
sum(embed_cost) as total_embedding_cost,
sum(novelty_cost) as total_novelty_cost,
sum(api_cost + embed_cost + novelty_cost) as total_cost
FROM individuals
WHERE run_id = 'YOUR_RUN_ID';
-- Mutation type effectiveness
SELECT
mutation_type,
count() as count,
avg(fitness_score) as avg_fitness,
quantile(0.5)(fitness_score) as median_fitness,
quantile(0.9)(fitness_score) as p90_fitness
FROM individuals
WHERE run_id = 'YOUR_RUN_ID' AND generation > 0
GROUP BY mutation_type;
-- Find generations with breakthroughs (>10% improvement)
SELECT
generation,
best_score,
best_score - lagInFrame(best_score) OVER (ORDER BY generation) as improvement,
improvement / lagInFrame(best_score) OVER (ORDER BY generation) as improvement_pct
FROM generations
WHERE run_id = 'YOUR_RUN_ID'
HAVING improvement_pct > 0.1
ORDER BY improvement_pct DESC;
-- LLM cost by model
SELECT
model,
count() as calls,
sum(cost) as total_cost,
avg(cost) as avg_cost_per_call,
sum(execution_time) as total_time_seconds
FROM llm_logs
GROUP BY model
ORDER BY total_cost DESC;
Troubleshooting¶
Connection Issues¶
"ClickHouse logging disabled"
- Check CLICKHOUSE_URL is set: echo $CLICKHOUSE_URL
- Verify format: https://user:password@host:port/database
- Test connection: python scripts/test_clickhouse.py
"clickhouse-connect not installed"
"Connection refused" - Check host/port are correct - Verify firewall allows outbound connections to ClickHouse port - For ClickHouse Cloud, ensure you're using HTTPS (port 8443) not HTTP (port 8123)
Data Issues¶
"No data in tables"
- Tables are created automatically but only populated when you integrate logging calls
- Check llm_logs and agent_actions - these are auto-populated by existing code
- Evolution tables (evolution_runs, individuals, etc.) require integration in runner.py
"Duplicate data"
- ClickHouse MergeTree allows duplicates by default
- Use code_hash to deduplicate individuals
- Set up a ReplacingMergeTree if you need automatic deduplication
"Queries are slow"
- Always filter by run_id when possible (it's in the ORDER BY key)
- Use timestamp ranges for time-based queries
- Avoid SELECT * - specify only needed columns
Next Steps¶
To integrate ClickHouse logging into your evolution runs:
- Update
runner.py- Add calls tolog_evolution_run,log_generation,log_individual - Update evaluation - Log individual metrics after each evaluation
- Track Pareto - Call
log_pareto_frontafter each generation - Set up dashboards - Create Grafana dashboards for real-time monitoring
See genesis/utils/clickhouse_logger.py for the full API.