Genesis Configuration Guide ⚙️¶
This guide covers the comprehensive configuration system in Genesis, including all parameters, file structures, and advanced configuration patterns.
Note: Genesis is based on Shinka AI and maintains compatibility with the original implementation. The core package is named
genesisinternally.
Table of Contents¶
- Core Configuration Components
- Configuration Parameters
- Pre-configured Variants
- Configuration Structure
- Creating Custom Configurations
- Advanced Configuration Patterns
- Configuration Examples
- Configuration Best Practices
Core Configuration Components¶
1. Evolution Config (evo_config)¶
Controls the core evolutionary algorithm parameters:
evo_config:
_target_: genesis.core.EvolutionConfig
num_generations: 20 # Number of evolution generations
max_parallel_jobs: 1 # Maximum parallel evaluations
max_patch_attempts: 10 # Max attempts to generate valid patches
# LLM Configuration
llm_models: # List of LLM models for mutations
- "azure-gpt-4.1"
llm_dynamic_selection: null # Dynamic model selection strategy
embedding_model: "text-embedding-3-small"
# Patch Configuration
patch_types: # Types of code modifications
- "diff" # Diff-based patches
- "full" # Full code replacement
patch_type_probs: # Probabilities for each patch type
- 0.5
- 0.5
# Task Configuration
language: "python" # Programming language
init_program_path: "???" # Path to initial program
task_sys_msg: "???" # System message for LLM
job_type: "local" # Job execution type
results_dir: ${output_dir} # Results directory
# Web Search Configuration
web_search_enabled: false # Enable web search for agents
web_search_prob: 0.1 # Probability (0-1) of using search per attempt
2. Database Config (db_config)¶
Manages the evolutionary database and island topology:
db_config:
_target_: genesis.database.DatabaseConfig
db_path: "evolution_db.sqlite" # SQLite database path
# Island Configuration
num_islands: 2 # Number of evolutionary islands
island_elitism: true # Enable elite preservation per island
# Archive Configuration
archive_size: 20 # Size of elite solution archive
num_archive_inspirations: 4 # Solutions drawn from archive
num_top_k_inspirations: 2 # Solutions from current generation
# Selection and Migration
exploitation_ratio: 0.2 # Exploitation vs exploration balance
elite_selection_ratio: 0.3 # Fraction of elites for selection
migration_interval: 10 # Generations between migrations
migration_rate: 0.1 # Fraction of population migrated
3. Job Config (job_config)¶
Defines the execution environment and resource requirements:
Local Execution¶
Slurm Cluster Execution¶
job_config:
_target_: genesis.launch.SlurmCondaJobConfig
modules: # Environment modules
- "cuda/12.4"
- "cudnn/8.9.7"
- "hpcx/2.20"
eval_program_path: "genesis/utils/eval_hydra.py"
conda_env: "genesis" # Conda environment name
time: "01:00:00" # Maximum job runtime
cpus: 4 # CPU cores per job
gpus: 1 # GPUs per job
mem: "16G" # Memory per job
4. Task Config¶
Defines problem-specific settings and evaluation functions:
# Task-specific evaluation function
evaluate_function:
_target_: examples.my_task.evaluate.main
program_path: ??? # Filled by runner
results_dir: ??? # Filled by runner
# Job configuration for this task
distributed_job_config:
_target_: genesis.launch.SlurmCondaJobConfig
# ... resource requirements ...
# Evolution settings specific to this task
evo_config:
task_sys_msg: |
You are an expert in [domain].
Key insights: [domain knowledge]
language: "python"
init_program_path: "examples/my_task/initial.py"
job_type: "slurm_conda"
exp_name: "genesis_my_task"
Configuration Parameters¶
Evolution Parameters¶
| Parameter | Type | Default | Description |
|---|---|---|---|
num_generations |
int | 20 | Number of evolutionary generations |
max_parallel_jobs |
int | 1 | Maximum concurrent evaluations |
max_patch_attempts |
int | 10 | Maximum attempts to generate valid patches |
llm_models |
list | ["azure-gpt-4.1"] |
LLM models for mutations |
patch_types |
list | ["diff", "full"] |
Types of code modifications |
patch_type_probs |
list | [0.5, 0.5] |
Probabilities for patch types |
language |
str | "python" |
Programming language |
embedding_model |
str | "text-embedding-3-small" |
Model for code embeddings |
web_search_enabled |
bool | false |
Enable agents to search the web |
web_search_prob |
float | 0.1 |
Probability of using web search per attempt |
Database Parameters¶
| Parameter | Type | Default | Description |
|---|---|---|---|
num_islands |
int | 2 | Number of evolutionary islands |
archive_size |
int | 20 | Size of elite solution archive |
num_archive_inspirations |
int | 4 | Solutions drawn from archive |
num_top_k_inspirations |
int | 2 | Solutions from current generation |
exploitation_ratio |
float | 0.2 | Balance between exploitation/exploration |
elite_selection_ratio |
float | 0.3 | Fraction of elites for selection |
migration_interval |
int | 10 | Generations between island migrations |
migration_rate |
float | 0.1 | Fraction of population migrated |
island_elitism |
bool | true | Preserve elites per island |
Resource Parameters¶
| Parameter | Type | Default | Description |
|---|---|---|---|
time |
str | "01:00:00" |
Maximum job runtime (HH:MM:SS) |
cpus |
int | 4 | CPU cores per job |
gpus |
int | 0 | GPUs per job |
mem |
str | "8G" |
Memory per job |
conda_env |
str | "genesis" |
Conda environment name |
modules |
list | [] |
Environment modules to load |
Pre-configured Variants¶
Shinka uses Hydra for flexible, hierarchical configuration management. The system is designed around composable configuration files that can be mixed and matched to create different experimental setups.
Variants provide pre-configured combinations of settings for common use cases:
Circle Packing Example¶
# configs/variant/circle_packing_example.yaml
defaults:
- override /database@_global_: island_large
- override /evolution@_global_: large_budget
- override /task@_global_: circle_packing
- override /cluster@_global_: local
- _self_
variant_suffix: "_example"
Agent Design Example¶
# configs/variant/agent_design_example.yaml
defaults:
- override /database@_global_: island_medium
- override /evolution@_global_: medium_budget
- override /task@_global_: agent_design
- override /cluster@_global_: local
- _self_
evo_config:
num_generations: 15
variant_suffix: "_agent_example"
Configuration Structure¶
configs/
├── config.yaml # Main config file with defaults
├── cluster/ # Execution environments
│ ├── local.yaml # Local execution
│ ├── gcp.yaml # Google Cloud Platform
│ └── remote.yaml # Remote Slurm clusters
├── database/ # Evolution database settings
│ ├── island_small.yaml # Small-scale evolution (2 islands)
│ ├── island_medium.yaml# Medium-scale evolution (4 islands)
│ └── island_large.yaml # Large-scale evolution (8+ islands)
├── evolution/ # Evolution parameters
│ ├── small_budget.yaml # Few generations, quick runs
│ ├── medium_budget.yaml# Moderate computational budget
│ └── large_budget.yaml # Extensive evolution runs
├── task/ # Problem definitions
│ ├── circle_packing.yaml
│ ├── agent_design.yaml
│ ├── bbo_search.yaml
│ ├── cifar10.yaml
│ ├── cuda_optim.yaml
│ ├── mad_moe.yaml
│ └── novelty_generator.yaml
└── variant/ # Pre-configured combinations
├── circle_packing_example.yaml
├── agent_design_example.yaml
├── mad_moe_example.yaml
└── default.yaml
Creating Custom Configurations¶
Method 1: Custom Variant File¶
Create a new variant file combining existing components:
# configs/variant/my_custom_variant.yaml
defaults:
- override /database@_global_: island_small
- override /evolution@_global_: small_budget
- override /task@_global_: my_task
- override /cluster@_global_: local
- _self_
# Override specific parameters
evo_config:
num_generations: 25
max_parallel_jobs: 2
db_config:
archive_size: 30
migration_interval: 5
variant_suffix: "_custom"
Launch with:
Method 2: Command Line Overrides¶
Override parameters directly on the command line:
genesis_launch \
task=circle_packing \
database=island_large \
evolution=medium_budget \
cluster=local \
evo_config.num_generations=50 \
evo_config.max_parallel_jobs=4 \
db_config.num_islands=6 \
variant_suffix="_custom_run"
Method 3: Custom Task Configuration¶
Create a new task configuration:
# configs/task/my_optimization_task.yaml
evaluate_function:
_target_: examples.my_optimization.evaluate.main
program_path: ???
results_dir: ???
distributed_job_config:
_target_: genesis.launch.LocalJobConfig
eval_program_path: "genesis/utils/eval_hydra.py"
evo_config:
task_sys_msg: |
You are an expert optimization researcher working on [specific problem].
Key insights to consider:
1. [Domain-specific insight 1]
2. [Domain-specific insight 2]
3. [Domain-specific insight 3]
Focus on [specific optimization goals].
language: "python"
init_program_path: "examples/my_optimization/initial.py"
job_type: "local"
exp_name: "genesis_my_optimization"
Advanced Configuration Patterns¶
Multi-Model Evolution¶
Use multiple LLM models with different strengths:
evo_config:
llm_models:
- "azure-gpt-4.1" # Strong reasoning
- "claude-3-sonnet" # Good at code
- "azure-gpt-4o-mini" # Fast iterations
# Optional: Dynamic model selection
llm_dynamic_selection:
strategy: "performance_based"
window_size: 10
Web Search Integration¶
Enable agents to search the web for documentation, libraries, and coding patterns. This is particularly useful for tasks involving niche libraries or new APIs.
web_search_enabled: When set totrue, agents are equipped with a search tool.web_search_prob: Controls how frequently the search tool is made available to the agent. A lower probability (e.g.,0.1) encourages the agent to rely mostly on its internal knowledge but allows for occasional external lookups when stuck.
Configuration Examples¶
Quick Prototyping Setup¶
# Fast iteration for development
defaults:
- override /database@_global_: island_small
- override /evolution@_global_: small_budget
- override /cluster@_global_: local
evo_config:
num_generations: 5
max_parallel_jobs: 1
db_config:
num_islands: 1
archive_size: 10
variant_suffix: "_prototype"
Production Research Setup¶
# Large-scale research experiment
defaults:
- override /database@_global_: island_large
- override /evolution@_global_: large_budget
- override /cluster@_global_: remote
evo_config:
num_generations: 100
max_parallel_jobs: 8
db_config:
num_islands: 8
archive_size: 50
migration_interval: 5
variant_suffix: "_production"
Multi-Task Comparison¶
# Configuration for comparing across tasks
defaults:
- override /database@_global_: island_medium
- override /evolution@_global_: medium_budget
- override /cluster@_global_: local
# Standardized parameters for fair comparison
evo_config:
num_generations: 30
max_parallel_jobs: 2
llm_models: ["azure-gpt-4.1"]
db_config:
num_islands: 4
archive_size: 25
variant_suffix: "_comparison"
Configuration Best Practices¶
1. Start Small, Scale Up¶
- Begin with
island_smallandsmall_budgetconfigurations - Increase complexity as you understand the problem better
2. Use Meaningful Variant Suffixes¶
- Include key parameters in the suffix:
_gen50_islands4_gpt4 - This helps identify experiments in results directories
3. Document Custom Configurations¶
- Add comments explaining parameter choices
- Include expected runtime and resource usage
4. Version Control Configurations¶
- Keep variant files in version control
- Tag configurations used for important results
5. Monitor Resource Usage¶
- Start with conservative resource allocations
- Monitor actual usage and adjust accordingly
For more examples and detailed parameter explanations, see the configuration files in the configs/ directory and the Getting Started Guide.