Genesis Configuration Guide ⚙️¶

This guide covers the comprehensive configuration system in Genesis, including all parameters, file structures, and advanced configuration patterns.

Note: Genesis is based on Shinka AI and maintains compatibility with the original implementation. The core package is named genesis internally.

Core Configuration Components¶

1. Evolution Config (`evo_config`)¶

Controls the core evolutionary algorithm parameters:

evo_config:
  _target_: genesis.core.EvolutionConfig
  num_generations: 20              # Number of evolution generations
  max_parallel_jobs: 1             # Maximum parallel evaluations
  max_patch_attempts: 10           # Max attempts to generate valid patches

  # LLM Configuration
  llm_models:                      # List of LLM models for mutations
    - "azure-gpt-4.1"
  llm_dynamic_selection: null      # Dynamic model selection strategy
  embedding_model: "text-embedding-3-small"

  # Patch Configuration
  patch_types:                     # Types of code modifications
    - "diff"                       # Diff-based patches
    - "full"                       # Full code replacement
  patch_type_probs:                # Probabilities for each patch type
    - 0.5
    - 0.5

  # Task Configuration
  language: "python"               # Programming language
  init_program_path: "???"         # Path to initial program
  task_sys_msg: "???"             # System message for LLM
  job_type: "local"                # Job execution type
  results_dir: ${output_dir}       # Results directory

  # Web Search Configuration
  web_search_enabled: false        # Enable web search for agents
  web_search_prob: 0.1             # Probability (0-1) of using search per attempt

2. Database Config (`db_config`)¶

Manages the evolutionary database and island topology:

db_config:
  _target_: genesis.database.DatabaseConfig
  db_path: "evolution_db.sqlite"   # SQLite database path

  # Island Configuration
  num_islands: 2                   # Number of evolutionary islands
  island_elitism: true             # Enable elite preservation per island

  # Archive Configuration
  archive_size: 20                 # Size of elite solution archive
  num_archive_inspirations: 4      # Solutions drawn from archive
  num_top_k_inspirations: 2        # Solutions from current generation

  # Selection and Migration
  exploitation_ratio: 0.2          # Exploitation vs exploration balance
  elite_selection_ratio: 0.3       # Fraction of elites for selection
  migration_interval: 10           # Generations between migrations
  migration_rate: 0.1              # Fraction of population migrated

3. Job Config (`job_config`)¶

Defines the execution environment and resource requirements:

Local Execution¶

job_config:
  _target_: genesis.launch.LocalJobConfig
  eval_program_path: "genesis/evaluate.py"

Slurm Cluster Execution¶

job_config:
  _target_: genesis.launch.SlurmCondaJobConfig
  modules:                         # Environment modules
    - "cuda/12.4"
    - "cudnn/8.9.7"
    - "hpcx/2.20"
  eval_program_path: "genesis/utils/eval_hydra.py"
  conda_env: "genesis"              # Conda environment name
  time: "01:00:00"                 # Maximum job runtime
  cpus: 4                          # CPU cores per job
  gpus: 1                          # GPUs per job
  mem: "16G"                       # Memory per job

4. Task Config¶

Defines problem-specific settings and evaluation functions:

# Task-specific evaluation function
evaluate_function:
  _target_: examples.my_task.evaluate.main
  program_path: ???               # Filled by runner
  results_dir: ???                # Filled by runner

# Job configuration for this task
distributed_job_config:
  _target_: genesis.launch.SlurmCondaJobConfig
  # ... resource requirements ...

# Evolution settings specific to this task
evo_config:
  task_sys_msg: |
    You are an expert in [domain].
    Key insights: [domain knowledge]
  language: "python"
  init_program_path: "examples/my_task/initial.py"
  job_type: "slurm_conda"

exp_name: "genesis_my_task"

Configuration Parameters¶

Evolution Parameters¶

Parameter	Type	Default	Description
`num_generations`	int	20	Number of evolutionary generations
`max_parallel_jobs`	int	1	Maximum concurrent evaluations
`max_patch_attempts`	int	10	Maximum attempts to generate valid patches
`llm_models`	list	`["azure-gpt-4.1"]`	LLM models for mutations
`patch_types`	list	`["diff", "full"]`	Types of code modifications
`patch_type_probs`	list	`[0.5, 0.5]`	Probabilities for patch types
`language`	str	`"python"`	Programming language
`embedding_model`	str	`"text-embedding-3-small"`	Model for code embeddings
`web_search_enabled`	bool	`false`	Enable agents to search the web
`web_search_prob`	float	`0.1`	Probability of using web search per attempt

Database Parameters¶

Parameter	Type	Default	Description
`num_islands`	int	2	Number of evolutionary islands
`archive_size`	int	20	Size of elite solution archive
`num_archive_inspirations`	int	4	Solutions drawn from archive
`num_top_k_inspirations`	int	2	Solutions from current generation
`exploitation_ratio`	float	0.2	Balance between exploitation/exploration
`elite_selection_ratio`	float	0.3	Fraction of elites for selection
`migration_interval`	int	10	Generations between island migrations
`migration_rate`	float	0.1	Fraction of population migrated
`island_elitism`	bool	true	Preserve elites per island

Resource Parameters¶

Parameter	Type	Default	Description
`time`	str	`"01:00:00"`	Maximum job runtime (HH:MM:SS)
`cpus`	int	4	CPU cores per job
`gpus`	int	0	GPUs per job
`mem`	str	`"8G"`	Memory per job
`conda_env`	str	`"genesis"`	Conda environment name
`modules`	list	`[]`	Environment modules to load

Pre-configured Variants¶

Shinka uses Hydra for flexible, hierarchical configuration management. The system is designed around composable configuration files that can be mixed and matched to create different experimental setups.

Variants provide pre-configured combinations of settings for common use cases:

Circle Packing Example¶

# configs/variant/circle_packing_example.yaml
defaults:
  - override /database@_global_: island_large
  - override /evolution@_global_: large_budget
  - override /task@_global_: circle_packing
  - override /cluster@_global_: local
  - _self_

variant_suffix: "_example"

Agent Design Example¶

# configs/variant/agent_design_example.yaml
defaults:
  - override /database@_global_: island_medium
  - override /evolution@_global_: medium_budget
  - override /task@_global_: agent_design
  - override /cluster@_global_: local
  - _self_

evo_config:
  num_generations: 15

variant_suffix: "_agent_example"

Configuration Structure¶

configs/
├── config.yaml           # Main config file with defaults
├── cluster/              # Execution environments
│   ├── local.yaml        # Local execution
│   ├── gcp.yaml          # Google Cloud Platform
│   └── remote.yaml       # Remote Slurm clusters
├── database/             # Evolution database settings
│   ├── island_small.yaml # Small-scale evolution (2 islands)
│   ├── island_medium.yaml# Medium-scale evolution (4 islands)
│   └── island_large.yaml # Large-scale evolution (8+ islands)
├── evolution/            # Evolution parameters
│   ├── small_budget.yaml # Few generations, quick runs
│   ├── medium_budget.yaml# Moderate computational budget
│   └── large_budget.yaml # Extensive evolution runs
├── task/                 # Problem definitions
│   ├── circle_packing.yaml
│   ├── agent_design.yaml
│   ├── bbo_search.yaml
│   ├── cifar10.yaml
│   ├── cuda_optim.yaml
│   ├── mad_moe.yaml
│   └── novelty_generator.yaml
└── variant/              # Pre-configured combinations
    ├── circle_packing_example.yaml
    ├── agent_design_example.yaml
    ├── mad_moe_example.yaml
    └── default.yaml

Creating Custom Configurations¶

Method 1: Custom Variant File¶

Create a new variant file combining existing components:

# configs/variant/my_custom_variant.yaml
defaults:
  - override /database@_global_: island_small
  - override /evolution@_global_: small_budget
  - override /task@_global_: my_task
  - override /cluster@_global_: local
  - _self_

# Override specific parameters
evo_config:
  num_generations: 25
  max_parallel_jobs: 2

db_config:
  archive_size: 30
  migration_interval: 5

variant_suffix: "_custom"

Launch with:

genesis_launch variant=my_custom_variant

Method 2: Command Line Overrides¶

Override parameters directly on the command line:

genesis_launch \
    task=circle_packing \
    database=island_large \
    evolution=medium_budget \
    cluster=local \
    evo_config.num_generations=50 \
    evo_config.max_parallel_jobs=4 \
    db_config.num_islands=6 \
    variant_suffix="_custom_run"

Method 3: Custom Task Configuration¶

Create a new task configuration:

# configs/task/my_optimization_task.yaml
evaluate_function:
  _target_: examples.my_optimization.evaluate.main
  program_path: ???
  results_dir: ???

distributed_job_config:
  _target_: genesis.launch.LocalJobConfig
  eval_program_path: "genesis/utils/eval_hydra.py"

evo_config:
  task_sys_msg: |
    You are an expert optimization researcher working on [specific problem].

    Key insights to consider:
    1. [Domain-specific insight 1]
    2. [Domain-specific insight 2]
    3. [Domain-specific insight 3]

    Focus on [specific optimization goals].
  language: "python"
  init_program_path: "examples/my_optimization/initial.py"
  job_type: "local"

exp_name: "genesis_my_optimization"

Advanced Configuration Patterns¶

Multi-Model Evolution¶

Use multiple LLM models with different strengths:

evo_config:
  llm_models:
    - "azure-gpt-4.1"      # Strong reasoning
    - "claude-3-sonnet"    # Good at code
    - "azure-gpt-4o-mini"  # Fast iterations

  # Optional: Dynamic model selection
  llm_dynamic_selection:
    strategy: "performance_based"
    window_size: 10

Web Search Integration¶

Enable agents to search the web for documentation, libraries, and coding patterns. This is particularly useful for tasks involving niche libraries or new APIs.

evo_config:
  web_search_enabled: true
  web_search_prob: 0.1     # 10% chance per patch attempt

web_search_enabled: When set to true, agents are equipped with a search tool.
web_search_prob: Controls how frequently the search tool is made available to the agent. A lower probability (e.g., 0.1) encourages the agent to rely mostly on its internal knowledge but allows for occasional external lookups when stuck.

Configuration Examples¶

Quick Prototyping Setup¶

# Fast iteration for development
defaults:
  - override /database@_global_: island_small
  - override /evolution@_global_: small_budget
  - override /cluster@_global_: local

evo_config:
  num_generations: 5
  max_parallel_jobs: 1

db_config:
  num_islands: 1
  archive_size: 10

variant_suffix: "_prototype"

Production Research Setup¶

# Large-scale research experiment
defaults:
  - override /database@_global_: island_large
  - override /evolution@_global_: large_budget
  - override /cluster@_global_: remote

evo_config:
  num_generations: 100
  max_parallel_jobs: 8

db_config:
  num_islands: 8
  archive_size: 50
  migration_interval: 5

variant_suffix: "_production"

Multi-Task Comparison¶

# Configuration for comparing across tasks
defaults:
  - override /database@_global_: island_medium
  - override /evolution@_global_: medium_budget
  - override /cluster@_global_: local

# Standardized parameters for fair comparison
evo_config:
  num_generations: 30
  max_parallel_jobs: 2
  llm_models: ["azure-gpt-4.1"]

db_config:
  num_islands: 4
  archive_size: 25

variant_suffix: "_comparison"

Configuration Best Practices¶

1. Start Small, Scale Up¶

Begin with island_small and small_budget configurations
Increase complexity as you understand the problem better

2. Use Meaningful Variant Suffixes¶

Include key parameters in the suffix: _gen50_islands4_gpt4
This helps identify experiments in results directories

3. Document Custom Configurations¶

Add comments explaining parameter choices
Include expected runtime and resource usage

4. Version Control Configurations¶

Keep variant files in version control
Tag configurations used for important results

5. Monitor Resource Usage¶

Start with conservative resource allocations
Monitor actual usage and adjust accordingly

For more examples and detailed parameter explanations, see the configuration files in the configs/ directory and the Getting Started Guide.