config.yaml
The master configuration file. Entry point for every simulation section — each top-level key either configures a subsystem directly or points to a subordinate YAML file.
Topic: Configuration Reference
Path: configs/2021/config.yaml (or any path passed via --config)
Overview
config.yaml is the single file passed to create_world.py. It wires together geography, population, venues, households, the timeline pipeline, relationship networks, and serialisation. Subsystems that need their own detailed schema (venues, households, distributors, etc.) are referenced by path and documented separately.
Keys
| Key | Description |
|---|---|
geography |
Geography hierarchy and spatial filter |
population |
Population data source and mode |
venues |
Venue data directory and type catalogue |
households |
Household allocation configuration |
debug_outputs |
Optional auxiliary CSV exports |
timeline |
Ordered pipeline of attribute, distributor, and child-creator steps |
relationship_pipeline |
Social network construction |
romantic_relationships |
Sexual orientation and partnership assignment |
serialization |
HDF5 output path and field selection |
geography
geography:
data_dir: "data/geography"
levels: ["SGU", "MGU", "LGU"]
load_all: false
filter:
level: LGU
codes: ["Durham", "Gateshead"]
file: "filters/my_codes.txt"
data_dir must contain hierarchy.csv, coord_sgu.csv, and coord_mgu.csv.
levels names the hierarchy from smallest to largest unit. Default is ["SGU", "MGU", "LGU"]; any number of levels with any names are accepted (e.g. ["SGU", "MGU", "LGU", "XLGU"] for a four-level hierarchy).
load_all: true loads every unit and ignores filter. When false, filter is applied.
filter.level must match one of the names in levels. codes is an inline list; file is a path to a text file with one code per line. When both are set, file takes precedence. If level is null or no codes are supplied, the filter is not applied and all units are loaded.
population
Two modes are supported.
Matrix mode (default) — reads aggregated age–sex matrices:
population:
data_dir: "data/population"
demographics_male_file: "demographics_male.csv"
demographics_female_file: "demographics_female.csv"
Each CSV has rows indexed by geo unit and columns indexed by age (0–99).
Explicit mode — reads a pre-built individual-level CSV:
population:
type: "explicit"
data_dir: "data/population"
filename: "population.csv"
column_mapping:
age: "Age"
sex: "Sex"
Occode: "Occode"
Explicit batch mode — as above but reads multiple CSVs from data_dir rather than a single file:
population:
type: "explicit_batch"
data_dir: "1911_data/population"
column_mapping:
age: "Age"
sex: "Sex"
column_mapping maps engine attribute names to CSV column names. Any additional columns listed are loaded as per-person attributes. filename is required for type: "explicit" and ignored for type: "explicit_batch".
venues
venues:
data_dir: "data/venues"
config_file: "configs/2021/venues/venues_config.yaml"
export_file: "venue_allocations.csv"
data_dir is the root directory for all venue CSVs. config_file is the venue type catalogue — see Venues Config. export_file is optional; when set, venue allocation results are written there.
households
households:
enabled: true
data_dir: "data/households"
data_file: "households.csv"
config_file: "configs/2021/households/households_config.yaml"
strategy_file: "configs/2021/households/allocation_strategy.yaml"
export_file: "household_allocations.csv"
enabled: false skips household allocation entirely.
data_file is a CSV of household composition counts per geo unit. config_file defines age categories and demotion/promotion rules — see Households Config.
Three allocation modes are selected by which optional files are set:
| Mode | How to activate |
|---|---|
| Unified strategy (households + communal venues) | Set strategy_file; see Allocation Strategy |
| Household-only multi-round | Set rounds_file; set strategy_file: null |
| Single-pass (simple) | Set both strategy_file and rounds_file to null |
export_file is optional; when set, household allocation results are written there.
debug_outputs
debug_outputs:
enabled: false
When enabled: true, the engine writes auxiliary CSVs during world creation: household_allocations.csv, venue_allocations.csv, residence_venues.csv, and unallocated_people.csv. These build large in-memory DataFrames — disable for country-scale runs.
timeline
timeline:
enabled: true
steps:
- type: attribute
config: "configs/2021/attributes/attribute_assignment.yaml"
- type: distributor
config: "configs/2021/distributors/school_distributor.yaml"
- type: child_creator
config: "configs/2021/venue_child_creators/school_classrooms.yaml"
steps is an ordered list executed top to bottom. Step type is one of:
| Type | Effect |
|---|---|
attribute |
Assigns a property to each eligible person |
distributor |
Places people into venues; writes to activity_map |
child_creator |
Sub-divides a parent venue into child venues |
Earlier steps get first pick of the population pool. Order is critical: education distributors must run before workplace assignment so the primary-activity filter works correctly.
See Attribute Assignment, Distributors, and Venue Child Creators.
relationship_pipeline
relationship_pipeline:
enabled: true
relationships:
- config: "configs/2021/relationships/social_networks.yaml"
Runs after venue assignment. Each entry in relationships builds one social network; multiple entries produce multiple independent networks. See Social Networks.
romantic_relationships
romantic_relationships:
enabled: true
config: "configs/2021/relationships/romantic_relationships.yaml"
Runs after household distribution and relationship_pipeline. Assigns sexual orientation and builds partnership networks. See Romantic Relationships.
serialization
serialization:
enabled: true
config_file: "configs/2021/serialization_config.yaml"
output_dir: "output/2021"
filename: "world_state.h5"
compression: "gzip"
compression_level: 4
config_file controls which person and venue fields are written — see Serialization Config. output_dir is created if absent. filename is overridden by --filename at the CLI.
compression is an HDF5 codec ("gzip" by default); omit to disable compression. compression_level controls gzip effort (default 4).
CLI Overrides
| Flag | Overrides |
|---|---|
--config PATH |
Path to this config file (default: configs/2021/config.yaml) |
--filename NAME |
serialization.filename |
--load-all |
Sets geography.load_all: true |
--lgu CODE[,CODE] |
geography.filter at LGU level |
--lgu-file PATH |
geography.filter at LGU level, codes from file |
--mgu CODE[,CODE] |
geography.filter at MGU level |
--mgu-file PATH |
geography.filter at MGU level, codes from file |
--sgu CODE[,CODE] |
geography.filter at SGU level |
--sgu-file PATH |
geography.filter at SGU level, codes from file |
CLI flags take precedence over config-file settings. --lgu-file takes precedence over --lgu when both are supplied.