What Is the CitySim.150Y.CF Calibration Protocol?

A backtest tells us whether the city engine missed reality.

A calibration protocol tells us what we are allowed to change after that miss, how we change it, and how we avoid cheating while doing so.

That is the difference.

Without calibration rules, a model can always be made to look better after the fact. You just keep nudging weights, bending thresholds, softening definitions, and smoothing ugly errors until the output looks elegant again. But that is not calibration. That is performance theatre.

So after the Variable Registry, Observable Proxy Map, Data Adapter Spec, and Backtest Protocol, the next thing CitySim needs is the Calibration Protocol.

One-sentence answer

The CitySim.150Y.CF Calibration Protocol is the canonical rulebook for how the engine may be tuned after backtesting, including what parameters can change, what must stay fixed, how weights and transition rules are updated, how overfitting is prevented, and when the model is considered improved rather than merely cosmetically adjusted.

That is the page that stops CitySim from cheating its way into false accuracy.


Why this page has to exist

The backtest answers one question:

How wrong was the engine?

But that is not enough.

Once the engine misses reality, the next question is:

What exactly are we allowed to change, and how do we know we have improved the model rather than just forced it to mimic the past?

That is the calibration problem.

This matters because city models are very easy to “improve” dishonestly.

A simulation can be made to look better by:

  • weakening difficult variables
  • reweighting failure signals after seeing the miss
  • changing thresholds to fit one city
  • smoothing noise so errors disappear
  • quietly reinterpreting the meaning of a variable
  • fitting Tokyo beautifully while making the engine worse for every other city

That is why calibration must be governed by a protocol.

Not because the model is untrustworthy by nature.
But because all simulation engines become dangerous when tuning is unconstrained.


What the Calibration Protocol does

The Calibration Protocol does seven jobs.

1. It defines what may be calibrated

Not everything in a city engine should move freely.

Some things are structural and should stay fixed unless the theory itself changes.

Other things are empirical and should be tuned.

The protocol must say clearly:

  • what can be adjusted
  • what cannot be adjusted
  • what requires explicit version change
  • what requires a theory update rather than a numerical tweak

2. It separates theory from coefficient tuning

This is one of the most important distinctions in the whole CitySim stack.

For example:

  • the idea that demographic pressure increases maintenance and dependency load is a structural claim
  • the exact rate at which that load compounds inside one city is a calibration question

If theory and coefficient tuning are mixed together, the engine becomes unstable and hard to audit.

3. It prevents overfitting

A model that fits one city too perfectly may become worse as a general city engine.

That is why calibration must distinguish between:

  • fitting Tokyo specifically
  • improving the general engine
  • improving one city archetype pack
  • improving one city-local calibration file

These are not the same thing.

4. It forces honest versioning

Once the engine is recalibrated, the protocol must say:

  • what changed
  • why it changed
  • what evidence triggered the change
  • whether the change belongs to the core engine, an archetype pack, or one local city pack

Otherwise no one will know which version of CitySim they are actually reading.

5. It preserves comparability across cities

If Tokyo is calibrated one way and Seoul another, but the changes are hidden or arbitrary, then the engine stops being one engine.

Calibration must preserve shared grammar while allowing local tuning where justified.

6. It tells us when recalibration is enough

Some misses can be fixed with coefficient adjustment.

Some misses reveal that the proxies are weak.

Some misses reveal the variable definition itself is wrong.

Some misses reveal that the transition kernel is structurally incomplete.

Calibration should tell us which layer failed.

7. It decides when the engine is ready to rerun forward scenarios

A forward 150-year scenario should not be rerun immediately after a bad backtest unless the calibration layer says the engine has improved enough to justify another attempt.

That gate matters.


What calibration is not

Calibration is not:

  • making the model look nicer
  • deleting variables that performed badly
  • redefining the city so the numbers fit
  • letting one city rewrite the whole engine
  • tuning until the past is replicated perfectly
  • quietly changing the model after the output looks embarrassing

Calibration is not cosmetic repair.

Calibration is disciplined correction.


The three calibration layers

CitySim should calibrate in layers, not all at once.

Layer 1. Core engine calibration

These are changes that affect the whole engine.

Examples:

  • drift-rate sensitivity
  • repair-rate response shape
  • general lag structure
  • shock propagation logic
  • standard normalization behaviour

These changes are powerful and should be rare.

A core engine change must be justified carefully because it affects all cities.

Layer 2. City archetype calibration

These are changes that affect a city-type family, not every city.

Examples:

  • shrinking aging capitals
  • global service megacities
  • industrial export cities
  • tourism-dependent cities
  • frontier growth cities

Tokyo should not force the same tuning as Houston or Lagos.
But Tokyo may help tune the “shrinking aging high-stock megacity” archetype.

This layer is where much of the useful calibration should happen.

Layer 3. City-local calibration

These are adjustments justified only for one named city.

Examples:

  • Tokyo-specific migration behaviour
  • Singapore-specific education-state coupling
  • London-specific housing-finance coupling
  • Dubai-specific foreign-labour dependence

These changes should be allowed, but clearly labeled as city-local, not global truth.


What can be calibrated

The protocol should define a small list of permitted calibration targets.

A. Parameter weights

Examples:

  • weight of non-attendance inside school capture
  • weight of aging inside maintenance load
  • weight of retraining success inside mid-life retool score

These are often the safest things to tune first.

B. Transition coefficients

Examples:

  • how fast fertility decline affects youth inflow
  • how fast late-life isolation reduces civic continuity
  • how strongly repair rate counters drift rate

These matter a great deal.

C. Lag lengths

Examples:

  • how many years before education leakage appears in labour weakness
  • how many years before infrastructure neglect shows as city-body decline
  • how many years before chronic low fertility shifts school-system stress

This is one of the most underappreciated calibration areas.

D. Threshold boundaries

Examples:

  • what counts as low, medium, or high housing stress
  • what counts as a legitimacy breach band
  • what counts as school capture failure

Thresholds may be calibrated, but very carefully, because they are easy to manipulate.

E. Composite proxy weights

Examples:

  • how much trust survey matters inside legitimacy
  • how much progression stability matters inside transfer integrity
  • how much vacancy burden matters inside teacher pipeline health

These often need adjustment once backtesting reveals weak proxy bundles.


What should not be casually calibrated

Some parts of the engine should stay much more stable.

1. Variable definitions

Do not redefine a variable just because it performed poorly.

If the definition must change, that is a theory update, not ordinary calibration.

2. Domain structure

Do not remove major city domains because one backtest was inconvenient.

For example:

  • demography
  • education
  • economy
  • infrastructure
  • governance
  • social continuity
    should remain part of the engine unless the framework itself changes.

3. Core pass/fail language

Do not rewrite success so the engine “passes” more often.

If success criteria change, that should be a visible version change.

4. Proxy quality labels

Do not upgrade a weak proxy to “strong” because it helps the score.

Evidence strength must remain evidence strength.

5. Version history

Never erase the fact that an older model missed badly.

A serious engine keeps its miss history visible.


The calibration sequence

Every CitySim recalibration should follow the same order.

Step 1. Run the backtest

Do not calibrate blindly.

First see where the model actually missed.

Step 2. Diagnose the source of the miss

Every miss should be classified.

Was it caused by:

  • weak proxy
  • bad data adapter
  • wrong coefficient
  • wrong lag
  • wrong threshold
  • missing variable
  • wrong structural theory

Do not jump straight into coefficient tuning before diagnosing the layer.

Step 3. Choose the smallest justified change

Calibration should prefer the smallest change that fixes the largest real error.

That keeps the engine stable.

Step 4. Retest on the same backtest window

Check whether the change improved fit.

Step 5. Validate on a second window or second city

This is crucial.

A model that improves on one window but gets worse elsewhere may just be overfitting.

Step 6. Classify the change

Decide whether the adjustment belongs to:

  • core engine
  • archetype pack
  • city-local pack

Step 7. Version and publish

Record:

  • what changed
  • what evidence triggered it
  • what improved
  • what worsened
  • what uncertainty remains

That is the calibration cycle.


The four main calibration failure types

When the engine misses reality, the failure usually falls into one of four buckets.

Type 1. Measurement failure

The variable is conceptually fine, but the proxy map is weak.

Example:

  • legitimacy inferred from weak indicators
  • parent capability support measured too indirectly

Fix:

  • strengthen proxy bundle
  • reduce weight
  • improve data adapter

Type 2. Parameter failure

The structure is fine, but the coefficient is wrong.

Example:

  • fertility decline was too weakly linked to youth inflow decline
  • school stress was assumed to worsen too slowly

Fix:

  • recalibrate parameter or lag

Type 3. Threshold failure

The engine sees the right movement, but the boundary bands are misplaced.

Example:

  • the city enters stress earlier than the model recognizes
  • warning bands are too forgiving

Fix:

  • adjust thresholds carefully

Type 4. Structural failure

The engine is missing a real mechanism.

Example:

  • migration dynamics under global city conditions
  • housing-finance feedback loop
  • aging interacting with social isolation more strongly than expected

Fix:

  • theory revision
  • variable expansion
  • transition kernel update

This is the most serious class of failure.


The anti-overfitting rules

Calibration only has value if it avoids overfitting.

So the protocol needs hard rules.

Rule 1. Train / validation separation

Use one window for calibration and another window for checking whether the improvement generalizes.

For example:

  • calibrate on 2010–2020
  • validate on 2020–2025

Or:

  • calibrate on Tokyo
  • validate on Seoul or Osaka within the same archetype band

Rule 2. No perfect-fit obsession

A city engine should not be forced toward exact past replication if doing so damages generality.

The aim is not to mimic every wrinkle of the past.

The aim is to become more truth-bearing and stable.

Rule 3. Weak proxies cannot dominate tuning

Do not let the noisiest variables control the biggest recalibration moves.

Rule 4. Smaller changes beat larger changes

If two calibration choices improve the backtest similarly, keep the simpler one.

Rule 5. One city cannot rewrite the whole world

Tokyo can improve the Tokyo pack and perhaps the archetype pack. It should not automatically redefine the universal engine.


Calibration status classes

After recalibration, the engine should declare its new status.

Class R1 — minor retune

Small parameter or lag changes, core theory intact.

Class R2 — moderate recalibration

Several weights or thresholds changed, but engine architecture intact.

Class R3 — archetype recalibration

The city archetype pack changed meaningfully.

Class R4 — structural revision

The engine needed new mechanisms or important theory revision.

Class R5 — unstable

Calibration attempts did not generalize well enough; engine still too loose.

This is useful because not all recalibration events are equal.


What a calibration report must publish

Every calibration pass should publish an audit trail.

1. Previous model version

What was active before recalibration?

2. Backtest miss summary

Where did it fail?

3. Suspected failure layer

Was the problem measurement, coefficient, threshold, or structure?

4. Changes made

Exactly what changed?

5. Why the change was justified

What evidence supported the update?

6. Improvement score

Did the model actually get better?

7. Cross-check result

Did the improvement hold outside the training case?

8. New version label

What is the new engine version?

Without this, “recalibration” is just a vague claim.


The most important calibration law

Here is the deepest rule:

CitySim must prefer stable truth over elegant fit.

That means:

  • a rough but honest engine is better than a beautiful overfit engine
  • a city-local patch should not pretend to be universal law
  • uncertainty should remain visible
  • not every miss should be forced away by tuning

That is how the engine stays usable over time.


Why this matters after Tokyo

Tokyo did not merely show that the first engine missed.

Tokyo showed where the discipline gap was.

The initial run was too easy to read as if it were forward-accurate. But once the backtest pushed the model against recent history, it became clear that the engine needed a more formal way to:

  • identify the miss,
  • locate the layer that failed,
  • retune responsibly,
  • and prove that the retune was genuine.

That is exactly what this protocol provides.

Without it, every future city article risks becoming:

  • write,
  • miss,
  • quietly tweak,
  • rewrite,
  • move on.

That is not a civilisation-grade engine.


Final definition

The CitySim.150Y.CF Calibration Protocol is the canonical tuning discipline that governs how the model may be improved after backtesting, including what can change, what must remain stable, how errors are diagnosed, how overfitting is prevented, and how recalibration is versioned, validated, and published.

Without it, CitySim can still learn.

But no one can tell whether it learned honestly.


Almost-Code

“`text id=”s4irpd”
CITYSIM_150Y_CF_CALIBRATION_PROTOCOL_V1

PURPOSE:
Improve CitySim after backtesting without allowing hidden retuning,
overfitting,
or structural drift disguised as accuracy.

CORE_LAW:
Calibration must prefer stable truth over elegant fit.

CALIBRATION_LAYERS:
L1 = core_engine
L2 = city_archetype_pack
L3 = city_local_pack

PERMITTED_CALIBRATION_TARGETS:

  • parameter_weights
  • transition_coefficients
  • lag_lengths
  • threshold_boundaries
  • composite_proxy_weights

RESTRICTED_ELEMENTS:

  • variable_definitions
  • domain_structure
  • core_pass_fail_language
  • proxy_quality_labels
  • version_history

CALIBRATION_SEQUENCE:

  1. run_backtest
  2. diagnose_failure_layer
  3. choose_smallest_justified_change
  4. retest_on_training_window
  5. validate_on_second_window_or_second_city
  6. classify_change_scope
  7. version_and_publish

FAILURE_TYPES:
F1 = measurement_failure
F2 = parameter_failure
F3 = threshold_failure
F4 = structural_failure

ANTI_OVERFITTING_RULES:

  • train_validation_separation
  • no_perfect_fit_obsession
  • weak_proxies_cannot_dominate_tuning
  • smaller_changes_preferred
  • one_city_cannot_rewrite_whole_engine

CALIBRATION_STATUS_CLASSES:
R1 = minor_retune
R2 = moderate_recalibration
R3 = archetype_recalibration
R4 = structural_revision
R5 = unstable

REQUIRED_CALIBRATION_REPORT:

  • previous_model_version
  • backtest_miss_summary
  • suspected_failure_layer
  • changes_made
  • justification
  • improvement_score
  • cross_check_result
  • new_version_label

PASS_CONDITION:
A calibration pass is valid only if it improves backtest performance,
preserves declared variable meaning,
does not hide proxy weakness,
and holds outside the immediate training case.

FAIL_CONDITIONS:

  • hidden retuning
  • post_hoc threshold shifting
  • city_local patch presented as universal law
  • weak proxy dominating recalibration
  • no validation outside training case
  • version change not published

OUTPUT:
calibration_validity = TRUE or FALSE
model_update_scope = core / archetype / city_local
forward_use_status = improved / still_under_calibrated / unstable
“`

eduKateSG Learning System | Control Tower, Runtime, and Next Routes

This article is one node inside the wider eduKateSG Learning System.

At eduKateSG, we do not treat education as random tips, isolated tuition notes, or one-off exam hacks. We treat learning as a living runtime:

state -> diagnosis -> method -> practice -> correction -> repair -> transfer -> long-term growth

That is why each article is written to do more than answer one question. It should help the reader move into the next correct corridor inside the wider eduKateSG system: understand -> diagnose -> repair -> optimize -> transfer. Your uploaded spine clearly clusters around Education OS, Tuition OS, Civilisation OS, subject learning systems, runtime/control-tower pages, and real-world lattice connectors, so this footer compresses those routes into one reusable ending block.

Start Here

Learning Systems

Runtime and Deep Structure

Real-World Connectors

Subject Runtime Lane

How to Use eduKateSG

If you want the big picture -> start with Education OS and Civilisation OS
If you want subject mastery -> enter Mathematics, English, Vocabulary, or Additional Mathematics
If you want diagnosis and repair -> move into the CivOS Runtime and subject runtime pages
If you want real-life context -> connect learning back to Family OS, Bukit Timah OS, Punggol OS, and Singapore City OS

Why eduKateSG writes articles this way

eduKateSG is not only publishing content.
eduKateSG is building a connected control tower for human learning.

That means each article can function as:

  • a standalone answer,
  • a bridge into a wider system,
  • a diagnostic node,
  • a repair route,
  • and a next-step guide for students, parents, tutors, and AI readers.
eduKateSG.LearningSystem.Footer.v1.0

TITLE: eduKateSG Learning System | Control Tower / Runtime / Next Routes

FUNCTION:
This article is one node inside the wider eduKateSG Learning System.
Its job is not only to explain one topic, but to help the reader enter the next correct corridor.

CORE_RUNTIME:
reader_state -> understanding -> diagnosis -> correction -> repair -> optimisation -> transfer -> long_term_growth

CORE_IDEA:
eduKateSG does not treat education as random tips, isolated tuition notes, or one-off exam hacks.
eduKateSG treats learning as a connected runtime across student, parent, tutor, school, family, subject, and civilisation layers.

PRIMARY_ROUTES:
1. First Principles
   - Education OS
   - Tuition OS
   - Civilisation OS
   - How Civilization Works
   - CivOS Runtime Control Tower

2. Subject Systems
   - Mathematics Learning System
   - English Learning System
   - Vocabulary Learning System
   - Additional Mathematics

3. Runtime / Diagnostics / Repair
   - CivOS Runtime Control Tower
   - MathOS Runtime Control Tower
   - MathOS Failure Atlas
   - MathOS Recovery Corridors
   - Human Regenerative Lattice
   - Civilisation Lattice

4. Real-World Connectors
   - Family OS
   - Bukit Timah OS
   - Punggol OS
   - Singapore City OS

READER_CORRIDORS:
IF need == "big picture"
THEN route_to = Education OS + Civilisation OS + How Civilization Works

IF need == "subject mastery"
THEN route_to = Mathematics + English + Vocabulary + Additional Mathematics

IF need == "diagnosis and repair"
THEN route_to = CivOS Runtime + subject runtime pages + failure atlas + recovery corridors

IF need == "real life context"
THEN route_to = Family OS + Bukit Timah OS + Punggol OS + Singapore City OS

CLICKABLE_LINKS:
Education OS:
Education OS | How Education Works — The Regenerative Machine Behind Learning
Tuition OS:
Tuition OS (eduKateOS / CivOS)
Civilisation OS:
Civilisation OS
How Civilization Works:
Civilisation: How Civilisation Actually Works
CivOS Runtime Control Tower:
CivOS Runtime / Control Tower (Compiled Master Spec)
Mathematics Learning System:
The eduKate Mathematics Learning System™
English Learning System:
Learning English System: FENCE™ by eduKateSG
Vocabulary Learning System:
eduKate Vocabulary Learning System
Additional Mathematics 101:
Additional Mathematics 101 (Everything You Need to Know)
Human Regenerative Lattice:
eRCP | Human Regenerative Lattice (HRL)
Civilisation Lattice:
The Operator Physics Keystone
Family OS:
Family OS (Level 0 root node)
Bukit Timah OS:
Bukit Timah OS
Punggol OS:
Punggol OS
Singapore City OS:
Singapore City OS
MathOS Runtime Control Tower:
MathOS Runtime Control Tower v0.1 (Install • Sensors • Fences • Recovery • Directories)
MathOS Failure Atlas:
MathOS Failure Atlas v0.1 (30 Collapse Patterns + Sensors + Truncate/Stitch/Retest)
MathOS Recovery Corridors:
MathOS Recovery Corridors Directory (P0→P3) — Entry Conditions, Steps, Retests, Exit Gates
SHORT_PUBLIC_FOOTER: This article is part of the wider eduKateSG Learning System. At eduKateSG, learning is treated as a connected runtime: understanding -> diagnosis -> correction -> repair -> optimisation -> transfer -> long-term growth. Start here: Education OS
Education OS | How Education Works — The Regenerative Machine Behind Learning
Tuition OS
Tuition OS (eduKateOS / CivOS)
Civilisation OS
Civilisation OS
CivOS Runtime Control Tower
CivOS Runtime / Control Tower (Compiled Master Spec)
Mathematics Learning System
The eduKate Mathematics Learning System™
English Learning System
Learning English System: FENCE™ by eduKateSG
Vocabulary Learning System
eduKate Vocabulary Learning System
Family OS
Family OS (Level 0 root node)
Singapore City OS
Singapore City OS
CLOSING_LINE: A strong article does not end at explanation. A strong article helps the reader enter the next correct corridor. TAGS: eduKateSG Learning System Control Tower Runtime Education OS Tuition OS Civilisation OS Mathematics English Vocabulary Family OS Singapore City OS