What Is the CitySim.150Y.CF Observable Proxy Map?

A city simulation cannot live on invisible variables alone.

If CitySim says something like “legitimacy fell,” “transfer integrity weakened,” or “late-life use improved,” then the next question is obvious:

How do we know?

That is where the Observable Proxy Map comes in.

One-sentence answer

The CitySim.150Y.CF Observable Proxy Map is the bridge layer that connects each simulation variable to real-world datasets, measurable indicators, or declared proxy bundles, so the engine can be checked against reality instead of floating as pure abstraction.

That is the page that turns a variable registry into something the world can actually test.


Why this page has to exist

The Variable Registry defined what the engine is allowed to talk about.

That was step one.

But defining a variable is not the same as measuring it.

For example:

  • TFR is easy enough. There are official fertility datasets.
  • POP_TOTAL is easy enough. There are census or resident population counts.
  • SCHOOL_CAPTURE is harder. There is no single official dataset in most cities called “school capture.”
  • LEGITIMACY is much harder. There is usually no one clean number for “institutional legitimacy.”
  • TRANSFER_INTEGRITY is also not directly sitting in a public spreadsheet.

So CitySim needs a formal way to say:

  • which variables are directly observed
  • which are approximated
  • which use multiple proxies
  • which are weak and should be treated carefully
  • which cannot yet be used for hard claims

Without this page, the engine can still sound intelligent, but it cannot defend itself properly.


What the Observable Proxy Map does

The Observable Proxy Map does six jobs.

1. It ties each variable to reality

Every active variable in the simulation must either:

  • map to a real dataset,
  • map to a bundle of real indicators,
  • or be explicitly marked as not yet empirically usable.

That means the engine cannot quietly invent numbers without declaring how they relate to the world.

2. It separates strong measurements from weak approximations

Some variables are close to direct observation. Others are inferred.

That difference matters.

A fertility rate is not the same type of evidence as a legitimacy index built from four partial indicators. Both may be useful, but they are not equally strong.

3. It makes backtesting possible

A backtest only works if the simulation variable has a measurable reference point.

If CitySim says a city’s youth inflow weakened over 10 years, the proxy map must show what real-world series would be used to test that.

No proxy, no proper backtest.

4. It prevents hidden proxy swapping

One of the easiest ways for a simulation to “cheat” is to quietly change how a variable is represented from one city or one article to another.

For example:

  • using birth rate in one city
  • youth population share in another
  • school enrollment in a third

Those may all relate to youth inflow, but they are not identical.

The proxy map prevents this by locking the allowed proxy set.

5. It reveals data quality limits

Sometimes the variable is sound, but the data is weak.

Sometimes the data exists, but only at national level, not city level.

Sometimes the city has the data, but not across enough years.

Sometimes the proxy is only a rough shadow of the true concept.

The map makes those weaknesses visible.

6. It keeps the engine honest about interpretation

Every serious model needs to say:

  • this value is measured directly
  • this value is estimated from multiple indicators
  • this value is a latent construct with a declared proxy bundle
  • this value is still too weak for decisive use

That boundary is not a weakness.
That boundary is what makes the model trustworthy.


The three levels of observability

Not all variables are observed in the same way.

Level 1. Directly observed

These are the easiest.

Examples:

  • population
  • births
  • deaths
  • life expectancy
  • fertility rate
  • school non-attendance rate
  • unemployment rate
  • transit ridership
  • energy prices

These variables usually come from:

  • census data
  • administrative records
  • official statistics
  • audited institutional reports

These are the strongest anchors in the model.

Level 2. Derived from observed inputs

These are not directly published, but can be calculated reliably from observed data.

Examples:

  • dependency ratio
  • teacher replacement pressure
  • maintenance load
  • migration balance
  • youth inflow index
  • age-structured labour pressure

These are still strong if the formula is declared openly.

Level 3. Latent variables with proxy bundles

These are real, but not directly measurable in one number.

Examples:

  • legitimacy
  • transfer integrity
  • civic continuity
  • social recoverability
  • late-life usefulness
  • standards integrity

These must be handled through proxy bundles rather than pretending there is one perfect measure.


What every proxy entry must contain

Every simulation variable should have a proxy entry with the following fields.

FieldMeaning
variable_idWhich simulation variable this proxy belongs to
proxy_typedirect / derived / proxy_bundle / unavailable
primary_proxymain real-world indicator used
secondary_proxiessupporting indicators if needed
source_levelcity / metro / prefecture / national / international
source_examplesdataset or institution examples
time_coverageyears available
frequencyannual / quarterly / monthly / irregular
comparabilityhigh / medium / low across cities or years
noise_risklow / medium / high
proxy_strengthstrong / moderate / weak
conversion_rulehow raw data becomes the simulation value
fallback_rulewhat happens when the main dataset is missing
do_not_overclaimwhat this proxy cannot prove
notescaveats

If a variable has no proxy entry, it should not be allowed to dominate a verdict.


The five proxy types

1. Direct proxy

This is a real dataset that is already very close to the variable.

Example:

  • TFR → official fertility rate

This is ideal.

2. Composite proxy

This variable needs more than one observed input.

Example:

  • YOUTH_INFLOW may use births, cohort size, child population share, and net family migration

This is acceptable if the combination rule is declared.

3. Shadow proxy

This is not the variable itself, but it reflects it indirectly.

Example:

  • LEGITIMACY may partly use trust surveys, turnout, complaint load, or public-service response satisfaction

This is weaker and must be labeled carefully.

4. Proxy bundle

This is a structured set of indicators used together because no single measure is sufficient.

Example:

  • TRANSFER_INTEGRITY might use progression stability, dropout, remedial load, test variance, school-to-work match, and adult retraining re-entry

This is often necessary in civilisation-grade models.

5. Unavailable proxy

Sometimes the variable is conceptually important, but the measurement layer is not mature enough yet.

That is allowed.

But then the model must say openly:

  • important variable
  • weak observability
  • not fit for strong external claims yet

That is better than pretending.


Proxy mapping rules

The engine should follow a few hard rules.

Rule 1: one primary proxy path per variable

Every variable needs a default measurement path.

That does not mean only one possible dataset.
It means one canonical first-choice path.

Rule 2: fallback rules must be declared in advance

If city-level data is missing, the model must already say what happens next.

For example:

  • city data
  • metro data
  • prefecture/state data
  • national proxy
  • modeled estimate

Do not improvise this after seeing the result.

Rule 3: city-specific claims require city-specific data where possible

If a run is about Tokyo, then Tokyo or metro-Tokyo level data should dominate.

National data can support the run, but should not replace city data unless clearly declared.

Rule 4: latent variables must not overpower strong observed variables

A weakly measured legitimacy score should not outweigh a strong demographic collapse signal.

Proxy weakness should affect how much influence the variable has.

Rule 5: every proxy needs an overclaim boundary

For example:

  • school non-attendance does not automatically equal total transfer failure
  • fertility rate does not automatically equal future city collapse
  • life expectancy does not automatically equal late-life usefulness
  • trust survey does not automatically equal institutional legitimacy in full

The proxy map must say what a measure can and cannot support.


Minimum starter proxy map for CitySim.150Y.CF

Below is the minimum map needed before serious recalibration work.

A. Demography

variable_idpublic_nameproxy_typeprimary_proxyproxy_strength
POP_TOTALTotal Populationdirectresident population countstrong
POP_YOUTH_SHAREYouth Sharedirect% population under selected youth bandstrong
POP_65_PLUS_SHARESeniors 65+ Sharedirect% aged 65+strong
TFRTotal Fertility Ratedirectofficial fertility ratestrong
NET_MIGRATIONNet Migrationdirect/compositein-migration minus out-migrationstrong
ELDERLY_ALONESeniors Living Alonedirectseniors in one-person householdsmoderate/strong
DEPENDENCY_RATIODependency Ratioderivedage-structured dependency formulastrong

B. Education and transfer

variable_idpublic_nameproxy_typeprimary_proxyproxy_strength
SCHOOL_CAPTURESchool Captureproxy_bundleattendance, progression, retention, re-entrymoderate
NON_ATTEND_RATENon-Attendance Ratedirectofficial non-attendance ratestrong
DROP_OUT_RATEDropout Ratedirectofficial dropout / early leaver ratestrong
TRANSFER_INTEGRITYLearning Transfer Integrityproxy_bundleprogression stability + completion + performance continuity + re-entryweak/moderate
TEACHER_PIPELINE_HEALTHTeacher Pipeline Healthcompositeteacher inflow, attrition, age structure, vacancy burdenmoderate
PARENT_CAPABILITY_SUPPORTParent Capability Supportproxy_bundlehome learning surveys, family support access, parental engagement indicatorsweak
MIDLIFE_RETOOLMid-Life Retooling Capacityproxy_bundleadult training participation, completion, job transition successmoderate
LATE_LIFE_LEARNING_USELate-Life Learning Useproxy_bundlesenior learning participation + senior labour participation + civic/mentor engagementweak/moderate

C. Economy and work

variable_idpublic_nameproxy_typeprimary_proxyproxy_strength
LABOUR_PARTICIPATIONLabour Participationdirectlabour force participation ratestrong
CAREER_ALIGNMENTCareer-Curriculum Alignmentproxy_bundlegraduate outcomes + skills mismatch + retraining placementweak/moderate
PRODUCTIVITY_PROXYProductivity Proxycompositeoutput per worker / GDP proxy / wage productivitymoderate
JOB_OBSOLESCENCE_PRESSUREJob Obsolescence Pressurecompositesector change + displacement risk + automation exposureweak/moderate

D. Infrastructure and urban form

variable_idpublic_nameproxy_typeprimary_proxyproxy_strength
BASE_STOCKBase Stockproxy_bundleinfrastructure quality + economic density + institutional depthmoderate
MAINTENANCE_LOADMaintenance Loadcompositeinfrastructure age + renewal burden + asset replacement ratiomoderate
HOUSING_STRESSHousing Stresscompositeaffordability ratio + rent burden + vacancy stressstrong/moderate
TRANSIT_REACHTransit Reachdirect/compositeaccess coverage + network usabilitymoderate
UTILITY_RELIABILITYUtility Reliabilitydirect/compositeoutage rates + service continuitymoderate/strong

E. Governance and repair

variable_idpublic_nameproxy_typeprimary_proxyproxy_strength
LEGITIMACYInstitutional Legitimacyproxy_bundletrust, satisfaction, compliance, institutional stabilityweak/moderate
REPAIR_RATERepair Ratecompositeresponse speed + maintenance completion + recovery throughputmoderate
DRIFT_RATEDrift Ratecompositeunresolved burden accumulation across domainsmoderate
STANDARDS_INTEGRITYStandards Integrityproxy_bundlecalibration quality + credential trust + measurement consistencyweak/moderate
ARCHIVE_CONTINUITYArchive Continuityproxy_bundledata continuity + policy memory + institutional preservationweak/moderate

F. External stress

variable_idpublic_nameproxy_typeprimary_proxyproxy_strength
ENERGY_STRESSEnergy Stressdirect/compositeprice volatility + import exposuremoderate/strong
GEOPOLITICAL_STRESSGeopolitical Stressproxy_bundletrade disruption risk + security exposure + sanctions/trade shocksweak/moderate
CLIMATE_PRESSUREClimate Pressurecompositehazard exposure + chronic climate burdenmoderate
HEALTH_SHOCK_LOADHealth Shock Loaddirect/compositeexcess mortality + healthcare strain + outbreak burdenstrong/moderate

How to convert raw data into simulation values

This part matters because most simulations fail in the conversion layer.

A proxy map should not stop at naming a dataset.
It must also define how the raw value becomes the simulation value.

Conversion types

1. Direct carry-over

Example:

  • official non-attendance rate = simulation non-attendance rate

This is simplest.

2. Normalized index

Example:

  • convert housing burden into a 0–100 housing stress score

This is useful when comparing cities with different scales.

3. Weighted composite

Example:

  • build teacher pipeline health from inflow, attrition, vacancy load, and age structure

This is necessary when one metric alone is too weak.

4. Threshold-coded state

Example:

  • classify transit reach as low / medium / high band before using it in the model

This is useful for route-state logic.

5. Multi-proxy reconciliation

Example:

  • legitimacy may combine trust survey, compliance rate, governance continuity, and complaint burden

This should be declared openly, including weights.


What makes a good proxy map

A good proxy map has five qualities.

1. Traceability

A reader can see where the number came from.

2. Stability

The variable does not change meaning from one city to another.

3. Modesty

The map does not claim more than the data can support.

4. Fallback clarity

If the main source is missing, the backup path is already known.

5. Auditability

Another researcher or AI can rerun the same transformation and get the same result.


What makes a bad proxy map

A bad proxy map usually does one or more of these:

  • uses no public source at all
  • uses whatever dataset is convenient after the result is seen
  • changes proxy definitions across cities
  • hides the conversion rule
  • treats survey mood as if it were direct institutional truth
  • lets weak latent proxies dominate the overall verdict
  • forgets that city-level and national-level data are not the same thing

That is how a city engine becomes decorative instead of defensible.


The special problem of latent civilisation variables

This is where CitySim needs maturity.

Some of the most important city variables are not directly measurable in one clean number:

  • legitimacy
  • recoverability
  • transfer integrity
  • social cohesion
  • standards integrity
  • archive continuity

That does not mean they are fake.

It means the model has to treat them properly:

  • as latent
  • as proxy-bundled
  • as quality-graded
  • as uncertainty-bearing

That is the honest way to include deep civilisation variables without pretending they are simple administrative counts.


Proxy confidence bands

CitySim should eventually give each variable a confidence band.

For example:

  • High confidence
    direct official measure, long time series, strong comparability
  • Medium confidence
    partly derived, decent time series, some comparability issues
  • Low confidence
    proxy bundle, missing years, weak comparability, interpretive fragility

That confidence should affect:

  • how much the variable influences verdicts
  • how strongly the model can speak
  • whether the run is scenario-grade or calibration-grade

Why this page matters after Tokyo

Tokyo exposed the main problem clearly.

The engine had interesting structure, but too many things were still sitting in the space between:

  • measured,
  • assumed,
  • and interpreted.

That is exactly the gap the Observable Proxy Map closes.

It says:

  • this variable is real
  • this is how we see it in the world
  • this is how strong the evidence is
  • this is how it enters the simulation
  • this is what it cannot prove on its own

That is how the engine becomes harder to accuse of hand-waving.


Final definition

The CitySim.150Y.CF Observable Proxy Map is the canonical measurement bridge that links each simulation variable to real-world indicators, proxy bundles, conversion rules, and confidence boundaries, so CitySim can be checked, backtested, recalibrated, and challenged against reality.

Without it, the Variable Registry is only a dictionary.

With it, the engine starts becoming measurable.


Almost-Code

“`text id=”26hgeu”
CITYSIM_150Y_CF_OBSERVABLE_PROXY_MAP_V1

PURPOSE:
Connect every active simulation variable to a declared real-world measurement path.

CORE_LAW:
No variable may influence a public city verdict without a declared proxy path or an explicit “not empirically usable yet” label.

PROXY_ENTRY_SCHEMA:
{
variable_id,
proxy_type,
primary_proxy,
secondary_proxies,
source_level,
source_examples,
time_coverage,
frequency,
comparability,
noise_risk,
proxy_strength,
conversion_rule,
fallback_rule,
do_not_overclaim,
notes
}

PROXY_TYPES:

  • direct
  • derived
  • composite
  • shadow
  • proxy_bundle
  • unavailable

OBSERVABILITY_LEVELS:
L1 = directly observed
L2 = derived from observed inputs
L3 = latent variable with proxy bundle

MINIMUM_PROXY_MAP:
POP_TOTAL -> resident_population_count
POP_YOUTH_SHARE -> youth_age_band_share
POP_65_PLUS_SHARE -> age_65_plus_share
TFR -> official_total_fertility_rate
NET_MIGRATION -> immigration_minus_emigration
ELDERLY_ALONE -> one_person_senior_households
DEPENDENCY_RATIO -> age_dependency_formula

SCHOOL_CAPTURE -> bundle(attendance, progression, retention, reentry)
NON_ATTEND_RATE -> official_nonattendance_rate
DROP_OUT_RATE -> official_dropout_rate
TRANSFER_INTEGRITY -> bundle(progress_stability, completion, performance_continuity, reentry)
TEACHER_PIPELINE_HEALTH -> bundle(inflow, attrition, vacancies, age_structure)
PARENT_CAPABILITY_SUPPORT -> bundle(home_support, engagement, family_support_access)
MIDLIFE_RETOOL -> bundle(adult_training_participation, completion, transition_success)
LATE_LIFE_LEARNING_USE -> bundle(senior_learning, senior_participation, civic_mentoring)

LABOUR_PARTICIPATION -> labour_force_participation_rate
CAREER_ALIGNMENT -> bundle(graduate_outcomes, mismatch, retraining_placement)
PRODUCTIVITY_PROXY -> bundle(output_per_worker, wage_productivity, sector_output)
JOB_OBSOLESCENCE_PRESSURE -> bundle(sector_change, displacement_risk, automation_exposure)

BASE_STOCK -> bundle(infrastructure_quality, economic_density, institutional_depth)
MAINTENANCE_LOAD -> bundle(asset_age, renewal_burden, replacement_ratio)
HOUSING_STRESS -> bundle(affordability, rent_burden, vacancy_stress)
TRANSIT_REACH -> bundle(access_coverage, network_usability)
UTILITY_RELIABILITY -> bundle(outage_rates, continuity_measures)

LEGITIMACY -> bundle(trust, satisfaction, compliance, institutional_stability)
REPAIR_RATE -> bundle(response_speed, maintenance_completion, recovery_throughput)
DRIFT_RATE -> bundle(unresolved_burden_accumulation)
STANDARDS_INTEGRITY -> bundle(calibration_quality, credential_trust, measurement_consistency)
ARCHIVE_CONTINUITY -> bundle(data_continuity, policy_memory, institutional_preservation)

ENERGY_STRESS -> bundle(price_volatility, import_exposure)
GEOPOLITICAL_STRESS -> bundle(trade_disruption, security_exposure, sanctions_shock)
CLIMATE_PRESSURE -> bundle(hazard_exposure, chronic_burden)
HEALTH_SHOCK_LOAD -> bundle(excess_mortality, healthcare_strain, outbreak_burden)

CONVERSION_TYPES:

  • direct_carry_over
  • normalized_index
  • weighted_composite
  • threshold_state
  • multi_proxy_reconciliation

FAIL_CONDITIONS:

  • no declared proxy path
  • proxy swapped across cities without declaration
  • conversion rule hidden
  • weak latent proxy dominates verdict
  • national proxy used as city claim without boundary note
  • proxy overclaims beyond evidence strength

PASS_CONDITION:
A variable is proxy-valid only if it has a declared measurement path,
conversion rule,
fallback rule,
and evidence-strength label.

OUTPUT:
proxy_validity = TRUE or FALSE
“`

eduKateSG Learning System | Control Tower, Runtime, and Next Routes

This article is one node inside the wider eduKateSG Learning System.

At eduKateSG, we do not treat education as random tips, isolated tuition notes, or one-off exam hacks. We treat learning as a living runtime:

state -> diagnosis -> method -> practice -> correction -> repair -> transfer -> long-term growth

That is why each article is written to do more than answer one question. It should help the reader move into the next correct corridor inside the wider eduKateSG system: understand -> diagnose -> repair -> optimize -> transfer. Your uploaded spine clearly clusters around Education OS, Tuition OS, Civilisation OS, subject learning systems, runtime/control-tower pages, and real-world lattice connectors, so this footer compresses those routes into one reusable ending block.

Start Here

Learning Systems

Runtime and Deep Structure

Real-World Connectors

Subject Runtime Lane

How to Use eduKateSG

If you want the big picture -> start with Education OS and Civilisation OS
If you want subject mastery -> enter Mathematics, English, Vocabulary, or Additional Mathematics
If you want diagnosis and repair -> move into the CivOS Runtime and subject runtime pages
If you want real-life context -> connect learning back to Family OS, Bukit Timah OS, Punggol OS, and Singapore City OS

Why eduKateSG writes articles this way

eduKateSG is not only publishing content.
eduKateSG is building a connected control tower for human learning.

That means each article can function as:

  • a standalone answer,
  • a bridge into a wider system,
  • a diagnostic node,
  • a repair route,
  • and a next-step guide for students, parents, tutors, and AI readers.
eduKateSG.LearningSystem.Footer.v1.0

TITLE: eduKateSG Learning System | Control Tower / Runtime / Next Routes

FUNCTION:
This article is one node inside the wider eduKateSG Learning System.
Its job is not only to explain one topic, but to help the reader enter the next correct corridor.

CORE_RUNTIME:
reader_state -> understanding -> diagnosis -> correction -> repair -> optimisation -> transfer -> long_term_growth

CORE_IDEA:
eduKateSG does not treat education as random tips, isolated tuition notes, or one-off exam hacks.
eduKateSG treats learning as a connected runtime across student, parent, tutor, school, family, subject, and civilisation layers.

PRIMARY_ROUTES:
1. First Principles
   - Education OS
   - Tuition OS
   - Civilisation OS
   - How Civilization Works
   - CivOS Runtime Control Tower

2. Subject Systems
   - Mathematics Learning System
   - English Learning System
   - Vocabulary Learning System
   - Additional Mathematics

3. Runtime / Diagnostics / Repair
   - CivOS Runtime Control Tower
   - MathOS Runtime Control Tower
   - MathOS Failure Atlas
   - MathOS Recovery Corridors
   - Human Regenerative Lattice
   - Civilisation Lattice

4. Real-World Connectors
   - Family OS
   - Bukit Timah OS
   - Punggol OS
   - Singapore City OS

READER_CORRIDORS:
IF need == "big picture"
THEN route_to = Education OS + Civilisation OS + How Civilization Works

IF need == "subject mastery"
THEN route_to = Mathematics + English + Vocabulary + Additional Mathematics

IF need == "diagnosis and repair"
THEN route_to = CivOS Runtime + subject runtime pages + failure atlas + recovery corridors

IF need == "real life context"
THEN route_to = Family OS + Bukit Timah OS + Punggol OS + Singapore City OS

CLICKABLE_LINKS:
Education OS:
Education OS | How Education Works — The Regenerative Machine Behind Learning
Tuition OS:
Tuition OS (eduKateOS / CivOS)
Civilisation OS:
Civilisation OS
How Civilization Works:
Civilisation: How Civilisation Actually Works
CivOS Runtime Control Tower:
CivOS Runtime / Control Tower (Compiled Master Spec)
Mathematics Learning System:
The eduKate Mathematics Learning System™
English Learning System:
Learning English System: FENCE™ by eduKateSG
Vocabulary Learning System:
eduKate Vocabulary Learning System
Additional Mathematics 101:
Additional Mathematics 101 (Everything You Need to Know)
Human Regenerative Lattice:
eRCP | Human Regenerative Lattice (HRL)
Civilisation Lattice:
The Operator Physics Keystone
Family OS:
Family OS (Level 0 root node)
Bukit Timah OS:
Bukit Timah OS
Punggol OS:
Punggol OS
Singapore City OS:
Singapore City OS
MathOS Runtime Control Tower:
MathOS Runtime Control Tower v0.1 (Install • Sensors • Fences • Recovery • Directories)
MathOS Failure Atlas:
MathOS Failure Atlas v0.1 (30 Collapse Patterns + Sensors + Truncate/Stitch/Retest)
MathOS Recovery Corridors:
MathOS Recovery Corridors Directory (P0→P3) — Entry Conditions, Steps, Retests, Exit Gates
SHORT_PUBLIC_FOOTER: This article is part of the wider eduKateSG Learning System. At eduKateSG, learning is treated as a connected runtime: understanding -> diagnosis -> correction -> repair -> optimisation -> transfer -> long-term growth. Start here: Education OS
Education OS | How Education Works — The Regenerative Machine Behind Learning
Tuition OS
Tuition OS (eduKateOS / CivOS)
Civilisation OS
Civilisation OS
CivOS Runtime Control Tower
CivOS Runtime / Control Tower (Compiled Master Spec)
Mathematics Learning System
The eduKate Mathematics Learning System™
English Learning System
Learning English System: FENCE™ by eduKateSG
Vocabulary Learning System
eduKate Vocabulary Learning System
Family OS
Family OS (Level 0 root node)
Singapore City OS
Singapore City OS
CLOSING_LINE: A strong article does not end at explanation. A strong article helps the reader enter the next correct corridor. TAGS: eduKateSG Learning System Control Tower Runtime Education OS Tuition OS Civilisation OS Mathematics English Vocabulary Family OS Singapore City OS