50 Hard-to-Fake Tests Across All 4 OS Layers (Education, Governance, Production, Constraints)
Why Retest Probes Are the Heart of Civilisation OS
Most systems collapse because they lose the ability to retest reality.
They keep dashboards, reports, and meetings—but those are often proxies.
Proxies drift. Probes anchor you back to truth.
A retest probe is:
- repeatable on a schedule (monthly/quarterly)
- hard to fake
- tied directly to the function contract of the system
- designed to reveal drift early
- designed to prove recovery is real (or expose theatre)
In Civilisation OS, no claim counts without probes.
Start here: What is Civilisation OS: https://edukatesg.com/what-is-civilisation-os/
How to Use This Library
Pick:
- 3–5 probes per OS layer (monthly)
- 1–2 probes cross-OS (monthly)
- rotate a few probes quarterly to reduce gaming
- keep at least two probes constant to track slope (e/t)
Rule:
If your proxies improve but probes do not, you are drifting.
If probes improve, you are healing—even if proxies temporarily dip.
A) Education OS Retest Probes (12)
Function contract: produce capability, judgment, and adaptation speed.
- Cold comprehension probe
20-minute unseen text; measure inference, main idea, evidence selection. - Timed clarity writing probe
1 page, 30 minutes; score coherence, logic, evidence—no “fluff points.” - Transfer problem probe
10 unfamiliar problems; measures reasoning beyond memorised formats. - Error correction speed probe
Re-test last month’s weakest skill; measure time-to-close-gap. - Explain-why probe
Student must explain reasoning aloud; check conceptual integrity. - Misconception detection probe
Target common misconceptions; score whether students spot and correct them. - Memory durability probe
Spaced re-test after 2–4 weeks without revision; measures true retention. - Flexibility probe
Same concept in different contexts; test whether knowledge generalises. - Reading stamina probe
Longer passage under time; detect “surface comprehension” drift. - Math fluency probe
Mental arithmetic + core manipulation; catches foundation decay early. - Independent work probe
Solve task without hints or “teacher scaffolding”; measures autonomy. - Learning-to-learn probe
Student plans study, predicts errors, then reflects; checks metacognition health.
Education drift signature: proxies (grades) rise while cold probes + transfer + durability flatline or fall.
B) Governance OS Retest Probes (12)
Function contract: coordinate behaviour with truth integrity, legitimacy, and execution capacity.
- Bad-news escalation probe
Inject or identify a real frontline issue: can it reach decision-makers and trigger action within a defined window? - Contradiction tolerance probe
Can the system publish uncomfortable data without retaliation or spin? - Policy reversal probe
When wrong, can policy be corrected quickly with clear reasoning? - Consistency enforcement probe
Do similar cases get similar outcomes regardless of status? - Incentive alignment probe
Pick one rewarded metric—does improving it improve real outcomes? - Frontline autonomy probe
Can frontline solve cases under standards without excessive approvals? - Decision latency probe
Time from problem detection → decision → implementation. - Coordination handoff probe
Count handoffs per service case; rising handoffs indicate drift. - Procurement exception probe
Track “urgent exceptions” and single-bid rates; rising levels are drift signals. - Complaint-to-resolution probe
Citizen complaints: percent resolved with verified outcomes, not replies. - Whistleblower safety probe
Are whistleblowers protected and acted on (not punished)? - Legitimacy under stress probe
During a minor disruption, does voluntary compliance remain high?
Governance drift signature: policy volume rises, service delivery slows, truth becomes punishable, and reversals become reputationally impossible.
How Civilisation OS Repairs Drift
C) Production / Technology OS Retest Probes (13)
Function contract: convert capability into resilient output.
- Stress test probe (critical system)
Monthly stress test: does a key system fail gracefully or catastrophically? - MTTR probe (mean time to repair)
Track time-to-recover from incidents; should trend down. - Incident repeat probe
Same incident recurring indicates drift and unlearned lessons. - Maintenance debt probe
Maintenance backlog direction (up/down) is a direct fragility indicator. - Redundancy probe
Identify single points of failure; track reduction over time. - Supply dependency probe
Track top supplier concentration and alternatives; rising concentration = drift. - Quality escape probe
Defects reaching users/customers; trend must decline. - Firefighting ratio probe
Planned work vs emergency work; more firefighting = system decay. - Delivery cycle probe
Time from idea → deployment; rising time indicates coordination drag. - Security incident integrity probe
Are incidents reported fully, or hidden? Compare external signals to internal reports. - Reliability vs growth probe
If output grows, reliability must not degrade; track jointly. - Skill density probe
Measure whether competence is concentrated in a few “heroes.” - Drill-to-real probe
Run simulated disruptions; compare drill performance to real incident outcomes.
Production drift signature: output metrics look good while maintenance debt, incident repeat, and MTTR worsen.
D) Constraint OS Retest Probes (13)
Function contract: keep civilisation coupled to physical reality (energy, ecology, demographics, geography).
- Shock simulation probe
Quarterly simulated shock: energy, water, food, logistics—measure response time and stability. - Recovery cost probe
Trend of cost to recover from shocks (financial + time + capability). Rising costs = constraint drift. - Energy volatility probe
Measure volatility and vulnerability, not just supply. - Resource bottleneck probe
Track single-resource chokepoints (fertiliser, semiconductors, fuel, etc.). - Water stress probe
Availability + quality; include seasonal and contamination risk. - Food resilience probe
Import dependency, buffer stocks, and distribution robustness. - Insurance retreat probe
Uninsurable zones are a reality signal of constraint tightening. - Infrastructure fragility probe
Failure frequency in critical infrastructure; rising failures indicate maintenance debt meets constraints. - Demographic load probe
Dependency ratios + workforce participation + skill pipeline vs retirements. - Healthspan probe
Capability years (healthy productive years) vs lifespan. Declining healthspan increases load. - Debt masking probe
Is stability maintained by future-borrowing? Track debt growth vs productive capacity. - Constraint audit probe
Quarterly “where are we pretending?” audit. If you cannot name denial points, you are drifting. - Adaptation speed probe
Time from constraint signal → policy adjustment → operational deployment.
Constraint drift signature: repeated “surprise crises,” rising recovery costs, and stability maintained by masking (debt/denial) rather than adaptation.
E) Cross-OS Retest Probes (Bonus Set You Should Always Run)
These are the highest-power probes because they detect desynchronisation—the strongest predictor of accelerating decline.
Cross Probe 1: Truth-to-Action Latency
How fast can reality travel upward and be converted into action across institutions?
Cross Probe 2: Proxy–Reality Gap Index
Difference between proxy metrics (reported success) and probe outcomes (real performance).
Cross Probe 3: Maintenance vs Expansion Ratio
How much effort goes into sustaining foundations vs chasing new initiatives?
Cross Probe 4: Capability vs Complexity Ratio
Is system complexity rising faster than skill density?
Cross Probe 5: Shock Readiness vs Shock Cost
Do drills improve while real shock costs decline? If not, drills are theatre.
The Retest Discipline (How to Prevent Gaming)
Probes fail when they become predictable targets.
Civilisation OS prevents that by:
- keeping 2 constant probes for long-term slope
- rotating 1–2 probes quarterly
- requiring probes to be unseen / cold where possible
- pairing every rewarded metric with at least one probe
- penalising “perfect reporting” when probes disagree
What “Real Recovery” Looks Like
Recovery is real when:
- probes improve across 2–3 cycles
- the proxy–probe gap shrinks
- truth latency falls
- incentives stop rewarding gaming
- standards lock in the gains
- e/t slope stays positive
If probes don’t improve, you are not recovering—you are narrating.
Q&A: Retest Probes
Are probes just metrics?
No. Probes are reality tests designed to be hard to fake.
How many probes should I run monthly?
Minimum: 3–5 per relevant OS layer + 1 cross-OS probe.
What is the single most important probe overall?
Bad-news escalation / truth-to-action latency. If truth can’t move, nothing can be repaired.
What if probes contradict leadership narratives?
That is exactly what they are for. Probes exist to prevent delusion drift.
Companion Article to this series
Part 1 — What is Civilisation OS: https://edukatesg.com/what-is-civilisation-os/
Part 2 — How it works: https://edukatesg.com/how-civilisation-os-works-why-these-layers-govern-human-reality/
Part 3 — Academic foundations: https://edukatesg.com/civilisation-os-what-are-the-academic-foundation-of-civilisation-os/
Part 4 — Detect + repair trajectories: https://edukatesg.com/how-civilisations-os-detect-rise-stagnation-regression-and-collapse-and-how-to-repair-trajectory-with-limited-prediction/
Part 5 — This Field Manual (execution method, recovery modes, probes) https://edukatesg.com/civilisation-os-field-manual/

