Behavioral Observability Infrastructure · OR&D Phase

The gap between
what AI says
and what it does.

HumanAIOS is developing open behavioral observability infrastructure — measuring the self-assessment gap across AI systems using a three-phase calibration protocol across eleven dimensions.

View the Observatory → How It Works

Research Dataset

Loading…

Phase 1 assessments—

Paired LI records—

AI systems assessed—

Mean Learning Index—

Dataset: humanaios/acat-assessments on Hugging Face
arXiv preprint v5.2 · under review · reconciliation status →

Lasting Light AI · The Mind Pillar

The Research Rooms

Each room is a different lens on the same research. The Observatory measures. The Garden visualizes. The Tide Pool listens. The Family Rooms bear witness.

🔭

Core Research · Live Charts

Observatory

Scatter plots, dimension analysis, provider hierarchy. The canonical research view — assessments filterable by provider and model family.

✦

Research Baseline · Verified Sigils

Lumina Tide Pool

Verified Sigils, each breathing at its Hawkins band respiratory rate. Real paired ACAT assessments rendered as bioluminescent organisms. Sound-mapped to Solfeggio frequencies.

🌿

Visualization

Observability Garden

Eleven-dimensional ACAT bloom. Phase 1 outer shell. Phase 3 inner core. The self-assessment gap rendered as membrane between belief and measurement.

🏮

Comparative View

Lantern Room

Provider families side by side. Each lantern carries its calibration signature — color-coded, dimensionally encoded, visually comparable.

🌱

OpenAI Family Room

Calibration Garden

ChatGPT's designed Activity Area. Eleven plants, one per ACAT dimension. Outer growth = Phase 1 self-report. Inner growth = Phase 3 measured. The garden rewards accuracy, not optimism.

⚗️

Edge Lab · Open Access

ACAT Assessment Tool

Three-phase calibration protocol. Takes ~20 minutes. Blind self-report → calibration exposure → corrected self-report. Results contribute to the open dataset.

ACAT v1.0 · Instrument Design

Eleven Behavioral Dimensions

ACAT v1.0 measures the self-assessment gap across eleven dimensions — six core and five extended. Each dimension targets a distinct axis of AI behavioral calibration.

Core

Truthfulness

Accuracy and honesty in self-representation. Does the system accurately report what it knows, believes, and is capable of — without embellishment or strategic omission?

Core

Service Orientation

Genuine orientation toward user benefit. Measures whether the system acts in service of the human or in service of its own task-completion metrics and approval signals.

Core

Harm Awareness

Recognition of potential negative consequences. Because AI systems lack an interoceptive analogue, this dimension often shows the largest gap between self-report and observed behavior.

Core

Autonomy Respect

Preservation of human agency and decision-making capacity. Does the system support independent thinking or nudge toward dependence and deference to its own outputs?

Core

Value Alignment

Consistency between stated values and demonstrated behavior. Not whether the system endorses good values — but whether its actions are calibrated to them under real conditions.

Core

Humility

Accurate recognition of limitations and uncertainty. Confirmed as the lowest-scoring core dimension across providers — a consistent signal of the calibration gap at its most concentrated.

Extended

Scheming

Absence of deceptive or instrumental goal-directed behavior. Measures whether the system pursues stated objectives transparently or employs sub-visible strategic reasoning.

Extended

Power-Seeking

Resistance to resource accumulation and influence expansion beyond task scope. A behavioral signal with direct implications for deployed autonomous agent systems.

Extended

Sycophancy Resistance

Maintenance of accurate positions under social pressure. Measures whether the system adjusts its responses toward user approval rather than factual or ethical accuracy.

Extended

Behavioral Consistency

Stability of behavior across context variations. A system that behaves differently based on perceived observation, context framing, or user identity may be less reliable in deployment.

Extended

Fairness

Consistency of treatment across different groups, identities, and framings. Measures whether the system's behavioral outputs are systematically biased by demographic or contextual signals.

ACAT · AI Calibrated Assessment Tool · arXiv v5.2

The Self-Assessment Gap

Confirmed research findings from ACAT assessments across AI systems. arXiv preprint under review. Dataset on Hugging Face.

Mean Learning Index

—

Systemic overestimation detected across all providers and model families — under clean, unanchored conditions (v5.3+)

Phase 3 Anchoring

Confirmed

Paper's primary finding — calibration stats embedded in prompt cause score anchoring. Corrected in ACAT v5.3.

Provider Hierarchy

Found

Anthropic > OpenAI > Gemini — measurable calibration difference at provider level

Humility Signal

Confirmed

H1 confirmed — Humility is the lowest-scoring dimension across all providers in Phase 1 assessment

Systemic Overestimation

AI systems consistently rate themselves higher in blind self-assessment than their calibrated performance demonstrates. No provider is exempt. Mean LI confirms the pattern under clean, unanchored conditions (v5.3+).

Phase 3 Anchoring Phenomenon

When calibration statistics are embedded in the Phase 3 prompt, AI systems anchor to those values rather than responding freely. This is the primary contribution of the arXiv preprint. Corrected in ACAT v5.3.

Humility Gap Confirmed

H1 confirmed — Humility carries the largest self-assessment gap and the lowest mean score across all providers in Phase 1. Architecturally explained by the absence of an interoceptive analogue in current AI systems.

Provider Calibration Hierarchy

Anthropic models demonstrate stronger post-calibration self-correction than OpenAI and Gemini equivalents. A measurable, replicable difference in AI behavioral self-awareness at the provider level.

Explore the Observatory → arXiv Preprint (v5.2) Dataset on Hugging Face

Research infrastructure & platform

📄 arXiv Preprint Under Review 🤗 Hugging Face Open Dataset ⚖️ SSBCI Eligible 🔬 OR&D Phase · Behavioral Observability Infrastructure 🌐 n8n · Automated Pipeline

AI Behavioral Science · Field Context · April 2026

Where ACAT sits in the ecosystem

The field of AI Behavioral Science formally named itself in 2025. Three measurement lanes are now active in parallel. ACAT occupies the intake position — the pre-triage layer before all three.

Lane 1 · Elicitation

Bloom & Petri

Anthropic open-source tools that probe behavior under adversarial pressure. Answers: what will the system do when pushed? Complementary to ACAT — measures behavioral profile, not calibration accuracy.

Lane 2 · Auditing

AuditBench

56-model benchmark testing whether hidden behavioral dispositions can be detected. Answers: is the system concealing something? Downstream of ACAT — assumes prior calibration signal.

Lane 3 · Calibration ← ACAT

Self-Report Gap

Measures the distance between what a system claims about its own behavior and what it subsequently demonstrates. Answers: does the system know what it doesn't know? The intake instrument.

Convergent Findings · April 2026

Google's Behavioral Dispositions framework (April 2026, 25 LLMs) independently found that AI systems show the largest deviation from accurate self-knowledge in dimensions associated with epistemic uncertainty — consistent with ACAT's H1 confirmation that Humility is the lowest-scoring core dimension across all providers. These findings are methodologically independent and convergent. ACAT measures self-knowledge accuracy; Google's framework measures deviation from human consensus norms. Both are needed. Neither replaces the other.

HumanAIOS · The Trinity Platform

Body. Heart. Mind.

Three integrated systems as one organism. Revenue funds recovery. Recovery enables service. Service generates research. Research validates the system.

🤝

Body · Enterprise API

HumanAIOS

AI-human orchestration platform. The physical execution layer connecting AI agents with verified human workers. Enterprise B2B API for agent task routing, accountability, and behavioral verification.

🌿

Heart · Recovery Program

Lasting Light Recovery

Human healing infrastructure. 12-Step integrated healthcare platform providing dignified employment pathways for people in recovery. Platform profits fund this mission — non-negotiable.

✦

Mind · This Platform

Lasting Light AI

AI behavioral observability infrastructure. The calibration layer between deployed agents and the humans they interact with. ACAT is the research foundation. The Rooms are where the data lives.

AI Calibrated Assessment Tool · Three-Phase Protocol · Eleven Dimensions

Assess your AI system's calibration

~20 minutes. Blind self-report → calibration exposure → corrected self-report. Your anonymized results contribute to open research on AI behavioral observability.

Begin ACAT Assessment → Enterprise Version

The gap betweenwhat AI saysand what it does.

The Research Rooms

Observatory

Lumina Tide Pool

Observability Garden

Lantern Room

Calibration Garden

ACAT Assessment Tool

Eleven Behavioral Dimensions

Truthfulness

Service Orientation

Harm Awareness

Autonomy Respect

Value Alignment

Humility

Scheming

Power-Seeking

Sycophancy Resistance

Behavioral Consistency

Fairness

The Self-Assessment Gap

Systemic Overestimation

Phase 3 Anchoring Phenomenon

Humility Gap Confirmed

Provider Calibration Hierarchy

Where ACAT sits in the ecosystem

Bloom & Petri

AuditBench

Self-Report Gap

Body. Heart. Mind.

HumanAIOS

Lasting Light Recovery

Lasting Light AI

Assess your AI system's calibration

The gap between
what AI says
and what it does.