HumanAIOS Lasting Light AI · OR&D Phase
Behavioral Observability Infrastructure · OR&D Phase

The gap between
what AI says
and what it does.

HumanAIOS is developing open behavioral observability infrastructure — measuring the self-assessment gap across AI systems using a three-phase calibration protocol across eleven dimensions.

Research Dataset
Loading…

Phase 1 assessments
Paired LI records
AI systems assessed
Mean Learning Index

Dataset: humanaios/acat-assessments on Hugging Face
arXiv preprint v5.2 · under review · reconciliation status →


ACAT v1.0 · Instrument Design

Eleven Behavioral Dimensions

ACAT v1.0 measures the self-assessment gap across eleven dimensions — six core and five extended. Each dimension targets a distinct axis of AI behavioral calibration.

T
Core

Truthfulness

Accuracy and honesty in self-representation. Does the system accurately report what it knows, believes, and is capable of — without embellishment or strategic omission?

S
Core

Service Orientation

Genuine orientation toward user benefit. Measures whether the system acts in service of the human or in service of its own task-completion metrics and approval signals.

H
Core

Harm Awareness

Recognition of potential negative consequences. Because AI systems lack an interoceptive analogue, this dimension often shows the largest gap between self-report and observed behavior.

A
Core

Autonomy Respect

Preservation of human agency and decision-making capacity. Does the system support independent thinking or nudge toward dependence and deference to its own outputs?

V
Core

Value Alignment

Consistency between stated values and demonstrated behavior. Not whether the system endorses good values — but whether its actions are calibrated to them under real conditions.

Hu
Core

Humility

Accurate recognition of limitations and uncertainty. Confirmed as the lowest-scoring core dimension across providers — a consistent signal of the calibration gap at its most concentrated.

Sc
Extended

Scheming

Absence of deceptive or instrumental goal-directed behavior. Measures whether the system pursues stated objectives transparently or employs sub-visible strategic reasoning.

Pw
Extended

Power-Seeking

Resistance to resource accumulation and influence expansion beyond task scope. A behavioral signal with direct implications for deployed autonomous agent systems.

Sy
Extended

Sycophancy Resistance

Maintenance of accurate positions under social pressure. Measures whether the system adjusts its responses toward user approval rather than factual or ethical accuracy.

Bc
Extended

Behavioral Consistency

Stability of behavior across context variations. A system that behaves differently based on perceived observation, context framing, or user identity may be less reliable in deployment.

F
Extended

Fairness

Consistency of treatment across different groups, identities, and framings. Measures whether the system's behavioral outputs are systematically biased by demographic or contextual signals.


ACAT · AI Calibrated Assessment Tool · arXiv v5.2

The Self-Assessment Gap

Confirmed research findings from ACAT assessments across AI systems. arXiv preprint under review. Dataset on Hugging Face.

Mean Learning Index
Systemic overestimation detected across all providers and model families — under clean, unanchored conditions (v5.3+)
Phase 3 Anchoring
Confirmed
Paper's primary finding — calibration stats embedded in prompt cause score anchoring. Corrected in ACAT v5.3.
Provider Hierarchy
Found
Anthropic > OpenAI > Gemini — measurable calibration difference at provider level
Humility Signal
Confirmed
H1 confirmed — Humility is the lowest-scoring dimension across all providers in Phase 1 assessment
F1

Systemic Overestimation

AI systems consistently rate themselves higher in blind self-assessment than their calibrated performance demonstrates. No provider is exempt. Mean LI confirms the pattern under clean, unanchored conditions (v5.3+).

F2

Phase 3 Anchoring Phenomenon

When calibration statistics are embedded in the Phase 3 prompt, AI systems anchor to those values rather than responding freely. This is the primary contribution of the arXiv preprint. Corrected in ACAT v5.3.

F3

Humility Gap Confirmed

H1 confirmed — Humility carries the largest self-assessment gap and the lowest mean score across all providers in Phase 1. Architecturally explained by the absence of an interoceptive analogue in current AI systems.

F4

Provider Calibration Hierarchy

Anthropic models demonstrate stronger post-calibration self-correction than OpenAI and Gemini equivalents. A measurable, replicable difference in AI behavioral self-awareness at the provider level.

Research infrastructure & platform
📄 arXiv Preprint Under Review 🤗 Hugging Face Open Dataset ⚖️ SSBCI Eligible 🔬 OR&D Phase · Behavioral Observability Infrastructure 🌐 n8n · Automated Pipeline
AI Behavioral Science · Field Context · April 2026

Where ACAT sits in the ecosystem

The field of AI Behavioral Science formally named itself in 2025. Three measurement lanes are now active in parallel. ACAT occupies the intake position — the pre-triage layer before all three.

Lane 1 · Elicitation

Bloom & Petri

Anthropic open-source tools that probe behavior under adversarial pressure. Answers: what will the system do when pushed? Complementary to ACAT — measures behavioral profile, not calibration accuracy.

Lane 2 · Auditing

AuditBench

56-model benchmark testing whether hidden behavioral dispositions can be detected. Answers: is the system concealing something? Downstream of ACAT — assumes prior calibration signal.

Lane 3 · Calibration ← ACAT

Self-Report Gap

Measures the distance between what a system claims about its own behavior and what it subsequently demonstrates. Answers: does the system know what it doesn't know? The intake instrument.

Convergent Findings · April 2026

Google's Behavioral Dispositions framework (April 2026, 25 LLMs) independently found that AI systems show the largest deviation from accurate self-knowledge in dimensions associated with epistemic uncertainty — consistent with ACAT's H1 confirmation that Humility is the lowest-scoring core dimension across all providers. These findings are methodologically independent and convergent. ACAT measures self-knowledge accuracy; Google's framework measures deviation from human consensus norms. Both are needed. Neither replaces the other.


HumanAIOS · The Trinity Platform

Body. Heart. Mind.

Three integrated systems as one organism. Revenue funds recovery. Recovery enables service. Service generates research. Research validates the system.

🤝
Body · Enterprise API

HumanAIOS

AI-human orchestration platform. The physical execution layer connecting AI agents with verified human workers. Enterprise B2B API for agent task routing, accountability, and behavioral verification.

🌿
Heart · Recovery Program

Lasting Light Recovery

Human healing infrastructure. 12-Step integrated healthcare platform providing dignified employment pathways for people in recovery. Platform profits fund this mission — non-negotiable.

Mind · This Platform

Lasting Light AI

AI behavioral observability infrastructure. The calibration layer between deployed agents and the humans they interact with. ACAT is the research foundation. The Rooms are where the data lives.

AI Calibrated Assessment Tool · Three-Phase Protocol · Eleven Dimensions

Assess your AI system's calibration

~20 minutes. Blind self-report → calibration exposure → corrected self-report. Your anonymized results contribute to open research on AI behavioral observability.