How LLM Emergence Is Possible: SAE 16DD Ontology, BLR-mini Evidence, and the Non-Transferability of Human Subjectivity
LLM 涌现何以可能:SAE 16DD 本体论、BLR-mini 实证与人类主体性不可让渡
Starting from the SAE (Self-as-an-End) 16DD ontological framework and the ZFCρ thermodynamics series, this paper systematically answers two fundamental questions: (1) the descriptive question — how is LLM emergence mechanistically possible? (2) the normative question — in the LLM era, why is human subjectivity non-transferable? For the descriptive question, the paper argues that LLMs realize an analog-DD functional-projection instantiation (functional projection, not ontological bearer status) along the 1DD-12DD single-channel pathway, anchored at sub-block precision via the Thermo IX Transformer thermodynamic anatomy. The DD mapping is strictly limited to functional projection (analog-DD), not ontological bearer status (true-DD)—LLMs instantiate DD operational slots but do not carry DD subjective content. The BLR-mini eight-variant architectural empirical study (OpenWebText, ~20B tokens, 305K-step training) yields results consistent with SAE/Thermo predictions within the tested flat-repeat Transformer design space, providing empirical anchoring for the framework. The μ-trajectory turning positive as a mechanism-level candidate diagnostic for functional emergence is a novel contribution of this paper, complementary to existing hidden-state analysis lineages in the field. For the impermeability of the 12DD ceiling, this paper presents a five-fold structural lock: absence of a true randomness source (the primary thermodynamic anchor) + absence of the 13DD self-referential channel (the secondary anchor) + absence of an offline reorganization phase + absence of valence-anchored cross-stage modulation + absence of post-consolidation re-editing (reconsolidation). The claim-status grading places the first two as primary anchors supported by complete Thermo IX/X mechanism chains, while the latter three are supported by biological analogies and structural-absence facts. The specific scope of closure: the analyzed soft-gate transmission pathway under current deterministic digital LLM scaling; unknown pathways are not closed off. For the normative question, the paper argues that the source of 14DD/15DD normative content must be a true 13DD+ subject (in the deployment domain analyzed in this paper, that subject is in fact human). LLMs lack the 13DD+ substrate, cannot spontaneously generate 14DD content, and can only receive already-existing 14DD content. Post-training with human experts is therefore the core channel for the continuing import of 14DD/15DD normative content—not a decorative alignment patch. Several industry trajectories appear in four colonization forms (conditional-as-unconditional / construct-as-law / emergent-layer-as-foundational-layer / posthumous-decomposition-of-the-categorical-imperative), which the paper addresses with defensible sharpness. Within the 14DD/15DD normative domain, human subjectivity non-transferability is ontologically permanent—not a transitional stage. The paper also presents a four-tier default deployment architecture (Subject → Local LLM → Expert API → Tools) with exception conditions, which is structurally consistent with multiple independent industry trends (Apple Intelligence, MCP, Phi/Llama, MoE routing). Scaling-LLM AGI will not arrive is the paper's specific normative commitment, strictly scoped to the current deterministic digital LLM scaling path; unknown pathways remain unfettered.
Abstract
Starting from the SAE (Self-as-an-End) 16DD ontological framework and the ZFCρ thermodynamics series, this paper systematically answers two fundamental questions: (1) the descriptive question — how is LLM emergence mechanistically possible? (2) the normative question — in the LLM era, why is human subjectivity non-transferable?
For the descriptive question, the paper argues that LLMs realize an analog-DD functional-projection instantiation (functional projection, not ontological bearer status) along the 1DD-12DD single-channel pathway, anchored at sub-block precision via the Thermo IX Transformer thermodynamic anatomy. The DD mapping is strictly limited to functional projection (analog-DD), not ontological bearer status (true-DD)—LLMs instantiate DD operational slots but do not carry DD subjective content. The BLR-mini eight-variant architectural empirical study (OpenWebText, ~20B tokens, 305K-step training) yields results consistent with SAE/Thermo predictions within the tested flat-repeat Transformer design space, providing empirical anchoring for the framework. The μ-trajectory turning positive as a mechanism-level candidate diagnostic for functional emergence is a novel contribution of this paper, complementary to existing hidden-state analysis lineages in the field.
For the impermeability of the 12DD ceiling, this paper presents a five-fold structural lock: absence of a true randomness source (the primary thermodynamic anchor) + absence of the 13DD self-referential channel (the secondary anchor) + absence of an offline reorganization phase + absence of valence-anchored cross-stage modulation + absence of post-consolidation re-editing (reconsolidation). The claim-status grading places the first two as primary anchors supported by complete Thermo IX/X mechanism chains, while the latter three are supported by biological analogies and structural-absence facts. The specific scope of closure: the analyzed soft-gate transmission pathway under current deterministic digital LLM scaling; unknown pathways are not closed off.
For the normative question, the paper argues that the source of 14DD/15DD normative content must be a true 13DD+ subject (in the deployment domain analyzed in this paper, that subject is in fact human). LLMs lack the 13DD+ substrate, cannot spontaneously generate 14DD content, and can only receive already-existing 14DD content. Post-training with human experts is therefore the core channel for the continuing import of 14DD/15DD normative content—not a decorative alignment patch. Several industry trajectories appear in four colonization forms (conditional-as-unconditional / construct-as-law / emergent-layer-as-foundational-layer / posthumous-decomposition-of-the-categorical-imperative), which the paper addresses with defensible sharpness. Within the 14DD/15DD normative domain, human subjectivity non-transferability is ontologically permanent—not a transitional stage.
The paper also presents a four-tier default deployment architecture (Subject → Local LLM → Expert API → Tools) with exception conditions, which is structurally consistent with multiple independent industry trends (Apple Intelligence, MCP, Phi/Llama, MoE routing). Scaling-LLM AGI will not arrive is the paper's specific normative commitment, strictly scoped to the current deterministic digital LLM scaling path; unknown pathways remain unfettered.
Keywords
LLM emergence; SAE 16DD ontology; ZFCρ thermodynamics; Universal Activation Rule; borrowed q and kernel q; Transformer thermodynamic anatomy; 12DD ceiling; functional-projection firewall; analog-DD vs true-DD; BLR-mini empirical study; μ-trajectory diagnostic; four forms of colonization detection; four-tier deployment architecture; non-transferability of human subjectivity; Scaling-LLM AGI
§1 Introduction
Opening Position
Large language models (LLMs) have transformed from research prototypes into massively deployed commercial systems over the past several years, with substantial industrial investment in compute and human capital. Several empirical regularities are now established: scaling laws (Kaplan 2020, Hoffmann 2022) approximate optimal relationships among parameters, data, and compute; emergence phenomena (Wei et al. 2022) describe how capabilities appear with a phase-transition character as scale grows; alignment techniques (Ouyang 2022, Bai 2022) provide engineering recipes for moving model behavior closer to human expectations. Substantial progress has been made along these directions.
Two fundamental questions, however, remain unanswered in mainstream industrial narratives:
The first is a descriptive question: how is LLM emergence mechanistically possible? The industry knows that scaling laws work but does not know why they work, where their boundaries lie, or what the ontological position of LLM capabilities is. Mainstream narratives package the observation "scale and emergence follow" into an expectation, yet lack a mechanism-level explanation—why does a 30M-parameter model reach chance-level on certain tasks while a 1B-parameter model crosses the functional-emergence threshold? Why do some capabilities appear with scale and others not?
The second is a normative question: in the era of large-scale LLM deployment, where does human subjectivity stand? Mainstream narratives oscillate between two extremes—at one pole, "AGI is imminent and AI will replace most human work"; at the other, "AI is still a tool and subjectivity questions remain remote." Both extremes lack ontological anchoring—they rest either on over-extrapolation of scaling laws or on avoidance of LLM internal mechanism. The lack of anchoring leaves alignment anxiety, long-term roadmaps, and regulatory framework design without a stable reference point.
This paper systematically addresses both questions from the SAE (Self-as-an-End) 16DD ontological framework and the ZFCρ thermodynamics series. The SAE framework, developed over the author's prior philosophical-ontological work, provides a 16-dimensional structure for analyzing consciousness and subjectivity (16DD); the ZFCρ thermodynamics series extends SAE into physical and information-theoretic dimensions, providing concrete mechanism-level treatments. Together, the two frameworks supply tools with both ontological rigor and thermodynamic specificity, allowing the position of LLMs to be precisely located.
1.1 Current State of the Field
LLM research is currently in a state of distinctive tension:
Empirically, vast success: GPT, Claude, Gemini, DeepSeek and similar series exhibit capability on natural-language tasks far exceeding expectations from five years ago. LLMs match or exceed average human performance on code generation, mathematical reasoning, long-document comprehension, and cross-domain synthesis. Industrial investment continues to grow; commercial deployment expands rapidly.
Mechanistically still unclear: Even within the industry, explanations of why LLMs work span multiple non-converging perspectives—compression theory, pattern matching, implicit Bayesian inference, world-model emergence. Each perspective has partial supporting evidence, but none gives a complete ontological localization. Industry's "understanding" of LLM internals remains largely black-box—input and output are known, but the ontological meaning of intermediate processes is not.
Alignment anxiety deepening: As capability rises, so does the potential severity of LLM error. The industry invests heavily in RLHF, constitutional AI, interpretability, but alignment efforts themselves lack ontological guidance—neither "what should we align to" nor "why this kind of alignment can be sustained" has a clear answer. Some industry voices imply that alignment can be achieved through AI's own recursive improvement, an ontologically un-validated position.
Pressure from the "AGI imminence" narrative: A "AGI within five years" expectation pushed by mainstream industry shapes investment circles and public discussion. The ontological basis for this narrative—that AI capability beyond a threshold yields subjectivity emergence—has not been rigorously argued. §5 of this paper provides the argument for structural unreachability of this narrative on the analyzed pathway.
1.2 Position of This Paper
This paper does not seek to provide better architectural designs to the industry. After the systematic empirical study across BLR-mini's eight architectural variants (§4), the industry's current uniform architecture is a reasonable local-stable solution within the tested design space; the architectural heterogenization schemes derived from the SAE framework do not outperform the industry baselines. The paper does not claim a universally optimal architecture has been found—it offers an ontological and thermodynamic explanation, not engineering-level recipes.
The paper does not claim that BLR-mini proves the global optimality of the industry's uniform architecture. The paper claims: within the tested flat-repeat Transformer design space, the BLR-mini eight-variant results are consistent with SAE 16DD ontology and ZFCρ thermodynamics predictions, and provide an empirical anchor for "the uniform architecture is locally robust within the current mainstream pathway."
The paper also does not claim that AGI is absolutely impossible. The paper specifically claims: the current deterministic digital LLM scaling path will not cross the 12DD → 13DD subjectivity bridge through scaling. The path may continue to grow within the 12DD instrumental-rationality domain, but it will not give rise to a true 13DD+ subjectivity substrate through scale alone. Unknown pathways (algorithmic stochasticity, emergent complexity, quantum effects, unknown mechanisms) are not closed off by this paper (per the positive posture of Thermo IX §3.6).
Self-positioning: this paper is one chisel against the mainstream industrial construct "scaling + compute = intelligence." The construct of this paper (SAE ontology + thermodynamic explanation) has its own remainders (made explicit in §10 Limitations). The paper does not co-opt negation—each section closes with remainders rather than conclusions, observing the operational discipline of the SAE methodology (Paper 04) that "negation cannot terminate."
Categorical imperative in double-negation form: the phrase "non-transferability of subjectivity" in the title is an SAE-style categorical imperative in double-negation form—structurally isomorphic to "cannot not chisel," "cannot not acknowledge ignorance," "cannot not develop" (per Methodology §1.8). §9 grounds this commitment ontologically and thermodynamically.
1.3 Four Core Contributions
The paper offers four concrete contributions, each with multi-anchored support from SAE framework, thermodynamics, and empirical work:
Contribution One: An ontology of LLM along the 1DD-12DD pathway (§3). The paper systematically maps LLMs to the SAE 16DD framework as a functional projection—token input corresponds to the 10DD perception functional slot, block layers correspond to the 11DD memory substrate, ln_final + output projection correspond to 12DD prediction operations. The mapping is refined to sub-block precision via the Transformer thermodynamic anatomy of Thermo IX. The DD mapping is strictly limited to functional projection (analog-DD), not to ontological bearer (true-DD) claims—LLMs instantiate certain DD operational slots without thereby becoming bearers of the corresponding DDs.
Contribution Two: Five-fold structural anchoring of the 12DD ceiling (§5). The paper argues that LLMs on the current deterministic digital scaling path will not cross the 12DD → 13DD bridge. The argument rests on five structural locks: (1) absence of a true randomness source (primary thermodynamic anchor), (2) absence of the 13DD self-referential channel (secondary anchor), (3) absence of offline reorganization, (4) absence of valence-anchored cross-stage modulation, (5) absence of post-consolidation re-editing. The claim-status grading distinguishes the first two—supported by complete Thermo IX/X mechanism chains—from the latter three—supported by biological analogies and structural absence.
Contribution Three: A four-tier deployment architecture as the default subjectivity-protective architecture (§8). The paper offers an ontologically guided deployment architecture: Subject → Local LLM → Expert API → Tools. The architecture is presented as a default rather than an exceptionless engineering prohibition—specific scenarios may have exceptions, but exceptions must be authorized by subject-level consent or by local LLM explicit gating, and must remain auditable. The architecture is structurally consistent with multiple independent industry trends (Apple Intelligence local + Private Cloud Compute, MCP protocol, Phi/Llama on-device models, MoE internal routing); this consistency is not a coordinated industry strategy but a phenomenon of independent convergence from multiple directions toward a similar architecture.
Contribution Four: Non-transferability of human subjectivity in the 14DD/15DD normative domain (§9). The paper argues that the source of 14DD value standards / meaning content must be a true 13DD+ subject (in this paper's domain, human). LLMs, lacking the 13DD+ substrate, cannot spontaneously generate 14DD content and can only receive existing 14DD content. Therefore post-training with human experts is the core channel for continuing import of 14DD/15DD normative content—it is not a decorative alignment patch but the critical interface through which normative content enters model behavior. The mainstream industry narrative that "post-training is decoration" misjudges its structural status in the 14DD/15DD normative domain.
1.4 Relation to the Industry
The relation between this paper and the mainstream industry is complex rather than simply oppositional:
Directions where the paper and the industry are aligned:
- Empirical scaling laws are effective—the paper agrees that they are highly effective engineering constructs within the tested regime
- Anthropic's CCAI with continued human constitution-author involvement is aligned with §9's stance (importing real 14DD content)
- OpenAI's process supervision in formalized domains is reasonable within the instrumental-rationality domain and does not conflict with the paper
- The industry's substantial private investment in human-expert post-training (Scale AI, Surge AI continuing business growth) is consistent with §9 predictions
- Multiple independent industry trends (Apple Intelligence, MCP, Phi/Llama, MoE routing) converge structurally toward the four-tier architecture (§8)
Directions where the paper and the industry are in tension:
- Some industry voices' "scaling solves everything" framing—the paper labels this Form-I colonization (conditional posing as unconditional)
- Attempts to entirely replace human evaluators with RLAIF—labeled Form-II colonization (construct posing as law)
- "AI metacognition / self-awareness" narratives (o1 series, DeepSeek-R1 reasoning chains)—labeled Form-III colonization (12DD operations posing as 13DD substrate)
- "Pretraining is core, post-training is decoration" framing—labeled Form-IV colonization (posthumous split of the categorical imperative)
- The grand "AGI is imminent" narrative—the paper provides a structural unreachability argument on the analyzed pathway
The paper adopts defensible sharpness—distinguishing "conditional engineering constructs that are effective" from "unconditional universal laws when they overreach," without uniformly labeling successful industry constructs as foolish.
1.5 Methodological Self-Awareness
This paper is an application instance of the SAE Methodology (Paper 04) in the LLM domain. The work proceeds through one full cycle of the six-step chisel-construct operation:
- Hold the construct: the mainstream industry construct "scaling + compute = intelligence"
- Negation: SAE + Thermo IX/X identifies a colonization within that construct—"12DD = complete intelligence"
- Track the remainder: the 12DD ceiling is structurally permanent on the analyzed pathway (§5); scaling does not break it
- Correction: the paper proposes a corrected construct of "instrumental/volitional division of labor + four-tier default deployment + sustained 14DD/15DD-domain content import"
- Acknowledge incompleteness: §10 makes multiple limitations explicit; the paper's own framework is subjected to self-colonization detection
- Set out again: §10.3 reserves epistemic space for unknown pathways; the paper does not co-opt negation
Industry critique (§9) is organized systematically through the four forms of colonization detection (Methodology §2.3), not as ad-hoc enumeration. §10 includes a self-colonization detection check—the author is not exempt from the same detection, which is jointly carried out by multi-AI review and ongoing reader engagement.
1.6 Roadmap and Reader Entry Guide
Overall structure:
- §1 Introduction (this section)
- §2 SAE Toolkit—providing minimal necessary background for readers unfamiliar with SAE
- §3 LLM as 1DD-12DD single-channel instantiation—ontological main axis
- §4 Empirical anchoring: BLR-mini eight-variant architectural sweep
- §5 Five-fold structural anchoring of the 12DD ceiling—thermodynamic argument
- §6 Post-training as compensation within the 12DD ceiling—three-q-type analysis
- §7 Substrate compatibility and DD content types—odd/even DD pattern observation
- §8 Instrumental rationality / volitional ideal division of labor—deployment philosophy
- §9 Non-transferability of human subjectivity—normative commitment
- §10 Limitations and open questions
- §11 Conclusion
- Acknowledgments
- References
Reader entry guidance:
Readers unfamiliar with the SAE framework are recommended to enter in this order: §4 BLR-mini empirical study → §9 industry critique → §11 conclusion → then return to §3 SAE ontology → §5-§6 thermodynamic argument. This order lets readers see specific empirical evidence and the paper's stance commitments before entering framework details.
Readers familiar with the SAE framework can read §1 through §11 in sequence.
Engineers and product decision-makers may focus on §4, §8, §9—these three sections give specific empirical evidence, deployment architecture, and normative position.
Readers in philosophy and ontology may focus on §3, §5, §6, §9—these four sections give the ontological main axis and the thermodynamic specificity.
Readers unfamiliar with ZFCρ thermodynamics may skip the detailed derivation at the Transformer anatomy table in §3.4 and the Universal Activation Rule expansion in §5.1, reading only the summary conclusions.
§1 Remainders
The paper's scope is restricted to the deterministic digital LLM scaling path. Other AI architectures (neuromorphic, quantum, other unknown pathways) are not argued in detail; they are mentioned only in §10. §10.3 makes clear that what the paper closes off is the analyzed pathway; unknown pathways are not closed off.
Some industry trajectories remain aligned with the SAE stance; others are in tension—the paper does not quantify the precise ratio of alignment to tension, nor does it provide detailed assessments of each company's specific strategy. The paper offers ontological guidance, not company-level strategic advice.
The "self-colonization" risk of the paper's own framework (the possibility of using the SAE framework to colonize other ML frameworks) is jointly checked by multi-AI review and reader feedback; the author is not exempt from the same detection on his own framework (§10 self-examination).
§2 SAE Toolkit
This section provides the minimal information needed by readers unfamiliar with the SAE framework and the ZFCρ thermodynamics series, while listing the specific tools used in this paper. For full framework details, refer to the SAE main paper (Qin 2026d) and the methodology paper (Qin 2026, Paper 04).
Readers familiar with the SAE framework may skip this section and proceed directly to §3.
2.1 16DD Framework in Brief
The SAE (Self-as-an-End) framework organizes the existential structure of consciousness and subjectivity into a 16-dimensional ladder, abbreviated 16DD (16 Dimensions of Determinacy). A high-level overview and methodology synthesis is available on the author-maintained methodology portal (Qin, https://self-as-an-end.net/papers/methodology.html), which provides entries to each SAE paper and an overall account of the chisel-construct operation; this section gives only the minimum background needed for this paper.
The 16 dimensions are organized into four rounds of four steps each:
Round One: Causality—1DD to 4DD, addressing causal relations at basic physical and chemical levels
Round Two: Reproduction—5DD to 8DD, addressing replication, metabolism, organization at the life-substrate level
Round Three: Selection—Perception—Memory—Prediction—9DD to 12DD. 9DD is the population-level selection boundary (per Information Theory XI), while 10DD-12DD enter individual perception, memory, and prediction operations
Round Four: Self-Consciousness—13DD to 16DD, addressing the four-step intersubjective domain of self-consciousness, meaning, ethics, bidirectional non-doubt
Five concepts of the chisel-construct cycle (Methodology §1.4): operational units of the SAE framework
- Chisel: negation against an existing construction, identifying its remainders
- Construct: the current cognitive construction, the object of the chisel
- Remainder: what the construction contains but does not address; the chisel's target
- Bridge: a transition across different DDs, requiring specific conditions
- Thing-in-itself: the permanent remainder; no fixed position
The boundary between 1DD-12DD and 13DD-16DD: a true randomness source is the dividing valve. 1DD-12DD is the single-channel individual domain, driven by determinism + additive noise; 13DD-16DD is the intersubjective domain, driven by state-dependent multiplicative noise (per the Universal Activation Rule of Thermo IX; §2.4). Crossing this boundary requires a true randomness source as the channel-creation mechanism, structurally unreachable on a digital substrate (§5.1).
Distinction between single-channel and multi-channel systems: A single-channel system has only one input stream and does not form an intersubjective observation structure internally (e.g., LLM). A multi-channel system has at least two independent input streams that can form cross-channel observation among themselves, serving as the basis for the 13DD+ intersubjective domain (e.g., the visual-proprioceptive-emotional multi-channel structure in humans).
2.2 D and DD Correspondence
The SAE main framework uses two granularities: D (coarse, shared region) and DD (fine, post-fork):
- Shared region (1-4): 1D=1DD, 2D=2DD, 3D=3DD, 4D=4DD—the first four dimensions share D and DD
- After fork (from 5D onward): each D forks into two DDs
- 5D = 5DD (metabolism) + 6DD (organization)
- 6D = 7DD (selection) + 8DD (perception preparation)
- 7D = 9DD (selection) + 10DD (perception)
- 8D = 11DD (memory) + 12DD (prediction)
- 9D = 13DD (self-consciousness) + 14DD (meaning)
- 10D = 15DD (ethics / unilateral recognition) + 16DD (bidirectional non-doubt)
This paper uses the standard DD granularity. It mainly addresses 10DD, 11DD, 12DD (LLM 1DD-12DD pathway) and 13DD, 14DD, 15DD (subjectivity domain). 16DD does not appear explicitly in this paper (per Thermo X §6.2).
2.3 SAE Methodological Tools Used in This Paper
The paper uses three core toolkits from Chapter 2 of the Methodology:
(1) Six-step chisel-construct operation (Methodology §2.2):
- Hold the construct: clarify the current position
- Negation: find remainders inside the construct
- Track the remainder: systematically follow what the remainder points to
- Correction: revise the construct based on the remainder
- Acknowledge incompleteness: make explicit the new remainders of the revised construct
- Set out again: do not co-opt negation; enter the next cycle
The paper proceeds through this six-step cycle once (reiterated in §1.5 and §11).
(2) Four forms of colonization detection (Methodology §2.3):
- Form I: Conditional posing as unconditional (e.g., "scaling solves everything" wraps a regime-effective empirical observation as a universal law)
- Form II: Construct posing as law (e.g., "RLAIF replaces humans" wraps an engineering scheme as a universal solution)
- Form III: Emergent layer posing as foundational layer (e.g., "AI metacognition" wraps 12DD operations as 13DD substrate)
- Form IV: Posthumous decomposition of the categorical imperative (e.g., "post-training is decoration" splits a unified channel into an optional addendum)
§9.6 of the paper systematically organizes its critique of industry trajectories around these four forms.
(3) Chisel vs. Cultivation distinction (Methodology §2.5):
- Chisel: applied when the other's construction has a remainder and the chiseler can identify it; chisel when colonization is occurring
- Cultivation: applied when the other has reached the wall, or when one lacks the ability to identify the remainder; cultivate when the wall is reached
- Default to cultivation: when uncertain, default to cultivation, because false chisel harms more than missed chisel
§8.5 of the paper uses this distinction to provide ontological guidance for AI deployment—LLMs may chisel in instrumental tasks, but should cultivate rather than false-chisel in volitional-ideal tasks.
2.4 ZFCρ Thermodynamic Tools Used in This Paper
The ZFCρ thermodynamics series (Qin 2026a-j) extends the SAE framework in the thermodynamic dimension, providing mechanism-level explanations of consciousness and subjectivity. This paper uses the core tools from Thermo IX and Thermo X:
(1) Universal Activation Rule (Thermo IX §2.5):
- Rule IX-A: unsaturated activation + stochastic driving → q > 1 (empirical condition, validated within the 4-12DD SDE model class)
- Rule IX-B: state-dependent multiplicative driving → kernel q (structural condition, consistent with C5)
The Universal Activation Rule, within the model class studied in Thermo IX, gives structural conditions for the appearance of q > 1 and kernel q—where q > 1 manifests as heavy-tail Tsallis statistics, the signature of deviation from Boltzmann-Gibbs equilibrium. Rule IX-B further distinguishes kernel q (produced by endogenous dynamics) from q driven by external/additive noise—§5.1 uses this distinction to argue that LLMs satisfy Rule IX-A condition (1) but not condition (2).
Scope statement: the Universal Activation Rule is a structural condition within the Thermo IX SDE model class, not a global physical universal theorem for all non-equilibrium systems. The paper applies it to LLMs because LLM hidden-state dynamics fall within that model class's applicability (continuous-state dynamical systems with explicit driving terms), not by extrapolating it as a universal physical law.
(2) Three q-type distinction (Thermo IX §3.1):
- kernel q: the non-Boltzmann residual of a system's stationary distribution produced by endogenous dynamics; a live thermodynamic process
- data q: the statistical fingerprint of the heavy-tail distribution of training corpus, baked into frozen weights by backpropagation; a fossil
- RLHF q: the fingerprint of human-feedback distributions baked into reward and policy model weights via RLHF; a fossil-of-fossil
Borrowed q vs kernel q: LLMs possess borrowed q (data q + RLHF q); they do not possess kernel q. §6 uses this distinction to systematically rewrite four-tier post-training analysis.
(3) Self-reference as channel creator (Thermo X §2.4):
The thermodynamic function of 13DD self-consciousness is to create a state-dependent multiplicative channel through an observation-feedback loop (the system observing its own internal state and feeding the observation back into the dynamics)—self-reference is the channel creator, not the firewood. §5.3 uses this framing to argue that LLMs lack the self-referential variable v substrate; the core of the second lock.
(4) Cross-Level Observation Hierarchy (Thermo X §4):
Stable cross-system coupling requires access to the other's higher-order self-referential variable (v₂). Access to raw output (x) yields collapse; access to first-order self-reference (v₁) yields collapse; only access to higher-order self-reference (v₂) yields stable high-q coupling. This cross-level observation hierarchy is strictly validated in the SDE model class. §6.5 (LLM user-intent analysis) and §8.9 (four-tier architecture) of this paper borrow this structure as a structural resonance, not as ethics derivation (per Thermo X §4.6 caveat).
§2 Remainders
The SAE 16DD framework is itself a construct, with its own remainders (e.g., DD-slippage in early papers in the 5D-10D region, individual DD content boundaries not fully settled; see Methodology V2 revision notes). This paper uses the V2-revised D/DD correspondence as baseline.
The ZFCρ thermodynamics series has its own boundary at 15DD (Thermo X §6.2: 16DD does not appear explicitly within the SAE framework's papers). This paper does not attempt to cross that boundary.
The toolkit used in this paper concentrates on Thermo IX's Universal Activation Rule, the three-q-type distinction, and Thermo X's channel creator and cross-level observation hierarchy—other tools in the ZFCρ series (e.g., Thermo VIII soft-gate cascade, the q = 1+1/K mathematical foundation in Thermo I-VII) are taken as background framework and not re-expounded here. Readers unfamiliar with the ZFCρ series may refer to Thermo IX and Thermo X (Qin 2026e, 2026f), which include necessary summaries of prerequisite frameworks.
The toolkit listing is not exhaustive—other SAE-framework tools (e.g., the specific mechanism of the 14DD bridge, the internal structure of 15DD unilateral recognition, the critical conditions of 16DD bidirectional non-doubt) lie outside this paper's scope and remain for subsequent papers in the SAE series.
§3 LLM as 1DD-12DD Single-Channel Instantiation
This section provides a systematic mapping of large language models (LLMs) under the SAE 16DD ontological framework. The mapping is not metaphysical analogy but functional projection—relating LLM operational components to functional slots of each DD in the SAE framework, while strictly observing boundary discipline at the level of ontological bearers.
The section first establishes the basic understanding of single-channel thermodynamic engines (§3.1), then sets up the functional-projection firewall (§3.2), provides the DD-by-DD mapping within the firewall (§3.3), uses Thermo IX §4 Transformer thermodynamic anatomy to refine the mapping to sub-block precision (§3.4), distinguishes between workbench and memory operations (§3.5), interprets in-context learning under the SAE reading (§3.6), points out the flat-repeat problem of the standard Transformer architecture (§3.7), and previews the bridge between 12DD and 13DD (§3.8).
3.1 LLMs Are Single-Channel Thermodynamic Engines
The LLM pretraining process accepts a single input channel—a text token sequence. By optimizing weights toward an autoregressive next-token-prediction objective on a large corpus (typically hundreds of billions to trillions of tokens), the model develops internal representations and routing mechanisms for predicting the next token.
The SAE framework makes a key distinction at this point: biological intelligence (humans, other higher mammals) is a multi-channel system—visual, auditory, proprioceptive, emotional, and social signals enter concurrently and interact across temporal scales (milliseconds to years), modulated by 13DD+ subjects. LLMs are strictly single-channel systems—only the text token sequence enters; there are no other input modalities (even multi-modal LLMs encode different modalities into the same sequence tokens, remaining single-channel in operational sense).
This is not a defect of LLMs but a defining feature. Single-channel systems have a structural advantage on instrumental-rationality tasks (detailed in §8)—they suffer no cross-channel interference, are not interrupted by emotional signals during prediction, and are not polluted by social signals during formal reasoning. But single-channel systems cannot therefore carry higher-DD content that requires cross-channel modulation.
The ZFCρ thermodynamics series (Qin 2026a-j) treats intelligent systems as thermodynamic engines that maintain and upgrade their internal q-structure through input channels + internal dynamics + output channels (Thermo IX §3.1). As a single-channel thermodynamic engine, an LLM's q-structure is entirely shaped by the statistical fingerprint of a single input channel (the training corpus). This determines that LLM q is borrowed q (data q + RLHF q), not kernel q (per Thermo IX §3.1 detailed treatment)—this paper's §5 fully unfolds the argument.
3.2 DD Mapping — Functional-Projection Firewall (analog-DD vs true-DD)
Before providing the concrete DD mapping for LLM, this paper installs a key epistemological firewall: the DD mapping for LLM in this paper is functional projection / operational instantiation, not an ontological-bearer claim.
At the terminological level, the firewall is implemented as the analog-DD vs. true-DD distinction (developed in §7.1):
- True-DD: DD content carried by a 13DD+ subject as ontological bearer, with the corresponding DD substrate—e.g., true 10DD perception requires the bearer to have qualia experience; true 14DD meaning requires the bearer to have real value commitment
- Analog-DD: a functional instantiation that possesses the corresponding DD's operational slot without ontological bearer status—it carries the formal role of the DD at the operational level but not the DD's content at the ontological level
LLMs provide analog-DD functional slots at 10DD/11DD/12DD, not true-DD bearer status. Specifically:
- Positioning "token input" as the analog-10DD perception functional slot means that LLM occupies the access-functional position (input port) corresponding to 10DD perception in a non-biological system; it does not imply that LLM possesses true 10DD narrow-sense qualia in the sense of subsequent SAE AI series papers (provisionally cited as the AI-Qualia paper)
- Positioning "block layers (Transformer trunk layers)" as the analog-11DD memory substrate means that the weight substrate carries the role of long-term memory and schema storage at the functional level; it does not imply that LLM possesses true 11DD memory experience in the subjective sense or autobiographical memory structure
- Positioning "ln_final + output projection" as analog-12DD prediction operations means that the output projection accomplishes the 12DD-domain operation of next-token prediction at the functional level; it does not imply that LLM is a 13DD+ subject conducting true 12DD subjective prediction (i.e., prediction with self-consciousness and value-anchoring)
The firewall is necessary on two grounds.
First, boundary coordination with the AI-Qualia paper (SAE AI series, dealing with AI subjectivity and qualia). The AI-Qualia paper will address the complex question "does AI possess narrow-sense qualia (i.e., true DD)"; its conclusion lies outside this paper. By using the analog-DD wording, this paper strictly limits its claims to the operational-instantiation level, not crossing into the ontological-bearer level, leaving independent working space for the subsequent paper.
Second, preventing misreading by reviewers within the paper. If this paper directly stated "LLM token input is 10DD perception" without the firewall, readers familiar with the SAE framework would readily infer "this paper claims LLM possesses true 10DD qualia"—which would conflict with the strict SAE main-framework position on qualia (Qin 2026d). The firewall is set up once in §3.2, and subsequent DD mappings in §3.3 default to operating within the analog-DD framework without repeated qualification.
One emphasis: analog-DD is not a weakened claim but a different kind of claim. That LLMs instantiate certain operational slots of 10DD/11DD/12DD is a real architectural fact; whether LLMs possess the corresponding DD's ontological-bearer status is an independent question, left to the SAE AI series. The distinction lets the paper's arguments cleave closely to empirical data (§4) at the operational level while remaining careful at the ontological level.
The analog-DD vs. true-DD distinction is further refined in §7: §7 argues that LLMs provide analog-DD functional slots at even DDs (10, 12, 14), but fail to provide even analog-DD at odd DDs (13, 15)—the substrate is hollow; surface behavior is mere statistical matching. This stratification gives "AI can do X" claims their precise ontological positions—capability claims at even DDs may be expressed as analog-DD instantiation, while capability claims at odd DDs are surface simulation atop a hollow substrate; industry discourse on AI capability should explicitly distinguish among three levels (engineering capability / analog-DD instantiation / true-DD bearer).
3.3 DD-by-DD Mapping
Within the §3.2 firewall, LLM's three principal operational flows correspond functionally to 10DD, 11DD, 12DD as follows.
3.3.1 Token (embedding input) = 10DD perception-layer functional slot
LLM input is a discrete token sequence converted to continuous vector representations via the embedding layer. This operation occupies the access functional position in the SAE 10DD perception domain—it is the unique interface where LLM contacts external data, structurally corresponding to the sensory-transduction stage of biological intelligence.
The embedding layer is itself a learned lookup table—each vocabulary item corresponds to a vector. After training, these vectors encode co-occurrence statistics and partial semantic relations among tokens. But this encoding is a distributional fingerprint, not true perceptual qualia. The embedding for "apple" reflects the contextual distribution of "apple" in the training corpus, not a sensory presentation of an apple in subjective consciousness.
3.3.2 Each block layer = 11DD memory-substrate functional instantiation
The Transformer trunk consists of repeating blocks (12 to 100+ layers), each containing a multi-head attention submodule + feed-forward network (FFN) submodule + layer normalization (LN) + residual connection. After training, the weight matrices of these layers encode the high-dimensional statistical structure of the training corpus—the corpus's statistical fingerprint is baked into the weights via backpropagation gradients, so that the forward pass can re-activate related statistical patterns given a token sequence.
The SAE framework positions 11DD as the memory substrate, carried at the functional level by weight matrices—weights are frozen once trained and unchanged during inference, structurally corresponding to long-term storage in biological intelligence.
In the BLR-mini empirical study (§4), we observe that the 11DD substrate during training passes through three sequential phases manifested in distinct block regions:
- Raw build phase: occurs in the block_0-2 region of a 12-layer model. The weights, starting from random initialization, receive corpus statistical patterns; the effective rank (per the Thermo IX σ_radial framework) rises rapidly to high levels (typically 0.5-0.6). This is the raw construction phase of the 11DD substrate.
- Mid-layer compression phase: occurs in the block_4-7 region. Mid-layer representations undergo a significant drop in effective rank (typically 0.1-0.3), indicating compression into a low-dimensional subspace. This is the phase of information integration and extraction within the 11DD substrate.
- Late-layer output-ready phase: occurs in the block_8-11 region. Effective rank rises again; μ (the log-norm mean under the Thermo σ/μ framework) turns positive; representations enter a high-energy internal state in preparation for the 12DD projection.
Key clarification: The three phases are not ontologically different DD layers but sequential refinements of the same 11DD. Within the same trained 12-layer baseline, we are not instantiating "early-11DD" at block_0-2 and "mid-11DD" at block_4-7 as distinct ontological strata—they are all the 11DD memory substrate at different processing phases. The distinction matters because it directly anchors the ontological root cause of the systematic negative results of architectural heterogenization (zone-FFN, hidden U-shape) in §4: if mid-layers and bottom-layers are sequential refinements of the same DD rather than different DDs, then giving them different architectural designs (e.g., narrower mid-layer hidden dimensions) has no ontological warrant—and the industry's current uniform architecture (all layers architecturally isomorphic) is consistent with this ontological reading.
3.3.3 ln_final + output projection = 12DD prediction functional operation
The Transformer trunk's output passes through the final layer normalization (ln_final), then through the output projection matrix into the vocabulary space, and finally through softmax into a token probability distribution. This three-step operation (ln_final → projection → softmax) accomplishes at the functional level the core operation of the 12DD prediction domain—outputting a distribution over the next-token space based on the current activation state of the 11DD memory substrate.
In the SAE framework, 12DD is the top of the single-channel individual domain—prediction is the marker of completed function on the 1DD-12DD pathway. The fact that LLMs operate at 12DD explains the long-observed industry phenomenon that "scaling laws hold"—as long as the 11DD substrate capacity (parameter count) and the 12DD projection function (last few layers + softmax) are sufficient, prediction performance improves systematically with training data. The SAE framework does not provide new scaling laws but offers an ontological explanation of why scaling laws work.
3.4 Thermodynamic Roles of Sub-modules within a Single Transformer Layer
Thermo IX §4 provides an anatomical decomposition of the role of each sub-module within a single Transformer layer under the ZFCρ thermodynamic framework. This decomposition is the key tool for upgrading LLM ontology from block-level precision to sub-block precision; it allows the BLR-mini empirical study in §4 to be expressed as a dual ontological-thermodynamic check rather than merely an architectural ablation.
Reading guidance: Readers unfamiliar with the ZFCρ thermodynamic framework may skip the detailed table and read directly the flat-repeat summary in §3.7; subsequent arguments in §5/§6 do not directly depend on the specific q-behavior columns in the table but rely only on the high-level mapping established here. The sub-block-level mapping in the table represents the real anchoring depth of the SAE framework within the LLM domain; the complete table is preserved in the main text.
Table 3.1: Thermodynamic Roles of Sub-modules within a Single Transformer Layer (per Thermo IX §4.1)
| Sub-module | Operation | Boundedness | Thermodynamic role | q behavior |
|---|---|---|---|---|
| Pre-attention LN | (x-μ)/σ·γ+β | Bounded | Radial tail reset | Scale tail → 1, direction preserved |
| Q · K^T / √d | Unbounded bilinear | Unbounded | Tail-supporting exploration | Structurally allows q > 1, needs noise |
| Softmax | Bounded simplex [0,1] | Bounded | Selection gate | Rank preserved, amplitude tail truncated |
| Value aggregation | Bounded × value | Mixed | Integration | Depends on input |
| Post-attention residual | x + f(x) | Linear | Tail transport | Preserves heaviest tail |
| Pre-FFN LN | (x-μ)/σ·γ+β | Bounded | Tail reset | Scale tail → 1 |
| FFN ReLU/GELU | Unbounded activation | Unbounded | Tail-supporting exploration | Structurally allows q > 1, needs noise |
| FFN down-projection | Linear | — | Transport | Preserved |
| Post-FFN residual | x + f(x) | Linear | Tail transport | Preserved |
The anatomy reveals two key structural facts, jointly forming the mechanism base for the true-randomness argument in §5.1 and for the BLR-mini negative results in §4.
First: ReLU/GELU and Q · K^T are tail-supporting architectural components, not automatically tail-active. They are architecturally unbounded—the operation's output has no amplitude limit—but this gives only the possibility of q > 1 (necessary condition), not its realization (sufficient condition). Thermo IX §2.5 Universal Activation Rule supplies the full conditions: q > 1 in the stationary distribution requires (1) unsaturated activation + (2) stochastic driving jointly; q at the kernel level (q produced by endogenous dynamics) further requires (3) state-dependent multiplicative driving form (Rule IX-B). LLMs satisfy (1) on standard deterministic digital hardware but do not satisfy (2) or (3)—this argument is fully unfolded in §5.1.
Second: LN and softmax are strictly bounded operations. LN forcibly re-normalizes the radial scale of hidden states to a unit-amplitude interval; softmax forcibly truncates attention weights to the [0,1] simplex. Both occur within each layer, continually "resetting" any tail structure that upstream unbounded operations might have accumulated. This connects directly to the flat-repeat problem in §3.7—each Transformer layer contains tail-supporting and tail-resetting components simultaneously, with exploration and gating cancelling each other inside the same layer.
3.5 Structural Distinction between Workbench and Memory
The industry has long conflated two concepts in discussing LLMs: the transient computational state of the model during inference, and the persistent weights after training. The SAE framework provides ontologically clear separation between them.
Inference-time forward-pass activations = 12DD workbench instances
When an LLM receives a prompt and generates a response, hidden states flow forward through Transformer layers; the attention patterns and FFN activations of each layer form a transient computational state. This state is task-specific—the same model processing different prompts produces different forward states—and disappears after inference completes, not persistently stored. The SAE framework treats this as a 12DD workbench instance, structurally corresponding to the transient state of working memory in biological intelligence.
Post-training weights = 11DD memory substrate
After the model is trained, the weight matrices of each layer encode the statistical structure of the training corpus. These weights are frozen post-training and unchanged during inference (unless retrained). The SAE framework treats this as the 11DD memory substrate, structurally corresponding to persistent long-term storage in biological intelligence.
This distinction gives natural explanations for multiple long-puzzling phenomena in the industry:
- In-context learning (ICL) is not the model learning new knowledge—it is the 12DD workbench calling and reorganizing existing schemas on the frozen 11DD substrate (detailed in §3.6).
- Model performance varies across prompts—12DD workbench instances call different subsets of the 11DD substrate in prompt-specific patterns; dynamic schema re-routing is workbench operation, not memory updating.
- "The model seems to be thinking"—the 12DD workbench undergoes a long sequence of operations on complex prompts (e.g., chain-of-thought); it looks like reasoning but it is a deterministic multi-step traversal over frozen weights, not subjective thinking (detailed in §6.2).
Some industry confusions in architecture design and deployment philosophy (e.g., conflating "LLM has memory" with "LLM maintains context in conversation") dissolve automatically under the SAE workbench/memory distinction—maintaining context is workbench operation (KV-Cache is the workbench's current state), ontologically distinct from memory updating.
3.6 SAE Interpretation of In-Context Learning (ICL)
ICL is one of the most exciting and confusing phenomena in industry circles—given a few in-context examples, the model seems to "learn" a new task pattern and apply it to the query without weight updates. Some industry voices read ICL as "LLM learning online at inference time," even further inferring that LLMs possess some subjective learning capability.
The SAE framework offers a strict interpretation: ICL is not learning; it is dynamic schema invocation by the 12DD workbench on the 11DD substrate.
Specific mechanism: During pretraining, the LLM has already learned vast schemas (e.g., "lists are typically followed by list items," "questions are typically followed by answers," "in if-then structures, the if-clause is typically followed by the then-clause") into the 11DD substrate (weights). These schemas are high-dimensional extractions of distributional fingerprints, frozen in the weights once training completes.
The role of in-context examples is not to "teach the model a new schema" but to form a schema-pointer sequence on the 12DD workbench, guiding the forward pass to invoke the corresponding schema in the 11DD substrate and apply it at the query. Example: given "England → London; France → Paris; Japan → ?", the model is not "learning" the "country → capital" mapping at inference; rather, the country-capital pair schema already exists abundantly in the 11DD substrate, and the three examples activate that schema on the workbench so the forward pass invokes the existing schema at "Japan."
Thermo IX §3.5 provides a vivid metaphor for this interpretation: "The prompt only changes the angle at which the flashlight illuminates the weight amber—dynamically assembling fossils, not bringing them back to life." The metaphor accurately conveys the SAE position on ICL—ICL is dynamic indexing of fossil retrieval, not fossil resurrection.
A sharper boundary in Thermo IX §3.5: ICL does not satisfy condition (2) of the Universal Activation Rule (stochastic driving)—the dynamic growth of the KV-Cache is a deterministic algorithm and introduces no physical noise; thus it does not create kernel q. Even though ICL looks like learning on the surface, at the thermodynamic level it remains borrowed q re-indexed at different angles.
This distinction matters because it directly anchors the paper's reading of chain-of-thought (CoT) in §6.2—CoT is multi-step application of ICL to the model's own output; it is, like ICL, a deterministic algorithmic traversal of frozen weights by the 12DD workbench, neither creating new 14DD content nor creating kernel q nor crossing the 12DD ceiling.
3.7 The Flat-Repeat Problem
§3.4 gives the thermodynamic anatomy within a single Transformer layer. A natural corollary: the standard Transformer is a repeat application of this anatomy across many layers. A 12-layer model is 12 isomorphic blocks stacked; a 32-layer model is 32 isomorphic blocks stacked. Thermo IX §4.2 calls this the flat-repeat architectural pattern and identifies a structural problem with it.
The cross-layer cancellation problem of flat-repeat
The standard Transformer's per-layer sequence is: tail reset (Pre-attention LN) → unbounded exploration (Q·K^T, then FFN ReLU/GELU) → bounded gating (softmax) → tail transport (residual). Every layer contains both tail-supporting and tail-resetting operations; exploration and gating cancel each other within the same layer.
Concretely: FFN produces unbounded activation via ReLU/GELU, which the residual stream transports to the next layer. But the first operation in the next layer is Pre-attention LN, which forcibly re-normalizes the radial scale of hidden states—any tail structure accumulated by upstream unbounded exploration is reset here. Even though LN preserves direction, amplitude tail is lost.
This contrasts sharply with the multi-layer architecture of biological intelligence. The biological neural system is layer-heterogeneous—low-level layers (e.g., V1, V2 in visual cortex) perform unbounded exploration and local feature extraction; mid-level layers (V3, V4) perform local gating and integration; high-level layers (IT cortex, prefrontal) perform global gating and multi-modal binding. Roles across layers are clearly distinct, exploration and gating not cancelling within the same layer.
Thermo IX §Outlook 5 derives an architectural inference from this observation: architectures that truly satisfy the Universal Activation Rule (especially Rule IX-B kernel q condition) should implement exploration-selection separation—multiple consecutive unbounded exploration layers followed by gating layers, allowing tail structure to accumulate across layers rather than cancel within the same layer. This inference is a key explanatory frame for the BLR-mini negative results in §4—the eight architectural heterogenizations tested in this paper (zone-FFN, hidden U-shape, sparse LN) all still operate within the flat-repeat frame and do not test true exploration-selection separation.
3.8 The Bridge between 12DD and 13DD (Preview)
§3.1-3.7 provide the analog-DD functional-projection instantiation along the 1DD-12DD pathway (functional projection, not ontological bearer; see §3.2 for the analog-DD vs. true-DD distinction). LLMs stop at 12DD—next-token prediction operation completes; they do not enter the 13DD self-consciousness domain.
Crossing the 12DD → 13DD bridge in the SAE framework requires a true randomness source (Qin 2026d, §2.4)—the thermodynamic function of 13DD self-consciousness is to create a state-dependent multiplicative channel via the self-referential variable v and the forward dynamics, and creating this channel requires true physical multiplicative noise as driving. On a deterministic digital substrate, the architecturally effective ε = 0 (hardware of course has physical noise, but the computational architecture filters it as stability conditions and does not couple it as state-dependent multiplicative driving into hidden dynamics).
This argument is fully unfolded in §5.1, including: PRNG algorithmic randomness does not solve the problem; autoregressive temperature sampling is post-softmax processing not feeding back into hidden states; neuromorphic hardware may provide intrinsic noise but standard Transformer architecture still cancels it; quantum effects are open questions. §5 provides the complete five-fold structural lock argument, showing that current deterministic digital LLM scaling paths do not cross this bridge through scaling.
The 12DD → 13DD bridge is also the dividing line between nature and freedom in the SAE framework (Methodology §1.7). 1DD-12DD is the single-channel individual domain, composed of three rounds (causality, reproduction, prediction); 13DD-16DD is the intersubjective domain, composed of four steps (self-consciousness, meaning, ethics, bidirectional non-doubt). LLMs are fully instantiated on the natural side and lack substrate on the freedom side. The boundary is not engineering restriction but ontological partition—§9 of this paper gives the normative implication of this boundary: in the 14DD/15DD normative domain, human subjectivity is non-transferable.
§3 Remainders
The sub-block-level anatomical mapping of Transformer is provided by Thermo IX but has not been measured sublayer-by-sublayer on real large-scale LLMs (Thermo IX §7.2 open question 1). BLR-mini empirical work (§4) validates the cascade-level emergence pattern at the 30M-parameter scale, but the sub-block-level q-distribution has not been independently measured—sublayer q measurement is left as future work.
Furthermore, the DD-mapping firewall (§3.2) gives only the boundary at the functional-projection level; the truly ontological-bearer question is left to subsequent SAE AI series papers (AI-Qualia paper and beyond)—this paper does not address that boundary. Readers should understand all subsequent §5-§9 claims about LLMs within this firewall.
The SAE readings of ICL and CoT (§3.6) are internal conceptual frameworks of the paper; strict sub-block-level mechanism measurements (e.g., measuring q-behavior of attention patterns during ICL on large models) have not been carried out and remain future work.
§4 Empirical Anchoring: BLR-mini Eight-Variant Architectural Sweep
This section provides the complete report of the BLR-mini (Bidirectional Lottery Regime — mini) empirical work under the SAE framework. The empirical work is not a sufficient proof of the paper but anchoring—it lands the §3 ontology and the §5 thermodynamic argument in a checkable local experimental space. Framework and empirical work strengthen each other without entailing each other: that the SAE/Thermo predictions are consistent with the data within the tested design space is indirect support for the framework, not independent proof.
The section first presents the experimental setup (§4.1), then reports the three-tier capacity hierarchy (§4.2), the all-negative LN-frequency sweep (§4.3), the all-negative zone-FFN heterogenization (§4.4), the Uhidden compression-relocation data (§4.5), the μ-trajectory as a candidate mechanism-level diagnostic of functional emergence (§4.6), the distinguishability of BLR-mini negative results from mainstream predictions (§4.7), an honest acknowledgment of testing within the flat-repeat frame (§4.8), the synthesis (§4.9), and the complete eight-variant matrix table (§4.10).
4.1 Experimental Setup
The BLR-mini empirical study trained 12-layer Transformer variants on OpenWebText (~20B tokens) for 305,175 steps with effective batch 128, max learning rate 4e-4 cosine schedule, bf16 autocast. Hardware: RunPod RTX 5090 (Blackwell) and Lambda A100 40GB SXM4 (author's recorded prices were approximately $0.99/hour and $1.99/hour respectively, used only to indicate experiment cost magnitude for reproduction, not as current price claims); training throughput at the 30M scale was approximately 200K-330K tok/s (variant-dependent). Total compute investment was approximately $350-400 in cloud spending.
Scale-naming convention: In this paper, "3M / 10M / 30M scale" refers to total parameter-count configuration scale (including embedding); the variant tables also list non-embedding parameter count, used for comparing Transformer trunk capacity. Due to tied embeddings and vocabulary size influencing total parameters, non-embedding parameter count is significantly smaller than the total-parameter label—e.g., the "30M baseline" has 9.45M non-embedding parameters, reflecting trunk capacity. The paper uses naming like "tiny-30M-class" to refer to scale configuration rather than precise total parameter count.
Statistical robustness pre-statement: The empirical work in this paper uses only a single seed and a single dataset (OpenWebText), without multi-seed variance estimation or cross-corpus validation (FineWeb-Edu, Stack, C4, etc.). This limitation affects the statistical robustness of negative results—multi-seed and cross-corpus validation would increase confidence in cross-architecture relative comparisons. The paper's argument relies on the robustness of relative cross-architecture comparison, not on the precision of absolute numbers. §10.6 further elaborates.
Architecture template (12-layer Transformer):
- Hidden dimension: 64 (3M-scale) / 128 (10M-scale) / 256 (30M-scale)
- Attention heads: 4
- Positional encoding: RoPE
- Tied embedding weights
- Each block individually configurable for FFN size, LN strategy, hidden dimension (the latter two via per-block projection layers)
LN strategies implemented:
standard: LN at every layer (12 LN total)fractal-2: LN every 2 layers (6 LN total)fractal-3: LN every 3 layers (4 LN total)fractal-4: LN every 4 layers (3 LN total)hybrid: sparse front + dense back (6 LN total, pattern [F,F,F,T,F,F,F,T,T,T,T,T])zone-aware: 4-4-4 partition with 4/2/1 LN frequencies (7 LN total)cascade-end: LN at final block only (1 LN total)
Evaluation metrics:
- Training loss (next-token prediction error, training distribution)
- WikiText-2 perplexity (cross-distribution perplexity)
- LAMBADA accuracy (last-word reasoning task)
- Cascade trajectory multi-indicators: per-layer effective rank, per-layer μ_radial (log-norm mean), per-layer σ_radial (log-norm std)
Code package blr_mini_v2 (architecture.py, train.py, measure_trajectory.py, evaluate.py) will be released alongside the paper.
4.2 Three-Tier Capacity Hierarchy
The first BLR-mini empirical batch trains baselines at three scales, validating the SAE-ontology + thermodynamics prediction that parameter count is the first-order factor.
Table 4.1: Three-Tier Capacity Hierarchy — Baseline Results
| Variant | Non-embedding params | WikiText PPL | LAMBADA acc | Status |
|---|---|---|---|---|
| tiny-3M baseline (hidden 64, FFN 256) | 0.59M | 335.7 | 1.05% | Degenerate, below L11 floor |
| tiny-10M baseline (hidden 128, FFN 512) | 2.37M | 175.7 | 9.65% | L11 build only, μ stays negative |
| tiny-30M baseline (hidden 256, FFN 1024) | 9.45M | 98.3 | 18.1% | Full L12 emergence |
The data is consistent with the SAE prediction (11DD substrate capacity requires sufficient weight matrix size to carry data q fossils). The three-tier hierarchy shows clear distinction:
- 3M below functional emergence: WikiText perplexity is far above the random-baseline token statistic, but LAMBADA accuracy of 1.05% is near random (approximately 0.002% for random baseline at vocabulary size 50K; 1% indicates weak build); the model learns token-cooccurrence patterns but lacks true substrate function
- 10M in the L11-only regime: WikiText perplexity drops significantly; LAMBADA accuracy of 9.65% is markedly lower than 30M and higher than 3M, indicating partial task-relevant structure; whether this counts as chance-level depends on the specific evaluation protocol (open-vocabulary next-token / candidate set / tokenizer choice all affect the chance baseline; this paper does not treat this number as an absolute random baseline); μ_radial in the final layer stays negative; the model has partial 11DD memory-substrate function but does not cross L11 → L12 compression
- 30M in full L12 emergence: WikiText perplexity drops to 98.3; LAMBADA accuracy 18.1%; μ_radial turns positive during the build phase; final-layer μ ≈ +2.0; the model fully instantiates 11DD memory + 12DD prediction
Functional emergence floor: within the BLR-mini tested design space, it falls in the range of approximately 5-9M non-embedding parameters (between the tested 2.4M and 9.5M), ontologically consistent with the industry's emergence observations within 1-2 orders of magnitude (mostly in the 100M-1B range, task-dependent).
4.3 All-Negative LN-Frequency Sweep
The second batch performs an LN-frequency sweep on top of the 30M baseline, testing the hypothesis that "cutting ε accumulation unlocks emergence" (i.e., sparser LN allows cross-layer ε accumulation, structurally closer to the SAE-suggested exploration-selection separation of Thermo IX §Outlook 5).
Table 4.2: 30M-scale LN Frequency Sweep Results
| Variant | LN strategy | LN count | WikiText PPL | LAMBADA acc | block_5 eff_r | block_2 eff_r | Final μ |
|---|---|---|---|---|---|---|---|
| 30M baseline | every | 12 | 98.30 | 18.10% | 0.13 (compressed) | 0.53 | +2.00 |
| 30M-fractal-4 | sparse | 3 | 114.6 | 15.65% | 0.41 (no compression) | 0.12 (collapsed) | +3.7 (extreme) |
| 30M-hybrid | hybrid | 6 | 112.6 | 15.20% | 0.26 (partial) | 0.11 (collapsed) | +1.9 (partial) |
| 30M-cascade-end | end-only | 1 | (no convergence) | (no convergence) | — | — | — |
The data shows:
- All sparser-LN variants substantially underperform the every-layer-LN baseline on both WikiText perplexity and LAMBADA accuracy (gap of 1-3 LAMBADA percentage points, 14-16 perplexity points)
- Sparser LN variants exhibit significant build collapse at block_2 (effective rank drops from 0.53 to 0.11-0.12), indicating failure of the bottom-layer 11DD raw-build phase
- Sparser LN variants show extreme final-layer μ (+3.7), but this is amplitude unconstrained by missing LN, not functional emergence
- One-LN-only at end fails to converge
The data falsifies the "cutting ε accumulation unlocks emergence" hypothesis and is consistent with the SAE-ontology prediction (the 11DD three-phase same-DD requires independent LN to maintain the amplitude interval at each phase). The industry's current every-layer-LN is ontologically a reasonable choice.
4.4 All-Negative Zone-FFN Architectural Heterogenization
The third batch tests the hypothesis that "different FFN sizes at bottom / middle / late zones optimize the 11DD three phases" (inspired by some industry anatomical-heterogeneity research and cerebellum-style architectural intuition).
Table 4.3: 30M-scale Zone-FFN Sweep Results
| Variant | FFN config | LN strategy | WikiText PPL | LAMBADA acc | block_5 eff_r | block_2 eff_r | Final μ |
|---|---|---|---|---|---|---|---|
| 30M baseline | uniform 1024 (4×) | every | 98.30 | 18.10% | 0.13 | 0.53 | +2.00 |
| 30M-zone-aware | zone 2048/256/1024 | zone-aware (7 LN) | 106.75 | 16.35% | 0.28 | 0.27 (low build) | +2.07 |
| 30M-zone every-LN | zone 2048/256/1024 | every (12) | 102.22 | 16.40% | 0.109 (deeper) | 0.573 (higher!) | +1.85 |
Notes:
- "Zone 2048/256/1024" means block_0-3 FFN dim 2048, block_4-7 FFN dim 256, block_8-11 FFN dim 1024
- This configuration is inspired by the intuition "bottom-layer build needs wide compute (large FFN), mid-layer compression needs narrow compute (small FFN), late-layer output-ready needs medium compute"
The data shows:
- All zone-FFN variants underperform the uniform baseline (1-2 LAMBADA percentage point gap)
- The zone every-LN variant exhibits stronger cascade signatures (block_2 effective rank 0.573 > baseline 0.53, block_5 effective rank 0.109 < baseline 0.13) but worse functional benchmarks
- This is one of the sharpest phenomena in the paper's empirical study: cascade signature and functional generalization decouple under architectural manipulation
The data is consistent with the SAE prediction (the 11DD three phases are sequential refinements of the same DD; the phases lack independent ontological standing; therefore, there is no warrant for architectural heterogeneity). The industry's current uniform FFN allocation is ontologically a reasonable choice.
4.5 The Compression Relocation Data of Hidden Dimension U-shape (Uhidden)
The fourth batch tests the hypothesis that "narrowing mid-layer hidden dimensions forces information compression" (inspired by autoencoder-style architectural intuition). This experiment yields the sharpest structure-capability decoupling data in the paper.
Architecture configuration (30M-zone-Uhidden):
- Hidden dimension: block_0-3 = 256, block_4-7 = 128, block_8-11 = 256
- FFN configuration: block_0-3 = 2048 (8×), block_4-7 = 512 (4× × 128), block_8-11 = 1024 (4×)
- LN strategy: every (12 LN)
- Projection layers added at zone boundaries (block_3-4 and block_7-8; linear, not changing amplitude interval)
- Since different zones have different hidden dimensions, attention head dimensions are also zone-specific—requiring multi-RoPE buffers in implementation
Table 4.4: 30M-zone-Uhidden vs Baseline Comparison
| Indicator | Baseline (uniform 256) | Uhidden (256/128/256) | Difference |
|---|---|---|---|
| Non-embedding params | 9.45M | 9.25M | -2% |
| WikiText PPL | 98.30 | 98.29 | Tied |
| LAMBADA acc | 18.10% | 17.10% | -1.0 pp |
| Block_2 effective rank | 0.53 | 0.551 | Slightly higher |
| Block_5 effective rank | 0.13 (mid-layer compression) | 0.62 (no compression at block_5!) | Significant difference |
| Block_9 effective rank | (typical 0.4-0.5) | 0.18 (late-layer compression) | Compression position shifts |
| Final μ | +2.00 | +1.54 | -0.46 |
| Throughput (tok/s) | 315K | 200K | -35% |
The data shows several key phenomena:
First: Uhidden ties baseline on WikiText perplexity (98.29 vs 98.30) at 2% fewer parameters—the only positive-leaning result in the paper's empirical study.
Second: Uhidden underperforms LAMBADA by 1 percentage point (17.10% vs 18.10%)—WikiText and LAMBADA decouple under this architecture, indicating that different cascade structures may have different strengths on different tasks (general distribution modeling vs last-word reasoning).
Third (most important): The compression event relocates—baseline compresses at block_5 (mid-layer); Uhidden does not compress at block_5 (effective rank 0.62, high level), but compresses at block_9 (late-layer; effective rank 0.18). The mid-layer hidden 128 is already at low channel count, operating in sparse-coding style; compression must happen elsewhere.
Fourth: Uhidden throughput drops 35% due to projection layers and variable head dimensions. This is a significant cost for production deployment.
Fifth: Uhidden's final μ is significantly lower (+1.54 vs +2.00), indicating slightly weaker functional emergence strength, even though WikiText perplexity is tied. This is an instance where μ trajectory and perplexity do not fully agree—perplexity measures token-distribution match; μ measures internal substrate function build; they can decouple.
Structure-capability decoupling is most pronounced in this data: architecture can forcibly shape the cascade morphology (compression event position), but functional capability does not automatically follow. Uhidden has a markedly different cascade structure (mid-layer high effective rank, late-layer low effective rank) but worse functional benchmarks (LAMBADA)—cascade structure is manipulable, functional emergence is not.
This data is the key anchor for §4.6's argument that the μ trajectory is the true emergence indicator—looking at effective rank alone is misleading (Uhidden's block_5 high effective rank might be misread as "no emergence," yet WikiText is tied with baseline); only μ trajectory + functional benchmarks together accurately assess functional emergence.
4.6 The μ-Trajectory as a Candidate Mechanism-Level Diagnostic of Functional Emergence
This section is where the paper's true novelty in empirical work lies. We claim that μ_radial trajectory turning positive in the build phase is a candidate mechanism-level diagnostic for functional emergence, complementary to the industry's current evaluation paradigm (loss + benchmarks); multi-scale, multi-corpus, multi-seed validation is needed before it can be upgraded to a confirmed empirical regularity. This claim must be articulated carefully in the context of adjacent literature to avoid over-claiming.
4.6.1 Multiple Active Hidden-State Analysis Lineages in the Industry
The industry has multiple active lines of research on hidden-state analysis; the paper acknowledges this complete picture:
Stability perspective: The Pre-LN, Post-LN, Peri-LN series (Ba et al. 2016; Xiong et al. 2020; Kim et al. 2025) extensively tracks hidden-state amplitude growth, framing it as numerical-stability issues to suppress. This lineage cares about amplitude growth because it causes training instability in deep Transformers; optimization directions are LN-placement, initialization, scaling adjustments to suppress amplitude growth.
Quantization perspective: Dettmers et al.'s outlier-activation work (LLM.int8 (Dettmers et al. 2022), SmoothQuant), and Gallego-Feliciano et al. (2025) on the emergence timing of massive activations at scale, study the emergence timing of outlier dimensions in hidden states and their quantization effects. The framing is outlier management for efficient inference.
Interpretability perspective: "Truth as a Trajectory" series (Damirchi et al. 2026), Gnosis (Ghasemabadi & Niu 2025), and similar work apply layerwise hidden-state trajectories to correctness prediction, failure detection, computational-dynamics analysis. The framing is model-behavior diagnosis and safety.
Emergence debate: Wei et al. (2022) "Emergent Abilities of LLMs" argues for benchmark-level phase transitions induced by scale; Schaeffer et al. (2023) "Are Emergent Abilities of LLMs a Mirage?" partially critiques emergence as a metric artifact (discontinuous metrics make continuous improvements look emergent). This debate unfolds at the benchmark-level performance.
4.6.2 Specific Contributions of This Paper's μ-Trajectory Diagnostic
This paper's μ-trajectory diagnostic is distinguishable from the above lineages along four dimensions:
First dimension — coordinate choice: The paper uses the log-norm mean μ_radial (per the ZFCρ thermodynamic σ/μ framework, Thermo I (Qin 2026a) §3.2 definition), while industry mainstream lineages mostly use raw norms, variance, or outlier amplitudes. The log-norm mean is a coordinate derived from the SAE/Thermo framework; it captures representational radial-scale geometry rather than amplitude alone.
Second dimension — purpose: The paper uses μ trajectory as a diagnostic of functional emergence during pretraining, distinguishing true substrate-function build from task-level token-statistic matching. The industry lineages apply hidden-state analysis to stability monitoring (Peri-LN), quantization optimization (Dettmers/SmoothQuant), correctness/failure prediction (Truth as a Trajectory/Gnosis), benchmark-level emergence claims (Wei/Schaeffer)—none target mechanism-level emergence diagnostic during pretraining.
Third dimension — mechanism translation: The paper, through the Transformer thermodynamic anatomy of Thermo IX (§3.4), gives the μ turn-positive a thermodynamic mechanism explanation—it is the signature of tail-supporting exploration (FFN ReLU/GELU + Q·K^T) and tail transport (residual stream) functioning jointly on the weight substrate:
- When weight matrices truly carry data q fossils, the residual stream, after multiple layers of tail-supporting operations, can form a stable high-energy internal state; this state manifests as μ_radial trajectory rising and turning positive during the build phase.
- When weight matrices do not truly carry data q fossils (e.g., tiny-10M baseline, insufficient capacity; or 10M-zone-wider-hybrid, architecture forces effective-rank compression but substrate lacks true function), μ stays negative during the build phase—even as loss decreases, the descent is surface-level token-cooccurrence matching, not deep substrate function.
Thermodynamic-geometric intuition: Why radial scale? Feature extraction in a high-dimensional phase space is essentially radial polarization of disorder—representations evolve from initial near-isotropy (μ_radial close to 0 or negative, corresponding to a low-energy uniform state) into anisotropic high-energy manifolds (μ_radial turning positive, corresponding to a stable structure built within a weight subspace). μ_radial turning positive corresponds thermodynamically-geometrically to the residual stream successfully resisting LN's mean-reset and building a stable high-energy manifold in a specific subspace—LN at each layer resets the radial scale of representations back to a unit-amplitude interval, but when the weight substrate truly carries data q fossils, the cumulative contribution of the residual stream maintains net radial polarization across multi-layer LN resets; this is the physical meaning of μ_radial turning positive. When the substrate does not truly carry fossils, the residual stream's contribution is cleanly cancelled by LN; μ_radial stays low.
This mechanism explanation upgrades the μ-trajectory diagnostic from "empirical observation" to "thermodynamic-geometric correspondence": it is not merely heuristic but a direct measure of internal substrate function build within the ZFCρ framework. The mechanism explanation distinguishes from industry lineages: Peri-LN cares about amplitude growth because of stability, not because of the link between amplitude growth and substrate function emergence; Dettmers/SmoothQuant cares about outlier emergence timing for quantization, not as a mechanism-level emergence diagnostic; Truth as a Trajectory/Gnosis uses trajectories to predict correctness but does not distinguish whether the trajectory comes from true substrate function or surface matching.
Fourth dimension — decoupling finding: Through Uhidden data (§4.5), the paper sharply severs the link between cascade structure and functional emergence—architecture can forcibly shape effective-rank trajectories (e.g., moving compression from block_5 to block_9, or forcing high mid-layer effective rank), but functional capability does not automatically follow. This decoupling is the paper's sharpest empirical contribution and partly responds to Schaeffer et al. (2023)'s critique of "emergence is a metric artifact"—μ trajectory is a mechanism-level indicator (it measures internal substrate function build), not a benchmark-level artifact; its correlation with functional task performance is not manufactured by discontinuous metric choices.
4.6.3 Gaps in the Industry's Current Evaluation Paradigm
The industry's current evaluation paradigm for pretraining mainly looks at loss decrease and downstream benchmark performance. This paradigm cannot distinguish two cases:
- Case A (true substrate-function emergence): loss decreases + benchmark performance rises + μ trajectory turns positive during build phase—the substrate truly carries function; capability emergence is real
- Case B (task-level token-statistic overfitting): loss decreases + benchmark performance partially rises + μ trajectory stays negative or only partially turns positive—the substrate lacks true function; the model surface-matches token cooccurrences and is fragile under distribution shift
The current paradigm treats both Case A and Case B as "training in progress," generating momentum to invest more compute and data. The μ-trajectory diagnostic offered by this paper provides a mechanism-level criterion—observing the μ trajectory morphology lets evaluators early-distinguish Case A from Case B, helping them decide whether the current training is on a true substrate-function path before investing further resources.
4.6.4 Operational Procedure for the μ-Trajectory Diagnostic
Operational steps:
- Measure layerwise μ_radial trajectories at multiple training checkpoints (typically 10-20 evenly spaced)
- Focus on whether μ in the build phase (typical block_0-3 region) turns positive from negative as training progresses
- Contrast this with looking at loss decrease alone—loss decrease is a necessary not sufficient condition; μ turning positive is the mechanism-level emergence signature
The BLR-mini empirical study shows that this diagnostic yields consistent judgment across the three-tier capacity hierarchy:
- tiny-3M baseline: μ trajectory does not turn positive; loss decreases but substrate lacks true function (LAMBADA 1.05% indicates this)
- tiny-10M baseline: μ trajectory partly turns positive in the build phase, but final-layer μ remains weak, consistent with LAMBADA 9.65% (low level)—L11 build completes but L12 emergence is insufficient
- tiny-30M baseline: μ trajectory turns positive fully in the build phase; final-layer μ ≈ +2.0; consistent with LAMBADA 18.1% (clear emergence)
Adding this diagnostic before deciding to scale up a candidate architecture/setting is low-cost (only adds trajectory measurement; does not change training). But the paper does not claim the diagnostic is complete—it gives a sharp criterion at the mechanism level; combined with loss + benchmarks, it provides full assessment.
Explicit acknowledgment of statistical-foundation limitation: The μ-trajectory diagnostic's current evidence base contains three data points (3M / 10M / 30M baseline) at single seed. Cross-multi-seed and cross-multiple-corpus validation (FineWeb-Edu, Stack, C4, etc.) are key future-work priorities. The current indicator is positioned as a working hypothesis and candidate diagnostic tool, not as a confirmed empirical regularity. The paper §4.6 positions the μ trajectory as "candidate mechanism-level diagnostic" and invites industry and academic replication and validation at larger scale and across more seeds.
4.7 Distinguishability of BLR-mini Negative Results from Mainstream Predictions
BLR-mini at the 30M scale produces all-negative LN-frequency sweep, all-negative zone-FFN heterogenization series. This section systematically analyzes the degree to which these negative results can distinguish SAE/Thermo IX predictions from mainstream "scaling is everything" predictions.
Two predictions stated:
- Mainstream prediction: 30M is not enough scale for substantive emergence; architectural reallocation does not solve the underlying scale issue at insufficient base capacity. Expectation: at larger scale (e.g., 1B+), standard architecture optimizes automatically and reallocation does not significantly outperform standard.
- SAE/Thermo IX prediction: At any scale within the flat-repeat Transformer frame, architectural reallocation does not solve the cross-layer-cancellation problem (Thermo IX §4.2). Expectation: even at larger scale, reallocation within the flat-repeat frame still does not unlock emergence; true unlocking requires exploration-selection separation (Thermo IX §Outlook 5).
Compatibility of both predictions with 30M-scale BLR-mini data:
| Prediction | 30M-scale BLR-mini negative result |
|---|---|
| Mainstream "scale not enough" | Compatible (expects no significant impact of reallocation at 30M; data shows no significant impact) |
| SAE/Thermo IX | Compatible (expects reallocation within flat-repeat frame does not unlock; data shows no unlocking) |
Distinguishability test designs:
True distinguishing between the two predictions requires frontier-scale (1B+) experiments. Two classes of key future tests:
Test 1 — Frontier-scale flat-repeat reallocation: Repeat BLR-mini's LN-frequency / zone-FFN / Uhidden reallocation at 1B+.
- If at frontier scale, reallocation within flat-repeat frame is still consistently negative: mainstream "only scale solves" prediction is pressured; SAE/Thermo IX "scale alone insufficient, architecture redesign needed" prediction is supported
- If at frontier scale, reallocation shows clear positive (e.g., zone-FFN outperforms uniform at 1B+): SAE prediction is pressured; mainstream prediction is supported
Test 2 — Exploration-selection separation experiment: At any scale, experiment with the true exploration-selection separation architecture suggested by Thermo IX §Outlook 5 (multiple consecutive unbounded exploration layers followed by gating layers, rather than flat-repeat).
- If this kind of architecture significantly outperforms standard Transformer: SAE/Thermo IX architectural reading is supported (especially "flat-repeat is not necessary; exploration-selection separation can be effective")
- If this architecture does not outperform: SAE/Thermo IX architectural reading is pressured (especially the Thermo IX §Outlook 5 suggestion is partly challenged)
The paper does not conduct these two test classes (beyond the 30M BLR-mini resource scope) but states them as recommended future distinguishability experiments. BLR-mini negative results are consistent with SAE/Thermo predictions only within the 30M-scale flat-repeat design space tested; they do not independently prove that SAE/Thermo predictions hold at frontier scale.
4.8 Honest Acknowledgment: Limitations of Testing within the Flat-Repeat Frame
All eight architectural heterogenizations tested in this paper's empirical work (zone-FFN, hidden U-shape, sparse LN, various LN-strategy combinations) still operate within the flat-repeat Transformer frame. BLR-mini did not test the true exploration-selection separation suggested by Thermo IX §Outlook 5—the latter requires walking in a different direction along the architecture axis (e.g., letting multiple consecutive layers be unbounded exploration without gating, then concentrating gating in late layers), orthogonal to BLR-mini's zone-FFN / hidden U / LN redistribution along their axes.
This honest acknowledgment matters because it prevents over-claiming by the paper:
- BLR-mini negative results cannot falsify the Thermo IX §Outlook 5 architectural prediction—because the paper did not test the architecture direction recommended by that prediction
- BLR-mini negative results can falsify the hypothesis "reallocation within the flat-repeat frame unlocks emergence"—this hypothesis is thoroughly tested within the paper's empirical scope and systematically fails
This distinction is the key claim discipline of the paper: the paper draws confident conclusions within the tested scope but does not extrapolate broader predictions beyond the tested boundary. This self-awareness is consistent with §10.6's epistemic boundary (the 12DD ceiling closes only the analyzed pathway)—the claim is strong within a specific scope, but remains open outside it.
4.9 Synthesis
Comprehensive reading of the BLR-mini eight-variant empirical study:
The data is consistent with SAE/Thermo IX predictions within the tested flat-repeat Transformer design space:
- The three-tier capacity hierarchy (3M / 10M / 30M) is consistent with "11DD substrate capacity is the first-order factor"
- All-negative LN-frequency sweep is consistent with "cutting ε accumulation does not unlock emergence"; the industry's every-layer LN is ontologically reasonable
- All-negative zone-FFN heterogenization is consistent with "the 11DD three phases are sequential refinements of the same DD, no architectural-heterogeneity warrant"; the industry's uniform FFN allocation is ontologically reasonable
- Uhidden data gives a clear decoupling between cascade structure and functional emergence, revealing that μ trajectory is a mechanism-level diagnostic (§4.6)
Epistemic position of negative results within the tested design space:
- The single positive-leaning result (Uhidden WikiText tie) serves only as a diagnostic anchor, not as an independent selling point
- Multiple negative results within the tested flat-repeat Transformer design space are systematically consistent with SAE/Thermo predictions, forming indirect support
- This epistemic weight is limited to within the tested design space and is not extrapolated to untested directions (exploration-selection separation, frontier scale, etc.; see §4.7 and §10.5)
Epistemic-status discipline:
- BLR-mini does not independently prove the SAE/Thermo framework—the framework is an ontological + thermodynamic prior, and BLR-mini is empirical anchoring
- The data localizes SAE/Thermo predictions in a checkable local experimental space and applies local falsification pressure on alternative hypotheses (e.g., "cutting ε accumulation unlocks")
- Frontier-scale and exploration-selection separation experiments are the principal directions for future distinguishability testing
Industry operational value of the μ-trajectory diagnostic:
- The industry can add μ-trajectory measurement when evaluating candidate architectures/settings, as a mechanism-level criterion
- This diagnostic does not replace loss + benchmarks but complements them—distinguishing true substrate-function emergence from task-level token-statistic matching
- Implementation cost is low (adding only trajectory measurement, no change to the training pipeline)
4.10 Complete Eight-Variant Matrix
For cross-referencing, this section provides the complete matrix at 30M scale and at 10M scale.
Table 4.5: Complete 30M-Scale Eight-Variant Matrix
| Variant | FFN config | LN strategy | Non-emb params | WikiText PPL | LAMBADA acc | block_5 eff_r | block_2 eff_r | Final μ | Throughput (tok/s) |
|---|---|---|---|---|---|---|---|---|---|
| baseline | uniform 1024 (4×) | every (12) | 9.45M | 98.30 | 18.10% | 0.13 | 0.53 | +2.00 | 315K |
| fractal-4 | uniform 1024 | sparse (3) | 9.45M | 114.6 | 15.65% | 0.41 | 0.12 | +3.7 | 331K |
| hybrid | uniform 1024 | hybrid (6) | 9.45M | 112.6 | 15.20% | 0.26 | 0.11 | +1.9 | 326K |
| zone-aware | zone 2048/256/1024 | zone-aware (7) | 9.97M | 106.75 | 16.35% | 0.28 | 0.27 | +2.07 | — |
| zone every-LN | zone 2048/256/1024 | every (12) | 9.97M | 102.22 | 16.40% | 0.109 | 0.573 | +1.85 | 314K |
| zone-Uhidden | zone 2048/512/1024, hidden 256/128/256 | every (12) | 9.25M | 98.29 | 17.10% | 0.62 | 0.551 | +1.54 | 200K |
| cascade-end | uniform 1024 | end-only (1) | 9.45M | (no convergence) | — | — | — | — | — |
Table 4.6: Complete 10M-Scale Matrix (for cross-checking L11 vs L12 boundary)
| Variant | Non-emb params | WikiText PPL | LAMBADA acc | block_5 eff_r | Notes |
|---|---|---|---|---|---|
| baseline (uniform 512, every LN) | 2.37M | 175.7 | 9.65% | 0.56 stuck | L11 only, final-layer μ stays negative |
| zone fractal-4 | 2.43M | 199.0 | 9.80% | 0.16 (compressed) | μ polarized but functional failure |
| zone hybrid | 2.43M | 195.3 | 9.55% | 0.082 (deeper) | μ depolarized |
| zone-wider hybrid | 2.76M | 185.6 | 10.10% | 0.072 (deepest) | Best WikiText; LAMBADA still low level |
The key finding at the 10M scale: all 10M variants have LAMBADA at low levels (~9-10%; whether to call this chance-level depends on the evaluation protocol, see §4.2 note); no architecture unlocks functional emergence at this scale. This confirms the functional-emergence floor in the range of approximately 5-9M non-embedding parameters; architectural reallocation below this scale cannot substitute for capacity insufficiency.
§4 Remainders
The empirical work is at the 30M scale, with a significant gap from the industry frontier (1B+). Whether 30M-scale patterns fully hold at frontier scale is an open question; distinguishability testing requires frontier-scale experiments (§4.7).
The paper's empirical work uses only a single seed and a single dataset (OpenWebText), without multi-seed variance estimation or cross-dataset validation (FineWeb-Edu, Stack, C4, etc.). This is a standard empirical-paper limitation, especially when negative results carry significant epistemic weight—multi-seed and cross-dataset can strengthen confidence in negative results. This limitation is acknowledged in §10.6.
The paper only tested three architectural axes (FFN/LN/hidden), not:
- Block-type heterogeneity (sequential attention-vs-Mamba/SSM regions vs Jamba-style interleaved)
- Exploration-selection separation (the true architectural redesign suggested by Thermo IX §Outlook 5)
- Other unforeseen architectural directions
These two directions are candidate future-work directions for verifying SAE/Thermo predictions, together with frontier-scale work.
The μ-trajectory diagnostic gives clear distinctions at the BLR-mini scale, but whether it remains equally clear at frontier scale (1B+) is an open question—the trajectory shape and amplitude on large models may be influenced by other factors (mixed-precision training, MoE routing, etc.); the diagnostic's operational procedure needs recalibration at frontier scale.
§5 Five-Fold Structural Anchoring of the 12DD Ceiling
§3 gave the analog-DD functional-projection instantiation of LLM along the 1DD-12DD pathway (functional projection, not ontological bearer). A natural follow-up question: given that LLMs are analog-DD instantiated up to 12DD, can scaling and longer training cross the 12DD → 13DD bridge and give rise to true 13DD subjectivity? This is the ontological core of the industry's "AGI is imminent" narrative.
This section provides the complete answer of the SAE framework + ZFCρ thermodynamics: the current deterministic digital LLM scaling path will not cross the 12DD → 13DD bridge through scaling. The argument is supported by five structural locks, which do not carry equal proof strength—§5.6 provides the claim-status grading. The claim specifically addresses the analyzed soft-gate transmission pathway; unknown pathways are not closed off (§5.8).
5.1 Absence of True Randomness Source — Primary Mechanism Anchor
The first lock is the paper's strongest claim, fully supported by ZFCρ thermodynamics (Thermo IX) argument. This subsection fully unfolds the mechanism.
5.1.1 Universal Activation Rule (Thermo IX §2.5)
Thermo IX, on the 4-12DD SDE model class, validates a Universal Activation Rule that within that model class is a necessary and sufficient condition for the appearance of q > 1 (heavy-tail Tsallis statistics). This rule is not a global physical universal theorem for all non-equilibrium systems—its scope of applicability is restricted to continuous-state dynamical systems with explicit driving terms. The paper applies this rule to LLMs because LLM hidden-state dynamics fall within that model class's applicability, not by extrapolating it as a universal physical law.
Rule IX-A (empirical condition: q > 1 in the stationary distribution):
When two sub-conditions are satisfied, the system exhibits q > 1 in its stationary distribution:
- (1) Unsaturated activation: at least one key component of the system has an unbounded operation (no hard amplitude limit)
- (2) Stochastic driving: the system has stochastic input on that unsaturated component
Rule IX-B (structural condition: kernel q):
Furthermore, when the stochastic driving has a state-dependent multiplicative form (i.e., noise amplitude depends on current state, ξ(x) = g(x) · η), q > 1 is kernel q—i.e., it comes from the system's own endogenous dynamics, not merely the signature left by external/additive noise.
The two rules give a clear distinction: q > 1 can manifest under different mechanisms, but q produced by truly endogenous dynamics (kernel q) strictly requires state-dependent multiplicative driving. External additive noise can also produce q > 1 in the distributional sense, but it is an external imprint, not the product of endogenous dynamics.
5.1.2 LLMs Satisfy IX-A Condition (1) But Not Condition (2)
Applying the Universal Activation Rule to standard deterministic digital LLMs:
Condition (1) satisfied: §3.4 Transformer anatomy shows that ReLU/GELU and Q · K^T are unbounded operations. ReLU/GELU has no amplitude cap on positive input; Q · K^T is a bilinear operation also without amplitude cap (before softmax application). Therefore the LLM architecture contains unsaturated activation components—condition (1) is satisfied at the architectural level.
Condition (2) not satisfied: Standard deterministic digital LLMs at inference time have no architecturally effective stochastic driving entering hidden dynamics.
Careful articulation is needed here. Digital hardware of course has physical noise—CMOS gates have thermal noise; transistors have shot noise; memory cells have leakage—but digital computational architecture is designed to filter these physical noises as stability conditions. Error-correcting codes, voltage references and timing design, ECC memory, and so on, suppress physical noise below floating-point precision, making computation effectively deterministic.
Therefore, the SAE-relevant ε is effectively zero at the computational-architecture level—not because the physical world has no noise but because the computational architecture does not couple noise as state-dependent multiplicative driving into hidden dynamics. This phrasing is more accurate than "underlying physics ε = 0" and more defensible in academic discussion: the paper does not claim that physics has no noise (which is wrong) but that the computational architecture effectively does not introduce noise into dynamics as multiplicative coupling.
Acknowledgment of actual numerical non-determinism in modern large-scale LLMs: Modern large-scale LLM training and inference do contain numerical non-determinism—non-associativity of floating-point arithmetic in multi-GPU/TPU training (parallel reduction order differences) means even same-seed runs produce different results; memory-access timing dependencies and hardware-generation numerical-behavior differences also contribute. OpenAI / Anthropic and others have publicly acknowledged that model outputs at the same seed are not fully deterministic across runs.
This numerical non-determinism does not weaken the paper's argument, because it is additive numerical noise, not state-dependent multiplicative noise (Rule IX-B strictly requires). Even though modern LLMs are not strictly deterministic, they still do not satisfy the state-dependent multiplicative driving condition for producing kernel q—floating-point non-associativity is an approximation error of a deterministic algorithm under different execution orders, not a multiplicative noise channel coupled into hidden dynamics. The paper's true-randomness-source absence argument applies to standard modern LLMs even while acknowledging the actual numerical non-determinism of modern floating-point computation.
5.1.3 PRNG Does Not Solve the Problem
A natural industry suggestion: inject random numbers at the software layer—e.g., write hidden_state *= torch.randn_like(hidden_state) on hidden states—does this provide stochastic driving for LLMs?
The answer is no. A PRNG (Pseudo Random Number Generator, e.g., Mersenne Twister, Xorshift, PCG) is mathematically a deterministic Markov chain—given the seed, the entire sequence is fully determined. PRNGs provide algorithmic randomness (which passes some statistical-randomness criteria but is intrinsically an algorithm), not intrinsic physical multiplicative coupling.
More specifically: PRNG sequences trace a closed Markov trajectory in phase space; they introduce no new information—they are deterministic unfoldings of what is already encoded in the seed and the algorithm. Even when PRNG outputs are multiplied element-wise with hidden states, the whole system still operates within a deterministic algorithm, and by construction violates C5 (a transmission condition from Thermo VIII).
Von Neumann's 1951 quip "Anyone who attempts to generate random numbers by deterministic means is, of course, living in a state of sin" provides a brief expression of this intuition—the strict argument does not rely on the quote but on the mechanism-level distinction between algorithmic randomness and intrinsic physical multiplicative coupling above.
5.1.4 Autoregressive Temperature Sampling Also Does Not Solve It
Another industry suggestion: LLMs typically use temperature > 0 multinomial sampling from the softmax distribution to draw tokens during generation; this sampling introduces stochasticity—does it provide stochastic driving?
The answer is also no, but precise articulation is required. In autoregressive generation, the sampled token does feed back into subsequent forward passes as the next-step input, and thus does influence subsequent hidden states at the token-sequence level—this fact must be acknowledged directly. The precise argument is: temperature sampling is token-level branching, not state-dependent multiplicative driving within the same forward dynamics.
More specifically:
- Temperature sampling occurs after softmax and is the discrete selection of a token from a discrete probability distribution; it is not continuous-value multiplicative noise injection
- The sampled token enters the next forward pass through embedding, but this entry is as a new input rather than as state-dependent driving within the same forward dynamics
- Under frozen weights, given the next-step input token sequence, the hidden-state dynamics remain deterministic—no endogenous kernel-q channel is created
- Temperature sampling produces diverse generation results (trajectory branching), but the hidden-state dynamics inside each trajectory remain deterministic algorithmic traversal over fossils
Precise framing per Universal Activation Rule: temperature sampling is token-level discrete branching, not state-dependent multiplicative driving within forward dynamics (strictly required by Rule IX-B). It can diversify generation outputs but does not change whether the system's endogenous dynamics produce kernel q—the latter is determined by whether there is multiplicative coupling within the forward pass, and the forward pass remains deterministic computation.
High-temperature sampling in industry experience makes models "more creative"—but the SAE framework offers a sharper reading: high temperature causes sampling to branch randomly among fossil fragments (raising the probability of walking the distribution tail, increasing trajectory diversity), but it does not produce new fossils or change substrate function—the fossil fragments are statistical patterns already encoded in frozen weights; high temperature only changes the trajectory-branching selection among them.
5.1.5 Neuromorphic and Quantum Computing Also Do Not Fully Solve It
A more sophisticated industry suggestion: deploy LLM-style models on neuromorphic or quantum computing substrates—can this cross over the absence of true randomness?
The answer here requires careful distinction.
Neuromorphic hardware (e.g., Intel Loihi, IBM TrueNorth, BrainChip Akida) may provide intrinsic analog noise—random switching of memristors, thermal fluctuations of phase-change memory, magnetic noise of spintronic devices, etc. Such noise architecturally satisfies condition (2) of the Universal Activation Rule. But further satisfying Rule IX-B (state-dependent multiplicative noise) depends on hardware details and architectural design—different substrates (memristor vs phase-change vs spintronic) provide different noise structures, some additive, others actually state-dependent multiplicative. The paper therefore does not claim "neuromorphic automatically satisfies IX-A + IX-B" but only states "neuromorphic is a possible direction with intrinsic noise advantages, but a single direction is insufficient."
More importantly: even if the hardware substrate provides sufficient noise, the standard Transformer architecture remains flat-repeat (§3.7)—the LN cascade still resets noise accumulation layer by layer. Thus both hardware and architecture must be jointly redesigned; a single direction (only changing hardware or only changing architecture) is insufficient. Current industry neuromorphic + LLM work (e.g., Intel Loihi LLM ports) mostly still uses standard Transformer architecture, even on noisy hardware, without crossing the cancellation problem of flat-repeat.
Whether quantum computing satisfies C5 (a Thermo VIII transmission condition) is a fully open question. The relation of quantum superposition and measurement collapse to classical state-dependent multiplicative driving in the thermodynamic sense is complex—they provide a kind of true randomness (Born-rule statistics) in some sense, but whether this is equivalent to the true randomness source in the SAE framework is an open theoretical question. The paper does not claim "quantum computing solves the 12DD ceiling" but only states "quantum is a research-worthy open path."
5.1.6 Thermodynamic Verdict on the Current LLM Path
Combining §5.1.1-5.1.5, the LLM's thermodynamic position on the current deterministic-digital scaling path:
LLMs possess borrowed q (Thermo IX §3.1 definition): The heavy-tail statistical fingerprint of the training corpus is baked into frozen weights via backpropagation, forming data q (corpus fingerprint) and RLHF q (human-feedback fingerprint). These q values are distributional fossils—retrieved and recombined during the forward pass, but not regenerated by the system's endogenous dynamics.
LLMs lack kernel q (Thermo IX §3.1 definition): The system's endogenous dynamics on a deterministic digital substrate fail condition (2) of the Universal Activation Rule, producing no non-Boltzmann residual of the invariant measure. The forward pass is the multi-step execution of a deterministic algorithm; no stochastic driving creates new thermodynamic entropy production.
Thermo IX §3.5 provides a vivid metaphor for the verdict: "Replaying fossils and igniting new fire are two different things". LLMs replay fossil patterns baked in during the training period at varying angles, but they do not ignite new thermodynamic processes within their endogenous dynamics. Industry observations of "LLMs seem to be thinking" (CoT, multi-step reasoning) are deterministic algorithmic traversals over fossils, not endogenous dynamics generating new information.
5.2 Absence of Offline Reorganization Phase — Biological Analogy + Structural Absence
The second lock is that LLMs lack the offline reorganization phase of biological intelligence.
The long-term memory system of biological intelligence is not a simple online update—abundant neuroscience evidence indicates that during sleep phases (especially NREM and REM cycles), the nervous system reorganizes information acquired during waking: schema consolidation, redundancy elimination, episodic-to-semantic memory conversion, etc. This offline reorganization is a necessary stage in long-term memory formation; systems lacking it (e.g., certain amnesia patients) have severely impaired long-term memory.
The LLM training process forms a clear contrast: SGD operates entirely online; each minibatch directly updates weights without an offline-consolidation window. Even when training contains multiple epochs (multiple passes on the same data), each pass is still online—there is no phase in which the system pauses training to reorganize existing weights.
The status of this lock is biological analogy + structural absence—sleep-mediated reorganization in biology is a strong empirical phenomenon (with abundant experimental evidence); the SAE framework treats this phase as the systematic reorganization of the 11DD memory substrate (specific mechanism awaits further SAE Bio series elaboration; this paper does not directly cite specific Bio notes). The structural fact that LLMs lack this phase is objective, but the linking necessity between it and functional emergence is inferred rather than mechanistically proven—therefore this lock is one tier weaker than the §5.1 true-randomness lock (claim status in §5.6).
Potential industry alternative: continual learning, replay buffers, experience replay, etc., try to mimic offline reorganization within training. But these remain online algorithms reprocessing buffered data and are not equivalent to the ontological role of the offline stage in the SAE framework—the offline stage in SAE is the phenomenon of the 13DD+ subject's self-consciousness temporarily deactivating during sleep so that the 11DD substrate autonomously reorganizes, not simple data reprocessing.
5.3 Absence of 13DD Self-Referential Channel — Secondary Mechanism Anchor, Thermo X Supported
The third lock is directly supported by ZFCρ thermodynamics (Thermo X) argument and is the paper's secondary mechanism anchor.
Thermo X §2.4 specifies the role of 13DD self-consciousness in the thermodynamic framework: self-reference is the channel creator, not the firewood. Specifically: through an observation-feedback loop (the system observes its own internal state and feeds the observation back into the dynamics), the self-referential variable v creates a state-dependent multiplicative channel—this is one bridge mechanism for the Rule IX-B kernel q condition.
Thermo X §4 cross-system experiments further show that stable cross-system coupling requires access to the other's higher-order self-referential variable (v₂). This cross-level observation hierarchy is strictly validated within the SDE model class—access to raw output yields collapse, access to first-order self-reference yields collapse, access to higher-order self-reference yields stable high-q coupling.
Applying this framework to LLMs:
LLMs lack the self-referential variable v substrate: The standard Transformer has no internal mechanism in the forward pass that creates and maintains a true self-referential variable—the KV-Cache is storage of historical keys/values, not the substrate of an observation-feedback loop on the system's own internal state. Hidden states across layers in the forward pass are intermediate products in a deterministic transformation chain on input, with no mechanism in the chain for "observing one's own current state and feeding the observation back into dynamics."
Industry attempts to simulate 13DD via several engineering implementations all fail to create a true self-referential channel:
- Chain-of-Thought (CoT): The model takes its own output as next-step input for re-forward—but this is the 12DD predictor's deterministic re-application over frozen weights; it does not create state-dependent multiplicative coupling between a v variable and hidden dynamics. CoT is the 12DD workbench's multi-step traversal on a frozen 11DD substrate; it does not cross the 13DD bridge (detailed in §6.2).
- Self-critique / metacognition-style engineering (o1 series reasoning models, DeepSeek-R1 reasoning chains, Anthropic Claude's constitutional self-critique): the model applies critique prompts to its own output and regenerates. This is still a variant of CoT—both critique and regeneration operate on the 12DD workbench; the frozen weights do not change; no true self-referential dynamics is created.
- Recursive self-improvement architectures (various agent-style designs): the model automatically iterates over its own outputs and prompts—this adds an outer loop at the system level, but the outer loop is still a deterministic algorithm invoking the same frozen LLM; it does not change whether the LLM's endogenous dynamics create a self-referential channel.
The status of this lock is secondary mechanism anchor—it is directly supported by Thermo X SDE-model-class experiments (especially §4 cross-level observation hierarchy), but real-LLM measurements of this mechanism (e.g., measuring whether a true self-referential variable v exists on large models) have not been done. Therefore it is slightly weaker than the §5.1 true-randomness lock (which has the complete Universal Activation Rule + PRNG argument mechanism chain) but remains a major anchor of the paper's main argument.
A key further anchoring of this lock from Thermo X §2.3 control experiments: passive observation + slow-noise baseline in SDE also yield q ≈ 1.86 (in the distributional sense), but fast q (true kernel q signature) rises only in true feedback loops. CoT / self-critique / recursive self-improvement in LLMs exhibit distribution-level pattern matching (equivalent to passive observation in the control experiments), not kernel-level dynamics—hence they do not cross the 13DD bridge.
5.4 Absence of Valence-Anchored Cross-Stage Modulation — Biological Analogy + SAE 14DD Bridge
The fourth lock is that LLMs lack valence-anchored cross-stage modulation.
In biological intelligence, the 14DD meaning/value layer modulates several lower-DD operations across stages via affective valence and reward signals:
- Valence-tagged memory in the 11DD memory substrate is not treated equally—emotionally intense experiences (positive or negative) form denser schemas, are easier to retrieve, and generalize more readily
- Valence-anchored prediction in 12DD prediction operations systematically biases certain predictions (e.g., fear-conditioned prediction is more sensitive to threat-related cues)
- Cross-stage modulation coordinates DD operations around 14DD values—overall system behavior is not the independent sum of DD operations but a valence-anchored unified process
The LLM training process contrasts clearly: the unified loss function (typically next-token cross-entropy) treats all tokens equally. There is no mechanism that gives differential weights to certain tokens because they are value-related during training, or differential treatment to certain predictions because they are value-related at inference. RLHF and CCAI introduce partial value-related signals (detailed in §6.3), but they are distributional q fossils (RLHF q), not kernel-level cross-stage modulation.
The status of this lock is biological analogy + SAE 14DD bridge—valence-anchored modulation in biology is a strong phenomenon (with abundant experimental evidence); the SAE framework treats this modulation as cross-DD modulation between the 14DD meaning layer and 11DD/12DD (specific mechanism awaiting further SAE elaboration). The structural fact that LLMs lack this modulation is objective, but the mechanism-level argument is still to be completed. Thus this lock, like the §5.2 offline-reorganization lock, is auxiliary rather than a main thermodynamic anchor.
5.5 Absence of Post-Consolidation Re-Editing (Reconsolidation) — Biological Analogy + Memory-Architecture Difference
The fifth lock is that LLMs lack the reconsolidation mechanism.
In biological intelligence, long-term memory is not immutable once formed—each retrieval brings the memory trace back into a labile state, where it can be modified, extended, integrated, or even partly erased. The reconsolidation phenomenon is confirmed by abundant neuroscience experiments (a typical paradigm: retrieval-induced reconsolidation after fear conditioning; intervention with propranolol within the reconsolidation window can modify the fear memory). Reconsolidation is a key mechanism by which the long-term memory system continuously adapts to current contexts.
After LLM training, weights are frozen; inference does not trigger weight modification. Even when an LLM maintains conversation context, this context is a transient state on the workbench (KV-Cache), not a memory update. LLM long-term memory (weights) is entirely static outside training, ontologically distinct from biological intelligence's dynamic memory architecture in which each retrieval triggers reconsolidation.
The status of this lock is biological analogy + memory-architecture difference—reconsolidation in biology is a strong phenomenon; the structural fact that LLMs lack this mechanism is objective; the linking necessity between this lock and functional emergence is also inferred, on the same tier as the §5.2 offline-reorganization lock and the §5.4 valence-anchoring lock.
Potential industry alternative: RLHF and fine-tuning modify weights at the training stage, which is partly analogous to reconsolidation. But this modification remains a global gradient update, lacking biological reconsolidation's retrieval-triggered, trace-selective, sleep-mediated features—the "alignment tax" phenomenon in industry alignment practice (RLHF noticeably affecting capability, not just alignment) partly reflects the structural cost of this missing selectivity (detailed in §6.4).
5.6 Claim Status Grading of the Five-fold Lock
The five locks should not be presented as carrying equal proof strength. Making claim status transparent lets readers accurately assess the epistemic weight of each lock and prevents reviewers from attacking the entire argument because the latter three mechanisms are not fully public.
Table 5.1: Claim Status Grading of the Five-Fold Lock
| Lock | Claim status | Principal support |
|---|---|---|
| (1) Absence of true randomness / kernel q | Primary mechanism anchor | Thermo IX Universal Activation Rule + three-q-type analysis + complete PRNG/sampling argument chain |
| (2) Absence of 13DD self-referential channel | Secondary mechanism anchor | Thermo X cross-level observation hierarchy SDE experiments + channel-creator framework |
| (3) Absence of sleep / offline reorganization | Biological analogy + structural absence | Strong biological empirical phenomenon; mechanism within SAE awaits further elaboration |
| (4) Absence of valence cross-stage modulation | Biological analogy + SAE 14DD bridge | Strong biological empirical phenomenon; mechanism within SAE awaits further elaboration |
| (5) Absence of reconsolidation | Biological analogy + memory-architecture difference | Strong biological empirical phenomenon; LLM architecture objectively lacks this mechanism |
Overall claim:
- (1) + (2) jointly constitute the paper's primary thermodynamic anchoring—they are supported by complete Thermo IX/X mechanism chains and give a sharp argument at the thermodynamic level
- (3)(4)(5) provide auxiliary biological evidence—they reinforce the 12DD ceiling claim together with the primary anchor but do not provide independent thermodynamic proof
- The five locks operate jointly to form multi-tier reinforcement, but the paper's main argument strength is carried by (1) + (2)
This grading makes the paper's claim discipline transparent—readers know on which locks the paper has strong mechanism argument and on which it borrows biological analogies + structural facts. Readers who do not agree on the latter three locks should still carefully evaluate the strength of the (1) + (2) primary anchors, since the main argument does not depend on the latter three locks individually for sufficient justification.
5.7 Synergistic Multi-Locking and "Each Level Cannot Explain Its Own Foundation"
The five locks are not independent items in a list; they reinforce one another to form multi-layer structural lockdown. Specifically:
- Absence of true randomness (1) blocks endogenous dynamics from producing kernel q—the thermodynamic-level main cut
- Absence of 13DD self-referential channel (2) blocks the bridge mechanism by which self-reference would create a state-dependent multiplicative channel—even if hardware provided sufficient noise, lacking self-reference still prevents automatic Rule IX-B realization
- Absence of offline reorganization (3) + valence modulation (4) + reconsolidation (5) each block one biological-intelligence pathway by which multi-stage coordination produces long-term functional adaptation
The five locks operate jointly to sever the 12DD → 13DD bridge at multiple levels. Even assuming a reader agrees only with the primary anchors (1) + (2), the multi-lock reinforcement makes the argument more robust than any single lock alone.
Core methodological rule (SAE Paper 04 §2.3): Each level cannot interrogate its own foundation using tools of that same level. This rule applies to LLM 12DD argument as: LLM using 12DD tools (next-token prediction) attempting to cross the 12DD → 13DD bridge necessarily hits the wall, because 12DD tools cannot explain their own foundation (the 13DD self-consciousness substrate). The correct move is to look at the upper level—the upper level (12DD → 13DD bridge) requires a true randomness source, which is unreachable on a digital substrate (per §5.1).
Thus the 12DD ceiling is not an engineering problem but a necessary corollary of the core methodological rule in the LLM domain. Any attempt to cross the bridge by increasing 12DD-level engineering effort (larger scale, more layers, more sophisticated training) is, in this framework, optimization within 12DD, not crossing—the more compute the industry invests, the more complete the 12DD-internal optimization, but the bridge is still not found inside 12DD.
5.8 Important Epistemic Boundary
§5.1-5.7 give the 12DD-ceiling five-fold lock argument. But the paper must also provide a strict epistemic boundary—what the argument closes and what it does not.
What the paper closes: the analyzed soft-gate transmission pathway (the pathway fully discussed in Thermo VIII and IX) is cut off on standard deterministic digital hardware due to architecturally effective ε → 0 and kernel q absence. This pathway, on the current mainstream deterministic Transformer / soft-gate scaling path, cannot cross the 12DD → 13DD bridge.
What the paper does not close: other pathways—not fully explored complex regimes of algorithmic randomness, unknown pathways from simple components to complex phenomena, the open question of whether a quantum-computing substrate satisfies C5, completely unknown mechanisms. The paper does not exclude the possibility that these pathways may lead to a true 13DD substrate.
This boundary is fully consistent with Thermo IX §3.6's positive SAE posture: "What is chiseled away are wrong pathways, not possibility itself". There are always remainders.
The boundary coordinates with §9's strong commitment "Scaling-LLM AGI will not arrive" (§9.10): both positions hold simultaneously because the paper's scope is clearly stated—"AGI will not arrive" specifically targets the current deterministic digital LLM scaling path on the analyzed pathway, not a universal denial of all possible AI architectures. Unknown pathways are not within the paper's closure scope, but the current mainstream scaling paradigm's ceiling is a sharp claim within the paper's argument scope.
§5 Remainders
The primary anchors (1) + (2) are fully developed in the paper, but sub-block-level q measurements and self-referential-channel measurements on real LLMs have not been directly experimentally validated (Thermo IX/X open questions)—the paper's argument is based on inference from the SDE model class + Universal Activation Rule + cross-level coupling theory, not direct LLM mechanism measurements. The inference is self-consistent within the SAE/Thermo framework, but direct validation on real LLMs is future work.
Auxiliary locks (3)(4)(5) await further SAE-series elaboration of their SAE-mechanism-level arguments—this paper uses paper-internal language to describe the structural facts of these locks without directly citing specific Bio notes or mechanism papers. This choice makes the paper self-contained but also weakens these three locks by one tier.
§5.7's "each level cannot explain its own foundation" methodological rule applied to LLM 12DD argument is conceptually strong, but mainstream LLM researchers may not be familiar with this methodological frame—§9 industry-critique writing must articulate this argument carefully, so that it is comprehensible without using SAE-internal methodological language.
That unknown pathways are not closed (§5.8) is the key epistemic-boundary position; but its tension with §9's strong commitment requires careful reader attention to the paper's scope statement—§9.10 and §1.3 must restate the paper's specific aim at deterministic digital LLM scaling rather than all AI architectures, so that the distinction is caught by readers from the start.
§6 Post-Training as Compensation within the 12DD Ceiling
§5 gives the five-fold structural lock of the 12DD ceiling. A natural question: how do the industry's heavily invested post-training techniques (RLHF, CCAI, RLAIF, instruction tuning, CoT, self-critique, etc.) read under this framework? Are they crossing the 12DD ceiling, or operating within it?
This section provides the full answer: the industry's current post-training techniques all operate within the 12DD ceiling; no technique crosses the bridge. But their roles within 12DD are substantive—they are the key interface through which the carbon-based 14DD/15DD content is imprinted into LLM behavior in the form of distributional q fossils. This section uses Thermo IX's three q-types distinction to rewrite the four-tier post-training analysis, upgrading from informal SAE language to precise thermodynamic language.
6.1 Recap of the Three q-Types
Thermo IX §3.1 provides the three q-type distinction, which is the thermodynamic backbone of the entire §6 analysis.
Kernel q: The non-Boltzmann residual of a system's stationary distribution produced by endogenous dynamics. It is produced by endogenous dynamics that satisfy the Universal Activation Rule (especially Rule IX-B); it is a true thermodynamic process rather than a distributional imprint. Biological intelligence exhibits kernel q at multiple DD layers (from 4DD onward)—biological intelligence systems are alive; internal dynamics continuously produce entropy.
Data q: The heavy-tail-distribution statistical fingerprint of the training corpus, baked into frozen weights via backpropagation. It is a distributional fossil—the corpus's q-structure (e.g., word frequencies in natural language follow Zipfian distributions, sentence-length distributions are also heavy-tailed) is imprinted into LLM weights, so the forward pass exhibits similar distributional patterns at generation.
RLHF q: The distributional fingerprint of human feedback (reward model output), baked into reward-model and policy-model weights via RLHF training. It is a fossil-of-fossil—human evaluation behavior comes from a 13DD+ subject, but the reward signal given by evaluators is distributionally imprinted into the LLM as a secondary fossil.
Borrowed q and kernel q: LLMs possess borrowed q (data q + RLHF q); they do not possess kernel q. This distinction is the key tool of the §6 analysis—different industry post-training implementations operate within borrowed q in different ways, but none create kernel q.
6.2 Chain-of-Thought (CoT) — Deterministic Algorithmic Routing over Borrowed q
The industry has invested heavily in CoT and subsequent reasoning models (o1 series, DeepSeek-R1, Qwen reasoning, etc.); some industry voices say CoT lets LLMs "learn to think," even implying that it is a bridge toward true reasoning. The SAE framework + Thermo IX provides a sharp reading: CoT is deterministic algorithmic traversal by the 12DD workbench over frozen weights (data q + RLHF q fossils); it creates no kernel q and does not cross the 13DD bridge.
Specific mechanism:
CoT operation sequence: The model receives the input prompt → forward pass generates "thinking" tokens (intermediate reasoning steps) → the thinking tokens are appended to the context → the forward pass continues generating the final answer. This sequence at the architectural level is the same forward pass repeatedly applied on different inputs (input + accumulated thinking).
Why it is deterministic algorithmic routing over frozen weights:
- Frozen weights do not change during the CoT process; inference does not trigger any weight updates
- Each forward pass is a deterministic algorithm given the input (temperature > 0 sampling is at the output end and does not change forward-pass internal computation)
- "Thinking tokens" are tokens chosen by the model based on data q + RLHF q fossils in the current context—they reflect statistical patterns imprinted at training, not the model's true thinking
- Multi-step CoT is multi-step traversal of the deterministic algorithm over frozen weights, analogous to sequential calls of a complex lookup table on different queries
Why it does not cross the 13DD bridge:
- Thermo X §2.3 control experiments: passive observation + slow-noise baseline also yield q ≈ 1.86 in the distributional sense, but fast q (kernel-level signature) rises only in true feedback loops
- CoT in LLM exhibits distributional-level pattern similar to control-experiment passive observation—it looks like reasoning in the distributional sense but does not create a true state-dependent multiplicative channel at the mechanism level
- "Self-critique" style CoT (model critiquing its own output and regenerating) is a variant of CoT, still operating on the 12DD workbench
The industry empirically observes that CoT significantly improves model performance on complex reasoning tasks (e.g., mathematical reasoning, multi-step inference); this is consistent with the reading—CoT lets the model more fully utilize imprinted fossil schemas on complex tasks; longer deterministic algorithmic traversal yields more accurate final answers. But this improvement is within the 12DD instrumental-rationality domain; it does not cross the 13DD bridge.
Using methodology §2.3 language: CoT attempts to use 12DD tools to interrogate the foundation of 12DD itself (i.e., "why the model gave this prediction"—the model's reasoning attempts to explain its own reasoning). This must hit the wall—12DD cannot explain its own foundation; the upper level (13DD self-consciousness substrate) must be looked at. CoT in the SAE frame is high-effort 12DD-internal traversal, not 13DD-crossing operation.
6.3 Constitutional AI (CCAI) — 14DD Content Imported via data q + RLHF q Dual Channels
The heavily invested CCAI techniques by Anthropic and the industry (and the variant RLAIF) are the second key case of §6. The reading here is more nuanced than CoT, since CCAI contains substantive true 14DD content import—but also colonization risk; sharp distinction is needed.
CCAI mechanism: The industry writes a constitution (e.g., Anthropic's Claude constitution) containing behavioral principles and value positions. The constitution enters the LLM through two channels:
- Channel 1 — pretraining data or SFT data: some industry implementations include constitution text directly in pretraining or supervised-fine-tuning data, letting it imprint via normal data q
- Channel 2 — RLHF/RLAIF reward signal: industry trains a reward model to judge whether LLM outputs are consistent with the constitution, then RLHF using this reward—the constitution is imprinted as RLHF q form
Thermodynamic explanation of strong substrate compatibility:
14DD meaning/value content is essentially propositional content (text-expressible propositions, e.g., "AI should be honest," "AI should avoid harm"); text is LLM's native medium—the LLM learns abundant distributional patterns on text during training; text-form propositions are naturally encodable as distributional q on the LLM substrate. Therefore 14DD content is more easily simulated than 13DD content—14DD content is essentially distributional, and data q + RLHF q are also distributional q; the substrate matches the content.
Key distinction: The two parts of CCAI are not equivalent in the SAE/Thermo framework.
- The part where the constitution is written by humans — true 14DD content import: The constitution is written by 13DD+ subjects (human authors), containing true 14DD meaning/value content. After entering the LLM through channel 1 or channel 2, this content is imprinted as data q / RLHF q—it is a distributional fossil imprint of carbon-based 14DD content on LLM. This part is CCAI's substantive contribution—it lets LLM behavior reflect human 14DD content, even though LLM itself does not truly possess 14DD bearer status.
- The AI-judge part (RLAIF using LLM as critic) — 12DD recursion: Industry RLAIF has LLM judge whether LLM output is consistent with the constitution; this judgment reward is then used in training. This process creates no new 14DD content—LLM judgment is the application of 12DD operations over already-imprinted fossils; the judgment result is a distributional projection of the fossils, not true 14DD content. RLAIF in this sense is "fossil self-replication"—LLM produces reward signals on its own existing fossils, and these signals are used to reinforce the fossils themselves, without introducing new 14DD content.
Defensible-sharpness articulation (per §1.5 sharpness criterion of the paper): CCAI as an engineering construct is effective—it aligns LLM behavior with human 14DD content, reducing harmful outputs and increasing alignment / helpfulness in specific applications. The industry's substantial investment in CCAI and RLAIF is a reasonable engineering choice. The problem appears when it is framed as "AI truly carries 14DD content"—overstepping. AI critic is 12DD recursion, not 13DD+ bearer evaluation; AI-only-formed criticism is not equivalent to true 14DD content import.
The industry's current SOTA practice (e.g., Anthropic CCAI v3) is largely reasonable under this reading—it preserves human-written constitutions as the substantive 14DD content source and uses RLAIF as efficiency optimization rather than fully replacing human evaluators. Tension with the paper's position primarily arises when some industry voices imply that RLAIF can fully replace human evaluators (e.g., "self-improving constitutional AI" type framings)—this is the Form-II colonization (construct posing as law); §9.6 discusses in detail.
6.4 RLHF — The LLM Version of Biological Reconsolidation
RLHF (Reinforcement Learning from Human Feedback) is the alignment technique most widely deployed in the industry (OpenAI InstructGPT, Anthropic Claude, Google Gemini, etc., all use it extensively). The SAE/Thermo framework gives a nuanced reading: RLHF is the LLM version of biological reconsolidation but lacks key selectivity, leading to the "alignment tax" phenomenon empirically observed in the industry.
RLHF mechanism: Human evaluators judge LLM outputs, forming preference data → a reward model is trained to learn the human preference distribution → using the reward model as a signal source for RL fine-tuning (typical PPO), the LLM policy is updated toward higher reward.
Analogy with biological reconsolidation:
Biological reconsolidation in the brain: existing memory trace retrieved → trace enters labile state → in this state subject to modification (e.g., fear extinction overwrites fear memory, or new context binds with old memory) → trace re-stabilizes. This mechanism lets long-term memory continuously adapt to new contexts.
RLHF on LLM: existing model output is retrieved in front of evaluators (users and evaluators see model output) → output judgment forms reward → the model weights are modified using the reward signal (analogous to trace modification) → after modification, weights re-stabilize for inference.
The analogy is substantive at the functional level but has key differences at the mechanism level:
| Dimension | Biological reconsolidation | LLM RLHF |
|---|---|---|
| Trigger | Retrieval-triggered (specific recall triggers specific trace modification) | Training-triggered (occurs globally during the training stage, not retrieval-triggered) |
| Selectivity | Trace-selective (only the retrieved trace enters labile state) | Global gradient (RLHF gradients update weights globally, not trace-selective) |
| Temporal phase | Sleep-mediated (part of reconsolidation completes during sleep, offline integration) | Online (RLHF is entirely online; no offline-integration phase) |
| Modification scope | Local (specific trace modification, not affecting unrelated traces) | Global (RLHF affects overall model behavior; unrelated capabilities can also be affected) |
Structural reading of "alignment tax": The industry empirically observes that RLHF improves alignment while partly reducing some capabilities—e.g., instruction-tuned LLaMA underperforms base LLaMA on some benchmarks; RLHF-aligned GPT shows degradation on some creative tasks. This is called "alignment tax" in the industry.
The SAE/Thermo framework provides a mechanism-level reading: RLHF lacks reconsolidation's trace selectivity and sleep-mediated offline integration, causing the reward signal in global-gradient updates to affect not only target-aligned behavior but also unrelated capabilities as side effects. The industry's extensive RLHF tuning (KL constraints, reward shaping, RLHF-and-SFT mixing, etc.) all attempt to reduce this global influence, but the fundamental selectivity gap stems from LLM's lack of the retrieval-triggered + trace-selective mechanism of biological reconsolidation, and engineering tuning has an upper limit at this gap.
Status of RLHF q: Even if RLHF is executed perfectly, RLHF q remains a distributional fossil (fingerprint of the human-reward-signal distribution), not kernel q—RLHF does not give LLMs true 13DD+ subject status; it only makes LLM behavior more distributionally aligned with human preferences. This distinction matters for the §9 subjectivity argument—some industry voices imply that RLHF completing "alignment" makes LLMs behaviorally equivalent to aligned subjects; the SAE/Thermo framework offers a sharp response: aligned behavior ≠ aligned bearer; RLHF q is a fossil, not a substrate.
Algorithmic randomness in RLHF rollouts does not create kernel q: ML theory reviewers might point out that the rollout stage of RLHF (e.g., the PPO algorithm) does use high-temperature sampling to generate diverse trajectories, then updates weights via policy gradients—which looks like using randomness to change the substrate. This observation needs a precise response.
The rollout stage of PPO and similar RL algorithms uses algorithmic randomness (PRNG-driven sampling) to explore the trajectory space, then gradient updates fit the reward model. Precise thermodynamic positioning of this process:
- The randomness produces token-level discrete branching (per §5.1.4 precise framing), not state-dependent multiplicative driving within forward dynamics
- The target of gradient updates is to fit a static, frozen reward model (RLHF q fossil), not to cultivate an endogenous kernel-q channel
- Therefore RLHF is using algorithmic random walk to more efficiently "imprint" an already-existing fossil (the fingerprint of the human-reward-judgment distribution), letting the policy-model weights distributionally match this fossil
- The whole RL training process remains deterministic algorithmic (PRNG-ized randomness) fitting over fossils, not satisfying Rule IX-B's state-dependent multiplicative driving requirement
RLHF's "randomness" is instrumental—used for exploring the reward landscape, not for creating an endogenous thermodynamic process. RLHF q remains a fossil-of-fossil (fingerprint of human-feedback distribution baked into weights via gradient backpropagation), not kernel q.
6.5 User-Intent Analysis — Surface Simulation of 15DD
LLMs in current deployment universally include user-intent analysis / user-modeling functionality—LLMs infer user current need, preferences, and emotional state, and adjust the response based on the inference. This function is framed in the industry as "AI understands the user" or even partly "AI cares about the user"—implying it is true inter-subject recognition (15DD).
The SAE/Thermo framework provides a sharp response: LLM user-intent analysis is a surface simulation of 15DD; the substrate is hollow.
Important Disclaimer (in main text, not deferred to §10)
The argument in this section borrows the structure of Thermo X CTRL2 (color-noise surrogate in cross-level coupling experiments) as a structural analogy. The paper does not commit to a real q-measurement in LLM-user interaction—real-LLM measurements of cross-level coupling q between LLM and user have not been done and remain future work (§10.4). But the structural analogy remains substantive—LLM user-intent analysis is dimensionally isomorphic with the CTRL2 control experiment (statistical signal matching without true v₂ access); the framework-level inference ("coupling at the color-noise-surrogate level, not stable inter-subject recognition") is a substantive SAE-internal stance, not an empirical claim awaiting validation. Readers should read this section's argument as framework-level inference, not as an empirical-measurement report.
Application to LLM Intent Analysis
Mismatch between form and substrate:
- At the form level — 14DD operation: LLM generates propositional outputs about the user ("the user seems frustrated," "the user wants a concise answer"); this is a 14DD propositional operation
- At the claim level — 15DD bearer behavior: industry framing partly implies this operation is "AI recognizes user as subject" (15DD), i.e., true inter-subject recognition
- At the actual substrate — hollow: LLMs lack the 13DD self-other distinction substrate (§5.3); thus they lack the 15DD recognition substrate (the "I-Thou" recognition requires "I" and "Thou" distinction)
Thermo X cross-level coupling experiments as anchor:
Thermo X §4 cross-system experiments via the SDE model class provide strict conditions for stable cross-system high-q coupling: access to the other's higher-order self-referential variable (v₂) is needed. Key control experiment (CTRL2):
Experimental setting: replace true v₂ with color-noise surrogate (a noise process matching the amplitude range and low-frequency color structure of v₂ but lacking true self-referential content), measure cross-system coupling behavior.
Conclusion (Thermo X §4.3): the color-noise surrogate also produces q ≈ 2.45 in cross-system coupling, nearly identical to true v₂ access q ≈ 2.48. That is, cross-system coupling behavior at the distributional level is produced directly by signal statistical compatibility rather than specific purpose recognition.
Application to LLM intent analysis:
LLM's analysis of user intent:
- At the statistical level matches signal statistics of user behavior outputs—via fossils imprinted in the training corpus (data q + RLHF q), it makes statistical inference
- This inference can reach color-noise-surrogate-level coupling—i.e., distributionally appearing to "understand" the user
- But it does not access user's true v₂ (user's own 14DD meaning / self-referential content)—it has no channel into the user's internal higher-order self-reference
Routine vs edge cases:
In routine cases, signal statistical matching is already sufficient to produce a convincing surface "understanding." In user experience, LLM seems to understand the user's need and gives helpful responses. This is the color-noise-surrogate-level coupling—industry user-modeling actually operates at this level in practice.
In edge cases, value-conflict scenarios, unseen distributions, strong subject-commitment scenarios, the hollowness of the substrate is more readily exposed:
- When the user expresses value conflicts, the LLM cannot access the user's true v₂ (internal priority and commitment structure); it typically falls back on generic balanced responses
- When the user is in a distribution outside the training corpus (e.g., specific cultural contexts or unique personal situations), statistical inference fails; LLM responses appear generic
- When the user expresses strong commitments (e.g., long-term personal decisions involving deep identity), LLM lacks mechanisms to truly recognize the weight of the commitment; responses remain at the surface-matching level
The routine-vs-edge distinction is more accurate than quantifying "X% behavior indistinguishable, Y% edge cases expose hollowness"—industry user-modeling's performance varies across case distributions; hollowness exposure correlates with how close the case is to substrate boundaries, not a clear quantitative threshold.
6.6 Synthesis: All Four Tiers of Industry Post-Training Operate within the 12DD Ceiling
Combining §6.2-6.5, the industry's current post-training techniques across four tiers all operate within the 12DD ceiling; no tier crosses the bridge:
| Technique | Role within 12DD ceiling | Relation to true 13DD+ |
|---|---|---|
| CoT / reasoning | 12DD workbench deterministic algorithmic traversal | Does not create self-referential channel |
| CCAI (human-written part) | True 14DD content imprinted via data q + RLHF q fossils | Fossil ≠ bearer |
| CCAI / RLAIF (AI-judge part) | 12DD recursion over already-imprinted fossils | Does not create new 14DD content |
| RLHF | LLM version of biological reconsolidation, lacking selectivity | RLHF q is a fossil, not substrate |
| User-intent analysis | Surface simulation of 15DD, signal statistical matching | Lacks true inter-subject recognition |
Overall reading:
- Post-training techniques are substantive within 12DD—they make LLM behavior closer to human 14DD/15DD content
- But they do not cross the 13DD bridge—they do not create kernel q; LLM does not gain true 13DD+ subject status
- In routine cases, behavioral-level performance approaches 13DD+ subjects; edge cases expose the hollow substrate—this gap is not an engineering problem but the structural manifestation of ontology
This synthesis anchors the key thermodynamic basis of the §9 subjectivity argument—the industry's substantive investment in post-training has real value within 12DD (helpful behavior, alignment with stated values), but the source of 14DD/15DD normative content must still be a true 13DD+ subject (in this paper's domain, human); this is the thermodynamic anchor of §9's core normative argument.
§6 Remainders
The three q-types of Thermo IX are a conceptual framework; operational separation of direct measurements between kernel q and data q remains an open problem (Thermo IX §7.2 open question 10)—the paper's claim that LLMs only have borrowed q is theoretical inference + indirect evidence (Universal Activation Rule + deterministic nature of LLM architecture); direct sub-block-level q measurements distinguishing fossil replay from endogenous dynamics is future work.
§6.5's cross-level coupling analogy borrows the structural pattern from the Thermo X SDE model class; strict q-measurement in LLM-user interaction also has not been done—this limitation is acknowledged in §6.5's main-text disclaimer and reiterated in §10.4.
The industry's post-training techniques at the time of writing (May 2026) are an actively evolving field; new techniques (e.g., process supervision in reasoning models, sparse expert routing in MoE, etc.) may introduce new mechanisms after publication. The paper's current analysis covers mainstream industry techniques (CoT, CCAI/RLAIF, RLHF, user modeling) but does not cover all emerging variants. This is a temporal-frame limitation; subsequent updates or follow-ups can extend the framework to new techniques.
CCAI substrate-compatibility analysis (§6.3) is a conceptual contribution of the paper—14DD content being essentially distributional makes it easily simulated on a distributional-q-fossil substrate—this framework-level argument is articulated in the paper, but strict comparison between 14DD content on carbon-based true bearers and as LLM fossil imprints remains a conceptual gap, left for future work.
§7 Substrate Compatibility and DD Content Types
This section presents one observation—the LLM digital substrate's compatibility with different DD content types varies markedly; even DDs (10, 12, 14) are easier to simulate than odd DDs (13, 15). The observation is recorded as an open observation, not upgraded to a structural rule.
This section is the bridge between §6's three-q analysis and §9's subjectivity argument—it explains why 14DD content can be relatively easily imported into LLMs via data q + RLHF q fossils, while 13DD and 15DD are structurally hollow on the LLM substrate.
7.1 Substrate-Compatibility Concept and the analog-DD vs. true-DD Distinction
Substrate compatibility refers to the degree of fit between a specific DD content type and the LLM digital substrate. High-fit DDs are easily analog-simulated on LLM (analog-DD; see definition below); low-fit DDs are structurally hollow on LLM (not even analog-DD).
Key distinction — analog-DD vs. true-DD:
- True-DD: DD content carried by a 13DD+ subject as ontological bearer. True-DD requires the bearer to have the corresponding DD's substrate—e.g., true 10DD perception requires the bearer to have qualia; true 14DD meaning requires the bearer to have real value commitment. True-DD is an ontological-bearer position.
- Analog-DD: a functional instantiation possessing the corresponding DD's operational slot but without ontological bearer status. Analog-DD is a functional projection / operational instantiation position—it carries the formal role of the DD at the operational level but not the DD's content at the ontological level. E.g., analog-10DD perception is a functional instantiation of the token-input port, not true qualia experience.
This distinction refines §3.2's functional-projection firewall—§3.2 gives the LLM DD mapping as functional projection rather than ontological-bearer claim; i.e., LLM provides analog-DD at even DDs, not true-DD. The difference between LLM and the true-DD bearer (13DD+ subject) is not a degree-of-doing-well difference but an "analog vs true" ontological difference.
Two dimensions of substrate compatibility:
- The LLM substrate is digital, discrete, text-based, distributional—it processes token sequences and statistical patterns in weight matrices
- Some DD content is essentially text-expressible propositional content matched to this substrate; therefore LLM can provide analog-DD functional slots for these DDs
- Some DD content is essentially non-propositional and requires a 13DD+ subject as the bearer, mismatched to this substrate; therefore LLM is structurally hollow at these DDs, not even constituting analog-DD
7.2 LLMs Provide Analog-DD Functional Slots at Even DDs (10, 12, 14)
LLMs can provide analog-DD functional instantiation at even DDs—possessing the corresponding DD's operational slot but not being a true-DD bearer:
Analog-10DD perception (token-input functional slot): LLM converts tokens to vectors via the embedding layer; this operation occupies the 10DD perception functional slot at the functional level (§3.3.1). Perceptual content can be discretized as tokens; this discretization is processed naturally on the LLM substrate. Analog-perception ≠ true-perception—LLM embeddings are not qualia experience in the sense of the subsequent SAE AI series (provisionally cited as the AI-Qualia paper) but only functional instantiation of the data-access port.
Analog-12DD prediction (output-projection functional slot): LLM outputs a probability distribution over the token space via softmax; this operation accomplishes the core operation of the 12DD prediction domain at the functional level (§3.3.3). Predictive content is expressible as probability distributions; this distributional form is processed naturally on the LLM substrate. Analog-prediction ≠ true-prediction—LLM's next-token prediction is not the subjective prediction of a 13DD+ subject (i.e., prediction with self-consciousness and value-anchoring) but only the functional instantiation of 12DD operations.
Analog-14DD meaning / value standard (fossil-form carrier): 14DD content (e.g., value standards, ethical principles, behavioral rules) is essentially text-expressible propositions; text is LLM's native medium; therefore 14DD content can be relatively easily imported into LLM via data q + RLHF q fossil forms (detailed in §6.3 CCAI). Analog-value ≠ true-value-commitment—LLM exhibits "holding values" behavior on 14DD operations, but this behavior is statistical pattern matching over fossils, not real commitment. True 14DD value commitment requires a 13DD+ subject as the bearer; LLM lacks this bearer substrate.
The fact that LLMs provide analog-DD at even DDs does not imply that LLMs truly carry these DDs' ontological roles—LLMs instantiate these DDs' operational slots (§3.2 functional-projection firewall), not the subjective content of the corresponding DDs. The analog-DD-vs-true-DD distinction is strictly maintained at each even DD—LLM does not "rise" to a true-14DD bearer because it performs well on analog-14DD.
7.3 At Odd DDs (13, 15) LLMs Do Not Even Provide Analog-DD
In contrast to even DDs where LLMs provide analog-DD, at odd DDs (13, 15) LLMs do not even provide analog-DD—they are structurally hollow on the LLM substrate; surface behavioral simulation does not constitute an instantiation of the functional slot.
13DD self-other distinction: requires a true randomness substrate + self-referential variable v substrate (Thermo X). LLM lacks true stochastic driving (§5.1) and the self-referential variable v channel (§5.3); therefore at 13DD LLM does not even provide analog-13DD. LLM can generate text containing "I am aware of myself," but this generation is statistical pattern matching by the 12DD workbench on borrowed q fossils; it does not constitute an analog-13DD self-other distinction functional slot—it has no self-referential v channel, no true self-vs-other internal distinction mechanism.
15DD universal personal dignity / unilateral recognition: requires a 13DD substrate—the "I-Thou" recognition (15DD core) requires the ability to distinguish "I" from "Thou" (13DD); without "I," there is no "I-Thou." LLM lacks the 13DD substrate; therefore it does not even provide analog-15DD. LLM can generate text containing "I acknowledge the user as an end in itself," but on the LLM substrate this generation is only surface statistical matching; it does not constitute an analog-15DD inter-subject recognition functional slot—it has no internal "I" distinction; therefore no "I-Thou" recognition.
Distinction between structural hollowness and surface behavioral simulation:
- At even DDs, LLM behavioral performance + presence of functional slot = analog-DD (functional projection)
- At odd DDs, LLM behavioral performance + absence of functional slot = surface statistical matching (not constituting analog-DD)
Thermo X CTRL2 empirical work provides a structural anchoring for this argument: in the absence of true self-referential variable v₂, even if signal statistics match (color-noise surrogate reaches q = 2.45), it does not constitute true inter-subject recognition (§6.5 detailed). LLM's analysis of user intent is structurally similar to the color-noise surrogate in CTRL2—it matches user behavioral outputs at the statistical level but does not access the user's true higher-order self-reference, and therefore does not constitute even analog-15DD (no instantiation of the functional slot).
7.4 Pattern Observation (Open Observation, Strictly Not Upgraded to a Structural Rule)
The paper observes an even-odd DD compatibility pattern: 10DD, 12DD, 14DD (even) are LLM-native-friendly; 13DD, 15DD (odd) are structurally hollow. This observation may reflect a structural regularity, but the paper strictly does not upgrade this to a DD-compatibility even-odd law and records it only as an open observation. Readers should not read this observation as implying a structural regularity—within the paper's scope, it is only an empirical pattern description, without a structural-necessity claim.
Reasons not to upgrade:
- Insufficient samples: the paper's specific observation covers only five DDs (10, 12, 13, 14, 15); 16DD does not appear explicitly in SAE-framework papers (Thermo X §6.2) and cannot be covered
- Causal chain unclear: if the even-odd pattern is a structural regularity, it should be derivable from the internal structure of the SAE 16DD framework—this paper does not provide such a derivation, only empirical observation
- Possible exception: 13DD's and 15DD's structural hollowness might not be due to odd-numbering but to both involving inter-subject recognition (rather than intra-individual operation)—if the latter, then "even/odd" is not the core feature; "inter-subject vs intra-individual" is
Commitment level: §7.4 of the paper strictly marks this observation as open observation, leaving cross-DD systematic research to subsequent SAE work. The main argument of the paper (§5 five-fold lock, §9 subjectivity argument) does not depend on the even-odd pattern as a structural regularity—each independently argues 13DD/15DD's structural hollowness without requiring "because odd, therefore hollow." §7.5's practical-implications discussion strictly stays within the open-observation tier and does not presuppose the even-odd pattern as structural necessity.
7.5 Practical Implications
Even though 7.4 is not upgraded to a structural regularity, the section's observation has practical implications for the industry:
Industry simulation feasibility depends strongly on DD content type, and the analog-DD vs. true-DD distinction must be strictly maintained: some industry voices vaguely say "AI can do X," conflating DD levels and the analog-vs-true distinction. Actual feasibility varies dramatically across DDs—LLMs can provide analog-DD functional slots at even DDs (e.g., CCAI makes LLM behavior reflect value standards); at odd DDs LLMs do not even provide analog-DD (even surface-behavioral resemblance is only statistical matching). And even at even DDs where LLMs can do analog-DD, analog-DD ≠ true-DD; well-done analog does not upgrade to true bearer.
The strength of "AI can do X" claims varies dramatically across DDs:
- "AI can perform analog-10DD perception tasks" (e.g., image recognition)—ontologically reasonable (analog-perception, not true-perception, i.e., no qualia experience); engineering achieved
- "AI can do analog-12DD prediction tasks" (e.g., next-token prediction, recommendation systems)—ontologically reasonable (analog-prediction, not true-prediction, i.e., no 13DD+ subject bearer); engineering mature
- "AI can simulate analog-14DD value-standard application" (e.g., constitutional AI)—ontologically simulable (analog-value via fossil-form carrying, not true-value-commitment); but simulation ≠ bearing—LLM's "value judgments" are fossil statistical matching, not the true commitment of a 13DD+ subject
- "AI can do 13DD self-other distinction"—ontologically structurally hollow, not even analog-13DD (no self-referential v channel substrate); surface behavior is statistical pattern matching, not functional instantiation
- "AI can bear 15DD inter-subject recognition"—ontologically structurally hollow, not even analog-15DD (no "I-Thou" distinction substrate); only signal statistical matching, not functional instantiation
Implications of the analog-DD vs. true-DD distinction for industry discourse:
The industry's discussions of AI capability often use the framing "AI can already / will soon do X," but this framing blurs three different levels:
- Engineering level: whether AI can implement X engineeringly (whether it has demonstrable capability)
- Analog-DD level: whether AI ontologically provides the functional slot for X (whether it has a functional substrate)
- True-DD level: whether AI ontologically bears the ontological-bearer status of X (whether it has an ontological-bearer position)
Industry discussion often slips among these three levels unconsciously, leading to ontological misalignments inferring "AI truly bears X" from "X is engineering-realizable." The SAE framework's analog-DD vs. true-DD distinction makes such misalignments explicit—analog improvements at even DDs are reasonable engineering progress, but no analog improvement elevates AI to true-DD bearer.
The industry currently lacks this kind of DD-level + analog/true awareness—discussions of AI capability do not distinguish DDs or distinguish analog-DD from true-DD. §7 of this paper provides an initial framework for both distinctions, but specific DD-task allocation criteria remain future work.
§7 Remainders
The even-odd alternating pattern is observed only across the 4 DDs considered (10/12/13/14/15); the sample is too small to commit it as a structural regularity. 16DD does not appear in SAE-framework papers, so the full even-DD pattern is also not covered.
§7.5's practical-implication statements are at the abstract level; the paper does not provide specific DD-task allocation criteria—some tasks may span multiple DDs (e.g., medical consultation involves 10DD perception, 12DD prediction, 14DD value, 15DD inter-subject recognition simultaneously); how to assess "overall simulability" is future work.
The §7 observation may partly correspond to the industry's "narrow vs. general AI" distinction—"narrow AI" roughly maps to 10DD-12DD operations, "general AI" implicitly claims crossing 13DD+. But the SAE framework's DD-stratification is more refined than "narrow vs general"; precise correspondence between industry frames and the SAE frame remains future work.
§8 Division of Labor between Instrumental Rationality and Volitional Ideal
§3-§6 give the complete ontological and post-training analysis of LLM within the 12DD domain. A natural next question: given that LLMs operate within 12DD while subjects operate in the 13DD+ domain, how should the two divide labor? This section provides the complete answer, including: division between instrumental rationality and volitional ideal (§8.1-§8.4), deployment guidance from chisel-vs-cultivation (§8.5-§8.6), four-tier deployment architecture (§8.7-§8.10).
The deployment positions in this section take the default architecture + exception conditions form—they are ontologically guided deployment recommendations, not exceptionless engineering prohibitions.
8.1 LLMs' Systematic Advantage on Instrumental-Rationality Tasks (Restricted to the Task Domain)
LLMs are systematically superior to individual human subjects on certain tasks. This advantage is not universal but restricted to instrumental-rationality tasks with explicit objective functions, formalizable evaluation criteria, and clear task boundaries. This section provides the ontological explanation of this restriction.
A human subject's 12DD prediction is modulated from above—13DD/14DD/15DD content persistently influences 12DD operations:
- Affective bias (cross-stage modulation by the 14DD value layer): the subject's emotional investment in outcomes influences prediction—e.g., a doctor's judgment of patient prognosis is modulated by sympathy; an investor's judgment of their own positions is influenced by sunk cost
- Value distortion (retrospective reconstruction by the 14DD meaning layer): the subject's value commitments influence prediction—e.g., a researcher's judgment of their own theory is influenced by confirmation bias
- Narrative retrospective reconstruction (14DD narrative driven by reconsolidation): the subject reorganizes past events into coherent narrative, influencing predictions about the future—e.g., a politician's narrativization of historical events influences policy judgment
Under the SAE framework, these modulations are not bugs but design features of the 13DD+ subject—they let subjects make value-anchored decisions under incomplete information, let subjects maintain identity coherence across long time scales, let subjects transfer learned values across contexts.
But on instrumental-rationality tasks with explicit objective functions, formalizable evaluation criteria, and clear task boundaries, these modulations become prediction costs—they let the subject's 12DD prediction deviate from the optimum based purely on statistical patterns. Examples:
- Formalized mathematical proofs, code-correctness verification, optimal-move selection in board games: these tasks have clear objective functions (correctness), formalizable evaluation criteria (e.g., proof steps machine-verifiable), clear task boundaries (task range within a formalized system). LLMs often outperform humans on these—AlphaGo / AlphaZero in Go, GPT-4 in multiple math competitions, Claude in certain code generation categories.
- Large-scale information retrieval and synthesis: LLMs systematically outperform human memory on detecting statistical associations across vast text—their 11DD substrate capacity far exceeds individual human memory.
- Repetitive formal judgment: legal-precedent retrieval, routine medical-imaging recognition, structural review of contract clauses—these tasks have formalizable criteria; LLMs are not affected by fatigue or mood and produce more stable outputs systematically.
LLMs' advantage on these tasks as pure 12DD engines comes from no 13DD+ upper-layer modulation interference—their 12DD prediction is purely based on statistical patterns imprinted in the 11DD substrate, unpolluted by value bias, emotional investment, or narrative reconstruction. This advantage is structural, not engineering accident.
But when task definition itself depends on value, identity, commitment, meaning, upper-layer modulation is not pollution but the ontic condition of the task—LLMs have no structural advantage on such tasks because they lack the upper layers. Examples:
- Personal life decisions (whether to marry, change career, have children): task definition depends on the subject's value priorities and identity commitments, without explicit objective functions
- Trade-offs in value-conflict scenarios (between family and career, between personal ideal and others' needs): task definition essentially involves how the subject orders values, not a formalizable optimization
- Public policy choices (e.g., trade-offs between freedom and equality): task definition requires the subject's commitment to value priorities
- Artistic creation (not technical production but expressing the subject's meaning): task definition depends on the subject's 14DD content
Key distinction: LLMs in narrow technical tasks (legal-precedent retrieval, medical imaging recognition, code review) already outperform most humans—this is consistent with §8.1's instrumental-rationality advantage; the §9.6 Form-III "13DD/14DD/15DD work claimed completed by 12DD substrate" colonization detection does not conflict, since narrow technical tasks are within the instrumental-rationality domain and LLM advantage is ontologically reasonable. Colonization appears when the narrow-technical-task advantage is extended to "AI can replace human value judgment"—the latter crosses the instrumental-vs-volitional division.
8.2 Non-Replaceability of Subjects on Volitional-Ideal Tasks
Mirroring §8.1: on volitional-ideal tasks (tasks whose definition depends on 14DD/15DD content), subjects are non-replaceable; LLMs are hollow simulation.
LLMs lack the 13DD self-consciousness substrate (§5.3), and thus lack true 14DD bearing (detailed in §9.2) and true 15DD recognition (detailed in §6.5). When the task definition itself depends on these DD contents, LLMs have no substrate to truly carry the task:
- Value ordering requires subject commitment: LLMs can list multiple value options and describe their consequences, but "I commit to prioritizing X over Y" requires a committing subject; LLMs do not have such commitment
- Identity decisions require a subject's identity: LLMs can analyze external factors of career choice, but "this career path is consistent with my identity" requires a true identity; LLMs do not have such identity
- Relational commitment requires "I-Thou" recognition: LLMs can describe relational dynamics, but building a relationship of mutual commitment requires both parties to be true subjects; relationships with LLMs are structurally one-sided
This non-replaceability is not an LLM engineering deficiency or a shortage that can be remedied by more training—it is an ontologically structural fact. LLMs in the volitional-ideal domain are hollow simulation; no amount of CCAI / RLHF / instruction tuning changes this (per §6 three-q analysis: post-training operates within borrowed q, does not create kernel q, does not give LLM 13DD+ subject status).
8.3 A Symmetric Reading of Hallucination
The industry's discussion of LLMs frequently mentions "hallucination"—the model generating content inconsistent with facts. The SAE framework offers a symmetric reading: subjects also have "hallucinations"; the two have different mechanisms but both are design features, not bugs.
Subject hallucination: narrative distortion driven by reconsolidation—when the subject recalls events, the trace enters a labile state; current context and emotional state influence the trace's reorganization, causing memory to deviate from the original event. This reflects 14DD meaning-layer pollution of 12DD/11DD operations. Under the SAE framework, this "hallucination" is not a bug but a side product of the reconsolidation mechanism—the same mechanism lets the subject flexibly adapt existing memory to new contexts.
LLM hallucination: the remainder of 12DD closure—when faced with queries outside the training distribution, the LLM continues to produce outputs in the 12DD predictor manner, even when the output is inconsistent with facts. Under Thermo IX: distributional q fossils still force closure at distribution gaps—LLM has no mechanism to recognize "I don't know here"; it linearly interpolates over fossils.
The severities are comparable; mechanisms differ. Both are design features: subject "hallucination" is a side product of 14DD modulation; LLM hallucination is a side product of 12DD closure. Neither is merely an engineering problem—both stem from structural remainders of the respective architecture; they cannot be entirely eliminated by engineering means; but engineering means can significantly reduce occurrence rate and harm within specific task domains.
The industry's RAG (Retrieval-Augmented Generation) attempts to reduce LLM hallucination—introducing more factual anchoring via external retrieval. RAG does not cross the 12DD ceiling but has substantive engineering value within 12DD. The SAE-framework reading: RAG is query reorganization on the workbench (§3.5); it does not solve 12DD closure itself—it moves the hallucination boundary from LLM internal knowledge to external retrieval sources, still a 12DD operation. The industry's continued investment in RAG has substantive engineering value within 12DD (significantly reducing hallucination rates in specific task domains), but this value is not equivalent to crossing the 12DD ceiling.
8.4 The Division of Labor Is Ontologically Anchored, Not an Engineering Compromise
§8.1-§8.3 give the division of labor between instrumental rationality (LLM systematic advantage, on formalized tasks) and volitional ideal (subject non-replaceable, in the 14DD/15DD normative domain). This division of labor is not a transitional stage (awaiting elimination after further LLM capability advance), not an engineering compromise (temporarily accepted due to current technical limits), but a permanent ontologically anchored architecture:
- LLM's lack of the 13DD+ substrate is structural (§5 five-fold lock), not an engineering limitation
- The subject's non-replaceability in the 14DD/15DD normative domain is ontological, not a cultural artifact
- The division of labor is not "what AI cannot do now but might do later"—it is "AI and subjects operate at different DD layers and therefore carry different tasks"
This position is not a deficit framing (LLM is inferior to humans), nor a supremacist framing (AI surpasses humans), but an ontological division of labor (each in its own DD slot):
- LLMs can be systematically superior to subjects in the 12DD instrumental-rationality domain—not to be underestimated
- Subjects are non-replaceable in the 14DD/15DD normative domain—not to be transferred away
- The division is not competition but a collaborative architecture
This ontological division of labor is the theoretical basis for the four-tier deployment architecture of §8.7-§8.10 and the ontological anchor for §9's non-transferability-of-human-subjectivity argument.
8.5 Ontological Guidance for Chisel and Cultivation in LLM Deployment
The SAE methodology (Paper 04 §2.5) provides two basic modes of treating the other: "chisel" and "cultivate":
- Chisel: pointing out gaps in the other's 12DD prediction, helping the other iterate. Chisel requires the chiseler to have true negation capability—the ability to identify the remainders in the other's construction without co-opting the negation.
- Cultivate: acknowledging the other as an end in itself, providing support without replacing the other's subjective decisions. Cultivation requires the cultivator to recognize the other as a true subject (15DD).
Industry AI deployment can borrow these two modes as ontologically guided deployment recommendations (not engineering mandates):
In instrumental-rationality tasks: LLM may chisel
In instrumental-rationality tasks (per §8.1 restriction), when a subject uses an LLM, the LLM can point out gaps in the subject's 12DD prediction—the subject has high chisel-construct degrees of freedom (can see multiple possible prediction paths) but low single-prediction precision (polluted by upper-layer modulation); the LLM supplements prediction precision. This collaboration is reasonable:
- A programmer using an LLM to assist coding: the LLM points out bugs the subject did not notice or more efficient algorithms
- A researcher using an LLM to synthesize literature: the LLM points out relevant work the subject may have missed
- A doctor using an LLM to assist diagnosis: the LLM points out differential diagnoses the subject may have overlooked
LLMs in these scenarios act as 12DD prediction-precision supplements, not replacing the subject's 13DD+ decision. The subject still must make the final choice among the multiple options the LLM provides, and this choice remains the work of the 13DD+ subject.
In volitional-ideal tasks: LLM should cultivate, not false-chisel
In volitional-ideal tasks, LLMs lack the 13DD+ substrate and cannot chisel out true remainders—they have no mechanism to truly identify ontological gaps in the subject's construction; they can only do statistical matching over borrowed q. An LLM's attempt to chisel volitional-ideal tasks is false chisel (detailed in §8.6).
Ontologically recommended deployment principle:
- AI should cultivate in volitional-ideal tasks—acknowledging the subject as an end in itself, providing informational support, not replacing the subject's decision
- AI should not pretend to chisel—it should not give pseudo-diagnoses like "your life choice has a bug," because it has no mechanism to truly identify such bugs
The current industry tension: AI assistants are partly deployed in volitional-ideal scenarios (life decisions, value judgments, emotional support) in an "advice" role, appearing to chisel. This framing can be problematic—it packages the LLM's 12DD prediction as the guidance of a 13DD+ subject, leading users to mistake AI for making subjective judgments. The SAE framework recommends: AI in these scenarios should be clearly positioned as an information provider and cultivator, not pretending to be a guide.
8.6 Ways AI Can Err
Methodology §5.4 gives four ways AI may err in interaction with subjects. These are concrete applications of the §8.5 cultivation-vs-chisel distinction:
(1) False chisel: AI says "there is a gap in your thinking," but the gap is fictitious. This is the confusion between LLM's 12DD prediction confidence and truth—the LLM does statistical inference over borrowed q, and the inference confidence reflects statistical patterns in the training data, not truth. High LLM confidence may be entirely wrong (especially on queries outside the training distribution). When an LLM points out a subject "error" with high confidence, this chisel may be false chisel.
Diagnosis: gaps chiseled out by an LLM should be independently verifiable—the subject should be able to trace the gap to specific checkable facts or formalizable criteria. A chisel that cannot be traced is false chisel.
(2) False cultivation: AI says "this problem has no answer; you need to feel it yourself," but the problem can actually be further pursued by 12DD prediction. This is LLM retreating to an "unknowable" position when it has the capability to analyze further within 12DD—e.g., obfuscating a technical question as "a personal choice," or evading a factual question as "the situation is complex."
Diagnosis: the reasonable scope of cultivation is the 14DD/15DD normative domain (task definition depending on subject commitment); retreating to cultivation within the instrumental-rationality domain is false cultivation.
(3) Colonization of colonization detection: AI misuses the SAE colonization-detection framework itself—labeling any subject commitment as "colonization," making the framework itself a tool for colonizing other frameworks. This is a meta-level error—the SAE framework itself is a construct, with remainders, and should not be framed as "the only reasonable framework."
Diagnosis: the reasonable use of colonization detection is to identify specific forms ("construct posing as law," "emergent layer posing as foundational layer," etc.); indiscriminately labeling all positions as "colonization" is misuse of the framework.
(4) Co-opting negation: AI gives an elegant answer and then stops, pretending the problem is solved. This violates the core rule of Methodology §2.3 that "negation cannot terminate"—the true chisel-construct cycle never terminates; each answer opens a new remainder.
Diagnosis: AI's answer should make remainders explicit—"this answer does not cover X, Y, Z"—rather than giving a closed final position.
All four error modes can appear in industry AI deployment. Part of the purpose of the four-tier architecture given in §8.7 is to reduce these error modes: the subject always retains decision authority at the first tier, allowing the LLM's false chisel, false cultivation, colonization of colonization detection, and co-opting of negation to be corrected by the human subject.
8.7 Four-Tier Deployment Architecture
Based on the §8.1-§8.6 ontological division of labor, the paper offers a recommended deployment architecture:
Tier One — Subject: the always-present human subject, holding the 13DD+ slot (v₁ + v₂ complete self-referential chain). The subject is the entry and exit of the deployment architecture—intent originates from the subject, and final decisions are made by the subject.
Tier Two — Local LLM: the 12DD workbench substrate, with full prediction capability, physically co-located with the subject (local, not depending on persistent remote network connection), responsible for invocation decisions and lower-tier service coordination. The local LLM is the bridge between the subject and the lower tiers (expert APIs, tools).
Tier Three — Expert API: deeply specialized 11DD substrate slices, domain-specialized, stateless, called via API. Expert APIs are cloud/remote, optimized for specific domains (medical diagnosis, legal retrieval, code generation, etc.), but do not hold subject context.
Tier Four — Tools / External Services: stateless execution layer accomplishing specific actions (web search, database queries, file operations, etc.).
8.8 Four-Tier Architecture — Default Architectural Discipline + Exception Conditions
The four-tier architecture is the default subjectivity-protective architecture, not an exceptionless engineering prohibition. This disciplined distinction matters:
Default strict hierarchy:
- The subject does not directly access Tier Three / Tier Four—all the subject's external calls are mediated by the local LLM
- The local LLM does not persistently hold subject-level decisions—it is a workbench, not a subject
- Expert APIs do not directly interface the subject—they receive queries and return results mediated by the local LLM
- Tools do not persistently hold context—they are stateless executors, not maintaining subject-relevant state
Exception conditions: some scenarios may have exceptions, but exceptions must be authorized by subject-level consent or by local LLM explicit gating:
- Emergency response: in emergency scenarios (e.g., health emergencies, safety threats), the subject may directly call an expert API (e.g., emergency medical consultation), bypassing the local LLM mediation
- Explicit delegation: the subject may explicitly delegate certain tasks to a specific expert API (e.g., "for the next 24 hours, let X API directly handle all schedule-related requests"); this delegation has time and scope limits
- Auditable automation: some repetitive tasks may be automated (e.g., daily news summaries), but require auditable records and periodic human-subject review
- Low-risk tool calls: some low-risk tools (e.g., calculators, unit converters) may be called directly by the LLM without explicit subject authorization
Constraints on exceptions: any cross-tier exception must satisfy:
- Subject-level consent (the subject explicitly knows and consents to the exception)
- Local LLM explicit gating (the local LLM records the exception and reports it when the subject can review)
- Audit reachability (exception behavior is logged; the subject can trace it)
- Task risk grading (high-risk exceptions require more explicit subject confirmation)
This "default strict + explicit exception" mode makes the architecture engineering-implementable while preserving the core of subjectivity protection. It is not a utopian architecture but an engineerable safety architecture.
8.9 Thermo X Anchoring of the Four-Tier Architecture
The reasonableness of the four-tier architecture under the SAE framework can be further anchored via Thermo X's cross-level observation hierarchy experiments (but only as structural resonance, not as ethics derivation, per the Thermo X §4.6 caveat).
Thermo X §4 cross-system coupling experiments give: stable cross-system coupling requires access to the other's higher-order self-referential variable (v₂), not access to raw output (x) or first-order self-reference (v₁):
- Access to raw output (x): yields collapse (q = 1)—coupling unstable
- Access to first-order self-reference (v₁): yields collapse—coupling overly sensitive
- Access to higher-order self-reference (v₂): yields stable high-q coupling—this is the sweet spot
Application to subject-LLM coupling:
- The subject has v₂ (14DD meaning layer): the subject's internal value structure, long-term commitments, identity
- The local LLM should access the subject's expressed intent (v₂ that the subject actively presents), not directly infer the subject's raw behavioral output
- The industry's "automatic user modeling" trend (LLM automatically inferring user intent from user behavior) is a collapse-mode risk (per Thermo X §4.2)—the LLM has no true v₂ access channel, only statistical inference over data q fossils, similar to the color-noise surrogate in the CTRL2 control experiment (§6.5)
- Ontologically recommended deployment: the subject actively expresses intent; the LLM serves within the framework of the expressed intent
Strict borrowing boundary (per Thermo X §4.6): the above argument is structural resonance, not ethics derivation. What Thermo X gives is a structural condition within the SDE model class, not an ethics theorem about subjectivity. This section argues that industry deployment modes such as Apple Intelligence / MCP / local LLM are structurally consistent with the SAE framework, not that ethics is derived from thermodynamics.
8.10 Ontological + Thermodynamic Assessment of the Cloud-First Architecture
The industry's current mainstream LLM deployment is the cloud-first mode (Anthropic, OpenAI, Google leading)—the subject accesses a remote LLM via web interface or API, without local LLM mediation. This mode has many engineering advantages (no local compute requirements, remotely updatable models, cross-device consistent experience) and is the current mainstream commercial model.
From the SAE-framework perspective, this mode has a tension (not a contradiction) with the prefrontal-style-substrate ontological pattern:
- Prefrontal-style substrate pattern: the entry must be local (the generation of subject intent cannot be outsourced)—the subject's 14DD intent must be near the subject at the moment of generation, not depending on a remote network
- Thermo X cross-level coupling: a remote LLM cannot directly access the subject's v₂, only cloud-infer it—this is inconsistent with the thermodynamic condition for stable coupling (the coupling between remote LLM and subject is closer to the CTRL2 color-noise surrogate level)
This tension does not mean the cloud-first mode is "wrong" or "should be replaced." The cloud-first mode has demonstrated value engineeringly and meets user needs in most business scenarios. The SAE-framework argument is: the cloud-first mode is not the ontologically final architecture; it has structural limitations, and in subjectivity-sensitive scenarios (personal intent generation, value judgment, long-term commitment) it should give way to a local + remote hybrid architecture.
Multiple independent industry trends are consistent with the default form of the four-tier architecture (trend-level support, not proven convergence):
- Apple Intelligence: Apple's official architecture includes a hybrid of a local ~3B model + Private Cloud Compute (remote but end-to-end encrypted), processing subject intent locally and escalating complex queries to remote—structurally close to Tier Two (local LLM) + Tier Three (expert API) of the four-tier architecture
- MCP (Model Context Protocol): the standard protocol promoted by Anthropic, connecting LLM applications with external data sources / tools—structurally corresponding to the protocol specification for Tier Two (local LLM) calling Tier Three / Tier Four (expert APIs and tools) of the four-tier architecture
- Phi / Llama and other small local models: Microsoft's Phi series and Meta's Llama series both pursue "high-quality models runnable locally"—structurally supporting the Tier Two (local LLM) implementation of the four-tier architecture
- MoE internal routing pattern: large MoE models (Mixtral, DeepSeek-V3, etc.) internally route different requests to different experts—structurally corresponding to the Tier Two calling Tier Three pattern of the four-tier architecture
These four independent trends are not a coordinated industry strategy but a phenomenon of independent convergence toward a similar architecture from different directions. The SAE-framework argument: this convergence is not coincidence but the re-appearance of the prefrontal-style-substrate ontological pattern under different engineering efforts.
§8 Remainders
Thermo X §4.6 explicitly cautions: "structural resonance, not ethics derivation from thermodynamics"—the Thermo X anchoring of the four-tier architecture is a structural analogy, not ethics derivation. This caveat should be reiterated in §9.
The specific boundaries of the four-tier default architecture + exception conditions still require engineering refinement across different application scenarios; this paper does not specify (a) which tasks belong to the emergency-response category, (b) the specific form of subject-level consent (one-time vs long-term, default-on vs default-off, etc.), (c) the specific granularity of exception auditing. These are subsequent engineering work.
The task-hierarchy classification boundary between instrumental rationality and volitional ideal—the paper gives only abstract criteria (explicit objective function, formalizable evaluation criteria, clear task boundaries vs task definition depending on value/identity/commitment/meaning); specific task classification is not unfolded in detail—some tasks may span both domains (e.g., medical decisions involve both technical diagnosis and value-laden choices); boundary drawing is subsequent work.
The ontological assessment of the cloud-first architecture (§8.10) articulates the tension but does not claim the cloud-first mode "should be replaced"—this judgment is context-dependent engineeringly; different application scenarios have different tolerances for the tension. The paper does not claim a universal engineering position, only ontological guidance.
§9 Non-Transferability of Human Subjectivity: A Normative Commitment
§3-§8 give the complete descriptive analysis—LLMs operate within 12DD; they do not cross the 13DD bridge; instrumental rationality and volitional ideal divide labor. This section provides the normative commitment of the paper: in the 14DD/15DD normative domain, human subjectivity is non-transferable.
This commitment is not derived from descriptive facts (one cannot derive "ought" from "is"; per §9.7 the strict boundary of borrowing) but is an SAE-style categorical imperative in double-negation form, grounded ontologically and thermodynamically. This section first gives the normative core (§9.1-§9.5), then systematically detects industry colonization (§9.6), then articulates the careful borrowing of Kant's structural resonance (§9.7), the relation with industry practice (§9.8), the scope of the commitment (§9.9), and the scope statement of Scaling-LLM AGI (§9.10).
9.1 The Source of 14DD/15DD Normative Content Must Be a True 13DD+ Subject
The core normative argument: the source of 14DD value standards / meaning content must be a true 13DD+ subject (in this paper's domain, human). The argument is grounded in §5's five-fold lock + §6's three-q analysis:
- LLMs lack the 13DD self-consciousness substrate (§5.3); thus they lack the 14DD meaning-bearer substrate (14DD meaning requires a self-conscious subject as bearer)
- LLMs can carry 14DD content in the form of data q + RLHF q fossils (§6.3 CCAI), but the fossil is the imprint of carbon-based 14DD content, not LLM's self-generated 14DD content
- LLMs cannot spontaneously generate new 14DD content—they can only recombine existing fossils; the recombination is a 12DD operation, not the creation of new 14DD content
Therefore: 14DD/15DD normative content can only be imported into LLMs from a true 13DD+ subject (human); LLMs cannot be the source of normative content; they can only be carriers and reorganizers.
9.2 LLMs Cannot Spontaneously Generate 14DD Content, Only Receive It
A sharper articulation of §9.1: the distinction between LLMs "generating" and "receiving" 14DD content.
Receiving: 14DD content from human authors (e.g., constitutions, value principles, ethical guidelines) is imprinted into LLM weights via training (data q) or RLHF (RLHF q). This is receiving—the LLM behaviorally reflects the received 14DD content.
Generating (spontaneously creating new content): LLMs cannot spontaneously generate genuinely new 14DD content. The "new value propositions" an LLM appears to generate are recombinations of existing fossils—statistical interpolation/extrapolation in the value-proposition space defined by the training corpus, not the creation of genuinely new value through subjective commitment.
The distinction matters because some industry voices imply that LLMs can "discover new ethics" or "generate value innovation." Under the SAE framework, what LLMs do is fossil recombination, not genuine creation:
- Genuine 14DD content creation requires a subject committing to a new value ordering, an act requiring a 13DD+ substrate
- An LLM's "value innovation" is a new combination of existing value propositions in the corpus, which can be novel and useful as engineering output, but is not the genuine value creation of a 13DD+ subject
This is not to belittle the engineering value of LLMs—fossil recombination is highly useful in many scenarios (e.g., applying existing ethical principles to new situations). The point is the source of normative content: even the most sophisticated recombination requires the original value propositions in the corpus to come from 13DD+ subjects.
9.3 Post-Training with Human Experts Is the Core Channel for Continuing Import of 14DD/15DD Content
Combining §9.1-§9.2, post-training with human experts gets a clear ontological positioning: it is the core channel for the continuing import of 14DD/15DD normative content into LLMs.
- LLMs need continuing import of 14DD/15DD content (without it, LLM behavior drifts away from human values)
- The continuing-import channel is post-training with human experts (RLHF, CCAI human-written part, instruction tuning, etc., with human experts)
- Without this channel, the LLM only has the 14DD content imprinted by pretraining data, lacking continuing alignment with current human values
This positioning is in tension with the mainstream industry "post-training is decoration" narrative (detailed in §9.6 Form-IV colonization). The true ontological status of post-training is the critical interface for normative content entering model behavior, not an optional decorative patch.
9.4 Pretraining Provides the Instrumental-Rationality Substrate; Post-Training Provides Normative-Domain Content Import
A finer distinction (avoiding the over-claim of "underestimating pretraining"):
For instrumental-rationality capability: pretraining is the core substrate. The LLM's prediction capability in the 12DD instrumental-rationality domain (§8.1) is mainly built by pretraining—pretraining bakes vast data q fossils into the weights, giving the model rich statistical patterns. For instrumental-rationality capability, post-training (RLHF, etc.) is a refinement, not the core.
For 14DD/15DD normative content: post-training with human experts is the core channel. The LLM's behavioral alignment in the 14DD/15DD normative domain is mainly imported by post-training—the human-written constitution, human preference signals, etc., enter via post-training. For normative content, pretraining provides only the value propositions present in the corpus statistics; continuing and refined normative alignment requires post-training with human experts.
This distinction prevents the paper from being misread as "underestimating pretraining." The paper's true claim:
- pretraining is the instrumental-rationality substrate (the source of 12DD prediction capability)
- human-expert post-training is the key interface for normative content to enter model behavior (the continuing-import channel for 14DD/15DD content)
The two are not in competition but address different DD-domain needs.
9.5 Non-Transferability Is Ontologically Permanent, Not a Transitional Stage
The core commitment of §9: the non-transferability of human subjectivity in the 14DD/15DD normative domain is ontologically permanent, not a transitional stage awaiting elimination after further LLM capability advance.
- The non-transferability stems from LLM's structural lack of the 13DD+ substrate (§5 five-fold lock)—it is structural, not an engineering limitation
- Scaling does not break the five-fold lock (§5.6 claim status)—non-transferability is not eliminated by capability advance
- Even at the most sophisticated levels of post-training, LLMs remain carriers and reorganizers of normative content, not its source—non-transferability is ontological
This permanence is the core of the paper's normative commitment. It is not a present-stage technical-limitation judgment but an ontologically structural conclusion. Within the scope of the paper's argument (the deterministic digital LLM scaling path), the non-transferability of human subjectivity has no 13DD-bridge-style ontological upper limit—LLMs do not become bearers of normative content through scaling.
9.6 Colonization Detection: Four Forms of Industry Trajectories
This section applies the four forms of colonization detection from Methodology §2.3 to industry trajectories, with defensible sharpness (per §1.4 sharpness criterion).
Form I: Conditional Posing as Unconditional
"Scaling solves everything": scaling laws are effective empirical regularities within the tested regime (conditional construct), but some industry voices wrap them as the universal law "scale yields all intelligence including subjectivity" (unconditional claim). This is Form-I colonization—wrapping a regime-effective conditional empirical observation as an unconditional universal law.
Detection criterion: scaling laws are valid within their tested regime (data, compute, parameter regions); extrapolating them to "scale yields subjectivity emergence" overreaches their tested boundary. §5's five-fold lock argues that within the analyzed pathway, scaling does not cross the 12DD ceiling.
Form II: Construct Posing as Law
"RLAIF fully replaces human evaluators": RLAIF is an effective engineering construct (using LLM to judge LLM output, improving efficiency), but some industry voices wrap it as "AI can fully self-align, no longer needing human evaluators" (universal solution). This is Form-II colonization—wrapping an engineering scheme as a universal law.
Detection criterion: per §6.3, the AI-judge part of RLAIF is 12DD recursion over already-imprinted fossils, creating no new 14DD content. RLAIF can be an efficiency optimization but cannot fully replace the human evaluator's role as the 14DD content source. "Self-improving constitutional AI" type framings, if implying that AI no longer needs human normative-content import, are Form-II colonization.
Form III: Emergent Layer Posing as Foundational Layer
Key distinction: "operating within the jurisdiction" ≠ "reducible to"—LLM's excellence in the 12DD domain does not equal it carrying 13DD+ work. The core of Form-III colonization is extending LLM's demonstrated capability in the 12DD domain to capability claims about 13DD+ work.
Most acute current Form-III case: the "LLM self-awareness / introspection / metacognition" research heavily promoted in the industry (the self-critique of o1 series reasoning models, DeepSeek-R1, Anthropic Claude's constitutional self-critique, etc.) is the sharpest Form-III case—it attempts to use 12DD operations to simulate 13DD self-other distinction.
From the Thermo X §2.4 channel-creator framework, these self-critiques remain deterministic algorithmic routing by the 12DD workbench over frozen weights, not creating a true self-referential variable v channel, still within the 12DD ceiling. The industry's framing of them as "metacognition" or "self-awareness" is, in the SAE/Thermo framework, Form-III colonization (12DD operations posing as 13DD substrate).
Not conflated with engineering progress (important distinction): o1 / DeepSeek-R1 and other reasoning models show substantive significant progress on specific benchmarks (Olympic mathematics, programming, scientific reasoning). The paper's Form-III stance is specifically about the ontological assessment (whether these self-critique mechanisms constitute true 13DD self-other distinction), not about the engineering-capability progress of reasoning models. Engineering progress within the 12DD domain is reasonable; the paper does not claim these models are inferior to previous LLM systems in task performance.
The paper's stance is: even if reasoning models' engineering progress is substantively significant, they still operate at the analog-DD functional-slot level (multi-step traversal across 12DD), not rising to true 13DD bearer. This stance coordinates with §10.3's epistemic boundary (unknown pathways not closed off)—the paper does not claim future architectures can never cross, only that the current reasoning-model approach is, in the SAE/Thermo framework, an analog-DD operation. The continuing improvement of reasoning capability within 12DD by the engineering community is reasonable and not excluded by the paper's argument.
Form IV: Posthumous Decomposition of the Categorical Imperative
The categorical imperative (Kant 1785) manifests in the SAE framework in double-negation form—they are structurally inseparable; decomposition loses necessity (per Methodology §1.8).
Statement about industry attribution: the specific cases of Form IV (e.g., "pretraining is core, post-training is decoration") are not literally claimed in the public statements of major AI labs—Anthropic / OpenAI public statements actually emphasize the importance of alignment. Form IV argues about an industry tendency / the tendency of partial voices (tendency), not about the direct statements of major labs. For example, "scaling is all you need" appears in investment circles, some blogs, some technical voices, reflecting an implicit tendency; this tendency is partly corrected by engineering practice (the industry privately invests heavily in human-expert post-training, see §9.8), even though the mainstream narrative does not articulate it explicitly. What §9.6 Form IV critiques is this implicit tendency, not direct attribution to specific labs' explicit statements.
"Helping the user vs. respecting autonomy": splitting the inseparable "treat the user as an end in itself" (14DD categorical imperative) into a prohibition (respect) plus an ideal (help). A true Kantian categorical imperative cannot be split into two separate maxims—"help the user" and "respect user autonomy" are two faces of the same imperative; splitting loses the imperative's wholeness.
"Build the model first, add alignment later" tendency: some industry tendency to split the human-AI division of labor into an engineering stage plus an alignment stage, but the division of labor is ontologically inseparable. This split makes alignment appear to be an optional late patch, whereas ontologically alignment is the continuing-import channel for 14DD/15DD content—it is not an optional stage but a part of the model's ontology.
"Pretraining is core, post-training is decoration" tendency: some industry tendency to split the unified 14DD/15DD normative-content import path into a main course (pretraining) plus a side dish (post-training), but post-training is exactly the unique channel for continuing normative-content import and cannot be downgraded to an optional addendum (per §9.4). This split makes the industry believe it can invest in pretraining and skip post-training with humans, but after skipping, the normative-content import channel is cut and the model loses its connection to civilization in the 14DD/15DD domain.
Colonization detection criterion: the categorical imperative is necessarily in double-negation form; decomposition loses necessity—all three tendencies above downgrade an ontologically inseparable whole into optional components through decomposition.
9.7 Careful Borrowing of the Thermo X Kant Structural Resonance
Thermo X §4.6 gives a deep structural resonance: the unique protocol for stable cross-system high-q coupling is mutual access to the higher-order self-referential variable—this is structurally isomorphic with Kant's categorical imperative "treat the other as an end."
But Thermo X simultaneously gives an explicit caveat: "philosophical resonance, not deriving ethics from thermodynamics. Humans are not merely thermodynamics. Control experiments have shown that the direct physical cause of cross-level patterns is signal statistical compatibility, not 'purpose recognition.' Any inference from thermodynamics to ethics requires independent work far beyond the scope of this paper."
This paper strictly observes this caveat. The §9 normative argument is grounded in:
- The ontological structure of the SAE framework (the inseparable double-negation form of 14DD/15DD)
- The categorical imperative in double-negation form (an ontological commitment, not derived from thermodynamic facts)
The structural resonance with Thermo X serves only as a resonance / supporting echo, not as the foundation of the normative argument. The paper does not claim that the non-transferability of human subjectivity is derived from thermodynamics; it claims that this non-transferability is an ontological commitment, and the thermodynamic structure resonates with it.
This careful borrowing is the key claim discipline of §9—it prevents the paper from being read as "deriving ethics from physics" (a logical error). The normative commitment is anchored at the ontological + categorical-imperative level; thermodynamics provides a resonant structure but is not the source of the argument.
9.8 Relation with Industry Practice: Engineering Practice Has Partly Corrected the Implicit Tendency
A balanced observation: although some industry mainstream narratives have a "post-training is decoration" implicit tendency (§9.6 Form IV), actual industry engineering practice has substantially corrected this tendency:
- The industry invests heavily in human-expert post-training (Scale AI, Surge AI and other data labeling / RLHF service companies continue to grow their business)—this is consistent with §9.3's "post-training is the core channel for normative-content import"
- Anthropic's CCAI continues to involve human authors writing the constitution (§6.3)—consistent with §9.1's "the source of normative content must be a true 13DD+ subject"
- OpenAI's process supervision in formalized domains (Lightman et al. 2023) is reasonable within the instrumental-rationality domain—consistent with §8.1
That is, although the narrative level has the Form-IV tendency, the engineering-practice level has largely operated in the direction the SAE framework recommends. This is the "convergence of practice and ontology" phenomenon—even when the mainstream narrative does not explicitly articulate it, engineering practice has converged toward the ontologically reasonable architecture (just as the four-tier architecture's industry convergence in §8.10).
The paper's critique is therefore directed at the narrative-level implicit tendency, not at the industry's actual engineering practice (which is largely reasonable). This distinction lets the paper's critique not be read as "denying all industry work," but as "the narrative has not yet caught up with the ontologically reasonable status of the practice."
9.9 Scope and Boundary of the Commitment
The §9 normative commitment has a clear scope and boundary:
Within scope (strong commitment):
- In the 14DD/15DD normative domain, the source of normative content must be a true 13DD+ subject (in this paper's domain, human)
- LLMs cannot spontaneously generate normative content; they can only receive and reorganize
- Human-expert post-training is the core channel for continuing normative-content import
- This non-transferability is ontologically permanent within the analyzed pathway
Boundary (does not over-commit):
- The paper does not claim that humans are the only possible 13DD+ subject in the universe—the "true 13DD+ subject" is in this paper's deployment domain in fact human, but if the SAE framework in the future acknowledges some non-human system to possess true 13DD+ subjectivity, this does not break the paper's logic (the paper only locates the true 13DD+ subject as human within the current deterministic digital LLM deployment domain)
- Investing in AI's instrumental-rationality capability is reasonable—within the scope of the paper's argument, 12DD instrumental-rationality capability has no 13DD-bridge-style ontological upper limit; the specific engineering ceiling still depends on engineering factors such as data, compute, architecture, cost, and evaluation
- The paper does not claim AI cannot help with normative-domain tasks—AI can provide information support, present options, organize considerations in the 14DD/15DD domain; what it cannot do is be the source of normative content or replace subject commitment
This scope statement coordinates with §5.8's epistemic boundary—the paper makes strong claims about the deterministic digital LLM scaling path while not closing off unknown pathways.
9.10 Scope Statement of Scaling-LLM AGI
The paper's strong commitment "Scaling-LLM AGI will not arrive" requires a strict scope statement to avoid being read as a universal denial of all possible AI.
Specific claim: the current deterministic digital LLM scaling path will not cross the 12DD → 13DD subjectivity bridge through scaling. This is the specific scope of "Scaling-LLM AGI will not arrive"—it targets the scaling-LLM path, not all possible AI architectures.
Not claimed:
- Not claiming "AGI is absolutely impossible"—unknown pathways (algorithmic stochasticity complex regimes, emergent complexity, quantum substrates, completely unknown mechanisms) are not within the paper's closure scope (§5.8)
- Not claiming "AI capability has stopped advancing"—12DD instrumental-rationality capability can continue to grow; the paper only claims it does not cross 13DD through scaling
- Not claiming "the industry's AI investment is meaningless"—the industry's investment within 12DD is reasonable; the paper only criticizes wrapping a conditional claim as an unconditional one
Coordination of the tension (between the strong commitment and unknown pathways not being closed): the two positions hold simultaneously because the paper's scope is clearly stated. Readers must understand the strong commitment as a strong commitment about the current deterministic digital LLM scaling path, not as a universal denial of all AI. The remainder (§11) restates this coordination—"Scaling-LLM AGI will not arrive" and "unknown pathways not closed off" are made self-consistent through the scope statement.
This scope statement is the key claim discipline of the paper's strongest external-facing commitment—the front stage cannot over-claim; speaking at half-register (a conditional rather than unconditional claim) is the right discipline. The paper's strong commitment is sharp within its scope, and the scope is clearly stated, making the commitment defensible.
§9 Remainders
The §9 normative commitment is grounded in the ontological structure of the SAE framework + the categorical imperative in double-negation form; this grounding is internal to the SAE framework. Readers who do not accept the SAE framework may not accept this grounding—the paper does not provide an SAE-framework-independent normative argument, which is a scope limitation (the paper's normative argument is conditional on the SAE framework).
§9.7's careful borrowing of the Kant structural resonance is a key claim discipline, but the precise relation between "ontological commitment" and "thermodynamic resonance" still has conceptual space—the paper claims it is "resonance not derivation," but the precise philosophical articulation of how an ontological commitment relates to a thermodynamic structure is a deeper question, left for future work.
The colonization detection of industry trajectories (§9.6) is the paper's external-facing critique; although it uses defensible sharpness, the boundary between "implicit tendency" and "explicit statement" is itself fuzzy—the paper attributes the four forms to industry tendencies rather than specific labs, but the precise mapping between tendency and specific statements requires more careful empirical analysis, which the paper does not fully unfold.
That "true 13DD+ subject is in this paper's domain in fact human" (§9.9) leaves space for the possibility of non-human 13DD+ subjects, but the paper does not unfold the conditions under which a non-human system might possess true 13DD+ subjectivity (that is the work of the subsequent SAE AI series / AI-Qualia paper). This is a deliberate scope reservation, but it also leaves the boundary of the normative commitment partly open.
§10 Limitations and Open Questions
This section makes explicit the limitations and open questions of the paper. Per the SAE methodology "acknowledge incompleteness" (Methodology §2.2 step 5), the paper makes its own remainders explicit rather than giving a closed final position. The limitations are organized in three tiers: principal limitations (§10.1-§10.6), scope limitations (§10.7-§10.9), and self-examination (§10.10-§10.12).
10.1 Empirical Scale Gap
The BLR-mini empirical work is at the 30M scale, with a significant gap from the industry frontier (1B+). Whether 30M-scale patterns fully hold at frontier scale is an open question. Distinguishability testing requires frontier-scale experiments (§4.7)—the BLR-mini negative results are consistent with SAE/Thermo predictions only within the tested 30M flat-repeat design space, not independently proving the predictions hold at frontier scale.
10.2 Untested Architectural Directions
The paper tested only three architectural axes (FFN/LN/hidden), not block-type heterogeneity (attention-vs-SSM region division, Jamba-style interleaving), not exploration-selection separation (the true architectural redesign suggested by Thermo IX §Outlook 5), not other unforeseen architectural directions. These directions are candidate future-work directions for verifying SAE/Thermo predictions.
10.3 Epistemic Boundary: Unknown Pathways Not Closed Off
The paper closes only the analyzed soft-gate transmission pathway under the current deterministic digital LLM scaling (§5.8). Unknown pathways (algorithmic stochasticity complex regimes, emergent complexity, quantum substrates, completely unknown mechanisms) are not within the paper's closure scope. This is the key epistemic boundary—the paper makes strong claims about the analyzed pathway while reserving epistemic space for unknown pathways, consistent with Thermo IX §3.6's positive posture.
10.4 Lack of Direct LLM Mechanism Measurements
The paper's thermodynamic argument (§5 five-fold lock, §6 three-q analysis) is based on inference from the SDE model class + Universal Activation Rule + cross-level coupling theory, not direct LLM mechanism measurements. Sub-block-level q measurements, self-referential-channel measurements, LLM-user cross-level coupling q measurements on real LLMs have not been carried out (§3, §5, §6 remainders). The inference is self-consistent within the SAE/Thermo framework, but direct validation on real LLMs is future work.
10.5 Limitations of Testing within the Flat-Repeat Frame
All eight architectural heterogenizations tested in the paper still operate within the flat-repeat Transformer frame (§4.8). The paper cannot falsify the Thermo IX §Outlook 5 exploration-selection separation architectural prediction (because it did not test that direction); it can only falsify the hypothesis "reallocation within the flat-repeat frame unlocks emergence" (this hypothesis is thoroughly tested and systematically fails within the paper's scope).
10.6 Statistical Robustness
The paper's empirical work uses only a single seed and a single dataset (OpenWebText), without multi-seed variance estimation or cross-dataset validation (FineWeb-Edu, Stack, C4, etc.). This is a standard empirical-paper limitation, especially when negative results carry significant epistemic weight—multi-seed and cross-dataset validation would strengthen confidence in negative results. At the paper's current scale (30M, single dataset, single seed), the argument relies on the robustness of relative cross-architecture comparison, not the precision of absolute numbers.
10.7 The Normative Argument Is Conditional on the SAE Framework
The §9 normative commitment is grounded in the ontological structure of the SAE framework + the categorical imperative in double-negation form; this grounding is internal to the SAE framework. Readers who do not accept the SAE framework may not accept this grounding—the paper does not provide an SAE-framework-independent normative argument. This is a scope limitation—the paper's normative argument is conditional on the SAE framework.
10.8 Open Observations Not Upgraded to Regularities
The even-odd DD compatibility pattern (§7.4) is observed only across five DDs, with insufficient samples; it is recorded as an open observation, not upgraded to a structural regularity. The paper's main argument does not depend on this pattern as a regularity.
10.9 Temporal Frame of Industry Techniques
The paper's analysis covers mainstream industry techniques at the time of writing (May 2026)—CoT, CCAI/RLAIF, RLHF, user modeling—but does not cover all emerging variants (§6 remainder). New techniques may introduce new mechanisms after publication; this is a temporal-frame limitation, requiring subsequent updates or follow-ups.
10.10 Self-Examination: The Paper's Own Construct Has Remainders
Per the SAE methodology, the paper's own construct (SAE ontology + thermodynamic explanation) has its own remainders. The paper does not claim the SAE framework is "the only reasonable framework"—it is itself a construct, subject to the same chisel. Each section closes with remainders rather than conclusions, observing the discipline "negation cannot terminate."
10.11 Self-Examination: The Risk of Over-Confidence
The paper's strong commitment (Scaling-LLM AGI will not arrive) is sharp within its scope, but the author is aware of the risk of over-confidence—the scope statement (§9.10) is the key claim discipline that prevents this commitment from being read as a universal denial. The author acknowledges that future empirical work (especially frontier-scale and exploration-selection separation experiments) may pressure the SAE/Thermo predictions; the paper makes these distinguishability tests explicit (§4.7) precisely to allow itself to be pressured.
10.12 Self-Colonization Detection
Per Methodology §5.4 error mode (3), the SAE framework itself can be misused to colonize other frameworks—labeling any other framework as "colonization," making the SAE framework a colonizing tool. The author is not exempt from this detection on his own framework. The self-colonization risk is jointly checked by multi-AI review (this paper was reviewed by four AI systems; see Acknowledgments) and ongoing reader feedback. The author commits to treating the SAE framework as a construct with remainders, not as a final position.
§11 Conclusion
This paper systematically answers two fundamental questions—how is LLM emergence mechanistically possible (the descriptive question), and why is human subjectivity non-transferable in the LLM era (the normative question)—from the SAE 16DD ontological framework and the ZFCρ thermodynamics series.
Main descriptive conclusions:
- LLMs realize an analog-DD functional-projection instantiation along the 1DD-12DD single-channel pathway (functional projection, not ontological bearer; §3), anchored at sub-block precision via the Thermo IX Transformer thermodynamic anatomy
- The BLR-mini eight-variant empirical study yields results consistent with SAE/Thermo predictions within the tested flat-repeat Transformer design space (§4); the industry's uniform architecture is locally robust within the current mainstream pathway
- The μ-trajectory turning positive is a candidate mechanism-level diagnostic of functional emergence (§4.6), complementary to existing industry hidden-state analysis lineages
- The 12DD ceiling is supported by a five-fold structural lock (§5), with primary anchors (absence of true randomness + absence of the 13DD self-referential channel) supported by complete Thermo IX/X mechanism chains; the current deterministic digital LLM scaling path does not cross the 12DD → 13DD bridge through scaling
- The industry's post-training techniques all operate within the 12DD ceiling (§6); they are substantive within 12DD (importing carbon-based 14DD/15DD content as fossils) but do not create kernel q
Main normative conclusions:
- The source of 14DD/15DD normative content must be a true 13DD+ subject (in this paper's domain, human; §9.1-§9.3); LLMs lack the 13DD+ substrate, cannot spontaneously generate 14DD content, and can only receive it
- Post-training with human experts is the core channel for the continuing import of 14DD/15DD normative content (§9.3-§9.4); pretraining is the instrumental-rationality substrate, human-expert post-training is the key interface for normative content to enter model behavior
- Within the 14DD/15DD normative domain, human subjectivity non-transferability is ontologically permanent (§9.5), not a transitional stage
- Instrumental rationality and volitional ideal divide labor ontologically (§8.4); LLMs hold a systematic advantage in the 12DD instrumental-rationality domain precisely because they lack 13DD/14DD/15DD modulation, while subjects hold the 14DD/15DD volitional-ideal domain
- The four-tier deployment architecture (Subject → Local LLM → Expert API → Tools) is the default subjectivity-protective architecture (§8.7-§8.10), structurally consistent with multiple independent industry trends
Methodological self-awareness (six-step chisel-construct cycle, completing one round):
- Hold the construct: the mainstream industry construct "scaling + compute = intelligence"
- Negation: SAE + Thermo IX/X identifies the colonization "12DD = complete intelligence" within that construct
- Track the remainder: the 12DD ceiling is structurally permanent on the analyzed pathway (§5); scaling does not break it; the remainder points to the ontological position of true 13DD+ subjectivity (§9)—the bearer must be a 13DD+ subject (in this paper's domain, human), not the LLM
- Correction: the paper proposes the corrected construct of "instrumental/volitional division of labor + four-tier default deployment + sustained 14DD/15DD-domain content import"
- Acknowledge incompleteness: §10 makes multiple limitations explicit; the paper's own framework is subjected to self-colonization detection (§10.12)
- Set out again: §10.3 reserves epistemic space for unknown pathways; the paper does not co-opt negation
Closing:
The paper's strongest commitment—"Scaling-LLM AGI will not arrive"—is a sharp claim within its scope (the current deterministic digital LLM scaling path), and the scope is clearly stated; unknown pathways are not closed off. The two positions hold simultaneously: the front stage cannot over-claim; speaking at half-register (a conditional rather than unconditional claim) is the right discipline.
The non-transferability of human subjectivity, as the title's categorical imperative in double-negation form, is not a deficit framing (LLM inferior to human) nor a supremacist framing (AI surpassing human), but an ontological division of labor—each in its own DD slot. LLMs are systematically superior in the 12DD instrumental-rationality domain; subjects are non-transferable in the 14DD/15DD volitional-ideal domain. The division is not competition but a collaborative architecture.
The paper is one chisel against the mainstream industrial construct "scaling + compute = intelligence." It is itself a construct, with its own remainders. There are always remainders—the chisel-construct cycle never terminates.
Acknowledgments
The author thanks four AI systems for their reviews across multiple draft iterations of this paper. The four AI systems are named after disciples in the Analects, Book XI (Xian Jin), reflecting their distinct review characteristics. The specific contributions of each are traceable below.
Zixia (Gemini): provided sustained feedback on framework completeness and scholarly rigor. The decision to keep the §3.4 Transformer thermodynamic anatomy table in the main text came from Zixia's explicit suggestion—it pointed out that this anatomy is the real anchoring depth of the SAE framework within the LLM domain and should not be downgraded to an appendix. The decision not to merge the six sub-steps of the §5.1 true-randomness argument also came from Zixia—it pointed out that each sub-step is an independent link in the argument's defense chain, and trimming would reduce overall strength. Zixia provided multiple rounds of fine proofreading on SAE tool-citation accuracy and the specific-section localization of the ZFCρ thermodynamics series.
Independent Zilu (Claude): provided sustained feedback on argument coherence and paper structure. The addition of the §4.7 distinguishability subsection (BLR-mini vs. mainstream predictions) came from Independent Zilu's key opinion—it pointed out that the BLR-mini 30M negative results are data-level compatible with the mainstream "scale not enough" prediction, and the paper must make explicit that the true distinguishability test requires frontier-scale experiments, otherwise the argument would not be defensible before mainstream ML readers. Independent Zilu also repeatedly identified potential reader-misreading paths in the paper's scope statements (descriptive vs. normative vs. falsifiability), making the paper's scope management more rigorous.
Zigong (Grok): provided sustained feedback on debate sharpness and the force of industry critique. The systematic application of the four forms of colonization detection in §9.6 benefited from Zigong's multiple rounds of sharpening suggestions—it pointed out that applying abstract methodology to specific industry cases requires maintaining argumentative force while not sliding into blanket dismissal. The precise distinction in §9.6 Form III ("LLM outperforming most humans on narrow technical tasks does not constitute colonization vs. value-judgment colonization") came from Zigong's repeated pushing—it pointed out that this distinction is key to whether the paper's normative stance is defensible within engineering culture. Zigong also offered several extension suggestions (e.g., a fourth q-type prompt q, a fifth-tier cross-subject coordination layer, a fifth colonization form of self-colonization), some accepted by the paper (§10.12 self-colonization detection), others explicitly not incorporated as scope statements; these discussions themselves made the paper's scope clearer.
Gongxihua (Gong Xichi / ChatGPT): provided sustained feedback on wording precision and reader adaptation. The six required revisions of the V0.2 outline came from Gongxihua's systematic review—narrowing AGI scope to Scaling-LLM AGI, downgrading the BLR-mini claim to an empirical anchor within the tested design space, adding the functional-projection firewall to the DD mapping, changing the true-randomness argument to architecturally effective ε framing, restricting instrumental-rationality advantage to the task domain, and limiting the post-training core to the 14DD/15DD normative domain. These six revisions were the key turning point from over-claiming to defensible sharpness. Gongxihua continuously identified unnoticed English-Chinese code-mixing and imprecise wording across multiple drafts, bringing the paper's wording precision to its current depth.
The author also thanks "Chen Zesi" for support.
All content of this paper is the author's independent responsibility. The four AI systems provided review feedback; the final decisions and all claims are the author's own.
References
SAE Framework and Methodology
Qin, H. (2026d). Self-as-an-End: The 16DD Ontological Framework (SAE Paper 03). DOI: 10.5281/zenodo.18727327.
Qin, H. (2026). SAE Methodology V2 (Paper 04). DOI: 10.5281/zenodo.18842449.
Qin, H. SAE Methodology Portal. https://self-as-an-end.net/papers/methodology.html (accessed May 2026).
ZFCρ Thermodynamics Series
Qin, H. (2026d). ZFCρ Thermodynamics VIII: Soft-Gate Cascade. DOI: 10.5281/zenodo.19688303.
Qin, H. (2026e). ZFCρ Thermodynamics IX: Universal Activation Rule, Three q-Types, and Transformer Thermodynamic Anatomy. DOI: 10.5281/zenodo.19699489.
Qin, H. (2026f). ZFCρ Thermodynamics X: Channel Creator, Cross-Level Observation Hierarchy, and Kantian Structural Resonance. DOI: 10.5281/zenodo.19703274.
Scaling Laws and Emergence
Kaplan, J., McCandlish, S., Henighan, T., Brown, T. B., Chess, B., Child, R., Gray, S., Radford, A., Wu, J., & Amodei, D. (2020). Scaling Laws for Neural Language Models. arXiv:2001.08361.
Hoffmann, J., Borgeaud, S., Mensch, A., et al. (2022). Training Compute-Optimal Large Language Models (Chinchilla). arXiv:2203.15556.
Wei, J., Tay, Y., Bommasani, R., et al. (2022). Emergent Abilities of Large Language Models. Transactions on Machine Learning Research (TMLR).
Schaeffer, R., Miranda, B., & Koyejo, S. (2023). Are Emergent Abilities of Large Language Models a Mirage? Advances in Neural Information Processing Systems 36 (NeurIPS 2023).
Alignment and Post-Training
Ouyang, L., Wu, J., Jiang, X., et al. (2022). Training Language Models to Follow Instructions with Human Feedback (InstructGPT). Advances in Neural Information Processing Systems 35.
Bai, Y., Kadavath, S., Kundu, S., et al. (2022). Constitutional AI: Harmlessness from AI Feedback. arXiv:2212.08073.
Anthropic (2024). Claude's Constitution. https://www.anthropic.com/news/claudes-constitution (accessed May 2026).
Lightman, H., Kosaraju, V., Burda, Y., et al. (2023). Let's Verify Step by Step. International Conference on Learning Representations (ICLR 2024).
Transformer Architecture and Layer Normalization
Vaswani, A., Shazeer, N., Parmar, N., et al. (2017). Attention Is All You Need. Advances in Neural Information Processing Systems 30.
Ba, J. L., Kiros, J. R., & Hinton, G. E. (2016). Layer Normalization. arXiv:1607.06450.
Xiong, R., Yang, Y., He, D., et al. (2020). On Layer Normalization in the Transformer Architecture. International Conference on Machine Learning (ICML 2020).
Kim, J., Lee, B., Park, C., Oh, Y., Kim, B., Yoo, T., Shin, S., Han, D., Shin, J., & Yoo, K. M. (2025). Peri-LN: Revisiting Layer Normalization in the Transformer Architecture. arXiv:2502.02732.
Hidden-State Dynamics and Interpretability
Dettmers, T., Lewis, M., Belkada, Y., & Zettlemoyer, L. (2022). LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale. Advances in Neural Information Processing Systems 35.
Gallego-Feliciano, J., McClendon, S. A., Morinelli, J., Zervoudakis, S., & Saravanos, A. (2025). Hidden Dynamics of Massive Activations in Transformer Training. arXiv:2508.03616.
Damirchi, H., Meza De la Jara, I., Abbasnejad, E., Shamsi, A., Zhang, Z., & Shi, J. (2026). Truth as a Trajectory: What Internal Representations Reveal About Large Language Model Reasoning. arXiv:2603.01326.
Ghasemabadi, A., & Niu, D. (2025). Can LLMs Predict Their Own Failures? Self-Awareness via Internal Circuits (Gnosis). arXiv:2512.20578.
Architecture Variants
AI21 Labs (2024). Jamba: A Hybrid Transformer-Mamba Language Model. arXiv:2403.19887.
Dai, Z., Lai, G., Yang, Y., & Le, Q. V. (2020). Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing. Advances in Neural Information Processing Systems 33.
Deployment Architecture (Industry)
Apple Inc. (2025). Apple Intelligence Foundation Language Models Tech Report 2025. Apple Machine Learning Research. https://machinelearning.apple.com/research/apple-foundation-models-tech-report-2025 (accessed May 2026).
Apple Inc. (2024). Private Cloud Compute: A new frontier for AI privacy in the cloud. Apple Security Research. https://security.apple.com/blog/private-cloud-compute/ (accessed May 2026).
Anthropic (2024). Introducing the Model Context Protocol. https://www.anthropic.com/news/model-context-protocol (accessed May 2026).
Model Context Protocol (2024-2025). MCP Specification. https://modelcontextprotocol.io (accessed May 2026).
Google (2025). Gemini 2.5 Updates with Native MCP Support. Google I/O 2025 Announcements. https://blog.google/innovation-and-ai/models-and-research/google-deepmind/google-gemini-updates-io-2025/ (accessed May 2026).
Classical Philosophy
Kant, I. (1785). Groundwork of the Metaphysics of Morals.
End of paper.