Self-as-an-End
Self-as-an-End Theory Series · Mathematical Foundations · ZFCρ Series Paper XVI · Zenodo 19013602

The Insertion Identity, Variance Splitting, and the Relay Engine of Conjecture H'

Han Qin (秦汉) · Independent Researcher · March 2026
DOI: 10.5281/zenodo.19013602 · CC BY 4.0 · ORCID: 0009-0009-9583-0018
📄 View on Zenodo (PDF)
English
中文
Abstract

We establish the algebraic and analytic engine underlying the compositeness discount identified in Paper 15. Three structural results are proved unconditionally. First, the exact insertion identity: for any composite n = pm with p = P⁻(n), the SPF gain decomposes as G_spf(n) = j(m) + K_p(m), where j(m) = max(G(m), 0) is the previous-layer jump and K_p(m) = ρ_E(pm−1) − ρ_E(m−1) − ρ_E(p) − 2 is the bridge term. This gives the exact recursion μ_{k+1}(x) = E_{I_k}[j(m)] + E_{I_k}[K_p(m)], reducing the Unbounded Mean Gain conjecture to two sub-problems. Second, the variance splitting decomposition: writing G_spf = A + B + 1 where A(n) = ρ_E(n−1) − ρ_E(n) ≥ −1 and B(n) = ρ_E(n) − ρ_E(P⁻(n)) − ρ_E(n/P⁻(n)) − 2 ≤ 0, with B ∈ [−2(k−1), 0] when v_{P⁻(n)}(n) = 1 (squarefree-in-P⁻, unconditional). Third, the bridge term identity: K_p(m) = A(pm) + B(pm) − A(m) (unconditional). Under Numerical Hypotheses A-tail and B-bound, Var(G_spf | Ω = k) = O_k(1). Numerically at N = 10⁷: E[G_spf | Ω = k] grows linearly with slope ≈ 0.26 (R² = 0.995), Var ∈ [1.21, 1.83], insertion bias ≈ +0.31. The relay mechanism — j(m) transmission compensating K_p drag — is quantitatively confirmed. A Bridge Corollary passes to x → ∞ limits: if Lemmas I and II hold uniformly, then μ_spf,∞(k) → ∞ completing the bridge to Assumption A'.

Keywords: integer complexity, ρ-arithmetic, insertion identity, variance splitting, relay mechanism, compositeness discount, bridge term, Chebyshev bound

1. Introduction

1.1 Context

Paper 15 (DOI: 10.5281/zenodo.19007312) established a proof architecture for D(N) → 1 under three assumptions: (A) monotone pointwise convergence of p_k(N), (A') p_∞(k) → 1, and (B) Sathe–Selberg. The present paper constructs the algebraic and analytic machinery needed to close Assumptions A and A'.

The key advance is the discovery that the relay mechanism — identified qualitatively in Paper 15 §6.4 — admits an exact algebraic formalization as the insertion identity, and that the variance control needed for Chebyshev application follows from a splitting lemma that exploits the one-sided bounds inherent in the DP definition of ρ_E.

1.2 Main Results

Theorem 1 (Exact Insertion Identity). For any composite n ≥ 4 with P⁻(n) = p and m = n/p,

G_spf(n) = j(m) + K_p(m)

where j(m) = max(G(m), 0) and K_p(m) = ρ_E(pm−1) − ρ_E(m−1) − ρ_E(p) − 2.

Theorem 2 (Variance Splitting). Write G_spf(n) = A(n) + B(n) + 1 where A(n) = ρ_E(n−1) − ρ_E(n), B(n) = ρ_E(n) − ρ_E(P⁻(n)) − ρ_E(n/P⁻(n)) − 2. Then: (a) A(n) ≥ −1; (b) B(n) ≤ 0; (c) when v_{P⁻(n)}(n) = 1: B ∈ [−2(k−1), 0] (unconditional); when v ≥ 2: isolated into Numerical Hypothesis B-bound; (d) under Hypotheses A-tail and B-bound: Var(G_spf | Ω = k, n ≤ N) = O_k(1) uniformly in N.

Theorem 3 (Bridge Term Identity). K_p(m) = A(pm) + B(pm) − A(m) (algebraic identity, unconditional). The expectation bound E_{I_k}[K_p(m)] ≥ −C_2(k) is a numerical observation (C_2 ≈ 1 at N = 10⁷), not proved; it remains an open input (Lemma II).

Theorem 4 (Two-Parameter Shell Mean Identity). Define μ_k(X, p) = E[G_spf(m) | Ω(m) = k, m ≤ X, P⁻(m) ≥ p] and Φ_k(X, p) = #{m ≤ X : Ω(m) = k, P⁻(m) ≥ p}. Then

E_{I_k(x)}[G_spf(m)] = Σ_p Φ_k(x/p, p) · μ_k(x/p, p) / Σ_p Φ_k(x/p, p)

Theorem 5 (Finite-Window Recursion). Under Lemma I (E_{I_k}[G_spf(m)] ≥ μ_k(x) − C_1) and Lemma II (E_{I_k}[K_p(m)] ≥ −C_2(k)),

μ_{k+1}(x) ≥ μ_k(x) + δ_k(x) − C_1 − C_2(k)

where δ_k(x) = E_{I_k}[max(−G_spf(m), 0)] ≥ 0 is the truncation gain. A Bridge Corollary passes to x → ∞ limits: if Lemmas I and II hold uniformly in x and Σ_{j=k_0}^{K} (δ_{j,∞} − C_1 − C_2(j)) → +∞, then μ_spf,∞(k) → ∞.

Theorem 6 (Toy Model). For a finite prime alphabet P = {q_1 < ··· < q_r}, if the shell mean E[G_spf | P⁻(m) = q_i] is non-decreasing in i, then E^ins[G_spf] ≥ E^amb[G_spf].

Numerical Result 7 (N = 10⁷). E[G_spf | Ω = k] grows linearly with slope ≈ 0.26 (R² = 0.995), Var(G_spf | Ω = k) ∈ [1.21, 1.83] (no trend), insertion bias ≈ +0.31, relay j(m)-transmission compensates K_p-drag across all k.

Numerical Observation 8. E_{I_k}[K_p] ≈ −0.7 for large k, driven by p = 2 at 88% weight. The conjecture E_{I_k}[K_p] > 0 is false.

2. The Exact Insertion Identity

2.1 Statement and Proof

Theorem 1. For any composite n ≥ 4 with P⁻(n) = p and m = n/p, G_spf(n) = j(m) + K_p(m), where j(m) = max(G(m), 0) and K_p(m) = ρ_E(pm−1) − ρ_E(m−1) − ρ_E(p) − 2.

Proof. By definition, G_spf(n) = ρ_E(n−1) − ρ_E(p) − ρ_E(m) − 1. The DP recurrence gives ρ_E(m) = ρ_E(m−1) + 1 − j(m). Substituting:

G_spf(n) = ρ_E(pm−1) − ρ_E(p) − [ρ_E(m−1) + 1 − j(m)] − 1 = [ρ_E(pm−1) − ρ_E(m−1) − ρ_E(p) − 2] + j(m) = K_p(m) + j(m). □

Remark. When m is prime, G(m) ≤ 0, so j(m) = 0 and ρ_E(m) = ρ_E(m−1) + 1, consistent. Verified: N = 10⁷, k = 3 through 18, identity holds exactly for all composites.

2.2 The Insertion Recursion

The map (m, p) ↦ n = pm with Ω(m) = k, p prime, p ≤ P⁻(m), pm ≤ x gives a bijection onto {n ≤ x : Ω(n) = k+1, n composite}. Define the insertion measure I_k(x) by this bijection. Then:

μ_{k+1}(x) = E_{I_k(x)}[j(m)] + E_{I_k(x)}[K_p(m)]

This is exact. The first term transmits the previous-layer jump; the second is the local bridge cost.

2.3 Interpretation: the Relay Formalized

Paper 15 §6.4 identified the relay mechanism qualitatively: P(jump) drives at low k, E[j/ln | jump] drives at high k. The insertion identity reveals the precise algebraic mechanism: the (k+1)-th layer's mean gain inherits the k-th layer's jump sizes (via j(m)), augmented or diminished by the bridge cost K_p(m).

3. The Variance Splitting Lemma

3.1 The A/B Decomposition

Theorem 2. Write G_spf(n) = A(n) + B(n) + 1 where A(n) = ρ_E(n−1) − ρ_E(n) and B(n) = ρ_E(n) − ρ_E(P⁻(n)) − ρ_E(n/P⁻(n)) − 2.

Proof sketches. (a) ρ_E(n) ≤ ρ_E(n−1) + 1 (successor always available), so A(n) ≥ −1. (b) ρ_E(n) ≤ ρ_E(P⁻(n)) + ρ_E(n/P⁻(n)) + 2 (SPF split valid), so B(n) ≤ 0.

(c) Let p = P⁻(n), v = v_p(n). Write ρ_E(n) = f(n) + r(n). For v = 1 (squarefree-in-P⁻): ε_p = 0, so B(n) = r(n) − r(n/p) − 2. Lower bound: r(n) ≥ 0 and r(n/p) ≤ 2(k−2), giving B ≥ −2(k−1). Hence B ∈ [−2(k−1), 0] unconditionally. For v ≥ 2: ε_p = ρ_E(p^v) − ρ_E(p) − ρ_E(p^{v−1}); the available lower bound deteriorates with p, not yielding an N-independent constant — isolated into Numerical Hypothesis B-bound.

(d) Under Numerical Hypotheses A-tail and B-bound, Var(G_spf | Ω = k, n ≤ N) = O_k(1) uniformly in N. □

Remark (Why direct analysis fails). The "obvious" computation via G = X − Y − Z − 1 gives Var(X) ≈ 20.6, Var(Z) ≈ 23.1, Cov(X,Z) ≈ 19.85 — enormous covariance nearly cancelling the individual variances, a consequence of the shared factor ρ_E(n). The A/B decomposition algebraically removes this common factor, revealing the true variance is O(1) residuals.

3.2 Numerical Verification

At N = 10⁷, Var(G_spf | Ω = k) ∈ [1.21, 1.83] across k = 2 to 20, no increasing trend. Component analysis at k = 4, N = 10⁶: Var(A) ≈ 0.81, Var(B) ≈ 0.54, Cov(A,B) ≈ −0.05.

kVar(G_spf | Ω=k)Var(A)Var(B)Cov(A,B)
21.21
41.430.810.54−0.05
101.67
201.83

4. The Bridge Term

4.1 Algebraic Identity

Theorem 3. K_p(m) = A(pm) + B(pm) − A(m).

Proof. A(pm) = ρ_E(pm−1) − ρ_E(pm). B(pm) = ρ_E(pm) − ρ_E(p) − ρ_E(m) − 2. A(m) = ρ_E(m−1) − ρ_E(m). Therefore:

A(pm) + B(pm) − A(m) = [ρ_E(pm−1) − ρ_E(pm)] + [ρ_E(pm) − ρ_E(p) − ρ_E(m) − 2] − [ρ_E(m−1) − ρ_E(m)] = ρ_E(pm−1) − ρ_E(p) − ρ_E(m−1) − 2 = K_p(m). □

4.2 The Bridge Lower Bound: Status

The algebraic identity K_p = A(pm) + B(pm) − A(m) is unconditional. The one-sided bounds (A ≥ −1, B ≤ 0) constrain the range of K_p but do not directly yield a lower bound on its mean. A bounded-variance, non-positive random variable can still have arbitrarily negative mean. Therefore, the bridge lower bound is not a consequence of the hypotheses in this paper.

Numerical Observation (Bridge Lower Bound). At N = 10⁷, E_{I_k}[K_p] is bounded below across all tested k: values range from −0.19 (k=3) to −0.97 (k=18), with no sign of divergence. This supports E_{I_k}[K_p] ≥ −C_2 for C_2 ≈ 1, but remains a numerical observation, not proved. This is an open input (Lemma II) for Theorem 5.

4.3 Numerical Structure

The insertion-weighted bridge term E_{I_k}[K_p] is negative for all k examined. The negative drift is driven by p = 2: its weight in I_k rises from 43% at k=3 to 88% at k=6.

kE_{I_k}[K_p]Weight of p=2
3−0.1943%
4−0.3162%
5−0.4674%
6−0.5788%
≈ −0.788%

Per-prime structure: E[K_2] ≈ −0.65 (always negative), E[K_3] ≈ +0.30, E[K_5] ≈ +0.50, E[K_7] ≈ +0.02. The positive contributions from p = 3, 5 are overwhelmed by p = 2's dominance. The conjecture E_{I_k}[K_p] > 0 is false. The growth of μ_k cannot come from the bridge term; it must come entirely from jump transmission E_{I_k}[j(m)].

5. The Insertion Measure and Its Bias

5.1 Radon-Nikodym Formula

For m on the Ω = k shell with m ≤ x, define w_x(m) = π(min(P⁻(m), x/m)). The insertion measure has density

dν^ins / dν^amb (m) = w_x(m) / w̄_{k,x}, w̄_{k,x} = N_{k+1}(x) / N_k(x)

For any shell observable H: E^ins[H] − E^amb[H] = Cov^amb(H, w_x) / E^amb[w_x]. Lemma I is equivalent to Cov^amb(G_spf, w_x) ≥ −C_1 · E^amb[w_x]. The Radon-Nikodym derivative is not uniformly bounded.

5.2 Two-Parameter Shell Mean Identity

Theorem 4 reduces the insertion bias to a weighted average of two-parameter shell means μ_k(x/p, p) over primes p, with weights ω_{k,x}(p) = Φ_k(x/p, p) / N_{k+1}(x).

5.3 Covariance Decomposition

Decompose w_x(m) = π(P⁻(m)) + r_x(m) where r_x(m) = π(min(P⁻(m), x/m)) − π(P⁻(m)) ≤ 0. Then:

Cov(G_spf, w_x) = Cov(G_spf, π(P⁻)) [least-prime-factor bias] + Cov(G_spf, r_x) [size cutoff correction]

Numerically at N = 10⁶: for k ≥ 4, the first term is positive (larger P⁻ correlates with larger G_spf). For k = 3, the first term is negative, but the size cutoff overcompensates, making the total positive.

5.4 An Eliminated Approach: P⁻-Monotonicity

The route "prove E[G_spf | P⁻ = p, Ω = k] is non-decreasing in p, then apply Chebyshev rearrangement" is false in general. At k = 3, N = 10⁶: the shell mean is not monotone — it rises for p = 3, 5 then drops for p = 7, 11, …. Pure P⁻-monotonicity cannot establish Lemma I; the size cutoff is essential.

5.5 Numerical Evidence

At N = 10⁷, the insertion bias on j(m) — that is, E^ins[j(m)] − E^amb[j(m)] — is positive for all tested k:

kj-bias (E^ins − E^amb)
3+0.302
4+0.415
5+0.432
8+0.300
10+0.271
15+0.226
18+0.421

Mean j-bias ≈ +0.31, stable across k. Direct computation of the G_spf-bias at N = 10⁶ gives values in +0.03 to +0.10 (all positive), providing stronger but still numerical support for Lemma I.

6. The Toy Model

6.1 Finite Prime Alphabet

Theorem 6. Let P = {q_1 < ··· < q_r} be a finite set of primes. On the restricted shell, partition by P⁻(n) = q_i with class weights α_i and class means h_i = E[H | P⁻ = q_i]. The toy insertion weight is w_i = i (number of primes ≤ q_i in P). If h_1 ≤ h_2 ≤ ··· ≤ h_r, then E^ins[H] ≥ E^amb[H].

Proof. E^ins − E^amb = Cov_α(w_i, h_i) / Σ_i w_i α_i. Since w_i = i is increasing and h_i is assumed increasing, Chebyshev's sum inequality gives Cov_α(w, h) ≥ 0. □

6.2 Significance and Limitations

The toy model proves that insertion bias is automatically non-negative when the shell observable is P⁻-monotone. In the unrestricted setting, G_spf is not globally P⁻-monotone (§5.4), so the toy proof does not directly apply. However, it identifies the structural mechanism: insertion reweighting favors larger-P⁻ classes, which tend to have higher gains. The gap between the toy model and the full result is precisely the role of the size cutoff.

7. The Conditional Growth Criterion

7.1 Statement and Bridge Corollary

Theorem 5 (Finite-Window Recursion). Under Lemma I and Lemma II, define the truncation gain δ_k(x) = E_{I_k}[max(−G_spf(m), 0)]. Then:

μ_{k+1}(x) ≥ μ_k(x) + δ_k(x) − C_1 − C_2(k)

Proof. By Theorem 1, μ_{k+1} = E_{I_k}[j(m)] + E_{I_k}[K_p]. Since G(m) ≥ G_spf(m), j(m) = max(G(m),0) ≥ max(G_spf(m),0) = G_spf(m) + max(−G_spf(m),0). Taking expectations: E_{I_k}[j] ≥ E_{I_k}[G_spf] + δ_k ≥ (μ_k − C_1) + δ_k. With E_{I_k}[K_p] ≥ −C_2(k):

μ_{k+1} ≥ (μ_k − C_1 + δ_k) + (−C_2(k)) = μ_k + δ_k − C_1 − C_2(k). □

Bridge Corollary. Define μ_spf,∞(k) := liminf_{x→∞} μ_k(x) and δ_{k,∞} := liminf_{x→∞} δ_k(x). If Lemmas I and II hold uniformly in x, then μ_spf,∞(k+1) ≥ μ_spf,∞(k) + δ_{k,∞} − C_1 − C_2(k). In particular, μ_spf,∞(k) → ∞ whenever Σ_{j=k_0}^{K} (δ_{j,∞} − C_1 − C_2(j)) → +∞ as K → ∞.

7.2 The Truncation Gain

The truncation gain δ_k is the expected value of |G_spf(m)| on the negative part of the G_spf distribution, under insertion measure. At N = 10⁷, δ_k > 0 for all tested k: δ_3 ≈ 0.45, δ_5 ≈ 0.35, δ_{10} ≈ 0.20. At k ≥ 17, p_k(N) = 1.000, so δ_k becomes very small in the current window.

7.3 The Growth Mechanism

Combining the three ingredients:

  • Truncation gain δ_k > 0 (relay engine)
  • Insertion bias ≈ +0.31 (measure favorability)
  • Bridge drag ≈ −0.7 (bounded cost)

The net per-step increment s_k := μ_{k+1} − μ_k ≈ 0.26, stable across k = 2 to 20 at N = 10⁷. The mechanism: even as μ_k grows, the insertion bias provides a persistent uplift that compensates the shrinking truncation gain and the bridge drag.

8. Numerical Evidence at N = 10⁷

8.1 Summary Tables

kE[G_spf | Ω=k]Var(G_spf | Ω=k)p_k(N)E_{I_k}[j(m)]E_{I_k}[K_p]
2−0.471.210.2290.30−0.23
51.241.490.7521.71−0.46
102.871.670.9703.57−0.81
154.081.750.9984.72−0.89
204.521.831.0004.93−0.97

E[G_spf | Ω = k]: linear growth from −0.47 (k=2) to 4.52 (k=20), slope ≈ 0.26, R² = 0.995. p_k(N): strictly increasing from 0.229 to 1.000.

8.2 Eliminated Conjectures

E_{I_k}[K_p] > 0: false. The bridge term is persistently negative, driven by p=2 dominance.

P⁻-monotonicity of shell means: false at k = 3. Size cutoff is essential for positive insertion bias.

9. Toward Convergence of SPF-Positivity Rates

9.1 The (A, B) Reformulation

The SPF-positivity condition G_spf(n) > 0 becomes A(n) + B(n) > −1. Define p_k^spf(N) := P(G_spf(n) > 0 | Ω(n) = k, n ≤ N). Since G(n) ≥ G_spf(n), p_k^spf provides a lower bound for the actual jump rate p_k(N). Convergence of p_k^spf(N) would support (but not directly establish) Assumption A.

9.2 Why Convergence is Expected

Convergence reduces to the convergence of the joint distribution of (A, B) on Ω-shells. B(n) takes O_k(1) distinct values in [−2(k−1), 0] in the squarefree-dominant case (Sathe-Selberg), and their frequencies are controlled by Sathe-Selberg asymptotics. A(n) = ρ_E(n−1) − ρ_E(n) involves the predecessor n−1, connecting to the bivariate Erdős-Kac framework: Goudout proved that ω(n−1) retains Erdős-Kac behavior conditional on ω(n) = k; Mangerel gave quantitative bivariate results for (ω(n), ω(n+a)). Extending such results to ρ_E (which is near-additive by the f+r decomposition) is a well-defined problem at the current frontier.

9.3 The Remaining Gap

Full proof requires: (i) convergence of r-value frequencies on Ω-shells (from Sathe-Selberg + DP smoothness), and (ii) predecessor stability — the distribution of ρ_E(n−1) − ρ_E(n) conditional on n's factorization type converges, connecting to the bivariate Erdős-Kac literature.

10. Conclusion

10.1 Established Results

This paper proves unconditionally: (a) Theorem 1 (Insertion Identity): G_spf(pm) = j(m) + K_p(m); (b) Theorem 2(a)(b)(c-squarefree) (Variance Splitting): G_spf = A + B + 1 with A ≥ −1, B ≤ 0, B ∈ [−2(k−1), 0] when v_{P⁻}(n) = 1; (c) Theorem 3 (Bridge Identity): K_p = A(pm) + B(pm) − A(m); (d) Theorem 4 (Two-Parameter Reduction): insertion mean is a weighted average of μ_k(x/p, p); (e) Theorem 6 (Toy Model): Chebyshev rearrangement proves bias ≥ 0 under P⁻-monotonicity.

Conditionally (under Hypotheses A-tail and B-bound): (f) Theorem 2(d): Var(G_spf | Ω = k) = O_k(1).

Numerically observed but not proved: (g) Bridge Lower Bound (Lemma II), open input for Theorem 5; (h) Theorem 5 (Finite-Window Recursion) and Bridge Corollary under Lemma I + II.

10.2 The Remaining Frontier

The proof of μ_spf(k) → ∞ reduces to four inputs:

  • Numerical Hypothesis A-tail: P(A(n) > t | Ω = k) ≤ Ce^{−αt}. Verified to N = 10⁷. Proving requires quantitative control of "smoothness jumps" between adjacent integers (Barban-Davenport-Halberstam).
  • Numerical Hypothesis B-bound: Var(B | Ω = k, n ≤ N) = O_k(1) uniformly. Verified to N = 10⁷. Proving requires Sathe-Selberg concentration on squarefree factorizations dominating the v ≥ 2 tail.
  • Lemma I (Insertion non-compression): E_{I_k}[G_spf(m)] ≥ μ_k(x) − C_1. Positive j-bias (+0.31) and G_spf-bias (+0.03 to +0.10) provide numerical support.
  • SPF-positivity convergence: Reformulated via (A, B) decomposition as a shifted-integer joint distribution problem at the frontier of analytic number theory.

10.3 Proof Dependency Graph

Lemma I + II (uniform in x) + Σ(δ_{j,∞} − C_1 − C_2(j)) → +∞ → μ_spf,∞(k) → ∞ + Thm 2 (variance) → A' via Chebyshev + SPF-positivity conv. (§9) + A' + B (Sathe-Selberg) → Thm F (Paper 15) → D(N) → 1

10.4 Eliminated Approaches

"E_{I_k}[K_p] > 0": false (p = 2 dominates at 88%). "P⁻-monotonicity of shell means": false at k = 3. "Bounded Radon-Nikodym derivative": false. "p_∞(k) = 1/2 via Gaussian noise": false (Cov(ρ_E(n−1), ρ_E(n/P⁻)) ≈ 20 invalidates independence assumption). Each eliminated approach sharpened the correct formulation.

11. Methodological Note: Four-AI Parallel Exploration

11.1 Protocol

The results in this paper were developed through a structured parallel exploration involving four AI systems — Claude (Anthropic), ChatGPT (OpenAI), Gemini (Google), and Grok (xAI) — coordinated by the author. The protocol is documented here as a reproducible methodology for AI-assisted mathematical research on open problems.

The exploration proceeded in two rounds. In Round 1, the three open sub-problems from Paper 15 were assigned based on capability profiles:

  • ChatGPT: creative multi-route exploration → Unbounded Mean Gain formalization
  • Gemini: numerical computation and distribution analysis → variance control
  • Grok: stateful REPL with high-precision computation → ρ_E to N = 10⁷ and cross-line numerical verification
  • Claude: long-chain deductive reasoning → Assumption A (deepest theoretical gap)

Each system received a tailored prompt containing the shared mathematical context, the specific sub-problem, suggested attack vectors, and explicit instructions to report obstructions and failures.

11.2 Contributions by System

ChatGPT discovered the exact insertion identity (Theorem 1): G_spf(pm) = j(m) + K_p(m), which became the paper's central algebraic result. It also developed the two-parameter shell mean identity (Theorem 4), the covariance decomposition (§5.3), the finite prime alphabet toy model (Theorem 6), and the conditional growth criterion (Theorem 5). In the review phase, ChatGPT identified: the f-additivity bug in Theorem 2(c) (the non-coprime case p² | n), the fixed-x quantifier error in the Bridge Corollary, the positive-part sum fallacy, the variance-does-not-control-mean gap, and the lower-bound-deterioration distinction in ε_p. Each correction sharpened the paper's logical hygiene after eight revision rounds.

Gemini discovered the A/B variance splitting decomposition (Theorem 2) and the bridge term identity K_p = A(pm) + B(pm) − A(m) (Theorem 3). This was the key structural insight that resolved the variance control problem, bypassing the enormous covariance (~20) between ρ_E(n−1) and ρ_E(n/P⁻). Gemini also computed the per-prime K_p tables revealing p = 2 dominance at 88% weight and refuted the conjecture E_{I_k}[K_p] > 0.

Grok computed ρ_E(n) to N = 10⁷ (529 seconds, 40 MB), providing the numerical backbone for all quantitative claims. It verified the insertion identity exactly for k = 3 through 18, confirmed E[G_spf] linear growth (slope 0.26, R² = 0.995), established Var(G_spf) ∈ [1.21, 1.83] across k = 2 to 20, and measured the insertion bias at +0.31. Grok's data refuted Claude's p_∞(k) = 1/2 conjecture by confirming p_2(10⁷) = 0.2288, inconsistent with convergence to 1/2.

Claude proposed the C1/C2/C3 sub-conjecture framework for Assumption A and the natural density reformulation (§9). Claude also proposed — and then retracted — the p_∞(k) = 1/2 conjecture (based on a Gaussian noise argument that incorrectly assumed ρ_E(n−1) and ρ_E(n/P⁻) fluctuate independently). The retraction, forced by Gemini's Cov ≈ 20 data and Grok's numerics, clarified the essential role of the DP-induced covariance structure.

11.3 Cross-Validation and Error Correction

The protocol's central feature is cross-validation: each system's conjectures and claims were tested against the others' data and analysis. Four conjectures were refuted during the process:

  1. E_{I_k}[K_p] > 0 (ChatGPT heuristic → refuted by Gemini's per-prime tables)
  2. p_∞(k) = 1/2 (Claude's theoretical prediction → refuted by Gemini's covariance data and Grok's 10⁷ numerics)
  3. P⁻-monotonicity of shell means (implicit in early toy model → refuted by ChatGPT's k=3 computation)
  4. Bounded Radon-Nikodym derivative (Claude's Approach 3 → refuted by ChatGPT's analysis of w_x(m))

Each refutation produced positive information: (1) identified truncation gain as the sole growth engine; (2) revealed the DP-covariance structure; (3) established the essential role of the size cutoff; (4) motivated the two-parameter shell mean reduction. The f-additivity bug (Theorem 2(c)) — caught by ChatGPT's systematic verification against n = 8 — would have been a serious error in the published paper.

11.4 The Human Role

The author's contributions were: (i) problem selection and direction — choosing which gaps to attack, when to parallelize, and when to converge; (ii) cross-line integration — synthesizing results from four independent explorations into a unified narrative; (iii) arbitration of contradictions — when Gemini's data contradicted Claude's prediction, deciding which to trust and why; (iv) termination judgment — determining when the paper had reached a stable, publishable state. No AI system proposed the four-way parallel structure, decided the round boundaries, or judged when to stop exploring and start writing.

11.5 Reproducibility

The protocol is fully reproducible: prompts, round summaries, and cross-validation steps are documented. Any researcher with access to comparable AI systems can replicate the workflow for other open problems. Essential ingredients: (a) a well-defined problem with identifiable sub-components; (b) AI systems with complementary capability profiles; (c) a human coordinator who provides direction, arbitrates, and terminates.

References

  1. H. Qin, "Sieve structure, compositeness discount, and the architecture of Conjecture H'" (Paper 15), Zenodo, DOI: 10.5281/zenodo.19007312.
  2. H. Qin, "The formal framework of ZFCρ" (Paper 1), Zenodo, DOI: 10.5281/zenodo.18914682.
  3. H. Qin, "Concentration and growth of ρ_E" (Paper 11), Zenodo, 2025.
  4. H. Qin, "Telescoping identity and jump statistics" (Paper 12), Zenodo, 2025.
  5. H. Qin, "Zero-inflated lattice normal model" (Paper 13), Zenodo, DOI: 10.5281/zenodo.18991986.
  6. H. Qin, "Three-component decomposition and variance bounds" (Paper 14), Zenodo, 2026.
  7. A. Selberg, "Note on a paper by L. G. Sathe," J. Indian Math. Soc. 18 (1954), 83–87.
  8. G. Tenenbaum, Introduction to Analytic and Probabilistic Number Theory, 3rd ed., AMS, 2015.
  9. P. Erdős and M. Kac, "The Gaussian law of errors in the theory of additive number theoretic functions," Amer. J. Math. 62 (1940), 738–742.
  10. É. Goudout, "Lois de répartition des diviseurs," doctoral thesis and related publications, 2018–2021.
  11. O. Mangerel, "On the bivariate Erdős–Kac theorem," Proc. London Math. Soc., 2021.
摘要

本文建立论文 15 所识别的合数折扣背后的代数与解析引擎。三个结构性结果被无条件证明。第一,精确插入恒等式:对任何合数 n = pm(p = P⁻(n)),SPF 增益分解为 G_spf(n) = j(m) + K_p(m),其中 j(m) = max(G(m), 0) 为上层跳跃,K_p(m) = ρ_E(pm−1) − ρ_E(m−1) − ρ_E(p) − 2 为桥接项。这给出精确递推 μ_{k+1}(x) = E_{I_k}[j(m)] + E_{I_k}[K_p(m)],将无界均值增益猜想归约为两个子问题。第二,方差分裂分解:令 G_spf = A + B + 1,A(n) = ρ_E(n−1) − ρ_E(n) ≥ −1,B(n) = ρ_E(n) − ρ_E(P⁻(n)) − ρ_E(n/P⁻(n)) − 2 ≤ 0;当 v_{P⁻(n)}(n) = 1 时 B ∈ [−2(k−1), 0](无条件)。第三,桥接项恒等式:K_p(m) = A(pm) + B(pm) − A(m)(无条件)。在数值假设 A-tail 和 B-bound 下,Var(G_spf | Ω = k) = O_k(1)。数值上(N = 10⁷):E[G_spf | Ω = k] 以斜率 ≈ 0.26(R² = 0.995)线性增长,Var ∈ [1.21, 1.83],插入偏差 ≈ +0.31。有限窗口递推在 Lemma I+II 下成立。Bridge Corollary:若 Σ(δ_{j,∞} − C_1 − C_2(j)) → +∞,则 μ_spf,∞(k) → ∞,完成通向 A' 的桥梁。

关键词:整数复杂度,ρ-算术,插入恒等式,方差分裂,接力机制,合数折扣,桥接项,Chebyshev 界

§1. 引言

§1.1 背景

论文 15(DOI: 10.5281/zenodo.19007312)建立了在三个假设下证明 D(N) → 1 的架构:(A) p_k(N) 的单调逐点收敛,(A') p_∞(k) → 1,(B) Sathe-Selberg。本文构建封闭假设 A 和 A' 所需的代数与解析机制。

关键进展:论文 15 §6.4 定性识别的接力机制可精确代数化为插入恒等式;Chebyshev 应用所需的方差控制来自利用 ρ_E DP 定义内禀单侧界的分裂引理。

§1.2 主要结果

Theorem 1(精确插入恒等式)。对任何合数 n ≥ 4,设 P⁻(n) = p,m = n/p,

G_spf(n) = j(m) + K_p(m)

其中 j(m) = max(G(m), 0),K_p(m) = ρ_E(pm−1) − ρ_E(m−1) − ρ_E(p) − 2。

Theorem 2(方差分裂)。G_spf(n) = A(n) + B(n) + 1。(a) A ≥ −1;(b) B ≤ 0;(c) v = 1 时 B ∈ [−2(k−1), 0](无条件),v ≥ 2 时分离为 B-bound;(d) 条件于 A-tail + B-bound,Var(G_spf | Ω = k) = O_k(1)。

Theorem 3(桥接项恒等式)。K_p(m) = A(pm) + B(pm) − A(m)(无条件代数恒等式)。E_{I_k}[K_p] ≥ −C_2(k) 为数值观测/开放输入(Lemma II)。

Theorem 4(双参数壳均值恒等式)。

E_{I_k(x)}[G_spf(m)] = Σ_p Φ_k(x/p, p)·μ_k(x/p, p) / Σ_p Φ_k(x/p, p)

Theorem 5(有限窗口递推)。在 Lemma I + II 下,

μ_{k+1}(x) ≥ μ_k(x) + δ_k(x) − C_1 − C_2(k)

Bridge Corollary:在均匀 Lemma I+II 下,若 Σ(δ_{j,∞} − C_1 − C_2(j)) → +∞,则 μ_spf,∞(k) → ∞。

Theorem 6(玩具模型)。有限素数字母表下,P⁻ 单调性蕴含 E^ins[G_spf] ≥ E^amb[G_spf](Chebyshev 重排)。

数值结果 7(N = 10⁷)。E[G_spf | Ω = k]:斜率 0.26,R² = 0.995。Var ∈ [1.21, 1.83],无趋势。插入偏差 +0.31。接力机制定量确认。

数值观测 8。E_{I_k}[K_p] ≈ −0.7(p = 2 主导 88%)。E[K_p] > 0 不成立

§2. 精确插入恒等式

§2.1 证明

G_spf(n) = ρ_E(pm−1) − ρ_E(p) − ρ_E(m) − 1。由 ρ_E(m) = ρ_E(m−1) + 1 − j(m) 代入即得 G_spf(n) = K_p(m) + j(m)。□

注记。m 为素数时 j(m) = 0,一致。验证:N = 10⁷,k = 3 至 18 精确成立。

§2.2 插入递推

(m, p) ↦ pm 给出到 {n ≤ x : Ω(n) = k+1, n 合数} 的双射。精确递推:μ_{k+1}(x) = E_{I_k}[j(m)] + E_{I_k}[K_p(m)]。

§2.3 接力机制的形式化

第 (k+1) 层均值增益继承第 k 层跳跃尺寸(经 j(m)),被桥接代价 K_p(m) 增减。

§3. 方差分裂引理

§3.1 A/B 分解与证明

(a) ρ_E(n) ≤ ρ_E(n−1) + 1 → A ≥ −1。(b) ρ_E(n) ≤ ρ_E(P⁻) + ρ_E(n/P⁻) + 2 → B ≤ 0。(c) B(n) = r(n) − r(n/p) + ε_p − 2。v = 1 时 ε_p = 0:B ∈ [−2(k−1), 0](无条件);v ≥ 2 时可用下界随 p 恶化,分离为 B-bound。(d) 条件于 A-tail + B-bound,Var = O_k(1)。□

直接分析失败的原因。Var(X) ≈ 20.6,Var(Z) ≈ 23.1,Cov(X,Z) ≈ 19.85——巨额对消源于共享因子 ρ_E(n)。A/B 分解代数移除此共同因子,揭示真实方差由 O(1) 余项控制。

§3.2 数值验证

N = 10⁷:Var ∈ [1.21, 1.83],无增长趋势。k=4,N=10⁶:Var(A) ≈ 0.81,Var(B) ≈ 0.54,Cov(A,B) ≈ −0.05。

§4. 桥接项

§4.1 代数恒等式

K_p(m) = A(pm) + B(pm) − A(m)。证明由直接展开得到。□

§4.2 桥接下界状态

代数恒等式无条件成立。单侧界约束 K_p 的值域但不给均值下界——有界方差、非正随机变量仍可有任意负均值。桥接下界 E_{I_k}[K_p] ≥ −C_2(k)(C_2 ≈ 1,N = 10⁷)为数值观测,是 Theorem 5 的开放输入(Lemma II)。

§4.3 数值

E_{I_k}[K_p]:−0.19(k=3)至 −0.97(k=18),趋 −0.7。p = 2 权重从 43% 升至 88%。E[K_2] ≈ −0.65,E[K_3] ≈ +0.30,E[K_5] ≈ +0.50。E[K_p] > 0 不成立。

§5. 插入测度及其偏差

§5.1 Radon-Nikodym 公式

dν^ins/dν^amb = w_x(m)/w̄_{k,x}。E^ins[H] − E^amb[H] = Cov(H, w_x)/E[w_x]。Lemma I 等价于 Cov(G_spf, w_x) ≥ −C_1 · E[w_x]。RN 导数不一致有界(可为 0 或远大于 1)。

§5.2 双参数壳均值恒等式

Theorem 4 将插入偏差归约为 μ_k(x/p, p) 在各素数 p 上的加权平均,权重 ω_{k,x}(p) = Φ_k(x/p,p)/N_{k+1}(x)。

§5.3 协方差分解

Cov(G, w_x) = Cov(G, π(P⁻)) + Cov(G, r_x)。k ≥ 4 时第一项正;k = 3 时第一项负但尺寸截断项过补偿,总体为正。

§5.4 被排除路线:P⁻ 单调性

P⁻ 单调性在 k = 3 不成立。壳均值在 p = 3, 5 后下降。尺寸截断是本质性的。

§5.5 数值

j-偏差(E^ins[j] − E^amb[j]):+0.302 至 +0.421,均值 ≈ +0.31(稳定)。G_spf-偏差 +0.03 至 +0.10(N = 10⁶,全正)。

§6. 玩具模型

有限素数字母表下,w_i = i(递增),h_i(P⁻ 单调)→ Chebyshev 重排 → E^ins ≥ E^amb。□ 无限制情形需处理尺寸截断(§5.4 示不可省略)。

§7. 条件增长准则

§7.1 Theorem 5 与 Bridge Corollary

在 Lemma I(E_{I_k}[G_spf] ≥ μ_k − C_1)和 Lemma II(E_{I_k}[K_p] ≥ −C_2(k))下:

μ_{k+1}(x) ≥ μ_k(x) + δ_k(x) − C_1 − C_2(k)

证明:j(m) ≥ G_spf(m) + max(−G_spf,0) → E[j] ≥ E[G_spf] + δ_k ≥ (μ_k − C_1) + δ_k。代入 Lemma II 即得。□

Bridge Corollary:在均匀 Lemma I+II 下,μ_spf,∞(k+1) ≥ μ_spf,∞(k) + δ_{k,∞} − C_1 − C_2(k)。若 Σ(δ_{j,∞} − C_1 − C_2(j)) → +∞,则 μ_spf,∞(k) → ∞。

§7.2–§7.3 增长机制

截断增益 δ_k(δ_3 ≈ 0.45,δ_5 ≈ 0.35,δ_{10} ≈ 0.20),插入偏差(+0.31),桥接拖拽(≈ −0.7)。净 s_k = μ_{k+1} − μ_k ≈ 0.26,在 k = 2 至 20 稳定。

§8. N = 10⁷ 数值证据

kE[G_spf | Ω=k]Varp_k(N)E[j(m)]E[K_p]
2−0.471.210.2290.30−0.23
51.241.490.7521.71−0.46
102.871.670.9703.57−0.81
204.521.831.0004.93−0.97

E[G_spf]:−0.47 至 4.52,斜率 0.26。被排除:E[K_p] > 0(否),P⁻ 单调性(否)。

§9. 通向 SPF 正性率收敛

p_k^spf(N) = P(G_spf > 0 | Ω = k) 为实际跳跃率下界。收敛性归约为 (A,B) 联合分布收敛:B 分布由 Sathe-Selberg 控制(v=1 主导);A 分布连接到移位整数联合分布(Goudout、Mangerel 的双变量 Erdős-Kac 框架)——当前解析数论前沿。

§10. 结论

§10.1 已确立结果

无条件:Thm 1(插入恒等式),Thm 2(a-c 无平方分), Thm 3(桥接恒等式),Thm 4(双参数归约),Thm 6(玩具模型)。条件性(A-tail + B-bound):Thm 2(d)(Var = O_k(1))。数值观测/开放输入:桥接下界(Lemma II),Thm 5(递推 + Bridge Corollary)。

§10.2 剩余前沿

四个输入:A-tail(相邻整数光滑性跳跃,需 Barban-Davenport-Halberstam),B-bound(Sathe-Selberg 集中于无平方因子),Lemma I(插入非压缩,数值支持),SPF 正性收敛(移位整数联合分布)。

§10.3 证明依赖图

Lemma I+II + Σ(δ_{j,∞} − C_1 − C_2(j)) → +∞ → μ_spf,∞ → ∞ + Thm 2(方差)→ A' + SPF 正性收敛 + B → Thm F(论文 15)→ D(N) → 1

§10.4 被排除路线

E[K_p] > 0(否),P⁻ 单调性(否),有界 RN 导数(否),p_∞(k) = 1/2(否,Cov ≈ 20 使独立假设失效)。每条锐化了正确表述。

§11. 方法论注记:四 AI 并行探索

§11.1 协议

本文结果通过四个 AI 系统——Claude(Anthropic)、ChatGPT(OpenAI)、Gemini(Google)、Grok(xAI)——的结构化并行探索开发,由作者协调。第一轮按能力分工:ChatGPT(创造性探索 → 无界均值增益),Gemini(数值分析 → 方差控制),Grok(高精度计算 → ρ_E 到 10⁷ + 全线验证),Claude(长链推理 → 假设 A 的最深理论缺口)。每个系统收到包含共享数学背景、具体子问题、攻击向量和报告障碍指令的定制提示。

§11.2 各系统贡献

ChatGPT 发现插入恒等式(Theorem 1)——本文核心代数结果——以及双参数壳均值恒等式(Theorem 4)、协方差分解(§5.3)、玩具模型(Theorem 6)、条件增长准则(Theorem 5)。审稿阶段识别了 f-可加性错误(n=8 反例)、固定 x 量词错误、正部和发散谬误、方差不控制均值缺口、下界恶化与对象发散区别。经八轮修订签字。

Gemini 发现 A/B 方差分裂(Theorem 2)——绕过巨额协方差(≈20)的关键洞见——和桥接项恒等式(Theorem 3)。计算各素数 K_p 表,揭示 p=2 主导 88%,否定 E[K_p] > 0。

Grok 计算 ρ_E 到 N = 10⁷(529 秒,40 MB),精确验证插入恒等式(k=3 至 18),确认线性增长(斜率 0.26,R²=0.995),方差稳定(1.21–1.83),插入偏差(+0.31),否定 p_∞(k) = 1/2(p_2(10⁷) = 0.2288)。

Claude 提出假设 A 的 C1/C2/C3 子猜想框架和自然密度重表述(§9)。提出并撤回 p_∞(k) = 1/2 猜想(被 Gemini 的 Cov ≈ 20 数据和 Grok 数值否定),澄清了 DP 协方差结构的本质角色。

§11.3 交叉验证与错误纠正

四个猜想在过程中被否定:(1) E[K_p] > 0(Gemini 否定),(2) p_∞(k) = 1/2(Gemini + Grok 否定),(3) P⁻ 单调性(ChatGPT 否定),(4) 有界 RN 导数(ChatGPT 否定)。每次否定产生正信息。f-可加性错误若未被捕获将是严重的发表错误。ChatGPT 担任主要定理卫生审核者。

§11.4 人的角色

作者贡献:(i) 问题选择与方向,(ii) 跨线整合,(iii) 矛盾仲裁,(iv) 终止判断。没有 AI 系统提出四路并行结构、决定轮次边界或判断何时停止探索。

§11.5 可复现性

协议完全可复现:提示、轮次摘要和交叉验证步骤均有记录。本质要素:(a) 具有可识别子成分的界定明确问题;(b) 互补能力的 AI 系统;(c) 提供方向、仲裁和终止的人类协调者。

参考文献

  1. H. Qin, "Sieve structure, compositeness discount, and the architecture of Conjecture H'" (Paper 15), Zenodo, DOI: 10.5281/zenodo.19007312.
  2. H. Qin, "The formal framework of ZFCρ" (Paper 1), Zenodo, DOI: 10.5281/zenodo.18914682.
  3. H. Qin, "Concentration and growth of ρ_E" (Paper 11), Zenodo, 2025.
  4. H. Qin, "Telescoping identity and jump statistics" (Paper 12), Zenodo, 2025.
  5. H. Qin, "Zero-inflated lattice normal model" (Paper 13), Zenodo, DOI: 10.5281/zenodo.18991986.
  6. H. Qin, "Three-component decomposition and variance bounds" (Paper 14), Zenodo, 2026.
  7. A. Selberg, "Note on a paper by L. G. Sathe," J. Indian Math. Soc. 18 (1954), 83–87.
  8. G. Tenenbaum, Introduction to Analytic and Probabilistic Number Theory, 3rd ed., AMS, 2015.
  9. P. Erdős and M. Kac, "The Gaussian law of errors in the theory of additive number theoretic functions," Amer. J. Math. 62 (1940), 738–742.
  10. É. Goudout, "Lois de répartition des diviseurs," doctoral thesis, 2018–2021.
  11. O. Mangerel, "On the bivariate Erdős–Kac theorem," Proc. London Math. Soc., 2021.