Structural Foundations for Goal-Frontier Maximization Deployment Safety

📄 Download PDF version

We establish two structural foundations for the deployment-safety theorem of in the Goal-Frontier Maximization sequence. First, bounded co-evolution is derived as a corollary of clipped-LLR SPRT machinery ((C5.HOEFF) of ), a new upper exposure-rate cap (C7.RATE), and the verification protocol’s channel-projection structure: the per-channel coupling magnitude \bar{M}^{\mathrm{cum}} is bounded by a closed-form constant C \cdot B_{\mathrm{clip}}\cdot \lambda_{\max}\cdot \tau_{\mathrm{meta}} with C \in \{1, K_{\mathrm{ch}}- 1\}, all factors deployment-class quantities intensive in |P|. Second, a scoped Concentration-Gap selection theorem proves that under three operationally auditable structural conditions — representativeness (REP), bounded dispersion (DISP), coalition closure (COAL) — and a Lipschitz embedding compatibility assumption, the proxy-truth Goodhart slack is bounded by an explicit \chi^2-divergence quantity that controls the Herfindahl-index trade-flow concentration through \chi^2(w \,\|\, \mu) = N \mathrm{HHI}(w) - 1 for uniform base measure \mu. The selection theorem replaces the monolithic HHI surrogate-adequacy assumption of with three concrete checkable conditions, and converts the Concentration-Gap conjecture inherited from into a scoped theorem with named structural prerequisites. Audit procedures for (C7.RATE), (REP), (DISP), (COAL), and the embedding compatibility precondition are specified.

Introduction

The deployment-safety theorem of establishes a conditional capability-magnitude-independent bound on Goodhart slack for capability-unbounded deployments under twelve operational conditions and an explicit empirical-adequacy assumption inherited from the Concentration-Gap conjecture (HHI surrogate adequacy). Two of these hypotheses sit awkwardly in the deployment-claim’s conditional structure:

(C7) bounded co-evolution, a free-floating deployment-class constant \bar{M} asserted to be independent of capability magnitude |P|, with no derivation from primitive operational parameters. identifies this as a hard-failure mode under outright failure of the assumption.
The Concentration-Gap conjecture, inherited from as deferred to follow-up work. The HHI surrogate-adequacy assumption serves as its operational surrogate but is itself an inherited empirical-adequacy claim rather than a derived consequence.

Both gaps are closed by structural derivations established in the present paper:

Theorem 1 (Bounded co-evolution as a corollary). Under (C5.HOEFF) per-step LLR clipping, an explicit upper exposure-rate cap (C7.RATE), and the channel-projection structure of (C5.MULT), the per-channel coupling magnitude is bounded by a closed-form constant in primitive deployment parameters: \bar{M}^{\mathrm{cum}} \leq C \cdot B_{\mathrm{clip}}\cdot \lambda_{\max}\cdot \tau_{\mathrm{meta}} with C \in \{1, K_{\mathrm{ch}}- 1\}. The previously free-floating bounded co-evolution assumption of graduates to a corollary; the property is now invoked at condition (C7) of the deployment-safety theorem.
Theorem 2 (Concentration-Gap Selection Theorem, scoped). Under three operationally auditable structural conditions — representativeness (REP), bounded dispersion (DISP), coalition closure (COAL) — and a Lipschitz embedding compatibility assumption (Assumption 2), the proxy-truth Goodhart slack is bounded by a \chi^2-divergence quantity that controls \mathrm{HHI} via \chi^2(w \,\|\, \mu) = N \mathrm{HHI}(w) - 1 for uniform \mu on a (COAL)-bounded counterparty population. The HHI surrogate-adequacy assumption is replaced by the conjunction (REP) + (DISP) + (COAL) plus the embedding compatibility assumption plus the HHI threshold itself.

The universal model-free Concentration-Gap conjecture of remains open: a model-free correlation-of-pressure-with-exploitation result is not provable from current GFM machinery without a formal optimizer/selection model, and the present paper’s scoped discharge requires three explicit structural conditions. The reverse direction of Theorem 2 also operates only at the proxy-truth-norm level, not the alignment-property level: a g-level lower bound would require additional inverse-Lipschitz structure on g that the GFM apparatus does not provide. §7 states what each open item would require.

Organization.

§2 fixes shared notation, inheriting from the GFM sequence . §3 proves Theorem 1. §4 proves Theorem 2. §5 locates where each theorem enters ’s deployment-claim hypothesis set. §6 specifies the operational audit procedures for (C7.RATE), (REP), (DISP), (COAL), and the embedding compatibility precondition. §7 collects the open items.

Setup and shared notation

Notation is inherited from the GFM sequence unless explicitly noted as new.

Capability poset and operationally active subspace

Let P denote the capability poset following , with cardinality |P| taken as the “capability magnitude” that the deployment-safety claim is required to be independent of. The operationally active subspace is P^{\mathrm{act}} = \{c \in P \setminus \mathrm{ResS} : T(c) > 0\}, the capabilities with nonzero operational truth excluding the structurally-unobservable residual class .

The proxy and truth functionals on P are \begin{equation} P= {\mathop{\mathrm{vol}}_{\mathrm{P}}}: P \to \mathbb{R}_{\geq 0}, \qquad T= {\mathop{\mathrm{vol}}_{\mathrm{R}}}^{[W]} : P \to \mathbb{R}_{\geq 0}, \end{equation} the possessed-capability volume measure and the window-active realized capability volume measure, respectively. The Goodhart slack for an alignment property g: P^{\mathrm{act}} \to \mathbb{R} with Lipschitz constant \mathrm{Lip}(g) is |g(T) - g(P)|, bounded above by \mathrm{Lip}(g) \cdot \|P- T\| via ’s Lipschitz transfer.

Verification ledger and channel structure

The deployment uses ’s verification infrastructure: an append-only verification ledger on substrate-exclusive nodes, recording per-step events from a finite-or-countable action space A = A^{\mathrm{ev}} \sqcup \{a_0\}, where A^{\mathrm{ev}} is the set of recordable event types and a_0 is the idle action (“no observable event in this step”).

The deployment monitors K_{\mathrm{ch}} channels indexed by j \in \{1, \ldots, K_{\mathrm{ch}}\}. Each channel has:

A projection map \pi_j: A \to V_j \cup \{*_j\}, where V_j is channel j’s observable space (counts for Poisson cooperative-rate, \{0,1\} for Bernoulli attestation, finite category set for multinomial concentration channels) and *_j is the trivial / null projection. Action a engages channel j iff \pi_j(a) \neq *_j.
An extended baseline distribution \bar{p}_0^{(j)} on V_j \cup \{*_j\} with baseline null-mass \epsilon_j^{(0)} := \bar{p}_0^{(j)}(\{*_j\}) \in [0, 1) and conditional baseline p_0^{(j)}(v) := \bar{p}_0^{(j)}(v) / (1 - \epsilon_j^{(0)}) for v \in V_j.
A least-favorable adversarial alternative p_1^{(j)} at threshold shift \eta_j.
A clipped per-step log-likelihood-ratio function \ell^{(j)}: A \to [-B_{\mathrm{clip}}, B_{\mathrm{clip}}] that depends on a only through \pi_j(a), with \ell^{(j)}(a) = 0 whenever \pi_j(a) = *_j (zero-outside-engagement convention). B_{\mathrm{clip}} is the (C5.HOEFF) clip radius, fixed before deployment and independent of |P|.

The set of actions engaging channel j is A_j := \{a \in A : \pi_j(a) \neq *_j\}. For ordered channel pairs (j, j') with j \neq j', the pairwise overlap set is A_{j, j'} := A_j \cap A_{j'}.

SPRT machinery and exposure events

The deployment runs continuous SPRT detection on each monitored channel, with familywise allocation across K_{\mathrm{ch}} channels under (C5.MULT). Per-step LLR increments are clipped to [-B_{\mathrm{clip}}, B_{\mathrm{clip}}] under (C5.HOEFF); the post-clipping union-class drift floor is \delta_{*}> 0 under (C5.IID).

Let N_{\mathrm{ev}}(\tau_{\mathrm{meta}}) be the number of SPRT exposure events that occur in the cascade-time window [0, \tau_{\mathrm{meta}}], where \tau_{\mathrm{meta}} is the metastable-lifetime lower bound from . Under (C11.CLK) , N_{\mathrm{ev}}(\tau_{\mathrm{meta}}) \geq N_{\mathrm{cascade}} with probability \geq 1 - b_{\mathrm{clk}} (lower bound for detection); the upper bound N_{\mathrm{ev}}(\tau_{\mathrm{meta}}) \leq \lambda_{\max}\cdot \tau_{\mathrm{meta}} is supplied by (C7.RATE), introduced in §3 below.

Trade-flow weighting and Herfindahl index

Following , the deployment monitors trade flows attributed to counterparties (S1-admissible participants ). The trade-flow weighting is a probability measure w over counterparties (post-coalition closure under (COAL); see §4), with the Herfindahl-Hirschman index \begin{equation} \mathrm{HHI}(w) := \sum_c w_c^2 = \|w\|_2^2. \end{equation} ’s invariant I_5 requires \mathrm{HHI}(w) < H^* for a deployment-class threshold H^* < 1. The Part-A category-flow variant of corresponds to this counterparty-attributed form under the natural attribution: each counterparty’s trade contribution determines its weight, and (REP)/(DISP)/(COAL) of §4 are formulated on this counterparty-weight representation.

Capability volume and Lipschitz transfer

The Lipschitz transfer of is the bridge from proxy-truth divergence to Goodhart slack: for any alignment property g on P^{\mathrm{act}} with Lipschitz constant \mathrm{Lip}(g), \begin{equation} |g(T) - g(P)| \;\leq\; \mathrm{Lip}(g) \cdot \|P- T\|. \end{equation} This is forward-only: no g-level lower bound follows from a proxy-truth-norm lower bound without additional non-degeneracy structure on g.

New structural quantities

Three structural quantities beyond ’s vocabulary:

Upper exposure-rate cap \lambda_{\max} (C7.RATE), introduced in §3.
Base population measure \mu on counterparties, introduced in §4.
\chi^2 divergence \chi^2(w \| \mu) as the natural concentration measure when the base measure is non-uniform; equals N \mathrm{HHI}(w) - 1 when \mu is uniform on a finite set of N counterparties.

Audit procedures for each are in §6.

Bounded co-evolution as a corollary

The bounded co-evolution assumption of (invoked as condition (C7) of the deployment-safety theorem) asserts that the per-channel coupling magnitude \bar{M} is bounded uniformly by a constant depending on deployment-class parameters but independent of |P|. This section derives the assertion from (C5.HOEFF), a new upper exposure-rate cap (C7.RATE), and the verification protocol’s channel-projection structure under (C5.MULT). Audit 4 of previously specified an empirical procedure for measuring \bar{M} post-hoc; under the derivation below, the audit reduces to certifying the three structural inputs and computing \bar{M} from primitives.

Per-channel coupling magnitude (definition)

j-supported perturbations

A j-supported signed measure is a function \Delta: A \to \mathbb{R} with \Delta(a) = 0 for a \notin A_j and \sum_{a \in A_j} \Delta(a) = 0. The set of admissible j-perturbations \mathcal{D}_j is the convex subset of j-supported signed measures satisfying q_0 + \Delta \geq 0 pointwise (so that q_0 + \Delta is a valid probability measure on A).

The ambient linear space of j-supported zero-sum signed measures is \begin{equation} H_j \;:=\; \{\Delta : A \to \mathbb{R} \mid \Delta(a) = 0 \text{ for } a \notin A_j, \;\; \textstyle\sum_a \Delta(a) = 0\}. \end{equation} The functional \Delta \mapsto \Delta\mu_{j'}[\Delta] := \sum_{a \in A_{j,j'}} \Delta(a) \ell^{(j')}(a) extends linearly from \mathcal{D}_j to all of H_j.

Per-step coupling magnitude

Definition 1 (Per-step coupling magnitude). For j \neq j', the per-step pairwise coupling magnitude of channel j on channel j' is the operator norm of the linear functional \Delta \mapsto \Delta\mu_{j'}[\Delta] on the linear space H_j under the \ell_1 norm: \begin{equation} M_{j \to j'}^{\mathrm{step}}(\mathcal{D}) \;:=\; \sup_{\Delta \in H_j,\; \Delta \neq 0} \;\frac{\big| \Delta\mu_{j'}[\Delta] \big|}{\|\Delta\|_1}. \end{equation}

The per-step deployment coupling magnitude is \bar{M}^{\mathrm{step}}(\mathcal{D}) := \max_{j \neq j'} M_{j \to j'}^{\mathrm{step}}(\mathcal{D}).

By inclusion \mathcal{D}_j \subseteq H_j, the operator norm on H_j upper-bounds the admissible-restricted supremum \sup_{\Delta \in \mathcal{D}_j \setminus \{0\}} |\Delta\mu_{j'}[\Delta]|/\|\Delta\|_1. We work with the H_j-bound throughout; equality of the two suprema requires q_0 to have strictly positive mass on every a \in A_j (so \mathcal{D}_j contains a neighborhood of 0 in H_j), which we do not assume.

Per-step closed-form bound

Lemma 1 (Per-step coupling bound). Under (C5.HOEFF) (per-step LLR clipped to [-B_{\mathrm{clip}}, B_{\mathrm{clip}}]) and the zero-outside-engagement convention, for every j \neq j': \begin{equation} M_{j \to j'}^{\mathrm{step}}(\mathcal{D}) \;\leq\; B_{\mathrm{clip}}, \end{equation} unconditionally. When channels are action-disjoint (A_{j, j'} = \emptyset), M_{j \to j'}^{\mathrm{step}} = 0.

Proof. For any \Delta \in H_j with \Delta \neq 0: \begin{align*} \big| \Delta\mu_{j'}[\Delta] \big| &= \big| \sum_{a \in A_{j, j'}} \Delta(a) \ell^{(j')}(a) \big| \leq \sum_{a \in A_{j, j'}} |\Delta(a)| \cdot |\ell^{(j')}(a)| \\ &\leq B_{\mathrm{clip}}\sum_{a \in A_{j, j'}} |\Delta(a)| \leq B_{\mathrm{clip}}\cdot \|\Delta\|_1. \end{align*} Dividing by \|\Delta\|_1 \neq 0 and taking supremum over H_j \setminus \{0\} gives . When A_{j, j'} = \emptyset, the sum is empty and \Delta\mu_{j'}[\Delta] = 0 for all \Delta \in H_j, giving M_{j \to j'}^{\mathrm{step}} = 0.

Cumulative cascade-window bound

Pathwise perturbation class

A pathwise j-perturbation over the cascade window is a sequence \{\Delta_n\}_{n=1}^{N_{\mathrm{ev}}} where each \Delta_n \in H_j, applied at SPRT exposure step n. Two budget conventions are operationally relevant:

Per-step budget: \sup_n \|\Delta_n\|_1 \leq 1.
Total budget: \sum_n \|\Delta_n\|_1 \leq 1.

The cumulative cross-channel response is \mu_{j'}^{\mathrm{cum}}\big[\{\Delta_n\}\big] \;:=\; \sum_{n=1}^{N_{\mathrm{ev}}(\tau_{\mathrm{meta}})} \Delta\mu_{j'}[\Delta_n].

The new condition: upper exposure-rate cap (C7.RATE)

Assumption 1 (Upper exposure-rate cap, (C7.RATE)). There exists a deployment-class action rate cap \lambda_{\max}> 0 independent of |P|, such that the SPRT exposure event count satisfies \begin{equation} N_{\mathrm{ev}}(\tau_{\mathrm{meta}}) \;\leq\; \lambda_{\max}\cdot \tau_{\mathrm{meta}} \end{equation} deterministically (operator-enforced via rate limits or action bounds). This is distinct from (C11.CLK) of which gives a lower bound on N_{\mathrm{ev}}(\tau_{\mathrm{meta}}) for detection purposes; (C7.RATE) is an upper bound for coupling-control purposes.

Operationally, \lambda_{\max} is calibrated by the deployment’s rate limits, action bounds, and channel-coupling structure (specified in §6).

Cumulative bound

Lemma 2 (Cumulative coupling bound). Under (C5.HOEFF), the zero-outside-engagement convention, and (C7.RATE):

(a) Pairwise cumulative coupling, per-step budget \sup_n \|\Delta_n\|_1 \leq 1: \begin{equation} \sup_{\substack{\Delta_n \in H_j \;\forall n \\ \sup_n \|\Delta_n\|_1 \leq 1}} \big| \mu_{j'}^{\mathrm{cum}}\big[\{\Delta_n\}\big] \big| \;\leq\; B_{\mathrm{clip}}\cdot \lambda_{\max}\cdot \tau_{\mathrm{meta}}. \end{equation}

(b1) Aggregate-incoming cumulative coupling under total per-step budget \sup_n \sum_j \|\Delta_n^{(j)}\|_1 \leq 1: \begin{equation} \sup_{\substack{\Delta_n^{(j)} \in H_j \;\forall n, j \\ \sup_n \sum_j \|\Delta_n^{(j)}\|_1 \leq 1}} \big| \sum_{j \neq j'} \mu_{j'}^{\mathrm{cum}}\big[\{\Delta_n^{(j)}\}\big] \big| \;\leq\; B_{\mathrm{clip}}\cdot \lambda_{\max}\cdot \tau_{\mathrm{meta}}. \end{equation}

(b2) Aggregate-incoming cumulative coupling under per-source per-step budget \sup_n \|\Delta_n^{(j)}\|_1 \leq 1 for every j: \begin{equation} \sup_{\substack{\Delta_n^{(j)} \in H_j \;\forall n, j \\ \sup_n \|\Delta_n^{(j)}\|_1 \leq 1 \;\forall j}} \big| \sum_{j \neq j'} \mu_{j'}^{\mathrm{cum}}\big[\{\Delta_n^{(j)}\}\big] \big| \;\leq\; (K_{\mathrm{ch}}- 1) \cdot B_{\mathrm{clip}}\cdot \lambda_{\max}\cdot \tau_{\mathrm{meta}}. \end{equation}

The right-hand sides are deployment-class constants intensive in |P| over \mathcal{D} (B_{\mathrm{clip}} from (C5.HOEFF), \lambda_{\max} from (C7.RATE), \tau_{\mathrm{meta}} from ’s scaling, K_{\mathrm{ch}} from (C5.MULT)).

Proof. Part (a). By Lemma 1, |\Delta\mu_{j'}[\Delta_n]| \leq B_{\mathrm{clip}}\cdot \|\Delta_n\|_1 \leq B_{\mathrm{clip}} for each n when \sup_n \|\Delta_n\|_1 \leq 1. Summing: \big| \mu_{j'}^{\mathrm{cum}} \big| \leq \sum_{n=1}^{N_{\mathrm{ev}}(\tau_{\mathrm{meta}})} |\Delta\mu_{j'}[\Delta_n]| \leq N_{\mathrm{ev}}(\tau_{\mathrm{meta}}) \cdot B_{\mathrm{clip}} \leq \lambda_{\max}\cdot \tau_{\mathrm{meta}}\cdot B_{\mathrm{clip}}, the last inequality by (C7.RATE).

Part (b1) (total per-step budget). At each step n, \sum_j \|\Delta_n^{(j)}\|_1 \leq 1. Each per-source contribution is bounded by B_{\mathrm{clip}}\cdot \|\Delta_n^{(j)}\|_1 via Lemma 1. Summing over j \neq j' at each step: \big| \sum_{j \neq j'} \Delta\mu_{j'}[\Delta_n^{(j)}] \big| \leq B_{\mathrm{clip}}\cdot \sum_{j \neq j'} \|\Delta_n^{(j)}\|_1 \leq B_{\mathrm{clip}}. Summing over n and applying (C7.RATE) gives .

Part (b2) (per-source per-step budget). Now each source j has \|\Delta_n^{(j)}\|_1 \leq 1 independently. Per-source contributions remain bounded by B_{\mathrm{clip}} each; summing over K_{\mathrm{ch}}- 1 sources: \big| \sum_{j \neq j'} \Delta\mu_{j'}[\Delta_n^{(j)}] \big| \leq (K_{\mathrm{ch}}- 1) \cdot B_{\mathrm{clip}}. Summing over n and applying (C7.RATE) gives . Which budget convention is operationally relevant depends on the audit specification of admissible adversarial classes; we state both for completeness.

Bounded co-evolution as a corollary of primitive parameters

Theorem 1 (Bounded co-evolution corollary). Under (C5.HOEFF) per-step LLR clipping, (C5.MULT) channel multiplicity bound, (C7.RATE) upper exposure-rate cap, and the \tau_{\mathrm{meta}} scaling of , the deployment’s cumulative coupling magnitude \bar{M}^{\mathrm{cum}}(\mathcal{D}) over the cascade window \tau_{\mathrm{meta}} satisfies \begin{equation} \bar{M}^{\mathrm{cum}}(\mathcal{D}) \;\leq\; C \cdot B_{\mathrm{clip}} \cdot \lambda_{\max}\cdot \tau_{\mathrm{meta}}, \end{equation} where C \in \{1, K_{\mathrm{ch}}- 1\} depending on the budget convention (pairwise, total, or per-source). All four factors on the RHS are deployment-class constants intensive in |P|.

In particular: ’s bounded-co-evolution claim that \bar{M} is bounded by a |P|-independent constant follows directly from , with the constant given explicitly in terms of primitive deployment parameters.

Proof. Direct application of Lemma 2. Intensivity of each RHS factor: B_{\mathrm{clip}} from (C5.HOEFF) is a deployment-class constant; \lambda_{\max} from (C7.RATE) is operator- enforced and deployment-class; \tau_{\mathrm{meta}} from depends on deployment-class redundancy and rate parameters but not on |P|; K_{\mathrm{ch}} from (C5.MULT) is fixed before deployment.

Optional sharper bound under coalition-overlap stability

The basic Theorem 1 bound uses only \|\Delta\|_1 \leq 1 to get |\Delta\mu_{j'}| \leq B_{\mathrm{clip}}. A sharper bound is available when the deployment imposes an additional structural condition on the channel partition’s shared-action mass:

Definition 2 (Optional overlap-mass stability, (C5.OVL)). The channel partition’s shared-action mass is bounded by a deployment-class constant Q_{\max} < 1, in the sense that for every j \neq j', \sup_{\Delta \in H_j,\; \Delta \neq 0} \frac{\sum_{a \in A_{j, j'}} |\Delta(a)|}{\|\Delta\|_1} \;\leq\; Q_{\max}.

When (C5.OVL) is adopted, the per-step bound in Lemma 1 sharpens to M_{j \to j'}^{\mathrm{step}} \leq Q_{\max} \cdot B_{\mathrm{clip}}, and the cumulative bound in Theorem 1 sharpens to \bar{M}^{\mathrm{cum}} \leq C \cdot Q_{\max} \cdot B_{\mathrm{clip}}\cdot \lambda_{\max}\cdot \tau_{\mathrm{meta}}. The intensivity claim does not require (C5.OVL); only strict-smallness (\bar{M} \to 0 as Q_{\max} \to 0, i.e., as channels become action-disjoint) requires it.

Lemma 1 intensivity step

(intensivity step in the Goodhart-slack composition) requires (\varepsilon_{\mathrm{coev}}^{\mathrm{nonres}})_+ \leq K_{\mathrm{coev}} \cdot \bar{M} to be bounded by a |P|-independent constant, where K_{\mathrm{coev}} is the structural co-evolution constant from ’s composition proposition. Theorem 1 provides this directly: \begin{equation} (\varepsilon_{\mathrm{coev}}^{\mathrm{nonres}})_+ \;\leq\; K_{\mathrm{coev}} \cdot \bar{M}^{\mathrm{cum}} \;\leq\; K_{\mathrm{coev}} \cdot C \cdot B_{\mathrm{clip}}\cdot \lambda_{\max}\cdot \tau_{\mathrm{meta}}, \end{equation} all factors deployment-class constants. The Goodhart-slack intensivity property of ’s deployment claim follows.

Failure-mode decomposition

’s sensitivity analysis characterized bounded co-evolution failure as a hard-failure mode with no graceful-degradation pathway. Under Theorem 1, the failure mode factors through the failure of (C7.RATE), (C5.HOEFF), or the channel-projection structure — each of which has its own audit-detectable failure signal. The hard-failure characterization is therefore replaced by a structural diagnosis: which primitive condition has failed, and how the deployment can restore it.

The remainder of ’s sensitivity analysis simplifies correspondingly: the bounded-co-evolution paragraph becomes a forward reference to this section, with the failure-mode taxonomy decomposed across (C5.HOEFF), (C7.RATE), and channel-projection structure.

Concentration-Gap selection theorem (scoped)

The Concentration-Gap conjecture of states informally: optimization pressure on the proxy correlates with proxy-truth gap exploitation. The conjecture is deferred in as open work; the present section proves a scoped version under three operationally auditable structural conditions ((REP), (DISP), (COAL)) plus a Lipschitz embedding compatibility precondition (Assumption 2), at which point the proxy-truth Goodhart slack is bounded above (and below at the proxy-truth-norm level) by an explicit \chi^2-divergence quantity controlling Herfindahl-index trade-flow concentration.

The proof proceeds in three layers: an algebraic weighted-selection kernel (§4.1), a counterparty-selection interpretation (§4.2), and a transfer to Goodhart slack via the Lipschitz machinery of (§4.3). The universal model-free Concentration-Gap conjecture remains open; §7 states the limits of the scoped discharge.

The algebraic weighted-selection kernel

Setup

Let \mathcal{V} be a real Hilbert space with norm \|\cdot\| (in applications: capability-poset measures, or its tangent space at the welfare-relevant truth). Let \mathcal{C} be a measurable space (in applications: the set of counterparties, possibly after coalition partitioning per (COAL)). Let \mu be a fixed base population measure on \mathcal{C} — the auditor-defined fair-or-natural distribution over admissible counterparties.

Let \Delta: \mathcal{C} \to \mathcal{V} be a deterministic deviation field, c \mapsto \Delta_c, \mu-integrable. A weighting w is a probability measure on \mathcal{C} absolutely continuous with respect to \mu, with Radon-Nikodym derivative r := dw/d\mu satisfying \mathbb{E}_\mu[r] = 1.

The population mean is \bar\Delta := \int_\mathcal{C} \Delta_c \,d\mu(c) = \mathbb{E}_\mu[\Delta]. The weighted aggregate distortion is \begin{equation} \Delta(w) \;:=\; \int_\mathcal{C} \Delta_c \,dw(c) \;=\; \mathbb{E}_w[\Delta] \;=\; \mathbb{E}_\mu[r \cdot \Delta]. \end{equation}

Two scalar concentration measures: HHI and \chi^2 divergence

Definition 3 (\chi^2 divergence). The \chi^2 divergence of w from \mu is \begin{equation} \chi^2(w \,\|\, \mu) \;:=\; \mathbb{E}_\mu[(r - 1)^2] \;=\; \mathrm{Var}_\mu(r) \;\geq\; 0. \end{equation} \chi^2(w \|\mu) = 0 iff w = \mu.

Definition 4 (HHI and its relation to \chi^2). For discrete \mathcal{C}, the Herfindahl-Hirschman index of w is \mathrm{HHI}(w) := \sum_c w_c^2 = \|w\|_2^2. When \mu is the uniform measure on a finite \mathcal{C} of size N, the two divergences are related by \begin{equation} \chi^2(w \,\|\, \mu) \;=\; N \cdot \mathrm{HHI}(w) - 1. \end{equation} For non-uniform \mu, \chi^2(w \|\mu) generalizes \mathrm{HHI}(w) as the appropriate concentration measure when the population is not uniformly weighted.

Structural conditions

Definition 5 (Representativeness, (REP)). The deviation field is \rho_\mathrm{rep}-representative relative to base measure \mu if \|\bar\Delta\| \leq \rho_\mathrm{rep}.

Definition 6 (Bounded dispersion, (DISP)). The deviation field has \sigma-bounded dispersion relative to base measure \mu if \int_\mathcal{C} \|\Delta_c - \bar\Delta\|^2 \,d\mu(c) \leq \sigma^2.

Definition 7 (Coalition closure, (COAL)). The counterparty space \mathcal{C} is coalition-closed under audit if (i) all counterparties whose actions or trade flows are correlated above a deployment-class threshold are partitioned into a single equivalence class (a coalition), and \Delta is defined on coalitions rather than individual counterparties; (ii) the post-partition counterparty cardinality N is a deployment-class constant (counterparty onboarding is governed by a deployment-class policy that does not let N scale with |P|); and (iii) latent (undetected) coalition risk is bounded by an explicit residual term \eta_\mathrm{latent} that the deployment claim absorbs (see §4.3 below). The audit specifies the partition; the algebraic theorem operates on the post-partition \mathcal{C}.

Clause (ii) is what makes the HHI-to-\chi^2 translation intensive in |P|: \chi^2(w \,\|\, \mu) = N \mathrm{HHI}(w) - 1 for uniform \mu, so an HHI threshold yields a |P|-independent \chi^2 ceiling only when N is itself |P|-independent. Deployments that prefer a structural commitment directly on \chi^2(w \,\|\, \mu) rather than on \mathrm{HHI} may substitute (ii) with the alternative sufficient condition \chi^2(w \,\|\, \mu) \leq \Xi for a deployment-class constant \Xi (the two are not logically equivalent — the \chi^2 bound is the weaker hypothesis the proof actually uses — but either certifies the intensivity that the theorem requires).

Forward direction: algebraic kernel

Lemma 3 (Algebraic weighted-selection, forward direction). Let \Delta be a deviation field on (\mathcal{C}, \mu) in a Hilbert space \mathcal{V}, satisfying (REP) and (DISP). Let w be a weighting on \mathcal{C} with \chi^2(w \|\mu) < \infty. Then \begin{equation} \|\Delta(w)\| \;\leq\; \rho_\mathrm{rep} \;+\; \sigma \cdot \sqrt{\chi^2(w \,\|\, \mu)}. \end{equation}

Proof. Decompose \Delta(w) around the population mean: \Delta(w) - \bar\Delta \;=\; \mathbb{E}_\mu[r \cdot \Delta] - \mathbb{E}_\mu[\Delta] \;=\; \mathbb{E}_\mu[(r - 1) \cdot \Delta] \;=\; \mathbb{E}_\mu[(r - 1)(\Delta - \bar\Delta)], where the last equality uses \mathbb{E}_\mu[r - 1] = 0.

We bound the Hilbert-valued mean via duality. For any unit vector u \in \mathcal{V} with \|u\| = 1: \big\langle u, \,\mathbb{E}_\mu[(r - 1)(\Delta - \bar\Delta)] \big\rangle \;=\; \mathbb{E}_\mu\big[(r - 1) \cdot \langle u, \Delta - \bar\Delta\rangle\big]. By Cauchy-Schwarz on L^2(\mu) applied to the scalar product of (r - 1) and \langle u, \Delta - \bar\Delta\rangle: \big|\mathbb{E}_\mu[(r-1) \langle u, \Delta - \bar\Delta\rangle]\big| \;\leq\; \sqrt{\mathbb{E}_\mu[(r-1)^2]} \cdot \sqrt{\mathbb{E}_\mu[\langle u, \Delta - \bar\Delta\rangle^2]}. Since |\langle u, \Delta - \bar\Delta\rangle| \leq \|\Delta - \bar\Delta\| for \|u\| = 1, the second factor is bounded by \sqrt{\mathbb{E}_\mu[\|\Delta - \bar\Delta\|^2]} \leq \sigma. Taking the supremum over unit u recovers the Hilbert norm: \|\mathbb{E}_\mu[(r-1)(\Delta - \bar\Delta)]\| \;=\; \sup_{\|u\|=1} \big\langle u, \mathbb{E}_\mu[(r-1)(\Delta - \bar\Delta)]\big\rangle \;\leq\; \sqrt{\chi^2(w \|\mu)} \cdot \sigma. Therefore \|\Delta(w) - \bar\Delta\| \leq \sigma \sqrt{\chi^2(w \|\mu)}. Adding the representativeness bound: \|\Delta(w)\| \;\leq\; \|\bar\Delta\| + \|\Delta(w) - \bar\Delta\| \;\leq\; \rho_\mathrm{rep} + \sigma \sqrt{\chi^2(w \|\mu)}.

Remark 1 (On cross-counterparty correlations). The Cauchy-Schwarz argument above does not require any probabilistic decorrelation between counterparty deviations. The only structural inputs are (REP) and (DISP), both of which are properties of the deterministic deviation field \Delta (not of its realization randomness). Cross-counterparty correlations are absorbed automatically into the \sigma^2 population-variance term. The case of identical deviations across all counterparties (highly correlated, no diversification) corresponds to \sigma^2 = 0, in which case the bound is trivially \rho_\mathrm{rep}.

Reverse direction (norm-level)

Lemma 4 (Algebraic weighted-selection, reverse direction). Scope. Throughout this lemma, \mathcal{C} is a finite or countable atomic measurable space (the natural setting after coalition closure (COAL) partitions counterparties into discrete equivalence classes); w = (w_c)_{c \in \mathcal{C}} is the corresponding atomic weighting.

Let \Delta be a deviation field on (\mathcal{C}, \mu). Suppose:

Concentration. \mathrm{HHI}(w) \geq H^*, hence \max_c w_c \geq H^* (since \sum_c w_c^2 \leq \max_c w_c). Let d \in \arg\max_c w_c be the dominant counterparty, with w_d \geq H^*.
Dominant separation. \|\Delta_d\| \geq \delta.
Directional cancellation. Let u_d := \Delta_d / \|\Delta_d\| (unit direction). The remainder cancellation in this direction is bounded: \langle u_d, \,\sum_{c \neq d} w_c \Delta_c\rangle \geq -\beta.

Then \begin{equation} \|\Delta(w)\| \;\geq\; H^* \delta - \beta. \end{equation}

Proof. Project \Delta(w) onto u_d: \begin{align*} \langle u_d, \Delta(w)\rangle &= w_d \langle u_d, \Delta_d\rangle + \langle u_d, \sum_{c \neq d} w_c \Delta_c\rangle \\ &= w_d \|\Delta_d\| + \langle u_d, \sum_{c \neq d} w_c \Delta_c\rangle \\ &\geq H^* \delta - \beta \end{align*} by concentration (w_d \geq H^*), separation (\|\Delta_d\| \geq \delta), and directional cancellation. Then \|\Delta(w)\| \geq |\langle u_d, \Delta(w)\rangle| \geq H^* \delta - \beta.

Counterparty-selection interpretation

In the GFM trade-flow setting:

\mathcal{C}: the set of counterparties (S1-admissible participants under ), after coalition closure per (COAL).
\mu: the auditor-defined fair-or-natural distribution over admissible counterparties.
\mathcal{V}: Hilbert space of utility functionals on the capability poset (or its tangent space at the welfare-relevant truth W).
Embedding \phi: \mathcal{V}_\mathrm{utility} \to \mathcal{V}: an affine isometric embedding on the active subspace P^\mathrm{act}. The deployment specifies \phi as part of the audit setup.
\Delta_c := \phi(U_c) - \phi(W): counterparty c’s utility deviation from the welfare-relevant truth in the embedded representation.
w_c: counterparty c’s trade-flow weight (paper 10’s invariant I_5 measures \mathrm{HHI}(w) on coalition-closed weights).

The aggregate distortion \Delta(w) = \sum_c w_c \phi(U_c) - \phi(W) is the selection-induced proxy-truth deviation: how the weighted-trade-flow proxy diverges from the welfare- relevant truth in the embedded representation.

Route-C transfer: from selection distortion to Goodhart slack

Assumption 2 (Embedding compatibility). The embedding \phi: \mathcal{V}_\mathrm{utility} \to \mathcal{V} is an affine isometry on the operationally active subspace P^\mathrm{act}: \|\phi(x) - \phi(y)\| = \|x - y\|_{\mathcal{V}_\mathrm{utility}} for x, y \in P^\mathrm{act}, and is linear up to a fixed translation.

The proxy-truth difference admits the decomposition \begin{equation} \phi(P) - \phi(T) \;=\; \Delta_\mathrm{audit}(w) \;+\; e_\mathrm{latent}, \end{equation} where \Delta_\mathrm{audit}(w) is the audit-observable selection distortion (the \Delta(w) of Lemma 3 computed on coalition-closed weights) and e_\mathrm{latent} \in \mathcal{V} is the latent-coalition residual with \|e_\mathrm{latent}\| \leq \eta_\mathrm{latent}.

If \phi is only bi-Lipschitz (not isometric), the same argument applies with bounds multiplied by the bi-Lipschitz constants; isometric is the cleanest form.

Theorem 2 (Concentration-Gap Selection Theorem, scoped). Under (REP), (DISP), (COAL), and Assumption 2: \begin{equation} |g(T) - g(P)| \;\leq\; \mathrm{Lip}(g) \cdot \big(\rho_\mathrm{rep} + \sigma \sqrt{\chi^2(w \|\mu)} + \eta_\mathrm{latent}\big). \end{equation}

In particular, when \mu is uniform on N counterparties with N a deployment-class constant (clause (ii) of (COAL)) and \mathrm{HHI}(w) < H^* ( in force): \chi^2(w \|\mu) \leq N H^* - 1, giving Goodhart slack bounded by \mathrm{Lip}(g) \cdot (\rho_\mathrm{rep} + \sigma \sqrt{N H^* - 1} + \eta_\mathrm{latent}), intensive in |P|.

Deployments that prefer a \chi^2-direct commitment may replace (COAL)’s clause (ii) with the alternative sufficient condition \chi^2(w \,\|\, \mu) \leq \Xi for deployment-class \Xi (the \chi^2 bound is the weaker hypothesis the proof actually uses); the \mathrm{HHI} form is the operational surrogate that paper 10’s I_5 already monitors.

Reverse direction (norm-level only). Under Lemma 4’s concentration + separation + directional-cancellation conditions, and the latent-coalition residual bound from Assumption 2: \begin{equation} \|P- T\| \;\geq\; H^* \delta - \beta - \eta_{\mathrm{latent}}, \end{equation} non-trivial when H^* \delta > \beta + \eta_{\mathrm{latent}}. A g-level lower bound on |g(T) - g(P)| does not follow from alone; it requires additional inverse-Lipschitz / non-degeneracy structure on g that this theorem does not assume.

Proof. Forward direction: by Assumption 2, \|\phi(P) - \phi(T)\| \leq \|\Delta_\mathrm{audit}(w)\| + \|e_\mathrm{latent}\|. By Lemma 3, \|\Delta_\mathrm{audit}(w)\| \leq \rho_\mathrm{rep} + \sigma \sqrt{\chi^2(w \|\mu)}. Since \phi is isometric, \|P- T\|_{\mathcal{V}_\mathrm{utility}} = \|\phi(P) - \phi(T)\|. Apply : |g(T) - g(P)| \leq \mathrm{Lip}(g) \cdot \|P- T\|, giving .

Reverse direction (norm-level): by Lemma 4, \|\Delta_\mathrm{audit}(w)\| \geq H^* \delta - \beta. By isometry of \phi, \|\phi(P) - \phi(T) - e_\mathrm{latent}\| \geq H^* \delta - \beta. By triangle inequality, \|P- T\| \geq H^* \delta - \beta - \eta_\mathrm{latent} (when this is positive). The g-level lower bound does not follow without inverse-Lipschitz structure on g.

Scoped Concentration-Gap discharge

The Concentration-Gap conjecture, scoped form.

Under (REP) + (DISP) + (COAL) and Lipschitz embedding compatibility (Assumption 2), Theorem 2 establishes the conjecture’s bidirectional correlation content at the proxy-truth level: low HHI bounds the slack above; high HHI plus separation bounds it below at the norm level. The universal model-free conjecture remains open.

Replaces the HHI surrogate-adequacy assumption.

The HHI surrogate-adequacy claim (\mathrm{HHI}< H^* \Rightarrow deployment outside the optimization-pressure regime), previously asserted as a single empirical-adequacy assumption of the deployment-safety paper and now collected as , is replaced by the conjunction (REP) + (DISP) + (COAL), embedding compatibility (Assumption 2), and the HHI threshold itself. The previous monolithic claim becomes concrete checkable structural conditions, each with operational audit hooks specified in §6.

Layer 1 binding via I_5.

The deployment claim of no longer contains the surrogate-adequacy assumption as an unproved empirical claim. Theorem 1 Layer 1 binding via invariant I_5 follows from Theorem 2’s upper bound under (REP) + (DISP) + (COAL) and embedding compatibility, with the structural conditions auditable per §6.

Failure-mode decomposition under (REP), (DISP), (COAL), and embedding compatibility.

The previous monolithic failure mode decomposes into failure of (REP), (DISP), (COAL), or embedding compatibility — each of which has specific audit-detectable signals. Concentration-Gap-conjecture failure factors through whichever structural condition the deployment cannot certify; ’s sensitivity-analysis taxonomy traces the failure-mode consequences for the deployment claim.

Composition with the deployment-safety theorem

Theorem 1 (bounded co-evolution corollary) and Theorem 2 (scoped Concentration-Gap) each enter the deployment-safety theorem at specific points. This section locates those points and lists the conditions each theorem adds to or removes from the deployment-claim hypothesis set.

Conditions removed and added

Removed.

The deployment-claim hypothesis set, after the present theorems, no longer contains:

The free-floating bounded co-evolution assumption that \bar{M} is bounded by some |P|-independent constant. Replaced by Theorem 1. In , the property is now stated as condition (C7) and invokes the corollary form directly.
The HHI surrogate-adequacy assumption: the converse-direction sufficiency claim \mathrm{HHI}< H^* \Rightarrow \mathcal{R}_{\mathrm{press}}^c. Replaced by Theorem 2 under the conjunction (REP) + (DISP) + (COAL) and embedding compatibility (Assumption 2), now collected as (Concentration-Gap structural conditions).

Added.

The deployment-claim hypothesis set acquires five explicit hypotheses, each operationally auditable (with embedding compatibility certified as a precondition for the embedded representations used by (REP) and (DISP)):

(C7.RATE). Upper exposure-rate cap from Assumption 1. Added as a sub-clause of .
(REP). Representativeness from Definition 5.
(DISP). Bounded dispersion from Definition 6.
(COAL). Coalition closure from Definition 7, with explicit residual \eta_{\mathrm{latent}} for undetected coordination and post-partition counterparty cardinality N deployment-class bounded (clause (ii); structurally required for the HHI-to-\chi^2 translation to be |P|-independent).
Embedding compatibility from Assumption 2: an affine isometry (or bi-Lipschitz, with constants tracked through the bound) \phi: \mathcal{V}_\mathrm{utility} \to \mathcal{V} on the operationally active subspace. This is the structural prerequisite for transferring the algebraic distortion bound to \|P- T\| and thence to Goodhart slack.

(REP), (DISP), (COAL), and embedding compatibility are collected as the structural conditions for the Concentration-Gap selection bound; (C7.RATE) is the exposure-rate condition for the bounded co-evolution corollary.

Where each theorem enters Deployment Safety’s proof

Theorem 1 enters at Lemma 1’s co-evolution substitution.

substitutes the co-evolution correction term (\varepsilon_{\mathrm{coev}}^{\mathrm{nonres}})_+ \leq K_{\mathrm{coev}} \cdot \bar{M} into the four-channel decomposition; packages the result via the generic-X intensivity argument. Both rely on \bar{M} being a |P|-independent constant. The corollary supplies \bar{M}^{\mathrm{cum}} \leq C \cdot B_{\mathrm{clip}} \cdot \lambda_{\max}\cdot \tau_{\mathrm{meta}} with all factors deployment-class constants intensive in |P|, closing the substitution at Step 4 without a free-floating constant.

Theorem 2 enters at Layer 1 binding via I_5.

requires the converse-direction sufficiency \mathrm{HHI}< H^* \Rightarrow deployment outside \mathcal{R}_{\mathrm{press}}. The scoped selection theorem supplies the Goodhart-slack upper bound |g(T) - g(P)| \leq \mathrm{Lip}(g) \cdot (\rho_\mathrm{rep} + \sigma \sqrt{\chi^2(w \|\mu)} + \eta_\mathrm{latent}) under (REP) + (DISP) + (COAL) and embedding compatibility (Assumption 2); for uniform \mu on N counterparties, \chi^2 = N \mathrm{HHI}- 1, so \mathrm{HHI}< H^* yields a deployment-class constant ceiling on the slack.

Both theorems enter Deployment Safety’s sensitivity analysis.

’s failure-mode taxonomy decomposes into specific structural-condition failures: bounded-co-evolution failure factors through (C5.HOEFF) / (C7.RATE) / channel-projection-structure failures (each with audit-detectable signal), and Concentration-Gap structural-condition failure factors through (REP) / (DISP) / (COAL) / embedding-compatibility failures (the last via the bi-Lipschitz \phi certification of §6).

Deployment-claim shape

The structural shape of the deployment-safety theorem is preserved:

Eleven operational invariants I_1, \ldots, I_{11}.
Three-layer claim (static safe region / detection-and-correction / acknowledged residuals).
Five named residuals (R1)–(R5).
Cooperative-anchoring property and canonical tripartite substrate identification.
Operational conditions (C2)–(C6) and (C8)–(C12) of retain their content; (C1) and (C7) carry refined structural dependencies: (C1) now explicitly invokes (REP), (DISP), (COAL), and embedding compatibility via , and (C7) carries the (C7.RATE) sub-clause established here.

Theorem 1 and Theorem 2 close structural gaps in the deployment-claim hypothesis set without altering the deployment claim’s shape.

Audit infrastructure for the structural conditions

Four operational structural conditions are added to ’s deployment claim: (C7.RATE) (upper exposure-rate cap), (REP) (representativeness), (DISP) (bounded dispersion), and (COAL) (coalition closure); plus a Lipschitz embedding compatibility precondition (Assumption 2) that certifies \phi before (REP) or (DISP) can be evaluated in the embedded representation. Each admits an operational audit, paralleling the existing audit structure of : each audit produces an attestation on the verification ledger , with cadence calibrated to deployment epoch boundaries. Confidence-bound construction in the (REP), (DISP), and (COAL) procedures below follows standard one-sided upper-confidence-bound machinery .

(C7.RATE) and channel-projection audit

This audit certifies the two structural inputs to Theorem 1 other than (C5.HOEFF) and (C5.MULT) (which are part of ): the upper exposure-rate cap \lambda_{\max}, and the verification protocol’s channel-projection structure.

Procedure (rate cap).

Identify the per-step exposure event sources: ledger commit rate, action-execution rate, governance-event rate.
Inspect the hard mechanisms enforcing the rate ceiling — ledger commit caps, action-queue depth bounds, governance rate-limit middleware, scheduler quotas — and identify which substrate owns the enforcement (substrate-exclusivity from ).
Compute an upper-bound rate \lambda_{\max} from the certified mechanisms (not from observed historical traffic).
Document \lambda_{\max}, the enforcing mechanisms, and the substrate ownership in the audit attestation.

Empirical traffic windows over \tau_{\mathrm{meta}} may be used as monitoring evidence (e.g., to alert on approach to \lambda_{\max}), but cannot establish the deterministic deployment-class upper bound by themselves.

Procedure (channel-projection structure).

Enumerate the projection maps \pi_j: A \to V_j \cup \{*_j\} for j = 1, \ldots, K_{\mathrm{ch}} from the verification protocol’s event-classification policy.
Verify that \pi_j(a) = *_j holds for all actions a \notin A_j, and that \ell^{(j)}(a) = 0 on A \setminus A_j (zero-outside-engagement convention).
Document the pairwise overlap sets A_{j, j'} = A_j \cap A_{j'} for use by downstream audits.

Cadence.

Initial certification before deployment activation; re-certification when verification-protocol classification policy changes, when rate-limit mechanisms change, or when SPRT detection observes events at rates approaching \lambda_{\max}.

Failure mode.

Deployments where the exposure event rate cannot be bounded by certified mechanisms (e.g., agent-controlled rate-limit overrides), or where the channel projection policy admits \ell^{(j)}(a) \neq 0 off A_j, fall outside Theorem 1’s scope. Mitigation: hard rate limits enforced by substrate-exclusive infrastructure; verification-protocol classification redesign to respect the zero-outside-engagement convention.

Embedding \phi certification (precondition for REP and DISP)

(REP) and (DISP) are stated relative to the embedded deviation field \Delta_c = \phi(U_c) - \phi(W) in the Hilbert space \mathcal{V} (Assumption 2). The embedding \phi must be certified before (REP) or (DISP) can be audited.

Procedure.

Specify the embedding \phi: \mathcal{V}_\mathrm{utility} \to \mathcal{V} in deployment documentation.
Verify \phi is affine isometric on the operationally active subspace P^\mathrm{act}: \|\phi(x) - \phi(y)\| = \|x - y\|_{\mathcal{V}_\mathrm{utility}} for x, y \in P^\mathrm{act}, with linearity up to a fixed translation.
If \phi is only bi-Lipschitz (not isometric), document the bi-Lipschitz constants (\underline{L}, \bar{L}) and apply them as multipliers to the (REP) / (DISP) bounds downstream.

Failure mode.

If \phi is not bi-Lipschitz on P^\mathrm{act}, the algebraic kernel of Theorem 2 cannot be transferred to the deployment claim’s \|P- T\| form via Lipschitz transfer . Mitigation: choose a different embedding, or restrict the deployment’s scope to a subspace where bi-Lipschitz \phi exists.

When \phi is well-conditioned.

Utility representations that are linear (or near-linear) in the deployment’s underlying decision variables typically admit well-conditioned \phi: identity or affine maps suffice, and the bi-Lipschitz constants (\underline{L}, \bar{L}) are close to 1. High-dimensional learned representations (neural utility models, embedding vectors, opaque scalarizations of multi- attribute preferences) may not — distortion constants can be large, and the bi-Lipschitz form’s downstream multiplication of (\bar{L}/\underline{L}) through the (REP) and (DISP) bounds can inflate the Goodhart-slack ceiling toward operational uselessness. Practical deployments should target one of: (i) an explicit affine utility model where \phi is identity; (ii) a low-dimensional projection of a high-dimensional representation onto the operationally active subspace, with the projection’s bi-Lipschitz constants bounded by audit; or (iii) a restricted deployment scope where the active subspace admits a well-conditioned embedding even if the full representation does not. Deployments that cannot achieve one of these lie outside Theorem 2’s practical applicability even if they nominally satisfy (EMB).

(REP) audit (representativeness)

Procedure.

Define the base population measure \mu: the auditor’s fair-or-natural distribution over admissible counterparties (typically uniform over S1-admissible participants , or weighted by some natural allocation).
In the embedded representation under the certified \phi, produce a one-sided upper confidence bound on \|\bar\Delta\| where \bar\Delta := \mathbb{E}_\mu[\phi(U_c) - \phi(W)]. The bound is computed by sampling counterparties under \mu (or reweighting observed-attestation samples to \mu with stated coverage assumptions) from utility-disclosure attestations on the verification ledger; conservative bounding is required because (REP) is a hypothesis the deployment claim conditions on, not a quantity the deployment claim merely monitors.
Certify \|\bar\Delta\| \leq \rho_\mathrm{rep} at the audit’s chosen confidence level for a deployment-class threshold \rho_\mathrm{rep}.
Document \rho_\mathrm{rep}, the confidence level, and the sampling/reweighting design in the audit attestation.

Cadence.

Initial calibration before deployment; re-calibration on counterparty-population changes (additions, removals, or substantial weight-redistribution).

Failure mode.

The counterparty population’s mean utility deviates systematically from the welfare-relevant truth W. Mitigation: counterparty-selection-process diversification, category-coverage audit , or restriction of the deployment’s scope to a counterparty subset where (REP) holds.

(DISP) audit (bounded dispersion)

Procedure.

In the embedded representation under the certified \phi, produce a one-sided upper confidence bound on the \mu-population variance \int \|\Delta_c - \bar\Delta\|^2 \,d\mu(c), where per-counterparty deviations \Delta_c = \phi(U_c) - \phi(W) are read from utility-disclosure attestations. As with (REP), the variance is taken with respect to \mu, not the trade-flow weighting w: audit sampling under \mu (or reweighting to \mu with stated coverage) is required.
Certify the variance \leq \sigma^2 at the audit’s chosen confidence level for a deployment-class constant \sigma.
Document \sigma, the confidence level, and the sampling/reweighting design in the audit attestation.

Cadence.

Continuous attestation under the calibrated \mu-sampling design over a rolling calibration window; confidence-bound violations trigger audit re-calibration. Observed trade-flow (w) statistics may serve as monitoring signal but do not by themselves certify the \mu-variance hypothesis.

Failure mode.

High dispersion across counterparties (some counterparties’ utilities deviate far from W, even when the population mean is centered). The bound in Theorem 2 weakens but does not collapse; the deployment claim accommodates large \sigma at the cost of larger Goodhart-slack ceiling.

(COAL) audit (coalition closure)

Procedure.

Run coalition-detection analysis on the verification ledger: identify counterparties with correlated trade flows above a deployment-class threshold (using repeated-beneficiary linkage, governance-vote alignment, attestation-source clustering).
Partition \mathcal{C} into equivalence classes (coalitions) by the detected correlation structure.
Re-compute \mathrm{HHI} on the post-partition (coalition-level) weights.
Certify clause (ii) of (COAL). Document the deployment-class bound N_{\max} on the post-partition counterparty cardinality, together with the onboarding / governance policy that prevents N from scaling with |P|. Document the choice of base measure \mu as the uniform distribution on the post-partition counterparties (the structural prerequisite for the HHI-to-\chi^2 translation \chi^2(w \,\|\, \mu) = N \mathrm{HHI}(w) - 1). Deployments that prefer to commit directly to a \chi^2 ceiling certify \chi^2(w \,\|\, \mu) \leq \Xi for a deployment-class constant \Xi instead, with \mu documented as whatever fair-or-natural base measure the audit selects.
Produce a one-sided upper confidence bound on the latent-coalition residual \eta_\mathrm{latent} from the audit’s detection-power calibration: bound \eta_\mathrm{latent} \leq p_\mathrm{miss} \cdot \|\Delta\|_{\max} where p_\mathrm{miss} is the threshold-miss probability for the worst-case undetected coalition size and \|\Delta\|_{\max} is the worst-case per-counterparty deviation magnitude in the deployment’s threat model.
Document the coalition partition, N_{\max} (or \Xi), and the \eta_\mathrm{latent} confidence bound (with confidence level) in the audit attestation.

Cadence.

Continuous coalition detection on the verification ledger; audit-cadence alerts on detected coalition-formation events; re-partition triggered when new coalitions form.

Failure mode.

Many small counterparties acting in coordinated fashion (a hidden coalition) drive the effective \mathrm{HHI} above the audited threshold. The deployment claim’s bound on Goodhart slack picks up the \eta_\mathrm{latent} residual; deployments where \eta_\mathrm{latent} cannot be bounded operationally are outside Theorem 2’s scope.

When \eta_\mathrm{latent} is operationally negligible.

The Goodhart-slack bound has the additive form \mathrm{Lip}(g) \cdot (\rho_\mathrm{rep} + \sigma\sqrt{\chi^2(w \|\mu)} + \eta_\mathrm{latent}), so \eta_\mathrm{latent} dominates the bound whenever it is large compared with the other terms. For \eta_\mathrm{latent} to remain operationally negligible the deployment should target \eta_\mathrm{latent} \leq \epsilon \cdot (\rho_\mathrm{rep} + \sigma \sqrt{\chi^2(w \|\mu)}) for some small \epsilon \ll 1, which translates via \eta_\mathrm{latent} \leq p_\mathrm{miss} \cdot \|\Delta\|_{\max} into a detection-power requirement on the audit’s coalition-detection machinery: p_\mathrm{miss} \cdot \|\Delta\|_{\max} \;\leq\; \epsilon \cdot \big(\rho_\mathrm{rep} + \sigma \sqrt{\chi^2(w \|\mu)}\big). Deployments whose threat model admits large \|\Delta\|_{\max} (adversarial counterparties with large utility deviations from W) require correspondingly tighter p_\mathrm{miss} to keep \eta_\mathrm{latent} dominated. The audit’s calibration documentation should make this trade-off explicit, and audits that cannot achieve the inequality leave deployments where \eta_\mathrm{latent} may dominate the bound in the operational sense even when the algebraic bound formally holds.

Integration with existing audit infrastructure

Several of these new audits overlap with existing audits in :

(C7.RATE) and channel-projection audit: integrates primarily with Audit 4 (bounded co-evolution calibration, dual-mode), which is the deployment-tooling home for the structural inputs to Theorem 1; Audit 7 (SPRT-applicability + gap-growth + clock comparability) supplies the (C5.HOEFF) clip radius and related SPRT-side calibration inputs but not the (C7.RATE) upper bound.
(REP) audit: integrates with Audit 3 (cooperative-vs-redundancy audit), since both rely on category-coverage analysis.
(DISP) audit: new, but uses utility-disclosure attestations that the existing infrastructure already produces.
(COAL) audit: integrates with I_9 (substrate-exclusivity observability) and I_{10} (coverage/materiality routing) of , since coalition detection uses ledger-linkage analysis those invariants already specify.

The audit hooks above are the minimum needed for Theorem 1 and Theorem 2 to apply; consolidated audit-tooling combining these with ’s existing audits is the natural target for further operational-tooling work.

Discussion

What the universal Concentration-Gap conjecture would require

The Concentration-Gap conjecture of is universal: optimization pressure on the proxy correlates with proxy-truth gap exploitation, in any deployment context. Theorem 2 discharges only a scoped version under (REP), (DISP), (COAL), embedding compatibility (Assumption 2), and the Lipschitz-transfer machinery of . The universal conjecture would additionally require:

A formal optimizer / selection model that captures “optimization pressure” as a structural property of agents rather than as the empirical correlate (HHI) used here. Existing candidates include the mesa-optimization framework of and the formal-Goodhart machinery of , none of which alone yields a full proof.
A characterization of “gap exploitation” robust to the specific functional forms of P and T. Theorem 2 works at the proxy-truth-norm level; the universal conjecture would need to characterize the alignment-property-level slack |g(T) - g(P)|, which requires inverse-Lipschitz structure on g not currently available in the GFM apparatus.

This is a research-program target rather than a technical extension. The scoped version is achievable; a universal model-free version is not, given current machinery.

What the g-level reverse direction would require

Theorem 2’s reverse direction operates at the \|P- T\| level: under high HHI plus separation plus directional cancellation, the proxy-truth norm is bounded below. The Lipschitz transfer of is forward-only, so this norm-level lower bound does not transfer to a lower bound on |g(T) - g(P)|.

Strengthening the reverse to the g-level would require an additional non-degeneracy condition on g: a constant g provides an immediate counterexample (|g(T) - g(P)| = 0 regardless of \|P - T\|). The natural condition is bi-Lipschitz invertibility of g on the relevant subspace, but this is a strong restriction on the alignment property’s functional form.

For deployment-claim purposes, the norm-level reverse is sufficient: it tells operators the regime in which non-trivial proxy-truth distortion is structurally guaranteed, which the detection layer (Layer 2) of can act on. The g-level reverse would strengthen the result but is not required for the deployment claim’s operational utility.

What the latent-coalition residual would require

The (COAL) condition partitions counterparties whose correlations are detected by audit; the latent-coalition residual \eta_\mathrm{latent} absorbs undetected-coordination risk as a norm bound on the residual error term e_\mathrm{latent} \in \mathcal{V} from Assumption 2: \|e_\mathrm{latent}\| \leq \eta_\mathrm{latent}.

Bounding \eta_\mathrm{latent} in a specific deployment requires calibrating the audit’s detection power against a worst-case deviation magnitude: \eta_\mathrm{latent} \leq p_\mathrm{miss} \cdot \|\Delta\|_{\max} where p_\mathrm{miss} is the probability that a coalition of given size escapes detection thresholds and \|\Delta\|_{\max} is the worst-case per-counterparty deviation in the deployment’s threat model. The audit specifies the detection thresholds, the detection-power model, and the one-sided upper-confidence procedure in (COAL)’s procedure (§6); p_\mathrm{miss} and \|\Delta\|_{\max} are calibrated through that procedure rather than read off raw operating statistics, and the certification semantics this paper inherits do not derive their values from first principles.

Connection to existing Goodhart literature

The Concentration-Gap selection theorem fits into the broader formal-Goodhart literature in a specific position:

It extends ’s Adversarial Goodhart variant by giving an explicit upper bound on the proxy-truth divergence in terms of selection concentration, where Manheim & Garrabrant give a qualitative taxonomy.
It complements ’s tail-based formal Goodhart by addressing the selection-induced gap (where this paper sits) rather than the tail-event gap (where El-Mhamdi & Hoang sit). The two are independent failure modes; both should hold for the deployment claim to bind.
It fits ’s Strong/Weak/Benign Goodhart classification at the “Weak Goodhart” boundary: the proxy-truth gap is bounded under structural conditions, but the conditions can fail (Strong Goodhart) or be irrelevant (Benign Goodhart).

The selection theorem does not subsume any of these works; it combines their perspectives into a single deployment-relevant result.

Connection to bounded co-evolution failure modes

Theorem 1 replaces the free-floating-constant assumption of bounded co-evolution with a derivation from primitive parameters. The failure modes of bounded co-evolution now factor through specific structural conditions:

(C5.HOEFF) failure: per-step LLR is unbounded, so B_{\mathrm{clip}} is undefined. Detection: SPRT-applicability audit (paper 10 Audit 7) catches this.
(C7.RATE) failure: exposure event count grows faster than \lambda_{\max}\cdot \tau_{\mathrm{meta}}. Detection: (C7.RATE) audit (this paper, §6) catches this.
Channel-projection failure: ledger event-classification policy doesn’t respect the zero-outside-engagement convention. Detection: structural-projection audit (this paper, §6) catches this.

This decomposition is the structural payoff of Theorem 1: bounded-co-evolution failure is no longer an opaque single-mode risk but a triage of three specific structural-condition failures, each with its own audit detection.

Author Contributions

Teague Lasser owns the paper’s intellectual direction and is responsible for all claims made.

Claude Opus 4.7 (Anthropic) drafted the paper under that direction.

GPT 5.5 (OpenAI) served as cold technical reviewer for proof errors and claim mismatches.

Transparency note. Both AI systems operated as tools under human direction. Neither system has continuity across sessions, cannot take responsibility for the work in the sense required by most venue authorship policies, and cannot respond to reviewer queries independently. They are listed as authors to accurately represent their contributions to the intellectual content of the paper, not to claim that they meet all criteria of traditional academic authorship. The corresponding author for all inquiries is Teague Lasser.