Corrective: Change the metric, and behavior changes with it. Gaming is a governance problem.
Both Cell II and Cell IV sit within the performativity tradition. They differ on what participants can see. Cell II names the cynical-maintenance variant. Practitioners understand that the ritual is performative, they continue to perform it, and they navigate the gap between performance and substance through tacit competence. Cabantous and Gond (2011) call this rational decision-making as performative praxis. Cell IV names the Barnesian variant (MacKenzie 2006, Callon 2007) operating in self-concealing conditions. The engagement constitutes the criteria by which the engagement is judged to be working, and the practitioner who would notice the constitution has been transformed by the engagement that constituted it. Sincere belief replaces knowing maintenance.
PSF is not outside performativity. PSF is the variant in which the framework has made itself true, and the actor cannot detect the circularity because the same engagement constitutes both the framework and the actor.
Each of the first three modes leaves something the actor retains. In Goodhart, the actor knows the metric is a proxy. In liturgy, the actor knows the ritual is performative. In manipulation, the actor on one side knows the metric is engineered. Knowledge is unevenly distributed across the three, but somewhere in the system, knowledge persists, and the accountable criterion stays in view.
Proxy seduction is the mode where the knowledge itself is what the engagement has reshaped. Nobody is gaming, because the gap is not visible to game against. The proxy metric reads as the accountable criterion. The judgment that would catch the displacement is the judgment the same engagement can erode, and that vulnerability is what makes the mode self-concealing.
Failure trips an alarm. Proxy seduction disarms it.
| Dimension | Goodhart's Law | Corporate liturgy | Manipulation | Proxy seduction |
|---|---|---|---|---|
| What people believe | Know the proxy is a proxy | See through the ritual | Trust the metric in good faith | Sincerely believe the metric |
| Source of the gap | Strategic optimization | Embedded tools and conventions | Deliberate engineering by another party | Constituted by the engagement itself |
| What happens to judgment | Unchanged | Unchanged (cynical but competent) | Unchanged (deceived but intact) | Transformed by the engagement |
| Self-correction | Change the metric | Navigate tacitly | Expose the manipulation | The process prevents recognition |
| Trajectory | Stable | Stable | Stable (until exposed) | Degrading (capacity erodes) |
Organizations adopt formal structures not because they work but because they conform to institutionalized expectations. To manage the resulting gap between ceremony and practice, organizations decouple: they maintain the formal structure for legitimacy while actual work proceeds on its own terms. The critical feature is that organizational actors know the gap exists. Decoupling is a conscious management strategy. The front-stage conforms to institutional myths while the back-stage operates according to technical demands. Both sides are visible to insiders.
Organizational identity labels persist while the meaning underneath reconstitutes. A team still calls itself "craftspeople" while "craft" has quietly come to mean "shipping fast." Nobody decided to decouple. The label changed its referent through a process Gioia et al. call adaptive instability. PSF extends this to metrics: the metric label is stable, the evaluative content has shifted, and nobody notices because there is nothing dramatic to notice. The ceremony has become the operation, and nobody registered the switch.
Organizations face multiple institutional logics that prescribe incompatible courses of action. Greenwood et al. mapped how field position, organizational structure, and identity shape whether organizations compartmentalize, compromise, or resist. Pache & Santos deepened this with "selective coupling": hybrid organizations (like social enterprises) do not just ceremonially decouple. They strategically adopt intact practices from each competing logic, assembling a working composite. Both accounts assume that the competing logics are recognizable to organizational actors. The logics may conflict in values and prescriptions, but they rely on broadly intelligible forms of evidence, argumentation, and professional reasoning. Disputes can be negotiated precisely because the standards for evaluating claims remain legible across logics.
AI engagement does not arrive as a competing logic that announces itself as different. It arrives as an improvement within the existing evaluative framework. Speed, output volume, coverage, consistency: these are already valued. The engagement makes them more visible and more achievable. The organization does not experience a clash between two recognizable logics. It experiences confirmation that its existing logic is working better than before. The rival logic never presents itself as a rival. It constitutes the confirming evidence for the logic already in place. By the time the shift becomes visible (if it ever does), evaluative capacity has already eroded.
AI introduces a rival "epistemic regime" into organizations: a historically situated, institutionally stabilized configuration that governs how knowledge is produced, evaluated, and authorized. The computational regime (grounded in statistical inference, large-scale data, predictive performance) competes with the professional regime (grounded in experiential judgment, disciplinary training, situated reasoning). Sergeeva et al. argue that organizations are arenas where these regimes contend for legitimacy. The contest is political, visible, and consequential. Actors resist, champion, compartmentalize, or capitulate. The framework draws on institutional logics, Foucault, STS, and sensemaking to describe what happens when the grounds of knowing are themselves in dispute.
PSF describes what happens before the contest Sergeeva et al. theorize becomes visible. Proxy seduction operates in the period when the computational regime has not yet announced itself as a rival. Organizations are not contending with a competing way of knowing. They are experiencing productivity gains that register as improvements within their existing evaluative framework. The epistemic shift is real, but it is self-concealing: the proxy metrics generated by AI engagement look like evidence that the incumbent regime is working better than ever. By the time the contest becomes legible (the moment Sergeeva et al. begin their analysis), evaluative capacity has already degraded. PSF explains why so many organizations arrive at the regime contest Sergeeva et al. describe with diminished capacity to navigate it.
Rational decision-making persists in organizations not because it works but because it is continuously performed into existence through the coordinated deployment of tools, theories, and experts. Rationality is a "performative praxis": it requires active production through three mechanisms (conventionalization, engineering, and commodification). The key insight is that participants in this production are not naive. They may recognize the ritual quality of the process. The production succeeds because rationality serves real organizational functions (legitimation, coordination, accountability) regardless of whether it produces optimal decisions. The actors can see through the performance and choose to maintain it.
PSF identifies the version where nobody sees through the performance because the performance has constituted the evaluative frame by which the performance is judged. The AI tool produces speed gains. Speed becomes the primary measure of success. The tool is then judged successful because speed went up. This is Barnesian performativity (MacKenzie, 2006): the framework made itself true. Cabantous & Gond describe a ritual that participants maintain knowingly. PSF describes a circularity that participants cannot detect, because the very criteria they would use to detect it have been constituted by the process under evaluation.
A substantial literature presents AI as a remedy for bounded rationality (Simon, 1955). Algorithms process more information, compute complex trade-offs, and synthesize knowledge at scale, freeing humans from the constraints of satisficing and cognitive bias. Shrestha et al. propose principles for deploying deep learning to augment organizational decisions. Laamanen et al. use the attention-based view to argue that AI broadens organizational attention, democratizes strategic processes, and accelerates feedback. Both accounts treat the human evaluator as a stable reference point: someone who can assess whether AI improved the decision and redirect resources accordingly.
PSF asks what happens when the "remedy" constitutes the evaluative frame by which the remedy is judged to be working. If AI engagement produces the metrics that organizations use to assess decision quality (speed of resolution, volume of output, consistency of recommendations), then the conclusion "AI improved our decisions" is circular. The evaluator is not stable. The evaluator has been transformed by the very engagement whose effects the evaluator is trying to assess. As Sergeeva et al. (2026) note, the bounded rationality framing implicitly privileges "the ideals of optimization, quantification, and computational inference as naturally more desirable ways of knowing," but PSF goes further: the framing does not just privilege computational knowing, it prevents the organization from noticing the privileging.
Paradox theory frames automation and augmentation as interdependent rather than opposed. Organizations that overemphasize automation lose human capability. Organizations that overemphasize augmentation underutilize AI capacity. The productive path integrates both, creating complementarities across time and space. The framework operates at the task level: which tasks should be automated, which augmented, how should the balance shift over time? The implicit assumption is that the organization can observe the consequences of its choices and adjust. If automation degrades quality, someone notices. If augmentation overwhelms practitioners, someone intervenes. The organization retains the evaluative capacity to manage the paradox.
PSF operates beneath the task level. The question is not whether a task is automated or augmented but whether the organization can still tell the difference between genuine augmentation and scaffolded dependency. When AI produces outputs that look like competent work product, how does the organization distinguish augmentation (practitioner skill enhanced) from substitution (practitioner skill bypassed)? The proxy metrics generated by engagement do not distinguish these cases. Both produce the same visible outputs. The erosion of evaluative capacity means the organization progressively loses the ability to navigate the very paradox Raisch & Krakowski identify.
Six parallels, one pattern. In every case, the existing framework assumes a stable evaluator: an organizational actor (or collective) that can see what is happening, assess the consequences, and respond. Goodhart assumes the gamer can see the gap. Institutional complexity assumes actors can identify the competing logics. The automation-augmentation paradox assumes someone can tell whether AI is helping or hurting. Epistemic regime theory assumes the contest is legible.
PSF identifies the constitutive version of each: the case where the engagement has already transformed the evaluator's capacity to see. The gap is invisible not because information is missing but because the evaluative frame through which information would be interpreted has been reconstituted by the very process under evaluation.
This is not one alternative to one framework. It is a systematic blind spot across an entire family of organizational theories. They all assume evaluative continuity. PSF is the theory of what happens when that assumption fails.
These frameworks were not careless. They were built on empirical settings where the assumption of evaluative continuity held, because those settings had physical substrates that generated self-announcing feedback when something went wrong. Robotic surgery (Sergeeva & Faraj, 2012): the surgeon loses haptic feedback and feels the absence immediately. Pharmacy automation (Barrett et al., 2012): medication errors produce visible patient outcomes. ERP implementations: broken workflows halt production. In each case, the disruption of practiced knowing was observable because the domain's material properties forced it into view.
AI engagement in knowledge work is structurally different. The outputs are linguistic, aesthetic, strategic, evaluative. They exist in registers where quality is judgment-dependent, not physically verifiable. Code that compiles is not code that solves the right problem. A strategy that is coherent is not a strategy that is sound. A diagnosis that matches the benchmark is not a diagnosis that captures the clinical picture. There is no equivalent of tissue resistance going missing. The proxy metrics fill the perceptual space where the felt absence would have been.
AI is not just another technology that mediates practiced knowing. It is the first technology that mediates practiced knowing in domains where the loss of that knowing cannot be detected by the outputs alone. That is what makes it the revealing case for organizational theory, not merely a new case.
The strategic-vs-constitutive flow diagram that sat at the foot of this page is being rebuilt separately. It still illustrated the earlier "judgment stock depletes" framing and predates the METR / colonoscopy evidence split, so it was removed rather than patched. The replacement should show the self-concealing path, with METR carrying the proxy-criterion gap and its concealment, and the colonoscopy withdrawal trial carrying the capacity erosion.