Four Modes of Proxy-Criterion Decoupling

How metrics separate from what they claim to measure, and whether anyone notices. Two axes: what creates the gap, and can the people involved see it?
WHAT CREATES THE GAP
IntentionalStructural
I
Goodhart's Law
Pattern: Teaching to the test. Juking crime stats. Optimizing click-through rates that do not track purchase intent.
Corrective: Change the metric, and behavior changes with it. Gaming is a governance problem.
A teacher graded on test scores teaches to the test, knowing the test does not measure what learning is for. A precinct graded on the crime rate reclassifies felonies as misdemeanors, knowing the paperwork does not change what happens on the street. The metric and the meaning have come apart, and the actor is the one pulling them apart. Goodhart's Law captures the pattern: any measure routinely used as a target loses its informational value, because the people held to it know it is a proxy and act accordingly. The accountable criterion stays in view throughout. The fix is incentive design, audit, and monitoring. The actor still knows what the metric was for.
Goodhart 1975, Strathern 1997
II
Corporate liturgy
Pattern: Strategic planning cycles that run for months and produce alignment more than decisions. Strategy decks that lay out the case for a decision already made. Capital-approval models produced because the process requires a model, not to drive the choice.
Corrective: Experienced practitioners navigate the gap tacitly. Removing the ritual costs more than maintaining it.
The strategic planning cycle runs for months, generating draft after draft through round after round of review, and everyone treats completing the process as the point rather than any decision it reaches. The strategy deck lays out the case for a decision the senior team has already taken. The capital-approval model gets built because the investment process requires a model before any project can proceed, and everyone treats producing it as a step to be completed rather than a question to be answered. Experienced practitioners are not deceived. They know the work is part performance, and they maintain the performance because it serves real organizational functions: alignment, governance, legitimation, audit trail. Cabantous and Gond (2011) call this rational decision-making as performative praxis, where rationality is continuously produced through tools, theories, and experts. What separates liturgy from gaming is that nobody is steering a number toward a target. The number is produced because the ritual calls for a number, not to hit a mark someone has in mind. The competent actor can see through the production and choose to maintain it. The ritual stays because removing it is harder than performing it, and because the ritual does work the organization needs done, even if not the work it claims to do.
Cabantous & Gond 2011 (performative praxis)
III
Manipulation
Pattern: Algorithmic management that obscures its decision criteria. Dark patterns that steer user behavior. Asymmetric pricing.
Corrective: Transparency, regulation, audit, whistleblowing. The fix targets the asymmetry, not the metric.
A gig platform shows estimated earnings that bear no consistent relationship to the actual payout schedule. A pricing engine charges different shoppers different prices for the same product, where the variable is the shopper rather than the product. Someone has engineered the gap deliberately, and the people on the receiving end do not know it is there. What makes this manipulation rather than seduction is that the victim's own criterion stays intact. Show a driver the real payout schedule and sound judgment returns at once, with nothing to rebuild. The damage sits in the channel, not in the evaluator. Kellogg, Valentine, and Christin (2020) describe the pattern as algorithmic management that obscures its own decision criteria. Knowledge of the gap is real, but it lives with the party that benefits from the gap rather than the party absorbing the cost. The corrective is structural: transparency, regulation, audit, whistleblowing. What needs to change is who knows what, not what gets measured.
Kellogg, Valentine & Christin 2020
IV
Proxy seduction (PSF)
Pattern: Colonoscopy withdrawal trial: unaided detection fell once the AI was removed. METR developers felt 20% faster but measured 19% slower. BCG consultants stayed confident in wrong answers beyond the AI frontier.
Corrective: Engineered braking: verification loops independent of the engagement, evaluation calibrated to accountable criteria, and boundary activity.
METR's developers were experienced engineers working on open source projects they knew well. They predicted AI assistance would speed them up by 24%. After using it on real tasks, they perceived a 20% speedup. Measured against the clock, they were 19% slower. The gap between what they perceived and what they did was 39 percentage points. Nobody was gaming. Nobody was lying. The engagement made speed feel responsive to effort in real time, and the developers' subjective sense of speed, which had been a reliable signal across their careers, registered the new feeling as evidence of the old criterion. Proxy seduction is the pattern. AI engagement constitutes a set of attractive proxy metrics (speed, volume, apparent certainty) that displace the criteria the work is accountable to: code that holds under stress, contracts that hold under challenge, diagnoses that match the clinical picture. The displacement operates through sincere belief, not strategic gaming. The same engagement that produces the proxies also erodes the judgment that would notice they are not the criteria. There is no actor pulling the metric away from the meaning. The engagement does the pulling, and the engagement also produces the practitioner who reads the new metric as the old meaning.
PSF (Bapat et al., in development)
A note on Cells II and IV

Both Cell II and Cell IV sit within the performativity tradition. They differ on what participants can see. Cell II names the cynical-maintenance variant. Practitioners understand that the ritual is performative, they continue to perform it, and they navigate the gap between performance and substance through tacit competence. Cabantous and Gond (2011) call this rational decision-making as performative praxis. Cell IV names the Barnesian variant (MacKenzie 2006, Callon 2007) operating in self-concealing conditions. The engagement constitutes the criteria by which the engagement is judged to be working, and the practitioner who would notice the constitution has been transformed by the engagement that constituted it. Sincere belief replaces knowing maintenance.

PSF is not outside performativity. PSF is the variant in which the framework has made itself true, and the actor cannot detect the circularity because the same engagement constitutes both the framework and the actor.

The distinctive claim

Each of the first three modes leaves something the actor retains. In Goodhart, the actor knows the metric is a proxy. In liturgy, the actor knows the ritual is performative. In manipulation, the actor on one side knows the metric is engineered. Knowledge is unevenly distributed across the three, but somewhere in the system, knowledge persists, and the accountable criterion stays in view.

Proxy seduction is the mode where the knowledge itself is what the engagement has reshaped. Nobody is gaming, because the gap is not visible to game against. The proxy metric reads as the accountable criterion. The judgment that would catch the displacement is the judgment the same engagement can erode, and that vulnerability is what makes the mode self-concealing.

Failure trips an alarm. Proxy seduction disarms it.

Dimensions of differentiation

DimensionGoodhart's LawCorporate liturgyManipulationProxy seduction
What people believeKnow the proxy is a proxySee through the ritualTrust the metric in good faithSincerely believe the metric
Source of the gapStrategic optimizationEmbedded tools and conventionsDeliberate engineering by another partyConstituted by the engagement itself
What happens to judgmentUnchangedUnchanged (cynical but competent)Unchanged (deceived but intact)Transformed by the engagement
Self-correctionChange the metricNavigate tacitlyExpose the manipulationThe process prevents recognition
TrajectoryStableStableStable (until exposed)Degrading (capacity erodes)

The broader pattern: strategic vs. constitutive

The 2x2 above maps proxy-criterion decoupling. The same structural distinction (a knowing agent who can see what is happening vs. a transformed agent who cannot) recurs across organizational theory. In each case, the existing framework assumes evaluative capacity remains intact. PSF identifies the version where engagement has already transformed the evaluator.
Parallel 1
Decoupling
Strategic gap management vs. invisible referent shift
Meyer & Rowan (1977)

Organizations adopt formal structures not because they work but because they conform to institutionalized expectations. To manage the resulting gap between ceremony and practice, organizations decouple: they maintain the formal structure for legitimacy while actual work proceeds on its own terms. The critical feature is that organizational actors know the gap exists. Decoupling is a conscious management strategy. The front-stage conforms to institutional myths while the back-stage operates according to technical demands. Both sides are visible to insiders.

Gioia, Schultz & Corley (2000) → PSF

Organizational identity labels persist while the meaning underneath reconstitutes. A team still calls itself "craftspeople" while "craft" has quietly come to mean "shipping fast." Nobody decided to decouple. The label changed its referent through a process Gioia et al. call adaptive instability. PSF extends this to metrics: the metric label is stable, the evaluative content has shifted, and nobody notices because there is nothing dramatic to notice. The ceremony has become the operation, and nobody registered the switch.

PSF's move: Meyer & Rowan's decoupling requires a knowing actor who manages the gap strategically. PSF identifies decoupling that happens to the actor without awareness. The gap is not managed because it is not perceived.
Meyer, J.W. & Rowan, B. (1977). Institutionalized Organizations: Formal Structure as Myth and Ceremony. American Journal of Sociology, 83(2), 340-363. DOI: 10.1086/226550
Gioia, D.A., Schultz, M. & Corley, K.G. (2000). Organizational Identity, Image, and Adaptive Instability. Academy of Management Review, 25(1), 63-81. DOI: 10.5465/amr.2000.2791603
Parallel 2
Institutional complexity
Negotiating visible competing logics vs. a rival arriving as improvement
Greenwood et al. (2011) / Pache & Santos (2013)

Organizations face multiple institutional logics that prescribe incompatible courses of action. Greenwood et al. mapped how field position, organizational structure, and identity shape whether organizations compartmentalize, compromise, or resist. Pache & Santos deepened this with "selective coupling": hybrid organizations (like social enterprises) do not just ceremonially decouple. They strategically adopt intact practices from each competing logic, assembling a working composite. Both accounts assume that the competing logics are recognizable to organizational actors. The logics may conflict in values and prescriptions, but they rely on broadly intelligible forms of evidence, argumentation, and professional reasoning. Disputes can be negotiated precisely because the standards for evaluating claims remain legible across logics.

PSF

AI engagement does not arrive as a competing logic that announces itself as different. It arrives as an improvement within the existing evaluative framework. Speed, output volume, coverage, consistency: these are already valued. The engagement makes them more visible and more achievable. The organization does not experience a clash between two recognizable logics. It experiences confirmation that its existing logic is working better than before. The rival logic never presents itself as a rival. It constitutes the confirming evidence for the logic already in place. By the time the shift becomes visible (if it ever does), evaluative capacity has already eroded.

PSF's move: Institutional complexity theory assumes competing logics are identifiable and navigable. PSF identifies a form of complexity that is structurally invisible because the competing logic disguises itself as confirmation of the incumbent one.
Greenwood, R., Raynard, M., Kodeih, F., Micelotta, E.R. & Lounsbury, M. (2011). Institutional Complexity and Organizational Responses. Academy of Management Annals, 5(1), 317-371. DOI: 10.5465/19416520.2011.590299
Pache, A.-C. & Santos, F. (2013). Inside the Hybrid Organization: Selective Coupling as a Response to Competing Institutional Logics. Academy of Management Journal, 56(4), 972-1001. DOI: 10.5465/amj.2011.0405
Parallel 3
Epistemic contestation
Visible regime contest vs. pre-contest erosion
Sergeeva, Leonardi & Faraj (2026)

AI introduces a rival "epistemic regime" into organizations: a historically situated, institutionally stabilized configuration that governs how knowledge is produced, evaluated, and authorized. The computational regime (grounded in statistical inference, large-scale data, predictive performance) competes with the professional regime (grounded in experiential judgment, disciplinary training, situated reasoning). Sergeeva et al. argue that organizations are arenas where these regimes contend for legitimacy. The contest is political, visible, and consequential. Actors resist, champion, compartmentalize, or capitulate. The framework draws on institutional logics, Foucault, STS, and sensemaking to describe what happens when the grounds of knowing are themselves in dispute.

PSF

PSF describes what happens before the contest Sergeeva et al. theorize becomes visible. Proxy seduction operates in the period when the computational regime has not yet announced itself as a rival. Organizations are not contending with a competing way of knowing. They are experiencing productivity gains that register as improvements within their existing evaluative framework. The epistemic shift is real, but it is self-concealing: the proxy metrics generated by AI engagement look like evidence that the incumbent regime is working better than ever. By the time the contest becomes legible (the moment Sergeeva et al. begin their analysis), evaluative capacity has already degraded. PSF explains why so many organizations arrive at the regime contest Sergeeva et al. describe with diminished capacity to navigate it.

PSF's move: Sergeeva et al. describe what the institutional landscape looks like when the contest becomes legible. PSF describes why it takes so long to become legible in the first place.
Sergeeva, A.V., Leonardi, P.M. & Faraj, S. (2026). Beyond the tool view of AI: Intelligent technologies and the emergence of new epistemic regimes. Strategic Organization. DOI: 10.1177/14761270261436479
Parallel 4
Performativity of rationality
Conscious ritual production vs. constitutive circularity
Cabantous & Gond (2011)

Rational decision-making persists in organizations not because it works but because it is continuously performed into existence through the coordinated deployment of tools, theories, and experts. Rationality is a "performative praxis": it requires active production through three mechanisms (conventionalization, engineering, and commodification). The key insight is that participants in this production are not naive. They may recognize the ritual quality of the process. The production succeeds because rationality serves real organizational functions (legitimation, coordination, accountability) regardless of whether it produces optimal decisions. The actors can see through the performance and choose to maintain it.

PSF

PSF identifies the version where nobody sees through the performance because the performance has constituted the evaluative frame by which the performance is judged. The AI tool produces speed gains. Speed becomes the primary measure of success. The tool is then judged successful because speed went up. This is Barnesian performativity (MacKenzie, 2006): the framework made itself true. Cabantous & Gond describe a ritual that participants maintain knowingly. PSF describes a circularity that participants cannot detect, because the very criteria they would use to detect it have been constituted by the process under evaluation.

PSF's move: Cabantous & Gond's performative praxis requires a cynical-but-competent actor who maintains the ritual. PSF identifies performativity that is self-validating: the framework constitutes the evaluative criteria by which the framework is judged to be working.
Cabantous, L. & Gond, J.-P. (2011). Rational Decision Making as Performative Praxis: Explaining Rationality's Éternel Retour. Organization Science, 22(3), 573-586. DOI: 10.1287/orsc.1100.0534
Parallel 5
AI as cognitive remedy
Overcoming bounded rationality vs. constituting the evaluative frame
Shrestha et al. (2021) / Laamanen et al. (2025)

A substantial literature presents AI as a remedy for bounded rationality (Simon, 1955). Algorithms process more information, compute complex trade-offs, and synthesize knowledge at scale, freeing humans from the constraints of satisficing and cognitive bias. Shrestha et al. propose principles for deploying deep learning to augment organizational decisions. Laamanen et al. use the attention-based view to argue that AI broadens organizational attention, democratizes strategic processes, and accelerates feedback. Both accounts treat the human evaluator as a stable reference point: someone who can assess whether AI improved the decision and redirect resources accordingly.

PSF

PSF asks what happens when the "remedy" constitutes the evaluative frame by which the remedy is judged to be working. If AI engagement produces the metrics that organizations use to assess decision quality (speed of resolution, volume of output, consistency of recommendations), then the conclusion "AI improved our decisions" is circular. The evaluator is not stable. The evaluator has been transformed by the very engagement whose effects the evaluator is trying to assess. As Sergeeva et al. (2026) note, the bounded rationality framing implicitly privileges "the ideals of optimization, quantification, and computational inference as naturally more desirable ways of knowing," but PSF goes further: the framing does not just privilege computational knowing, it prevents the organization from noticing the privileging.

PSF's move: The bounded rationality literature assumes a stable evaluator who can assess whether AI helped. PSF identifies the mechanism by which AI engagement transforms the evaluator, making that assessment structurally unreliable.
Shrestha, Y.R., Krishna, V. & von Krogh, G. (2021). Augmenting organizational decision-making with deep learning algorithms. Journal of Business Research, 123, 588-603. DOI: 10.1016/j.jbusres.2020.09.068
Laamanen, T., Weiser, A.-K., von Krogh, G. & Ocasio, W. (2025). Artificial intelligence in adaptive strategy creation and implementation. Long Range Planning, 58(4), 102561. DOI: 10.1016/j.lrp.2025.102561
Parallel 6
Automation and augmentation
Task-level adjustment vs. evaluative capacity erosion
Raisch & Krakowski (2021)

Paradox theory frames automation and augmentation as interdependent rather than opposed. Organizations that overemphasize automation lose human capability. Organizations that overemphasize augmentation underutilize AI capacity. The productive path integrates both, creating complementarities across time and space. The framework operates at the task level: which tasks should be automated, which augmented, how should the balance shift over time? The implicit assumption is that the organization can observe the consequences of its choices and adjust. If automation degrades quality, someone notices. If augmentation overwhelms practitioners, someone intervenes. The organization retains the evaluative capacity to manage the paradox.

PSF

PSF operates beneath the task level. The question is not whether a task is automated or augmented but whether the organization can still tell the difference between genuine augmentation and scaffolded dependency. When AI produces outputs that look like competent work product, how does the organization distinguish augmentation (practitioner skill enhanced) from substitution (practitioner skill bypassed)? The proxy metrics generated by engagement do not distinguish these cases. Both produce the same visible outputs. The erosion of evaluative capacity means the organization progressively loses the ability to navigate the very paradox Raisch & Krakowski identify.

PSF's move: The automation-augmentation paradox assumes the organization can observe consequences and adjust. PSF identifies the mechanism by which AI engagement erodes the observational capacity on which the paradox management depends.
Raisch, S. & Krakowski, S. (2021). Artificial Intelligence and Management: The Automation-Augmentation Paradox. Academy of Management Review, 46(1), 192-210. DOI: 10.5465/amr.2018.0072

The systematic blind spot

Six parallels, one pattern. In every case, the existing framework assumes a stable evaluator: an organizational actor (or collective) that can see what is happening, assess the consequences, and respond. Goodhart assumes the gamer can see the gap. Institutional complexity assumes actors can identify the competing logics. The automation-augmentation paradox assumes someone can tell whether AI is helping or hurting. Epistemic regime theory assumes the contest is legible.

PSF identifies the constitutive version of each: the case where the engagement has already transformed the evaluator's capacity to see. The gap is invisible not because information is missing but because the evaluative frame through which information would be interpreted has been reconstituted by the very process under evaluation.

This is not one alternative to one framework. It is a systematic blind spot across an entire family of organizational theories. They all assume evaluative continuity. PSF is the theory of what happens when that assumption fails.

Why the blind spot exists: material braking

These frameworks were not careless. They were built on empirical settings where the assumption of evaluative continuity held, because those settings had physical substrates that generated self-announcing feedback when something went wrong. Robotic surgery (Sergeeva & Faraj, 2012): the surgeon loses haptic feedback and feels the absence immediately. Pharmacy automation (Barrett et al., 2012): medication errors produce visible patient outcomes. ERP implementations: broken workflows halt production. In each case, the disruption of practiced knowing was observable because the domain's material properties forced it into view.

AI engagement in knowledge work is structurally different. The outputs are linguistic, aesthetic, strategic, evaluative. They exist in registers where quality is judgment-dependent, not physically verifiable. Code that compiles is not code that solves the right problem. A strategy that is coherent is not a strategy that is sound. A diagnosis that matches the benchmark is not a diagnosis that captures the clinical picture. There is no equivalent of tissue resistance going missing. The proxy metrics fill the perceptual space where the felt absence would have been.

AI is not just another technology that mediates practiced knowing. It is the first technology that mediates practiced knowing in domains where the loss of that knowing cannot be detected by the outputs alone. That is what makes it the revealing case for organizational theory, not merely a new case.

To add · action item

The strategic-vs-constitutive flow diagram that sat at the foot of this page is being rebuilt separately. It still illustrated the earlier "judgment stock depletes" framing and predates the METR / colonoscopy evidence split, so it was removed rather than patched. The replacement should show the self-concealing path, with METR carrying the proxy-criterion gap and its concealment, and the colonoscopy withdrawal trial carrying the capacity erosion.