The Panopticon of Planning in AGI Risk and Safety Initiatives

Why "Theories of Change" Will Not Provide a Safe Path to AGI

Sep 22, 2025

The rush to govern Artificial General Intelligence is relying on outdated tools. We must reject the illusion of rational control and confront the raw operations of power.

In the boardrooms of Silicon Valley, the halls of the UN, and the strategy sessions of major NGOs, a consensus is forming: we must coordinate globally to ensure the safe development of AGI. The stakes are existential, the timelines are shrinking, and the response has been a predictable turn toward the comforting embrace of managerialism.

The tool of choice? The "Theory of Change" (ToC).

A Theory of Change is seductive. It promises clarity in the face of chaos. It functions as a cognitive tool; specifically, a Causal Logic Model embedded within a Strategic Conceptual Framework. It demands that we map out our inputs, define our desired outcomes, and draw neat, linear arrows connecting them. If we establish international auditing standards, we achieve alignment, which in turn leads to global safety.

It is a rational, modernist approach to a profoundly non-rational, emergent phenomenon. And it is, to put it bluntly, woefully inadequate for the task at hand. Worse, it is dangerous, because it offers the illusion of control while obscuring the very dynamics that drive the risk.

To understand why these frameworks fail, we must return to the work of Michel Foucault, the preeminent historian of how power utilizes knowledge to discipline and control. From a Foucauldian perspective, a Theory of Change for AGI safety is not a roadmap to salvation; it is a blueprint for a new kind of panopticon.

My undergraduate thesis centered on the work of Michel Foucault, and over the last 20 years, I’ve returned to his work often. For those of you familiar with his work, you’ll understand why this Substack publication is named Heterotopia AI.

The Illusion of the Causal Model

The foundation of any Theory of Change is causality: the belief that actions lead to predictable outcomes in a stable environment. Foucault’s historical method, genealogy, rejects this linear view of history. History is not a rational progression; it is a series of contingent conflicts, messy accidents, and sudden ruptures.

AGI is not simply a new variable to plug into an existing model. It represents a potential epistemic rupture; a fundamental break in the underlying structure of knowledge and power that defines our era. A Causal Logic Model attempts to predict the future using the rules of the present. But AGI threatens to rewrite the rules of causality themselves.

When we rely on linear models, we are preparing for a future that looks like the present. We are utterly unprepared for the radical contingency of AGI emergence. The map is useless when the territory is transforming beneath our feet.

The Framework as a Technology of Power

The more insidious aspect of these strategic frameworks is how they function as what Foucault called "technologies of power." They are not neutral aids to cognition; they are instruments for enforcing a specific vision of the world.

A global framework requires consensus on core concepts, primarily "AGI Safety." But who defines safety?

Foucault argued that "truth" is not objective; it is produced by power. Every society establishes a "Regime of Truth" such that the discourses it accepts and determine what is true and what can function as truth. Applied to the challenges inherent in our current pursuit of AGI, the definition of safety is already being captured.

Is "safety" defined by the leading AI labs, focused on technical alignment that ensures market viability? Is it defined by dominant nation-states, concerned primarily with maintaining a strategic military advantage? Or is it defined by marginalized communities, prioritizing protection from algorithmic bias, surveillance, and economic displacement?

A Theory of Change cannot reconcile these conflicts; it can only mask them. By forcing a consensus, the framework operationalizes the definition of safety enforced by the most powerful actors. The profound political question of what kind of future we want is translated into a technical question of how to execute the framework.

Governmentality and the Conduct of Conduct

The implementation of a global AGI safety framework is an exercise in what Foucault termed governmentality; the techniques and strategies used to render a population or system governable. It is the "conduct of conduct."

By establishing international standards, auditing regimes, and monitoring protocols, the framework aims to make the complex landscape of AGI development more legible and manageable. This process disciplines participants. It establishes norms for what constitutes "responsible" development. Nations or corporations that deviate are labeled as rogue.

This normalization is a profound exercise of power. It determines who is authorized to speak (the technical alignment "experts") and silences "subjugated knowledges"—the perspectives that do not fit the rationalist mold. The goal shifts from ensuring genuine safety to ensuring governability and the stability of existing power structures.

The Alternative: Practices of Critique and Refusal

If the dominant frameworks are inadequate, what is the alternative?

Foucault was famously reluctant to offer solutions, as this creates a new regime of power. The alternative to a Theory of Change is not a better framework, but a fundamentally different ethos. We must move from planning to practice, from consensus to contestation.

1. Problematization over Problem-Solving:

A ToC is a problem-solving tool. We must instead engage in problematization. This means analyzing how and why something became a problem in the first place. Instead of asking, "How do we achieve AGI safety?", we must ask, "Why is AGI development proceeding in a way that makes 'safety' such a crisis? What economic and political structures are driving the risk?"

2. Cartography of Power over Strategic Planning:

We do not need static roadmaps. We need dynamic cartographies of the AGI landscape. This means constantly mapping the shifting alliances, conflicts, and power relations, not just in international treaties, but in coding practices, hardware design, and the allocation of capital.

3. Agonistic Pluralism over Consensus:

Consensus is often a dangerous normalization that suppresses dissent. We must embrace agonistic pluralism. Governance structures should be designed to facilitate structured conflict and continuous contestation, rather than eliminating it. Safety is not achieved by universal agreement, but by a dynamic equilibrium maintained by constant vigilance and mutual checking.

4. Fostering Counter-Conduct:

We must actively support resistance against dominant modes of governance. This means championing decentralized alternatives, adversarial auditing, and "practices of refusal." We need parrhesia, which can be thought of as courageous truth-telling by whistleblowers and critical civil society that can serve to interrupt dangerous trajectories.

The challenge of AGI demands that we abandon the comforting illusion of the master plan. Here, in Heteropia AI, the goal is to build spaces of otherness where dominant narratives are challenged. We must engage in the messy, continuous practice of analyzing power, questioning established truths, and remaining perpetually vigilant about how technology is reshaping the human condition.

Even in, and especially in, discussions about the last significant human-driven problem to be solved, namely, how to thrive in a superintelligence explosion as a species.

Discussion about this post

Ready for more?