Dimensions of Machine Usefulness in Science

Acemoglu and Johnson on Machine Usefulness

In Power and Progress: Our Thousand-Year Struggle Over Technology and Prosperity (2023), economists Daron Acemoglu and Simon Johnson argue that technological progress does not automatically produce broad-based prosperity. What matters is how technologies are deployed. They distinguish between “machine usefulness” and “so-so automation.” The former refers to genuinely useful technologies which augment human capabilities, create new tasks for humans to perform, and generate productivity gains that flow broadly. The latter merely displaces workers without creating commensurate new value. Self-checkout kiosks are the canonical example of so-so automation: they eliminate cashier jobs but don’t dramatically improve the shopping experience, primarily shifting labor costs from corporations to customers while degrading service quality. The question for any new technology is not “can it replace human labor?” but “does it expand what humans can do, and who benefits?”

The Problem with Autonomy Frameworks

Existing frameworks for self-driving laboratories measure progress by how much human involvement can be removed. The highest levels describe labs where humans “merely serve the needs” of the machine or “set a research direction” and walk away.

These frameworks share two questionable assumptions:

First, they conflate technical achievement with value creation. “Can the machine do this without human intervention?” is a different question from “Does this advance science, create new human capabilities, or generate broad-based benefit?” History suggests technologies that substitute for human labor at roughly equivalent performance redistribute value rather than creating it.

Second, they impose false linearity. Autonomy frameworks present progress as a single journey from Level 0 to Level 5, where each stage subsumes the previous. But the actual landscape of useful scientific tools is multidimensional. I’ve always been a huge fan of the software SnapGene because it externalizes and disciplines molecular biologists’ reasoning about DNA constructs. It is enormously valuable without being “higher” on any autonomy scale. It’s not “Level 2” waiting to become “Level 4.” It’s excellent along dimensions that autonomy frameworks don’t measure at all.

This framework proposes an alternative: measuring AI-enabled science by the new human capabilities it creates, not the human tasks it eliminates—and recognizing that these capabilities vary along multiple independent dimensions, not a single axis.

The Six Dimensions of Machine Usefulness in Science

A tool may score high on some dimensions and low on others. There is no implied progression—excellence along one dimension does not require or lead to excellence along others.

1. Friction Reduction

Can we do the same work with less effort?

Characteristics:

Scientists do the same work, but with less friction
Primary value: time savings, reduced drudgery
Human judgment remains central to all decisions

Examples:

Automated pipetting that executes human-designed protocols
Data visualization tools that format results for human interpretation
Literature search that surfaces relevant papers faster

Key question: Would removing this tool change what science gets done, or just how fast?

Risk: Can become so-so automation if it primarily displaces workers without enabling new science.

2. Reach Extension

Can we access experimental territory that was previously impractical?

Characteristics:

Scientists can now do things that were previously impractical
Opens new experimental territory rather than accelerating existing work
Human judgment guides direction; machine extends reach

Examples:

High-throughput screening that makes combinatorial exploration feasible
Robotic manipulation of hazardous or extreme-condition samples
Continuous monitoring that captures dynamics humans would miss

Key question: Are scientists asking questions they wouldn’t have asked before?

Value creation: New experimental territory creates new opportunities for insight, new specializations, new translational pathways—new human tasks.

3. Pattern Surfacing

Can we perceive structures we couldn’t see before?

Characteristics:

Identifies patterns, correlations, or anomalies across scales humans can’t process
Outputs require human interpretation to become knowledge
Creates new objects for human reasoning

Examples:

Dimensionality reduction revealing clusters in high-dimensional data
Anomaly detection flagging unexpected results for human investigation
Cross-dataset integration connecting disparate findings

Key question: Does this generate hypotheses that surprise domain experts?

Value creation: New patterns create new questions, new subfields, new interpretive expertise—humans become specialists in making sense of machine-surfaced structure.

4. Repertoire Expansion

Can we access expertise that was previously siloed or tacit?

Characteristics:

Aggregates expertise that no single human possesses
Makes implicit knowledge explicit and executable
Enables non-experts to leverage expert-level protocols

Examples:

LLM-assisted protocol generation drawing on literature-wide best practices
Cross-domain suggestion systems (e.g., recombinase biochemistry → cloning optimization)
Troubleshooting assistants encoding accumulated lab wisdom

Key question: Can a competent scientist now do what previously required rare specialized expertise?

Value creation: Democratizes capability, lowers barriers to entry, creates demand for new integrative roles (people who can combine newly-accessible techniques in novel ways). The human task shifts from possessing rare expertise to combining newly-accessible capabilities.

5. Judgment Amplification

Can we make better decisions under complexity?

Characteristics:

Handles complexity, uncertainty, or scale beyond human cognitive limits
Human values, priorities, and risk tolerance remain upstream
Machine provides decision support, not decision replacement

Examples:

Experimental design optimization under complex constraints
Uncertainty quantification that makes honest confidence intervals tractable
Scenario modeling that reveals consequences of strategic choices

Key question: Are scientists making better decisions, or just faster ones?

Value creation: Creates demand for humans skilled in specifying values, interpreting tradeoffs, and exercising judgment at higher levels of abstraction. The hard problems remain human problems.

6. Cognitive Scaffolding

Can we reason more reliably and share that reasoning?

Characteristics:

Externalizes and disciplines human reasoning
Makes thinking visible, revisable, and transmissible
Prevents errors by structuring cognition, not by replacing it

Examples:

SnapGene: visualization and planning tools that ensure scientists know the full properties of the DNA they’re working with
Electronic lab notebooks that create records as a byproduct of planning
Version control systems that make the history of reasoning accessible

Key question: Does this make individual reasoning more robust and make that reasoning shareable across people and time?

Value creation: Knowledge accumulates rather than dissipating. Errors get caught earlier. Expertise becomes teachable. The human task shifts from remembering to reasoning.

How This Framework Differs

Dimension	Autonomy Frameworks	Machine Usefulness Framework
Structure	Linear levels (0→5)	Independent dimensions
Measures progress by	Human removal	Human capability expansion
Ideal end state	Machine “in charge”	Humans doing things previously impossible
Treats human involvement as	Limitation to minimize	Source of value to cultivate
Success criterion	Can it run without us?	Can we do more with it?
Economic model	Substitution	Complementarity
Evaluates tools by	Position on single axis	Profile across multiple dimensions

Implications for Investment and Policy

Favor complementarity over substitution

Prioritize systems where human and machine capabilities are genuinely different and mutually enabling, not where machines do what humans do slightly better.

Beware throughput metrics

“10x more experiments” is only valuable if the experiments are worth running. Throughput divorced from insight is infrastructure, not progress.

Invest in translation layers

The specification bottleneck—translating human intent into machine-executable instructions—is where the most leverage likely lies. Natural language interfaces, better protocol languages, and intent-to-execution pipelines create capability broadly.

Fund the interpretive infrastructure

As machines surface more patterns, the bottleneck shifts to human sense-making. Invest in visualization, explanation, and education alongside automation.

Ask distributional questions

Who benefits? If primarily capital owners and elite institutions, the case for public investment weakens. If early-career scientists, under-resourced labs, and patients, the case strengthens.

Conclusion

The autonomy framing asks: “How much can we remove the human?” and measures progress along a single axis from full human involvement to full machine control.

The machine usefulness framing asks: “How much more can the human do?” and recognizes that capability expansion happens along multiple independent dimensions. A tool like SnapGene can be transformative without being “autonomous” at all. An LLM conversation can provide Repertoire Expansion and Cognitive Scaffolding without Reach Extension or Judgment Amplification.

This reframing matters because it changes what we optimize for. The autonomy framing risks creating sophisticated infrastructure that concentrates benefit and displaces workers without commensurate productivity gains—so-so automation dressed up as progress. The machine usefulness framing aims for technologies that expand human capability, create new tasks, and generate broad-based benefit.

The best AI-enabled science will be measured not by how autonomous the lab becomes, but by what scientists—and ultimately, all of us—can do that we couldn’t do before.