📊 Full opportunity report: Engineering Is Automated. Research Is the Residual. on ThorstenMeyerAI.com — validation score, market gap, and execution plan.
TL;DR
AI systems have achieved near-complete automation of core engineering tasks, with research still partly reliant on human insight. This shift impacts AI development strategies and institutional responses.
Recent empirical evidence confirms that AI systems are now capable of automating the majority of core engineering tasks, while AI’s research capabilities are approaching similar levels but are not yet fully automated, according to recent analyses by Thorsten Meyer.
Multiple benchmarks, including CORE-Bench and MLE-Bench, show AI systems reaching near-complete performance on tasks fundamental to AI engineering, such as reproducing research results and competing in Kaggle competitions. For example, CORE-Bench performance has improved from 21.5% in September 2024 to 95.5% in December 2025, with some experts declaring the task ‘solved.’ Similarly, AI’s performance on Kaggle competitions has increased from 16.9% to 64.4% over sixteen months, placing it within the range of mid-tier human practitioners.
These advances indicate that the bottleneck for engineering is shifting from capability to deployment, with AI systems handling frictions traditionally managed by human engineers. However, research tasks—such as generating novel hypotheses, designing experiments, or solving complex scientific problems—are less advanced, with evidence suggesting they are approaching saturation but not yet fully automated. Thorsten Meyer notes that the structural difference may be that research is increasingly becoming a form of large-scale engineering, which could accelerate the residual gap closing.
Engineering is automated.
Research is the residual.
Six skill benchmarks. Edison’s framing. The question Clark leaves open is whether research is just engineering at scale.
Jack Clark’s Import AI #455 catalogs six benchmarks measuring AI capability on AI R&D tasks and concludes “AI can today automate vast swatches, perhaps the entirety, of AI engineering.” The residual question is research. The structural read on the residual: it may not be a permanent moat.
Six skills. One trajectory.
Clark catalogs six benchmarks measuring AI capability on AI R&D-relevant tasks. Each individual benchmark could be noise. Six benchmarks moving together is a curve. The pattern is the cascade observed across the broader Clark series — visible here in the specific R&D-skill domain.
![Claude AI for Beginners Bible: [5 in 1] The Ultimate Guide to Automate Your Work, Save Hours Every Week, and Use AI for Real-World Results](https://m.media-amazon.com/images/I/415+fSJacsL._SL500_.jpg)
Claude AI for Beginners Bible: [5 in 1] The Ultimate Guide to Automate Your Work, Save Hours Every Week, and Use AI for Real-World Results
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Three data points. Mixed signal.
Clark provides three data points on the creative-spark question. Yes-evidence: Erdős-1051, centaur math discovery, sporadic Move-37-style moments. No-evidence: low yield, framing dependence, absence of acceleration. The mixed signal is the honest read.
The data supports two readings. Pessimistic: rare moments suggest creative insight is qualitatively distinct from engineering work. Optimistic: rare moments are an artifact of low-volume exploration; more shots on goal yields more discoveries. Both readings are consistent with Clark’s “vast swatches, perhaps the entirety” claim. They differ on the residual.

CLAUDE AI UNLEASHED From First Prompts to Pro: The Complete Guide to Claude AI for Writing, Research, Coding, and Business (The Claude AI Mastery Series)
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Five dimensions Clark gestures at but leaves underdeveloped.
Clark’s section is rigorous on the empirical evidence. Five strategic dimensions matter for the institutional response that the Clark series synthesis argues is structurally inadequate.

Computational Visual Media: 13th International Conference, CVM 2025, Hong Kong SAR, China, April 19–21, 2025, Proceedings, Part II (Lecture Notes in Computer Science)
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Two readings. Different equilibria.
The structural question Clark leaves open: is research a permanent moat that bounds automated AI R&D, or is it engineering at scale that dissolves with more shots on goal? Both readings are consistent with the current data. They differ by orders of magnitude in consequences.
Productivity multiplier years
Recursive loop operational

AI Workflow Tools for Researchers & Analysts: Automating Literature Reviews, Summaries, and Hypothesis Generation with ChatGPT, Claude, and Perplexity
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Five audiences. Asymmetric cost of being wrong.
The institutional response should not bet on inspiration being a permanent moat. If the distinction holds, capacity built is still useful. If it closes, capacity is necessary. Asymmetric cost-of-being-wrong points toward building now.
IN INDUSTRY
IN ACADEMIA
POLICYMAKERS
INVESTORS
EVERYONE ELSE
Engineering is automated. The residual is the question. The institutional response should not bet on inspiration being a permanent moat.
Implications for AI Development and Strategic Focus
The automation of engineering tasks suggests that AI could soon handle the bulk of technical implementation and deployment, reducing the need for human engineers in routine aspects of AI development. This shifts strategic emphasis towards research and innovation, which remain less automated and may become the next bottleneck. The potential for AI to self-improve through recursive engineering raises questions about the pace of progress and the future role of human researchers in AI evolution.
Recent Benchmarks and Progress Milestones
Over the past 18 months, multiple independent benchmarks have shown rapid progress in AI capabilities relevant to AI R&D. CORE-Bench, measuring research reproduction, improved from 21.5% to 95.5%, with some experts calling it ‘solved.’ Similarly, Kaggle competition performance increased significantly, with AI now reaching levels comparable to mid-tier human practitioners. These trends reflect a broader pattern of saturation across different technical skills, suggesting that AI’s engineering capabilities are reaching a near-peak state.
Researchers like Thorsten Meyer interpret these developments as evidence that AI’s engineering skills are effectively automated, with research remaining the residual challenge. The benchmarks’ convergence indicates that progress in AI R&D skills is accelerating, potentially reshaping the landscape of AI innovation and the roles of human researchers.
“AI can today automate vast swatches, perhaps the entirety, of AI engineering. It is not yet clear how much of AI research it can automate.”
— Thorsten Meyer
Remaining Uncertainties in AI Research Automation
While engineering tasks are nearing full automation, the extent to which AI can autonomously generate novel scientific insights remains less clear. The structural question of whether research is inherently a form of large-scale engineering at scale is still open. Additionally, the pace of future progress depends on institutional responses, hardware developments, and the evolution of AI architectures, which are not yet fully predictable.
Next Steps in AI Capability Development
Researchers and institutions are likely to focus on further refining AI’s research abilities, testing the limits of automation in hypothesis generation, experimental design, and scientific discovery. Monitoring benchmark progress and real-world applications will be critical in assessing whether research automation can catch up with engineering. Additionally, discussions around governance and strategic deployment will shape how AI’s evolving capabilities are integrated into scientific and industrial contexts.
Key Questions
What does it mean that AI has automated most engineering tasks?
It means AI systems can now handle tasks like reproducing research, optimizing kernels, and managing technical frictions without human intervention, reducing the need for human engineers in routine technical work.
Is AI capable of conducting scientific research independently?
Currently, AI is approaching saturation in some research-related tasks, but fully autonomous scientific discovery remains an open challenge. It is not yet clear how much of research can be fully automated.
Why is the distinction between engineering and research important?
Engineering involves implementing and deploying AI systems, which AI can now automate extensively. Research involves generating new knowledge, which remains less automated and is critical for future AI progress.
What are the implications for human researchers?
As engineering becomes automated, human researchers may shift focus toward innovation, hypothesis generation, and guiding AI research, potentially reducing routine technical work but increasing strategic and creative roles.
Source: ThorstenMeyerAI.com