Monday, April 20, 2026

Using AI to improve (not automate away) academic research

Everyone seems to be consumed with AI anxiety. Graduate students are wondering if they will be replaced by assistants, or if they themselves are using AI enough or using it "right". Researchers are wondering what it means to produce research if agents can write whole papers. Everyone is wondering how we will keep up with a literature that is moving ever faster. 

Everyone is feeling the pressure to do *more*: do more projects, produce more papers, review more papers. This has already resulted in negative impacts on the research space, for example the problems that conferences have in getting quality, non-automated reviewing for the huge volume of submissions they receive. 

We should think about what we can do that is *different.* We should try to use automation to be more efficient at the annoying parts of our jobs while leaving more time for discovering new knowledge. The key (fast-evolving, unresolved) issue is how AI models will change the frontier of what is scientifically possible. This varies from field to field and changes day by day, but my sense is that the rise of semi-autonomous agents will be very interesting for scaling up social and behavioral science.

Don’t give away the good part

The first use-case for AI has been generating text. While this function can be useful, especially for useless administrative prose that no one needs to read, it is at odds with a fundamental feature of scholarship: writing is critical to thinking. 

A small handful of scientists I've encountered seem to think directly from clearly posed questions to elegant experimental designs and theoretical interpretations. I envy them because I can't do that. Instead, I have to slowly and laboriously externalize my argument into a paper or a talk and stare at it to realize why it doesn't make sense. It's inconvenient that this process sometimes happens after years of research effort!

If you value thinking, then jumping directly to text generation using AI doesn't make sense. That's one reason why so many academics are so negative on AI, and deeply distrust its use as a tool: generating some generic text gives away the opportunity to figure something out.

But it's shortsighted to give up on AI altogether because of this argument. Instead, we need to focus on ways that AI can help us have more time to spend on the good, hard work of writing and thinking. I'm still working on this, but here are some cases where it's been useful for me: reformatting research documents for IRB; planning conference travel itineraries; drafting documentation for a software package; organizing a code repository; rearranging and reformatting authorship and credit info for a manuscript; adding DOIs to a reference section. Not all of these worked perfectly – I'm looking at you, DOI falsification – but overall they saved me time that I could use for more meaningful work. 

Don’t focus on making the same junk

Here's the other thing: we're not knocking it out of the park right now. The standard paper-shaped package of social science research is not something that we just want to make more of! We have a long ways to go to ensure that the work we publish is reproducible, replicable, and robust. Using AI to make more standard papers faster will not be a win.*

The bigger problem is that there's no direct route from more papers to precise, generalizable theories of how the human mind works, or how social systems function, or how to improve school achievement. Making more papers might be a *side effect* of making progress on those topics, but it's not the right causal lever to pull. 

What we need are more precise measurements and more precise theories. (As a side note, this is what I've written about again and again, e.g. in my experimental methods textbook or a recent review on cognitive modeling). AI is not the royal road to these, but the work to come is figuring out how it can help.**

What does "better" look like?

Unless you're actually studying AI, standard text-generation a la GPT4 is not that helpful for making more precise measurements or better theories. How do you go from a chat window to better science? I tried a bunch of ideas early on in this era and they all failed. With increases in the capacity of base models and the rise of agent-based AI tools like Cursor, Claude Code, and Cowork, however, I am seeing some fascinating ways that these tools can be helpful for increasing the scale and robustness of our work. Here are a few.

First, coding agents are remarkably useful for research outputs, as long as you keep a tight leash on how they are used. Anything you do with vibe coding tools or agents needs to be clearly verifiable or else you run the risk that models will be lying to you. They are astonishing at quickly creating user interfaces and visualizations. For social science researchers, this means the possibility of creating more flexible and customized experiments and not being bound to tools like Qualtrics. It also means we can think about doing more with our data. Even if you don't have the skills or time to set up a datapage for your paper, you can have Claude do it.  

Second, the rise of LLM agents makes it easy to automate challenging research tasks. For example, I have been using Claude Cowork to extract information on data analysis parameters from a large folder of PDFs. Just like hand annotation by a research assistant, these judgments need to be double checked if you are going to rely on them, but they are really quite good and save a huge amount of time. 

Third, coding agents reduce barriers for formalization and theory testing. For example, re-implementing computational models from the literature can be challenging, especially for students who don't specialize in this kind of work. There are some "gotchas" here – you need to understand the models you are fitting! – but coding agents can really help with managing some of the complexity of getting up and running with different kinds of models. 

Fourth, optimization. I just got a message from a collaborator who I worked with on a complex modeling project (this paper, which was one of my favorite projects in the last few years). One of the big bottlenecks of the work was a probabilistic inference problem that made the code hard to implement and slow to run even on compute clusters. He was able to optimize the inference and make it something like a thousand times faster using some analytic math suggested by an AI agent. 

Finally, a sometimes-overlooked but very cool function of AI tools is for providing critique. Agents need not write code; they can also check that your code does what you think it does. One of my students used Claude Code to find a pretty significant bug in their (entirely hand coded) research pipeline. This kind of robustness checking is far too rare in research. I also find AI critiques of research materials (designs, stimuli, even writing) to be very helpful in pointing out flaws I've overlooked. Maybe models aren't as insightful as a really smart friend or mentor, but these folks tend to be really busy with their own work! 

Conclusion

I hear a lot of people saying they oppose AI-generated prose and so they don't think AI should be used for research. My response is that this is just the wrong way to use AI as an academic. Don't use it to decrease the quality of the hard thing you do. Instead, try to find ways to use it for making the boring parts of your job easier and increasing the quality and scope of the best parts!

[Note: I drafted this post and then got feedback from Claude Cowork. The feedback was pretty good (though a bit sycophantic), and I think that my trying to implement it myself improved the post.]

* My colleague Russ Poldrack has an amazing new book on coding practices for good science, which includes substantial chunks on how to use AI effectively for scientific coding. 
** I'm very inspired by what my colleagues Judy Fan and Dan Yamins call "bids for scale": attempts to go beyond business as usual in our work and reach for projects that are larger – not just in size but in ambition as well. These kinds of projects try to push research forward by synthesizing theoriesbroadening samples, or measuring real-world impacts. AI tools seem like they will be very helpful for these kinds of mega projects – especially in doing some of the very hard work of data and materials wrangling that come with large-scale projects.

No comments:

Post a Comment