3
Large language models (LLMs) have demonstrated the promise to revolutionize the field of software engineering. Among other things, LLM agents are rapidly gaining momentum in their application to software development, with practitioners claiming a multifold productivity increase after adoption. Yet, empirical evidence is lacking around these claims. In this paper, we estimate the causal effect of adopting a widely popular LLM agent assistant, namely Cursor, on development velocity and software quality. The estimation is enabled by a state-of-the-art difference-in-differences design comparing Cursor-adopting GitHub projects with a matched control group of similar GitHub projects that do not use Cursor. We find that the adoption of Cursor leads to a significant, large, but transient increase in project-level development velocity, along with a significant and persistent increase in static analysis warnings and code complexity. Further panel generalized method of moments estimation reveals that the increase in static analysis warnings and code complexity acts as a major factor causing long-term velocity slowdown. Our study carries implications for software engineering practitioners, LLM agent assistant designers, and researchers.


I’m a software developer and I fucking hate coding with AI.
In the beginning I was curious, even excited, but reality soon caught up.
Unnecessarily complex, bloated code (at best)
Wrong SQL statements
Code that won’t compile
Insecure code
And for haven’s sake, never let it near legacy code, it will fuck up everything.
I’m not a software developer or used a dedicated programming AI inside an IDE or anything. Just inconsistent, occasionally scripting-heavy IT type stuff. My organization offered me whatever AI google makes available because they had more licenses than they were using so I figure whatever, sure. I think 80% of the time (when I’ve been desperate enough to look at its summary of a technical search or used it for help, even with Google products like a Sheets or Appsheet function) it has hallucinated useless answers that occasionally sent me on wild goose chases when they looked convincing. I’ve been pointed to non-existent powershell commands and multiple non-existent python libraries. To be fair, when I searched for one of the powershell commands I found AI slop articles about it. So at least it had some excuse… or maybe someone found the hallucinated results were trending and tried to capitalize on that.
I was having a rough day and figured I’d ask the chat version geared towards programming to give me a complicated Sheets formula. I saw a flaw in its formula, explained it, and ended up being gaslit as it proceeded to support the accuracy of its results step by step and literally used a different value on the one problematic step to demonstrate that it’s logic was sound.
My favorite was when I searched to see if AppSheet supported tooltips in forms. It listed the features of AppSheet including the ability to display tooltips in forms. It offered a reference link. The link was to a forum with the following, admittedly paraphrased exchange: