Our first outage from LLM-written code

skip0110@lemmy.zip · 6 months ago

Our first outage from LLM-written code

deegeese@lemmy.dbzer0.com · edit-2 6 months ago

I gasped when I saw this:

A bit of discussion indicated that the trigger for the CPU spikes both times was our CEO logging in. We re-deployed to get a clean start, permanently banned him from the service, and moved on.

This is like finding a live grenade under your bed and putting it under the rug.

They found a way to reproduce a system killing bug, and instead of taking the time to understand it, they threw away their test case.

BlazeDaley@lemmy.world · 6 months ago

They contained the impact. Root causing or “understanding” should come after impact mitigation. If needed find a safe way to reproduce the bug without customer impact.

We reverted the refactoring, deployed, un-banned the CEO, and set about analysis.

FizzyOrange@programming.dev · 6 months ago

Yeah me too but if you keep reading they didn’t actually “move on” in the way that it sounds.

Irdial@lemmy.sdf.org · 6 months ago

Well done. More and more companies are deploying LLM-written code in production environments. Might as well be honest about the results so we can learn what does and doesn’t work.

skip0110@lemmy.zip · 6 months ago

Why are we using tools that can’t parse the comment and code via syntax for refactoring?

spartanatreyu@programming.dev · 6 months ago

The first problem is they’re letting AI touch their code.

The second problem is they’re relying on a human to pick up changes in moved code while using git’s built-in diff tools. There’s a whole bunch of studies that show how git’s diff algorithms are terrible, and how swapping to newer diff algos improves things considerably.

TL;DR on the studies:

Only supporting add/remove/move operations is really bad.
Adding syntax awareness to understand if differences in indentation should be brought to a reviewer’s attention, improves code and makes code reviews more accurate. (But this is hard because it’s language dependent)
Adding extra operations (indent/deindent/move/rename-symbol/comment/un-comment/etc…) makes code review easier, faster and more accurate. (But again, most of this requires syntax awareness.

There’s also a bunch of alternative diff algos you can use, but the best ones are paid, and the free ones have fewer features. See: