Time to go around commenting stuff yourself!
I just inherited my first codebase a few months ago. It’s like this everywhere and original developer was fired, so what should sometimes be a simple fix turns into a full day of finding what needs to change. Any recommendations on fixing/maintaining code like this or should I just make it the next person’s problem?
-
if it’s not in git / SVC, add it as is. Create a “refactor” branch, and liberally use commits
-
Treat it like a decompilation
Figure out what something does, and rename it (with a stupidly verbose name, if you have to). Use the IDE refactor tools to rename all instances of that identifier
-
Take a function, figure out what it does, and refactor it in a way that makes sense to you
-
Use the editor’s diff mode to compare duplicate code, extract out anything different into a variable or callback, and combine the code into a function call. Vscode’s “select for compare” and “compare with selected” are useful for this
-
Track what you’re doing / keep notes in something like Obsidian. You can use
[[Wikilinks]]syntax to link between notes, which lets you build a graph structure using your notes as nodes -
be cognizant of “Side Effects”
For example, a function or property, or class might be invoked using Reflection, via a string literal (or even worse, a constructed string). And renaming it can cause a reflective invocation somewhere else random to fail
Or function or operator overloading/overiding doing something bizarre
Or two tightly coupled objects that mutate each other, and expect certain unstated invariants to be held (like,
foo()can only be called once, orthingyA.len()must equalthingyB.len()- write tests if you can, either using a testing framework or custom Python scripts
You can use these to more thoroughly compare behavior between the original and a refactor
- if something feels too “heavy”, like it’s doing xml formatting, file manips, a db insert, and making coffee, all in a single class or function
Separate out those “concerns”, into their own object/interface, and pass them into the class / function at invocation (Dependency Injection)
- use “if guards” and early returns to bail from a function, instead of wrapping the func body with an if
public Value? Func(String arg) { if (arg.IsEmpty()) { return null; } if (this.Bar == null) { return null; } // ... return new Value(); /// instead of if (!arg.IsEmpty) { if (this.Bar != null) { // ... return new Value(); } } return null; }
-
Add comments as you go
longest file I have ever maintained contained 50,000 lines of code.
fifty THOUSAND.
forgive me for not weeping for 2000 lines.
my advice, don’t fucking touch it. pull out as much functionality out of it into other things over time.
there will come a day when you can throw it away. maybe not today, maybe not tomorrow… but some day.
Yeah, been there. The codebase I worked on also had a single method with 10k lines.
The database IDs were strings including the hostname of the machine that wrote to the DB. Since it was a centralized server, all IDs had the same hostname. The ID also included date and time accurate to the millisecond, and the table name itself.
Me: Mom, can we have UUIDs? Mom: We have UUIDs at home UUIDs at home: that shit
You should add the local weather forecast, a random fun fact and the canteen menu of the day to the key to make it more interesting to read.
I literally told my boss that I was just going to rebuild the entire pipeline from the ground up when I took over the codebase. The legacy code is a massive pile of patchwork spaghetti that takes days just to track down where things are happening because someone, in their infinite wisdom, decided to just pass a dictionary around and add/remove shit from it so there is no actual way to find where or when anything is done.
I did this once
I was generating a large fake dataset that had to make sense in certain ways. I created a neat thing in C# where you could index a hashmap by the type of model it stored, and it would give you the collection storing that data.
This made obtaining resources for generation trivial
However, it made figuring out the order i needed to generate things an effing nightmare
Of note, a lot of these resource “Pools” depended on other resource Pools, and often times, adding a new Pool dependency to a generator meant more time fiddling with the Pool standup code
FUCK. Triggers me. Just got let go from a place that had this problem and wouldn’t let me make any changes whatsoever. I didn’t even push hard.
The language is COBOL.

your paycheck is $5000 because you know COBOL
Pretty sure that knowing COBOL isn’t the hard part. It has relatively few language concepts.
This lack of language concepts just makes it difficult to reason about it, so that’s what you’re getting a paycheck for. Well, and possibly also because it might take months to have a new dev figure out your legacy codebase, so it’s cheaper to keep the current dev by paying them competitive prices.
Per day.
Not quite. More like per 40 hour week with no overtime, but my father insists on having up to 20 hours a week of overtime he’s allowed to burn, so it’s kinda like $7,500 a week. He generally gets paid byweekly or monthly. Subcontractor and all that BS
subcontractor
that’s why it’s $7500.
COBOL. That’s why he’s a subcontractor. Not like they taught COBOL in my CS courses in university in 1998-2002.
Even worse than there being no comments: the code is extensively commented, but its function has drifted from what the comments describe to the point where they are actively misleading.
The good old “signal left when switching to right lane.”
Ngl that’s like baby levels of nasty code. The real nasty shit is the stuff with pointless abstractions and call chains that make you question your sanity. Stuff that looks like it’s only purpose was to burn the clock and show off a niche language feature. Or worse than that even is when the project you inherit has decade old dependencies that have all been forked and patched by the old team
If all I had to worry about was organization and naming I’d be over the moon
Former coworkers: “oh, these two lines are the same in function x and function y. TIME TO ABSTRACT”
Git commits with message saying “pushing changes” and there are over 50 files with unrelated code in it.
“fixed issue”
“Fix for critical issue.”
Followed by an equally large set of files in a commit with just the message:
“Fixup”
And then the actual fix turns out to be mixed in with “Start sprint 57 - AutoConfiguration Refactor” which follows “Fixup”
My favorite was an abstract class that called 3 levels in to other classes that then called another implementation of said abstract class.
And people wonder why no one on our team ever got shit done.
You got 3 letters?! Luck!
I worked at a japanese company whose engineers we’re former NTT developers. Copypasta (i.e. not using functions), inefficient algos, single-letter var names, remote code execution from code as root, etc. good times!
single-letter var names
in kanji?
The only experience I have like this is when I wanted to see how the ARMA Life mod was doing certain things, but it was programmed by like 20 different people in 3 different languages. Most of it was in German and French.
It was easier to just to find my own way of doing what I wanted to do.
I was part of project that scoffed at the idea documenting code. Comments were also few and far between. In retrospective, it really seemed like they wanted to give that elitist feel because everything reeked of wanting to keep things under wraps despite everything being done out in the freakin’ open.
Hey! This was my first real job. Is Matlab code written by physicists who just recently learned programming.
My first thought immediately was of academia also.
This was me when I started working with my current full time job.
What a nightmare.
Honest question: would an LLM be able to write useful comments in code like this?
use the LLM to generate regression tests for the large file, then start refactoring it
It would probably struggle to see the larger picture. I can see it being used to add comments in self-contained functions though without too much difficulty.
Honest question: would an LLM be able to write useful comments in code like this?
It can be better han nothing, but not really. The LLM faces the same challenge that any competent coder does: neither were present to learn the human, business and organization context when the code was first written.
The next row would be “boss fires you thinking Claude can maintain the codebase.”
At least there’s a kind of happy ending when we walk past the old boss and don’t toss a dollar into his pan-handling hat.
Well, I’m the only maintainer for my project, so ha! (I only have myself to blame.)

That just means my boss will have to do all the work. Ha, what an idiot. Wait… aw. 🙁













