I’ll give an example. At my previous company there was a program where you basically select a start date, select an end date, select the system and press a button and it reaches out to a database and pulls all the data following that matches those parameters. The horrors of this were 1. The queries were hard coded.
-
They were stored in a configuration file, in xml format.
-
The queries were not 1 entry. It was 4, a start, the part between start date and end date, the part between end date and system and then the end part. All of these were then concatenated in the program intermixed with variables.
-
This was then sent to the server as pure sql, no orm.
-
Here’s my favorite part. You obviously don’t want anyone modifying the configuration file so they encrypted it. Now I know what you’re thinking at some point you probably will need to modify or add to the configuration so you store an unencrypted version in a secure location. Nope! The program had the ability to encrypt and decrypt but there were no visible buttons to access those functions. The program was written in winforms. You had to open the program in visual studio, manually expand the size of the window(locked size in regular use) and that shows the buttons. Now run the program in debug. Press the decrypt button. DO NOT EXIT THE PROGRAM! Edit the file in a text editor. Save file. Press the encrypt button. Copy the encrypted file to any other location on your computer. Close the program. Manually email the encrypted file to anybody using the file.
A data ingestion service that was processing ~15 billion logs each day that was duplicating each of those logs 2-4 times in memory as a part of the filtering logic. No particular reason nor need to do it. When I profiled the system it was BY FAR the largest hog of CPU and memory.
The engineer who wrote it once argued with me about writing comparisons a == b vs b == a because one was technically more efficient … in a language we weren’t using.
Our CFO’s social security number, contact info, and just about everything you’d need to impersonate them inside a random shell script that was being passed around like drugs at a party for anyone to use. Oh and it had an API key to our payments processor hard coded into it.
That was the tip of the iceberg of how bad the systems were at the company. All of these are from the same company:
- A fintech based company with no billing team
- An event system that didn’t event
- A permissions system that didn’t administer permissions
- A local cache for authentication sessions. Which means that requests would intermittently fail auth because the session was only on one replica. If you hit any of the other ones, you’d get an unauthenticated error
- A metrics collection system that silently lost 90% of it’s data
- Constant outages due to poorly designed and implemented systems (and lack of metrics… hmmm)
- Everything when I joined was a single gigantic monolith that was so poorly implemented they had to run at least 3 different versions of it in different modes to serve different use cases (why the fuck did you make it a monolith then?!)
- The subscriptions system was something like 20 or 30 database tables. And they were polymorphic. No one could touch the system without it breaking or that person declaring failure, which leads me to …
- A database schema with over 350 tables, many of which were join tables that should have been on the original table (fuck you scala/java for the limitations to the number of fields you can have in a case class). Yes you read that right. Table A joined to table B just to fill in some extra data that was 1:1 with table A. Repeat that a few dozen times
- History tables. Not separate from the original table, but a table that contained the entire history of a given piece of data. The worst example was with those extraneous join tables I just mentioned. If you went and changed a toggle from true to false to true to false, you’d have 4 records in the same table. One for each of those small changes. You’d have to constantly try to figure out what the ‘latest’ version of the data was. Now try joining 5 tables together, all of them in this pattern.
- Scala… I could go on a tirade about how bad scala is but needless to say, how many different error handling mechanisms are there? Scala decided to mix all of them together in a blender and use them all together. Scala is just two white paper languages in a trenchcoat. Never use it in a production system
- A dashboard for “specialists” that was so easy to overwhelm that you could do it by breathing on it due to the LACK of events that it needed
- Passwords stored in plain text (admittedly this was in the systems of the company we acquired while I was there). Doesn’t matter if they were actually <insert algorithm here>, they were visible in a dashboard accessible by employees. Might as well have been plain text
- A payments system that leaked it’s state into a huge part of the rest of the system. The system ended up being bifurcated across two systems, I was brought in to try to clean up some of the mess after only a couple of months. I desperately tried to get some help because I couldn’t do it solo. They ended up giving me the worst engineer I’ve ever worked with in my 15 year career, and I’ve seen some bad engineers. Looking back, I’m reasonably confident he was shoving our codebase into an AI system (before it was approved/secured, so who knows who had access) and not capable of making changes himself. I could make several posts about this system on its own
- I could go on but I’ll cut it off there
Not mine, but svn-based JDSL is the best related story that’s always worth sharing.
Floats for currency in a payments platform.
The system will happily take a transaction for $121.765, and every so often there’s a dispute because one report ran it through round() and another through floor().
deleted by creator
This might require a bit of background knowledge about Power Query in Excel and Power BI, specifically the concept Query Folding.
Power Query is a tool to define and run queries against a host of data sources and spit out tabular data for use in Excel (as tables) or Power BI (as Tabular Data Model). The selling point of it is the low-code graphical presentation: You transform the data by adding steps to the query, mostly through the menu ribbon. Change a column type? Click the column header > Data Type > select the new type. Perform a join? Click “Merge Queries”, select the second query, select the respective key column(s) to join on and thr join type – no typing needed. You get a nested table column you can then select which columns to expand or aggregate from.
Each step provides you with a preview of the results, and you can look at, edit, delete or insert earlier steps at will. You can also edit individual steps or the whole query through a code editor, but the appeal is obviously that even non-programmers can use it without needing to code.
Of course, it’s most efficient to have SQL transformations done by the database server already. Bur Power Query can do that too: “Query Folding” is the feature that automatically turns a sequence of Power Query steps into native SQL. A sequence like “Source, Select Columns, Filter Rows, Rename Columns” will quite neatly be converted into the SQL equivalent you’d expect. Merges will become Join, appending tables becomes Union, converting a text to uppercase becomes UPPER and so on.
If at some point there is a step it can’t fold, it will use a native query to load the data up to that point, then do the rest in-memory. Even if later steps were foldable, they’ll have to be done in-memory. You can guess that this creates a lot of potential for optimising longer queries by ensuring as much or it as possible is folded and that the result is as “small” as possible – as few rows and column as feasible etc.
Now, when I tell you that there is a table in one of our sources with a few large text columns you almost never need, you may be able to smell the smoke already. A colleague of mine needed help with his queries being slow to load. He had copied some code from Stackoverflow or what have you that joins a query with itself multiple times to resolve hierarchies. In theory, it was supposed to be foldable, provided the step it runs off of is. The general schema of my colleague’s query went Data Source -> non-foldable type conversion -> copied code -> filtering (ultimately keeping about 20% of rows) -> renaming columns -> removing columns. Want to guess which columns were loaded, processed with each join, explicitly renamed and only then finally understood to be useless and discarded?
“I always do the filtering last, don’t want to miss anything.”
This is your regularly scheduled reminder that MS (and our corporate BI team) can present Power Query as self-service data transformation tool all it wants, that still doesn’t mean it’s actually designed for use by non-data techies.
Disclaimer: this is not really about code, but about using IT in my non-IT workplace and I realized this just yesterday. A rant.
I work in the social sector. Our boss seems to have slipped into position sideways (they did not do our work for a significant amount of time before).
I got zero onboarding when I started working there; everything I know about the organisational ins and outs I learned by asking my colleagues.
The boss seems to actively want to not inform me of things, i.e. even if I ask about something they reply in the most cursory manner or immediately refer me to somebody else. I have no idea why they do it, my guess is that they sense that they’re woefully inadequate for the job, plus me being much older triggers insecurities?
For example, when I could not log into an app to see my future shifts, I asked the boss about it first but they immediately refered me to tech support. Calling them, after a while we found out that the boss had mistyped my name. Then I could log in.
Last week I was sick and waited til Sunday noon to check this week’s shifts - but again I couldn’t log in. The boss answered neither phone nor email. Fair enough I guess, on a sunday. Thankfully tech support was working and after a long while we found out that the app for checking my shifts only allows log-ins from within the workplace network, not the open web.
I almost missed my monday shift because of that. Boss calls me, enraged. I explained the situation. They clearly did not know that the app only allows log-ins from within the workplace network.
All my coleagues tentatively/silently agree that this boss is useless. How do we keep the workplace running, and why is it me who is left in the dark? Turns out they have a Whatsapp group. I don’t use Whatsapp. They asked me repeatedly and urgently to join.
tl;dr: this workplace would fall apart if people wouldn’t communicate through Whatsapp instead of official channels
Shadow IT shit lol
A page that handled call requests. It was a table showing some information about the person, the case it’s related to and some other fields. It fetched everything from any table it touched. So the call was fetching all the information about the person. The case it was related to. The person who was assigned to the case, and since the case was linked a couple of layers in, all of that data as well.
I created a simple view that only fetched the data it needed. It went from over A GIGABYTE of data to less than 25mb of data of transfer to the web ui.
One time, I had to request firewall access for a machine we were deploying to, and they had an Excel sheet to fill in your request. Not great, I figured, but whatever.
Then I asked who to send the Excel file to and they told me to open a pull request against a Git repo.
And then, with full pride, the guy tells me that they have an Ansible script, which reads the Excel files during deployment and rolls out the firewall rules as specified.In effect, this meant:
- Of course, I had specified the values in the wrong format. It was just plaintext fields in that Excel, with no hint as to how to format them.
- We did have to go back and forth a few times, because their deployment would fail from the wrong format.
- Every time I changed something, they had to check that I’m not giving myself overly broad access. And because it’s an Excel, they can’t really look at the diff. Every time, they have to open it and then maybe use the Excel version history to know what changed? I have no idea how they actually made that workable.
Yeah, the whole time I was thinking, please just let me edit an Ansible inventory file instead. I get that they have non-technical users, but believe it or not, it does not actually make it simpler, if you expose the same technical fields in a spreadsheet and then still use a pull request workflow and everything…
The corporate world runs on excel, never the best option, but everyone knows it so…
Yep; I’ve seen excel files that at like 10MB because it’s a database in Excel
Try a few Gigabytes. I worked on site IT support for a year, we had to max out memory on a workstation because the company database was a, about 3GB, Excel file. It took minutes to open and barely worked, crashing frequently.
So, this is completely off topic, but some of the comments here reminded me of it:
An elderly family friend was spending a lot of her time using Photoshop to make whimsy collages and stuff to give as gifts to friends and family.
I discovered that when she wanted to add text to an image, she would type it out in Microsoft Word, print it, scan the printed page, then overlay the resulting image over the background with a 50% opacity.
I showed her the type tool in Photoshop and it blew her mind.I am simultaneously horrified that she didn’t do any research to see if she could insert text into the image and incredibly impressed at her problem solving skills. Honestly, the more I think about it, the more I lean towards impressed; good on her!
Haha that’s so dumb. She could’ve just taken a screenshot!
I showed her the type tool in Photoshop and it blew her mind.
Or well. That.
Photoshop is amazing. That said you kinda need to take a course in it to use 80% of the functionality.
Aw really wholesome actually. Some libraries in my area have senior friendly editing classes, I think it’s becoming more popular. Good looking out for them!
Had a coding firm costing 1k+ euros which was unfamiliar with django select all() from DB just to cast that into a list each time a user opens the tool. That got real funny real fast when the customer started adding the announced 50k objects per day. They did that buried in about 50-60 api endpoints conveniently coded by hand instead of using genetic api endpoints available from django rest framework.
When the loading times hit 50s per click, the company took the money and ran. My colleagues and me spent 2 years and half that to fix that shit.
All about PTC’s God awful piece of shit PLM/PDM systems IntraLink and PDMlink. I cannot believe the amount of trash code that company uses. And they get paid millions to basically screw the customers over. The costumer’s CAD gets intertwined in a huge heap of automated HTML garbage. This leads to a total disaster.
-
Take from index 10 of the buffer, AND it with some hard-coded hex value.
-
Bit shift it by a hard-coded amount of 2
-
Do the first two steps, but with a different hard-coded index, hex value, and bit shift.
-
OR the two results.
-
Shove the result back into a buffer.
All of this is one line with no commenting or references to what the fuck this process comes from or why it is applicable. Then there was a second copy of the line, but with different hard-coded values.
// Here be dragons // Call Darren before changing // Darren quit 2 years ago good luck // - PJ 2015Ok that is truly horrid…
Can you say what the point of it actually was?
-
Oh, I’ve seen some doozies… The one I remember the most, and I’ve seen this twice, is this:
myClass.TheProperty = myClass.TheProperty;When I asked about it, the developer said that, well yes, because it reads from one place and sets in another! Not at all difficult to read!
I found code that calculated a single column in an HTML table. It was “last record created on”.
The algorithm was basically:
foreach account group foreach account in each account group foreach record in account.records if record.date > maxdate max = maxdateIt basically loaded every database record (the basic unit of record in this DATA COLLECTION SYSTEM) to find the newest one.
Customers couldn’t understand why the page took a minute to load.
It was easily replaced with a SQL query to get the max and it dropped down to a few ms.
The code was so hilariously stupid I left it commented out in the code so future developers could understand who built what they are maintaining.
My current favorite is in ruby with the unless keyword:
tax = 0.00 unless not_taxed(billing) tax = billing.zipcode.blank? ? estimated_tax_from_ip(account) : billing.tax tax = (tax.nil? ? 0.00 : tax) endTo me, anything payments related you want to be really super clear as to what you’re doing because the consequences of getting it wrong are your income. Instead we have this abomination of a double negative, several turnaries, and no comments.
Hm. Needs to be unrolled into early returns and have some unit tests strapped tight around it
FYI, an operator with three arguments (such as ?:) is called ternary. The word is related to tertiary, if that helps remembering it.
Correct, and since there are multiple instances I’m using a plural form, and fighting autocorrect at the same time.
I know you were using multiple instances, but I wasn’t sure if that was a typo, auto cow wrecked or genuinely not knowing.



