Decompiling Binary Code with Large Language Models

cm0002@lemdro.id · 2 months ago

Decompiling Binary Code with Large Language Models

sobchak@programming.dev · 2 months ago

If I understand the results tables on repo correctly, their largest model achieves ~68% re-executability rate on code compiled with the q0 optimization flag. I’m unsure if that just tests if the decompiled code can be recompiled and executed, or if the programs need to produce the same result on some test cases. If the model is used to refine Ghidra outputs (I’m guessing this is some well-known decompilation framework) it can be used to achieve ~80% re-executability rate, which is better than Ghidra’s baseline of ~34%.

Decompiling Binary Code with Large Language Models

Decompiling Binary Code with Large Language Models

GitHub - albertan017/LLM4Decompile: Reverse Engineering: Decompiling Binary Code with Large Language Models