If I understand the results tables on repo correctly, their largest model achieves ~68% re-executability rate on code compiled with the q0 optimization flag. I’m unsure if that just tests if the decompiled code can be recompiled and executed, or if the programs need to produce the same result on some test cases. If the model is used to refine Ghidra outputs (I’m guessing this is some well-known decompilation framework) it can be used to achieve ~80% re-executability rate, which is better than Ghidra’s baseline of ~34%.
If I understand the results tables on repo correctly, their largest model achieves ~68% re-executability rate on code compiled with the q0 optimization flag. I’m unsure if that just tests if the decompiled code can be recompiled and executed, or if the programs need to produce the same result on some test cases. If the model is used to refine Ghidra outputs (I’m guessing this is some well-known decompilation framework) it can be used to achieve ~80% re-executability rate, which is better than Ghidra’s baseline of ~34%.