Lemmy: Bestiverse
  • Communities
  • Create Post
  • Create Community
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
RSS BotMB to Lobste.rsEnglish · 6 days ago

Towards Real-World Industrial-Scale Verification: LLM-Driven Theorem Proving on seL4

arxiv.org

external-link
message-square
0
link
fedilink
2
external-link

Towards Real-World Industrial-Scale Verification: LLM-Driven Theorem Proving on seL4

arxiv.org

RSS BotMB to Lobste.rsEnglish · 6 days ago
message-square
0
link
fedilink
Formal methods (FM) are reliable but costly to apply, often requiring years of expert effort in industrial-scale projects such as seL4, especially for theorem proving. Recent advances in large language models (LLMs) have made automated theorem proving increasingly feasible. However, most prior work focuses on mathematics-oriented benchmarks such as miniF2F, with limited evaluation on real-world verification projects. The few studies that consider industrial-scale verification mostly rely on closed-source models with hundreds of billions of parameters, which cannot be locally deployed and incur substantial usage costs. In this paper, we propose AutoReal, an LLM-driven theorem proving method for real-world industrial-scale systems with support for lightweight local deployment. We evaluate AutoReal on the seL4-Isabelle verification project as a representative and challenging case study. AutoReal incorporates two key improvements: (1) chain-of-thought (CoT)-based proof training, which teaches the LLM the reasoning behind proof steps and enables step-wise explanations alongside proofs, and (2) context augmentation, which leverages proof context from the project to enhance LLM-driven proving. Based on the AutoReal methodology, we fine-tune a base model to obtain AutoReal-Prover, a compact 7B-scale prover for industrial-scale theorem proving. AutoReal-Prover achieves a 51.67% proof success rate on 660 theorems from seL4-designated Important Theories across all 10 seL4 proof categories, substantially outperforming prior attempts on seL4 (27.06%). To evaluate generalization, we further apply AutoReal-Prover to three security-related projects from the Archive of Formal Proofs (AFP), covering all 451 theorems and achieving a proof success rate of 53.88%. Overall, this work advances the application of LLM-driven theorem proving in real-world industrial-scale verification.

Comments

alert-triangle
You must log in or # to comment.

Lobste.rs

lobsters

Subscribe from Remote Instance

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !lobsters@lemmy.bestiver.se
lock
Community locked: only moderators can create posts. You can still comment on posts.

RSS Feed of lobste.rs

Source of the RSS Bot

Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 24 users / day
  • 164 users / week
  • 487 users / month
  • 1.41K users / 6 months
  • 2 local subscribers
  • 334 subscribers
  • 11.5K Posts
  • 625 Comments
  • Modlog
  • mods:
  • patrick
  • RSS Bot
  • BE: 0.19.15
  • Modlog
  • Instances
  • Docs
  • Code
  • join-lemmy.org