Trustworthy Formal Natural Language Specifications

Gordon, Colin S., Matskevich, Sergey

Onward! Papers, October 2023, doi: 10.1145/3622758.3622890

Abstract

Interactive proof assistants are computer programs carefully constructed to check a human-designed proof of a mathematical claim with high confidence in the implementation. However, this only validates truth of a formal claim, which may have been mistranslated from a claim made in natural language. This is especially problematic when using proof assistants to formally verify the correctness of software with respect to a natural language specification. The translation from informal to formal remains a challenging, time-consuming process that is difficult to audit for correctness. This paper shows that it is possible to build support for natural language specifications within existing proof assistants, in a way that complements the principles used to establish trust and auditability in proof assistants themselves. We implement a means to provide specifications in English, and have them automatically translated into formal claims, entirely within the Lean proof assistant. Our approach is extensible (placing no artificial restrictions on grammatical structure), modular (allowing information about new words to be distributed alongside libraries), and produces proof certificates explaining how each word was interpreted and how the sentence’s structure was used to compute the meaning. We apply our prototype to the translation of various English descriptions of formal specifications from a popular textbook into Lean formalizations; all can be translated correctly with a modest lexicon with only minor modifications related to lexicon size.

Bibtex

@inproceedings{onward23,
  author = {Gordon, Colin S. and Matskevich, Sergey},
  title = {Trustworthy Formal Natural Language Specifications},
  abbr = {Onward!},
  booktitle = {Onward! Papers},
  year = {2023},
  month = {October},
  youtube = {https://www.youtube.com/watch?v=wXruK8xD1ZE},
  arxiv = {2310.03885},
  url = {https://arxiv.org/abs/2310.03885},
  doi = {10.1145/3622758.3622890},
  acm = {https://dl.acm.org/doi/10.1145/3622758.3622890},
  bibtex_show = {true},
  abstract = {
  Interactive proof assistants are computer programs carefully constructed to check a human-designed proof of a mathematical claim with high confidence in the implementation. However, this only validates truth of a formal claim, which may have been mistranslated from a claim made in natural language. This is especially problematic when using proof assistants to formally verify the correctness of software with respect to a natural language specification. The translation from informal to formal remains a challenging, time-consuming process that is difficult to audit for correctness.

This paper shows that it is possible to build support for natural language specifications within existing proof assistants, in a way that complements the principles used to establish trust and auditability in proof assistants themselves. We implement a means to provide specifications in English, and have them automatically translated into formal claims, entirely within the Lean proof assistant. Our approach is extensible (placing no artificial restrictions on grammatical structure), modular (allowing information about new words to be distributed alongside libraries), and produces proof certificates explaining how each word was interpreted and how the sentence’s structure was used to compute the meaning.

We apply our prototype to the translation of various English descriptions of formal specifications from a popular textbook into Lean formalizations; all can be translated correctly with a modest lexicon with only minor modifications related to lexicon size.
  },
}