When we speak about a verification system we mean software that can check arguments for correctness against specific sets of axioms and inference rules. The system can also produce justification showing what fact depends on what others. A basic criterion for such a system is that there is a simple program (about 300 lines) that can check the primitive proofs, i.e., those proofs that invoke only the axioms and inference rules but no decision procedures. This simple program can be written in many languages and proved correct by hand. (Sometimes this is part of the definition of a foundational system.) This defines the highest standard of rigor. There is the opportunity for foundational provers to verify the code that implements the basic checking for other such provers.
There are other logical tools such as decision procedures and model checkers and type checkers that are not foundational. They are pieces of code that perform logical operations, but they might be difficult to understand and verify. It is important to know that these methods ``conform to a logical theory.''
As an illustration, consider the well-known SupInf algorithm that has been widely used in theorem provers or the Arith procedure used in Nuprl. These procedures are used in theorem provers. Decision procedures are complex and a potential source of unsoundness. When they are used, the prover is not foundational unless there is a reduction. Surprizingly there are no formal proofs of correctness for these procedures and worse, all of the published informal proofs are either incorrect or grossly incomplete -- critical cases are missing.
There are two basic ways to use decision procedures soundly. We say that a procedure is run in safe mode if it generates primitive proofs (those that do not invoke decision procedures). It is run in unsafe mode if it just decides whether or not there is a proof, but does not exhibit one. One method of using these procedures is (a) to first run them in unsafe mode, which is fast, and later, off-line run them in safe mode. HOL does this. Another approach, (b) is to formally verify the procedure (without using it in the verification) and possibly bind it by reflection [1].
We propose to explore capability (b) more thoroughly. It serves a dual
purpose of being a test case of verification by techniques of the community.
It might allow the establishment of a new standard. We need to know that it can be done.