|
|||
|
|
|||
|
Why is a formal digital library of algorithmic knowledge important to
critical infrastructure protection?
Here is one of many justifications suggested by the BAA. This presentation of it is based on four categories of assertion: A, statements that are self evident; B, basic facts that have been discovered some time ago in computer science; AB, more modern discoveries in computer science; C, conclusions from these facts. A: Algorithms, programs, processes and protocols are all examples of formal procedural knowledge. Because computers execute these procedures, systems of this kind of knowledge have become indispensable to modern society --- to defense, health, learning, and discovery. The nation's critical infrastructure includes a number of software systems, and elements of the physical infrastructure are controlled by software. Software systems are structured collections of algorithms. To promote reliability, the research community has long supported the creation of libraries of code and code modules (or classes) which can be carefully scrutinized, preserved and reused. At the level of individual algorithms, this practice connects well to the practice of presenting reference algorithms in the open literature. B1: Computer scientists and mathematicians since the Greeks have known that procedural knowledge is incomplete without corresponding factual knowledge. We know more about an algorithm than that it ran with certain results. Algorithms are produced in concert with factual and analytical knowledge that determines their design and justifies claims of the designers and programmers that the algorithms accomplish the tasks for which they are intended. This knowledge is closely related to computational and constructive mathematics. Not all of this critical knowledge is written or saved in any form, and even the elements which are carefully written can easily become lost or disconnected from the algorithms as they evolve. Consequently, further development of such algorithms is more likely to be erroneous. Collecting declarative knowledge along with the algorithms and linking it to the code is considered to be excellent professional practice by both the research community and industry. When code is collected into libraries, associated declarative knowledge should be included as well so that it can be scrutinized, criticized, improved, checked, and so that it will evolve with the code and be preserved with it. This practice is known to be effective. Moreover this declarative knowledge is frequently critical to understanding the algorithm, as we can tell from the way textbooks present algorithms. However, code libraries do not typically include the amount of detail given by textbooks, when in fact we believe that even more such knowledge is necessary to justify the code and explain it. AB2: Significant parts of the declarative knowledge documenting algorithms can now be formalized. Because computers can check and process this formal declarative knowledge, it has become included as a part of the code verification process. As computer security and software reliability become more important to government and industry, formal code documentation and its verification will increase. Libraries containing formal documentation and explanation will then become indispensable to society -- to defense, health, learning and discovery. The classification of knowledge into computer checked (formal) and noncomputer checked (informal) aids those who are responsible for locating flaws in reasoning that justifies algorithms in critical software systems. It is vastly less likely that errors occur in the computer checked knowledge. For subtle algorithms or tedious ones with many cases, experience has shown that computer assistance in the form of extended type checking, proof checking and model checking is essential to finding errors in reasoning. In the case of concurrent and distributed algorithms, even simple protocols can be so subtle and complex that it is very difficult to program them correctly without computer assistance in checking for errors or automatically creating arguments for correctness along with the code. AB3: Libraries of declarative knowledge require trustworthy mechanisms that account for long chains of logically connected evidence. Surprisingly little is known about these accounting mechanisms for any feasible means of providing sufficiently large and scalable collections of formal declarative knowledge. Even less is known about mechanisms that can account for the formal evidence produced by different verifiers with incompatible logics. Nevertheless, there is good reason to believe that there will always be several verifiers in use because there are several incomparable logics with which to carry out verification tasks. Just as there are many different programming languages and dialects, each one suited to a certain class of problems, there are also many different logical systems for verifying declarative knowledge, each well suited to a certain class of problems. Furthermore, research develops new approaches that are not simple modifications to old ones, and can be expected to continue that way. For example, linear logic might become a practical formalism, various mixtures of types, sets, and domains might fit into place in unforeseen ways, and we might use relevance logic to contain inconsistencies so that they do not spread beyond local effects. Formal logics are extremely precise; minor changes in a single rule can render the entire logic inconsistent. There are many points at which all the modern logics for theorem proving differ. Some allow empty types, some do not. Some depend on decidable type systems, others do not. Some allow dependent types, others do not. Some use the axiom of choice some do not, and among those that do, some use the Hilbert epsilon operator to state it, and others do not. Some logics allow full recursive types, others only restricted recursion. In some logics types are ordinary objects, in other logics they have a special limited status. C1: This ONR/MURI project is providing the required technical understanding needed to build large libraries of formal declarative knowledge in digital form and account for evidence archived in them. We are applying this understanding to building small sustainable examples, especially examples that support multiple distinct verifiers. C2: In addition to mechanisms to account for long chains of justifications, these libraries require standard services --- organizing, archiving, and searching. The formal character of this knowledge opens new possibilities for computer assistance in these tasks. We are exploring those possibilities and will test them on our examples. B4: Using computers to check and even generate declarative knowledge remains a difficult task. The rate at which formal knowledge can be generated depends on the power of the logical reasoning tools, called provers, and on the amount of formal knowledge available to the provers. It also depends on the number of trained personnel. This crude equation illustrates the relationship. verification_rate = prover_strength x size(knowledge base). Researchers have spent 30 years building powerful provers and sharing algorithms used in them, and they have spent no time making large collections of formal knowledge that can be shared. Government and industrial funding has helped create the powerful provers but there has been essentially no funding for the the knowledge base. ONR/OSD is a leader in this regard. As formal knowledge is shared and made available to provers, the verification task becomes faster and easier. Collecting this knowledge into libraries facilitates sharing, criticism, and improvement. The libraries will also contribute to the education of professionals able to use verification tools and produce formal content. C3: Our project creates mechanisms that allow logically sound sharing of formal knowledge. We are providing a basis for creating a vast collection of formal facts. It is plausible that once a critical mass of formal knowledge is assembled in a form allowing automatic sharing, a singularity will occur in the capacity of the research community to produce more. C4: If this singularity results in the rapid creation of formal knowledge about algorithms, the benefits to society will be enormous because that capability will enable a dramatic increase in the reliability and security of software, and it will free a large number of highly trained people to focus their efforts more productively. It is possible that a technology for the routine verification of formal knowledge will also dramatically alter the means of verifying and communicating precise knowledge of many kinds. The stability, accessibility and extensibility of libraries of formal knowledge, is key to harnessing the power of a community of developers who use results of the formal knowledge providers. Libraries of formalized knowledge form a basis for long-term collaboration between parties with differing interests and skills in the development of such knowledge. |
|