AsteroidMeta: mmj2ProofAssistantConsiderations

A few untutored, possibly false-to-fact thoughts about the hypothetical new mmj2 Proof Assistant, planned for completion by September 2006.

Of interest:

Dick Grune and Ceriel J.H. Jacobs, "Parsing Techniques - A Practical Guide"

http://www.cs.vu.nl/~dick/PTAPG.html

See pages 60-61 "2.10 A METAPHORICAL COMPARISON OF GRAMMAR TYPES"

"The set of all Pascal programs that would terminate in finite time when run with a given input can be generated by an unrestricted phrase structure grammar. Such a grammar would, however, be very complicated, even in van Wijngaarden form, since it would incorporate detailed descriptions of the Pascal library routines and the Pascal run-time system.

The set of all Pascal programs that solve a given problem (for instance, play chess) cannot be generated by a grammar (although the description of the set is finite)."

A Metamath proof is in actuality, a program, that when fed into the stack-based, push-down automaton Proof Verification Engine, generates the sequence of symbols that is the object of the proof.

Given that the final label in a Metamath proof must yield a single formula on the Proof Work Stack that matches the $p statement's formula, it is known in advance of Proof Verification exactly what must be on the Proof Work Stack after the next to last label is processed by the Proof Verification Engine: "4. If the label refers to an assertion with hypotheses, a (unique) substitution must exist that, when made to the mandatory hypotheses of the referenced assertion, causes them to match the most recent entries of the Proof Work Stack." and "6. The same substitutions are made to the referenced assertion, and the result is pushed onto the Proof Work Stack." (from C:/mmj2/doc/mmjProofVerification.html in mmj2.zip).

We also know that if the grammar is unambiguous, the final label must refer to a formula whose grammatical parse tree root node matches the parse tree root node of the formula being proved – OR that the formula can provide such a root node via substitution into a variable hypothesis (ex. if the statement being proved is of the form "a → b" then either the final label's formula must be the form "a → b" or of the form "a".) This assumes that every statement can be grammatically parsed, which is not a requirement of the Metamath specification (see miu.mm!)

Now, unless the "proof" is a trivial one-liner that "proves" its hypothesis, the final statement label must refer to an Assertion, it cannot refer to an Hypothesis. Thus, deriving the non-trivial proof of Statement "X" using backwards reasoning means using abductive reasoning (or enlightened guessing) to figure out which label is the final label. Once that is determined, what remains is a simple exercise in deriving the necessary stack contents that will await the final proof step. And, since we can generate the final step of a proof, generating these earlier stack entries is also doable :) Ha.

Last year Norm Megill applied his "improve" command to the contents of set.mm, and surprise(!), Metamath.exe generated shorter proofs than the hand-coded entries for quite a few of the theorems! But, if memory serves, those were all in the propositional logic section of the set.mm database.

As a practical matter, generating an arbitrary missing proof step of an arbitrary proof involving predicate logic is not doable (the program might get lucky, but mathematically, the number of possible RPN label strings to try out is excessive.)

What does seem reasonable is saving the person doing the proof work the trouble of having to memorize the many thousands of Metamath labels (that's just in set.mm). If the Proof Assistant could handle a column of formulas with step justifications consisting of only the previous step numbers to be used as hypotheses for the current step's assertion (and nothing if the step is an hypothesis), then the Proof Assistant ought to be able to figure out which label is required – and how to arrange this information into an RPN sequence of labels constituting a valid proof step (a Metamath proof is like assembler, with lots of syntax and variable hypothesis labels interspersed with the "real" labels, the logical hypotheses and assertions.)

The reasoning behind this is that a proof can be represented as a tree with 'n' children, where n is the number of hypotheses in the statement's mandatory frame (variable and logical hypotheses). Each child node is a sub-proof, with recursive considerations that apply. The structure of a Metamath proof tree is identical to the structure of a Metamath statement parse tree, except that the proof is a parse using some extremely high order grammar that generates grammatically correct and true statements in the logical system being studied :)

--ocat

mmj2ProofAssistantUnification draft! --ocat 8-Nov-2005