Why are Metamath proofs so short?

Metamath's set.mm database encodes a rich slice of mathematical knowledge, yet weighs in at under 4 megabytes uncompressed. There are over 6000 theorems, so the average proof size is less than a kilobyte. Even without the "compressed proof format," the average theorem is only about 25 steps, not counting the "mandatory hypotheses." (The question of how to count the mandatory hypotheses is reasonable; some Metamath variants, including Norm's "solitaire" java app, use unification to infer them. Ghilbert uses a more straightforward matching algorithm to avoid most but not all of the mandatory hypotheses, resulting in an expansion of about 2.5 in the total number of steps)

Why are the proofs so short? Especially in comparison of Metamath's design to that of other proof systems, the answer is far from clear. Metamath forces proofs to be explicit and detailed down to the tiniest primitive. Most other proof systems have considerable infrastructure to automate the more tedious tasks.

HOL, and its many related systems, bring an extraordinarily powerful tool to the task: a full programming language. In fact, ML (named for its "metalanguage" role) began life as the dedicated language within LCF (the ancestor of HOL), but has become popular on its own. An embedded Turing-complete language opens the door to achieving the Kolmogorov complexity of proofs - any data compression technique that you can imagine can be coded up. The theoretical limit is almost nothing, as the code can just search the space of all possible proofs shortest-first. Of course, in practice proofs do have to be more explicit, otherwise the time complexity of executing the code would be most impractical. But this theoretical explanation does illustrate how HOL can trade off CPU cycles to make proofs shorter.

Mizar, and quite a few related systems, use a different approach - the verifier embeds a small but fairly powerful theorem prover. Thus, a proof only need contain the "major steps," assuming that this verifier can automatically prove all the tedious details in between the major steps. For example, for a simple induction you'd need to state the invariant, but the prover may well be able to prove the base case and the induction step by itself.

Yet, Metamath, with no such sophistication, manages proof sizes that are barely larger than the statements of the theorems. Metamath doesn't have built-in primitives even for such things as substitution, well-typedness, or even simply proving that a variable is not free in a term; you need to prove these step-by-step.

To make matters worse, Metamath does not explicitly manage the assumptions in deduction-style proofs. You have to carry these through the proof yourself, with the result of each proof step explicitly represented as an implication from assumptions to conclusion. Norm, in his discussion of the deduction theorem, states that applying the standard deduction theorem to a Metamath proof results in a worst-case exponential explosion.

: I worked this out for the very first propositional calculus theorems in Margaris' First Order Mathematical Logic, pp. 60-61. In order to convert the assumption discharges to proofs from the axioms, using the deduction theorem, here is the number of steps needed: Theorem 6, 40 steps; Theorem 7, 761 steps; Theorem 8, 2321 steps; Theorem 9, 27806 steps. Compare this with the shortest proofs known from the same axiom system: these numbers are (*2.12) 19; (*2.16) 59; 59; and 81 steps respectively. (The last two aren't in pmproofs.txt.) In set.mm the proofs are even shorter of course because of the re-use of previously proved results. --norm 21-Dec-2005

I've been thinking recently about tools for optimizing proofs - in other words, a translation which takes a proof file in, and produces a new proof file that proves exactly the same theorems, but is (hopefully) shorter. I have an idea for a general technique similar in principle to zlib compression, which I believe explains in part how Metamath can achieve such short proofs.

To review zlib very briefly, the compressed stream basically contains two kinds of commands. The first just sends a sequence of uncompressed bytes. The second "plays back" a sequence of bytes that has already occurred in the uncompressed stream, specifying its offset (how many bytes back to start) and length. For natural language text, it's easy to see why zlib is so effective - these "play back" commands can capture words and even short phrases. While there are slightly better compression algorithms (such as bzip2), zlib is still plenty good enough to have become industry-standard.

I believe a similar basic technique can be applied to Metamath (or Ghilbert) proofs. Whenever a sequence of proof steps is repeated more than once in a proof file, you can usually stick it into its own theorem, then just include the reference to that theorem instead. I also believe that such a technique can be automated, using techniques similar to zlib's for recognizing repeated sequences, and that this could prove especially useful in automated translations from other systems such as HOL.

(A more detailed discussion could go here, as well as several examples, but I wanted to get the basic idea "on paper")

--raph 20-Dec-2005

Answer: Metamath proofs are so short because it uses Condensed Detachment? The RPN-formatted proofs feed the stack-based automaton Proof Verification Engine, so that each step of a proof consists of just a reference (label) to an hypothesis or assertion. The stack is arranged so that all operands needed by each proof step – which are like function calls – are present on the stack in the exact order needed by the Proof Engine. Each proof step generates a formula on the top of the stack, so a proof needs to consist of only labels, thus rendering each proof maximally concise; basically, a normal hand-written proof's justification reference column is the only thing needed internally in a Metamath proof. That is my idea, fwiw.

Regrading your compression idea, I think it is valid. I have coded the LZW algorithm and experimented with it. It seems very similar to what you are proposing. However, architecturally speaking, I am not wildly enthusiastic about the idea for the same reason that I am not enthusiastic about Metamath's compressed proof format. The additional complications need to be justified and weighed. For example, I have considered the idea of adding the capability to read "zip" format compressed .mm files to mmj2. Zip IO subroutines are commonly available in libraries, so no complicated hand-coding is necessary, and most OS's can process zipped files, so everyone pretty much already has the tools to decompress a file and inspect the contents visually. A zip-compressed .mm file achieves similar compression levels to Metamath's compressed proof format and eliminates the added complexity inside the application code.

--ocat 20-Dec-2005

Just to be clear: I am not advocating any additional infrastructure within Metamath or friends. I am, instead, saying that techniques similar to zlib compression can be used, with the existing Metamath-family languages to make short proofs, especially Ghilberrt in which definitions can be made freely and safely. You needn't bother with actually coding such a compression/optimization algorithm as long as you're only dealing with hand-written proofs, but as soon as you start automatically generating proofs, you will probably need some similar technique to keep the size of proofs from blowing up.

The two reasons you cite; the power of condensed detachment and the fact that proof steps are encoded by a single identifier, also help keep proofs short, but it's a remarkable enough property of Metamath that I feel it needs to be well understood.

--raph 20-Dec-2005

One thing I have found surprising is that set.mm is able to get as far as it does without the standard deduction (meta)theorem. (Well, maybe I shouldn't be surprised because Principia Mathematica was written before the deduction theorem was discovered and therefore also doesn't make use of it.) In addition to the weak deduction theorem, there are a number of other special techniques scattered throughout set.mm to convert deductions to theorems when the w.d.t. doesn't apply. Some day I should collect them. Most recently, I added abidhb, which provides a neat trick to convert bound-variable hypothesis builders from inference form to theorem form, as in the conversion of hbfv to hbfvd.

Possibly what has happened historically is that once the deduction theorem became popular, people stopped looking at alternate techniques. They either became lost art or remained undiscovered because no one cared anymore. For me, that's part of what has made developing set.mm fun - mathematicians of the past never bothered to look for these techniques, so they are out there waiting to be discovered.

Perhaps the theory of calculational proofs could provide some clues. Here is an excerpt from Lifschitz's On Calculational Proofs <http://www.cs.utexas.edu/users/vl/papers/calc.ps> (mentioned on the metamathCalculationalProofs page):

: We know that inventing proofs of predicate formulas in a Hilbert style deduction system is an unpleasant task unless we use some metamathematical tools such as the deduction theorem…[But] [w]hat we learn from Dijkstra and Scholten is that the importance of assumptions in formal proofs is overrated: with the right choice of inference rules, proving formulas is easy even without introducing assumptions (and without the use of auxiliary derivable objects such as sequents).

In set.mm, some rules related to those used by calculational proofs, most notably the bitr* and (or,an,im,…)bi* stuff, arose naturally as part of its development. Curiously, the associative law for the biconditional, biass, seems to serve no purpose other than to provide completeness, and it doesn't even appear in Principia Mathematica.

--norm 21-Dec-2005

WhyAreMetamathProofsSoShort

Why are Metamath proofs so short?