AsteroidMeta: GhilbertVsMetamathPart1

The Ghilbert design, though somewhat sketchy, is interesting both as it relates to Metamath, and as a fount of ideas for PlanetMath's HDM project.

What follows is a rough, unproven draft copy with only cursory review by the Ghilbert designer. Conclusions drawn may be erroneous. The objective is to work through the design document: "Design of the Ghilbert proof format", dated 21 Oct 2003 from www.ghilbert.org.

This is only Part I. Part II is now complete, but parts III and IV may never be written. A lot depends on fate :) And frankly, I like Metamath. I like the idea of students and amateur, non- academics being able to create their own quirky, crufty, ambiguous grammars, making grand mistakes and learning from them. Metamath really just needs a GUI to make it more approachable. Even just a more powerful Proof Assistant might do the job!

But maybe Ghilbert is the way to go? After all these years, will we get rid of our faithful pickup truck with 200,000 miles that still always starts the first time, trading it for a fancy new sports car? Hmmm.

Metamath vs. Ghilbert - PART I

*1. SYNTAX*

Ghilbert statements are written using a command name followed by an "sexp" (s-expression). Ghilbert tokens are 7-bit ASCII codes between 33 and 126 inclusive, separated by whitespace. (Note: the Ghilbert design document states that commands are written entirely on a single line, but that is out of date).

Metamath statements are keyword based, can range across several lines, and are also composed of 7-bit ASCII codes.

    Ghilbert
    ========
    # comment: A blank line is also considered a comment:
    
    stmt ( ax-1 ( ) ( ) ( -> ph ( -> ps ph ) ) )
 
    Metamath
    ========
    $( comment: blah-blah.
       A Metamath comment can be embedded between
       any two tokens, except inside a comment. $)
    ax-1 $a |- ( ph -> ( ps -> ph ) ) $.

NOTE: From the standpoint of readability, it is an essential feature that Ghilbert statements are allowed to be broken up and extend across multiple lines, hence this change in the Ghilbert design. Embedded comments are nice too. Maximum line lengths and line delimiters may vary from one operating system to the next, so an argument can be made that line terminators should not be a feature of the syntax.

*2. GRAMMATICAL TYPES*

Grammatical Type Codes are explicitly declared in Ghilbert using the "kind" statement. Each Kind must be unique within a file's Kind Namespace, which is used to validate Kinds used on other Ghilbert statements.

(Note from raph - I am making a distinction between "type" and "kind", which is fairly subtle. The concept of "kind" is entirely syntactic, and every term and variable has its kind immediately apparent. The concept of "type" is more relevant in typed axiomatic frameworks such as HOL than in essentially untyped axiomatic frameworks such as ZFC set theory and second order arithmetic. The fact that a value has a particular type is a wff. For example, all values in my HOL axiomatization have kind "val" but may have different types. The constant "0" could have type "nat", for example, meaning that (: (0) (nat)) is a theorem, while "+" has type "nat * nat → nat". Both "0" and "+" have kind "val". In any given universe, the number of kinds will be small (a handful), but in a typed logic such as HOL the space of types is rich and large.)

[Reply by ocat – I think I understand. What I am referring to as a "Type" or "Grammatical Type" (I changed the heading above), is what Ghilbert refers to as a "kind", and that you chose the term "kind" so that you could avoid muddying the semantic waters of "mathematical types" in the object languages to be expressed in Ghilbert versus the syntactic/grammatical kinds used to define those items in Ghilbert source statements. So…henceforth, the term "type" will be avoided like the plague unless we're talking about HOL or category theory or whatever…it is too late to go back and change mmj2 now though :) So in that context I will try to use "grammatical type" or "Type Code" in some cases, and avoid more confusion.]

In Metamath a "kind" is merely a declared constant that appears in the first symbol position of a formula. There is no Metamath restriction on using that constant in other symbol positions as a plain old constant. Though mmj2's Grammar checker does include a "dual-use" Grammatical Type/constant prohibition, that prohibition appears to be unnecessary in Ghilbert because of the syntax of Ghilbert's "term" statement.

    Ghilbert
    ========
    kind ( wff )
    
    Metamath
    ========
    $c wff $.

(Note from raph - see also Metamath's restrictions on the $f statement. While you could overload a constant as both a "kind" and for other uses, an $f statement accepts only the former use. The "var" statement in Ghilbert corresponds very closely to $f in Metamath, so the concept of "kind" may be more similar than immediately apparent.)

[Replay by ocat – mmj2 doesn't handle meta-metamath statements or grammars such as "$a |- |- x → y $." or "$p |- ph → wff", where "|-" and "wff" are used as grammatical Types. The mmj2 restriction has to do with the parsing algorithm more than anything else – Ghilbert makes the issue disappear with "kind" and "var", so bravo! Yay.]

*3. USER TERMS/NOTATIONS/GRAMMAR RULES*

Ghilbert defines grammatical "terms" using the "term" statement. It makes no distinction between a "wff" and a "term", they are both "terms".

The "term" statement defines a grammatical replacement rule for a "kind". While it differs from the Metamath "$a" command's use as a grammar builder in several important ways but is very similar overall:

- A Ghilbert term is uniquely identified by its Term Name which is unique within the file's term/variable joint namespace. Interestingly, a Ghilbert term does not have a "label". Term name serves as the label in proof steps in addition to its normal use in formulas (sexp's).

- Another difference between Ghilbert and Metamath notation definitions is that Ghilbert doesn't bother with variable names in term definitions, and simply specifies a sequence of Kinds. This tidies up a loose end in Metamath, which does not explicitly require its syntax axiom variables to have different names even though it relies upon that condition for ordering its proof work stack (mmj2 does check for different variable names in syntax axioms).

- "arity" can be 0, 1, …, n in Ghilbert as in Metamath, but Null terms are (apparently) not supported in Ghilbert. And, it appears that operator overloading is unsupported (the list of kinds associated with a unique term name is fixed). Thus, automatic Ghilbert/Metamath conversions will not always be perfect because Metamath supports a larger class of grammars.

(Note from raph - any overloading or variable-arity Metamath grammar items will need to be converted into distinct Ghilbert term names. The class of grammars is exactly the same, it's just that in Ghilbert the disambiguation is always explicit in the term names, rather than implicit.)

[Reply by ocat - Metamath allows ambiguous grammars, which I don't think is true for Ghilbert.]

    Ghilbert
    ========
    # a 0-ary term, term name = "E"
    term ( class ( E ) )
 
    # 1-ary term, term name = "-."
    term ( wff ( -. wff ) )
     
    # 2-ary term, term name "->" 
    term ( wff ( -> wff wff ) )
     
    # Type conversion, from kind "set" to "class"
    term ( class ( cv set ) )
 
    # nulls permitted error?! yes, no term name.
    term ( wff ( ) ) 
     
    # operator overloading error? yes, dup term name.
    term ( N ( + N N ) )
    term ( I ( + I I ) )
        
    Metamath
    ========
    $( 0-ary -- AKA "named, typed-constant" -- "E" $)
    cE $a class E $. 
 
    $( 1-ary, label = wn $)
    wn $a wff -. ph $. 
 
    $( 2-ary, label = "wi" $)
    wi $a wff ( ph -> ps ) $.
 
    $( Type conversion from type "set" to class
       where "x" is a variable with type "set": $)
    cv $a class x $.
     
    $( operator overloading in Metamath $)
    plusN $a N ( x + y ) $.
    plusC $a I ( i + j ) $.
     
    $( nulls permitted declaration ok in Metamath$)
    wNull $a wff $.

(Note from raph - Metamath permits waaaay too much stuff. It relies entirely on the discipline and good sense of the theorem author to avoid ambiguity, inconsistent definitions, etc. The pitfalls are not always obvious. For example, the notation $a class A ! $. for factorial would create a grammatical ambiguity, which is why the notation ( ! ` A ) is used instead.)

[Reply by ocat - I have documented numerous ambiguous grammars (see mmj2.zip) and hope to write code in mmj2 to detect those. I think Ghilbert is wise to steer clear of ambiguity, right from the start, no matter how interesting and useful ambiguities may be in certain contexts (politicians wouldn't be able to talk if we disallowed them ambiguity.]

*4. VARIABLES*

Variables are declared using the "var" statement in Ghilbert, which is like Metamath's "$v" statement except that it specifies a Kind in addition to the variable name.

The "var" statement is a like combination of Metamath's $v and $f statements, but unlike Metamath:

- a variable's Type/Kind is fixed.

- a variable is always "active" (in scope) within the file in which it is defined (from the standpoint of grammatical parsing, this difference really simplifies things!)

- variable name is (must be) unique within the file's term/variable joint namespace.

Another interesting difference between Ghilbert and Metamath is that there is no such thing as a variable hypothesis, or its label. A variable's name is used to refer to the variable both in subsequent formulas and in proofs.

    Ghilbert
    ========
    var ( wff ph ph ) # multiple vars in single var are ok
     
    Metamath
    ========
    $v ph ps $. $( multiple vars in single $v are ok $)
    wph $f wff ph $.
    wps $f wff ps $.

*5. The "Truth" Type Code*

Metamath employs a Type Code on every formula, including Logical Hypotheses and Logical Assertions, Axiomatic and Provable. Typically, the Type Code used is "|-" Type Code, which when applied to a "wff"-ish expression (a logical expression), is construed to mean "It is provable that…" or "It is true that…".

The "|-" Type Code distinguishes "logical" statements from notation statements ("term" statements in Ghilbert terminlogy.)

In Ghilbert the "|-" Type Code is implied by the location of a formula. In the Ghilbert Proof Verifier (gh_verify.py), as logical expressions (logical hypotheses, plus "stat", "ax" and "thm") are pushed onto the stack, type code "|-" (which is hard-coded) is prefixed to the expression. Thus, "( → ph ph )" is loaded onto the stack as "|- ( → ph ps )".

While it is true that the Ghilbert Proof Verifier does not double-check to ensure that "|-" is applied only to "wff"-ish expressions, neither does Metamath.exe. (Bluntly, the user is free to code non-sensical logic statements, such as "It is provable that DOG".)

[In mmj2 the user specifies the specific logic Type Codes corresponding to "|-" and "wff" so that grammatical parsing can proceed rigorously; Ghilbert's formulas are pre-parsed sexp's and the question does not arise.]

(Note from raph - I consider |- to be an internal artifact of the proof language, not a fundamental property. Metamath uses the same concept (represented by the use of $a) for both syntactic well-formedness and logical provability, hence the need to distinguish the two. In Ghilbert, there are two different mechanisms to express these two concepts. It is true that gh_verify uses the string "|-" as a pseudo kind name, but that is strictly an implementation artifact and could easily be removed.)

{Note from ocat – "|-" is an artifact, of course, but its genesis is relevant, especially with respect to Ghilbert's decision to treat wff's and terms as "term"s. Metamath makes no distinction – is agnostic – about symbols, whether they are logical symbols or "functions". The kind (or Grammatical Type Code" of a formula in a theorem, axiom, or logical hypothesis) ought to be "wff" (or else the wff-like kind that maps to "|-" or the equivalent, where both "wff" and "|-" are arbitrary, perhaps set by the user.) In other words, it would not make sense to have a "theorem", "3 + 5", because "3 + 5" is not a logical assertion. Enderton makes the distinction in "A Mathematical Introduction to Logic" in 2.1, p. 72: "An expression is any finite sequence of symbols. Of course most expressions are nonsensical, but there are certain interesting expressions: the terms and the wffs." The reason "|-" is handy on the proof stack is that it establishes the provenance of an expression, and acts as a guarantee that the prover is not (incorrectly) substituting a grammatical production where an axiom, theorem or logical hypothesis is required.}

*6. Begin/End Scope Statements*

Metamath's Begin Scope statement, "${" and End Scope statement, "$}" have no equivalent in Ghilbert.

In Ghilbert, every Assertion (via "stat" or "thm") has its own personalized lists, possibly empty, of Disjoint Variable Restrictions and Logical Hypotheses. This simplifies things quite a bit, especially since every Ghilbert variable is defined in a global namespace with an associated Kind, and there is no need to re-declare variable hypotheses in an Assertion's Mandatory or Optional Frames.

Also, since Ghilbert has "modules" (described later), there is no urgent need to use scope declarations so that the variable namespace can be reused within a file.

In practice, for example in Metamath's set.mm, 99% of variables are defined globally anyway, which actually makes the file more readable and consistent (no guessing about what "x" means, for example.) And, guesstimating again, 99% of the time in set.mm, a scope is defined for an Assertion if and only if the Assertion has Disjoint Variable restrictions and/or Logical Hypotheses. So taking away the Begin/End Scope statements appears to be an elegant simplification.

Here is a prototypical Metamath scope from set.mm (expanded proof format version, "expset.mm") followed by the same mathematical object expressed in Ghilbert:

    Metamath
    ========
    ${ 
        $d x ps $.
        a4b.1 $e |- ( x = y -> ( ph -> ps ) ) $.
        $( A weaker version of ~ a4a . $)
        a4b $p |- ( A. x ph -> ps ) $=
          wph wps vx vy wps vx ax-17 a4b.1 a4a $.
          $( [5-Aug-93] $)
    $}
 
    Ghilbert
    ========
    thm (a4b ((ps x)) 
      ((a4b.1 (-> (= (cv x) (cv y)) (-> ph ps))))
      (-> (A. x ph) ps)
      (ps x ax-17 a4b.1 a4a
    ))
 Notes:

a) As you can see, Ghilbert does away with scope blocks and puts the $d's and $e's inside the Assertion!

b) The Ghilbert example's expressions "cv x" and "cv y" invoke type conversion from "set" to "class" because Ghilbert does not allow operator overloading – the term "= class class" is defined, but "= set set" is not.

c) The Ghilbert proof is much shorter. Curious? The mystery will be explained later!

--ocat

Continue to GhilbertVsMetamathPart2