public class Grammar extends java.lang.Object implements SyntaxVerifier
The first thing to note is that Grammar builds a "grammar" -- a set of replacement rules -- from a Metamath ".mm" file/database. The distinction made between "building" and extracting what is already there is important because Metamath's set.mm database uses "overloaded functions", such as "weq" and "wceq", which are technically "ambiguous" (two valid parse trees exist.)
These "non-essential" ambiguities are resolved during grammar generation by a process of "combinatorial explosion", which creates a Notation Rule for every permutation of input parameters, and then priorities those rules -- first come, first served, dropping any duplicates. Assuming that "weq" precedes "wceq" in the database, "set = set" will always parse to "weq" and the other variations such as "set = class", "class = set" will parse to "wceq".
The second important topic is that Grammar imposes new validation rules on a Metamath .mm file/database. But these rules never come into force unless a user desires to use Grammar to perform parsing (aka "syntactical analysis".) A user with a non-Grammar compliant database can still use mmj to perform Proof Verification and to print reports.
The additional mmj validation rules are not extremely restrictive, and in fact, the entire mmj system is designed to operate on set.mm with no changes to any default parameters or settings. Let me give examples.
The Metamath symbol "|-" termed "It is provable that..." is the default in mmj, and it is given a special name, the "provableLogicStmtTyp". If your .mm file uses a different symbol, you must tell mmj (via mmj.util.RunParmFile or a constructor parameter.) Moreover, if your .mm file encodes a new logical theory involving TWO provableLogicStmtTyp codes, you're going to be hit with an error, and Grammar will refuse to continue. In addition, your provableLogicStmtTyp code cannot be the Type Code of any variables in your system -- you cannot define a Syntax Axiom for it, and you thus, cannot do a Meta-Metamath theory using that approach (not directly anyway.) There may be an important need for two provableLogicStmtTyp codes in a single .mm file, but my imagination is limited, and so was my time. Another restriction is using a grammatical Type Code as a constant in an expression, for example, like this: "xyz $p |- ( wff -> ph ) $.", where "wff" is a Type Code for "well-formed formula" variables. See ConsolidatedListOfGrammarValidations.html for more info.
One excuse I must make is that the Grammar Rule Trees and Forests (mmj.verify.GRForest.java and mmj.verify.GRNode.java) were part of the BottomUpParser.java debacle. They are still used for detecting duplicate NotationRules and for printing out the rules, but if I were starting from scratch I would not use them.
Finally, mmj.verify.GrammarAmbiguity.java is mostly a stub -- incomplete. There is a lot of work needed to detect and report on grammatical ambiguities, we've just scratched the surface. In fact, according to a famous parsing techniques expert, developing a generalized program able to process any context-free grammar and decide whether it is unambiguous is mathematically impossible. So that is definitely an opportunity for achievement, right there. It will take a bit longer than initially expected :)
Modifier and Type | Field and Description |
---|---|
java.util.Map<java.lang.String,Stmt> |
stmtTbl |
java.util.Map<java.lang.String,Sym> |
symTbl |
Constructor and Description |
---|
Grammar()
Constructor using all default values for doCompleteStmtAmbiguityEdits,
doCompleteGrammarAmbiguityEdits, provableLogicStmtTyp Code (ie "|-") and
logicStmtTyp Code (ie "wff").
|
Grammar(java.lang.String[] provableLogicStmtTypCodes,
java.lang.String[] logicStmtTypCodes)
Constructor using default doCompleteStmtAmbiguityEdits and
doCompleteGrammarAmbiguityEdits values.
|
Grammar(java.lang.String[] provableLogicStmtTypCodes,
java.lang.String[] logicStmtTypCodes,
boolean doCompleteGrammarAmbiguityEdits)
Constructor using default doCompleteStmtAmbiguityEdits value.
|
Grammar(java.lang.String[] provableLogicStmtTypCodes,
java.lang.String[] logicStmtTypCodes,
boolean doCompleteGrammarAmbiguityEdits,
boolean doCompleteStmtAmbiguityEdits,
java.lang.Class<? extends GrammaticalParser> parserPrototype)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
int |
assignNextGrammarRuleNbr() |
void |
derivedRuleQueueAdd(GrammarRule gR)
Add rule to derivedRuleQueue.
|
void |
derivedRuleQueueInit()
Initialize derivedRuleQueue.
|
GrammarRule |
derivedRuleQueueRead()
Read GrammarRule from derivedRuleQueue.
|
boolean |
getGrammarInitialized()
Return grammarInitialized flag.
|
Cnst[] |
getLogicStmtTypArray()
Return Array of Logic Stmt Type Codes (like "wff").
|
Messages |
getMessages() |
int |
getNotationGRGimmeMatchCnt()
Get Count of NotationRules deemed to be "gimme matches".
|
java.util.Set<NotationRule> |
getNotationGRSet()
Return NotationRule Set.
|
java.util.List<NullsPermittedRule> |
getNullsPermittedGRList()
Return NullsPermittedRule List.
|
java.util.Set<Cnst> |
getNullsPermittedTypSet()
Return NullsPermittedRule Type Code Set.
|
Cnst[] |
getProvableLogicStmtTypArray()
Return Array of Provable Logic Stmt Type Codes (like "|-").
|
int |
getSymTblSize()
Return size of Symbol Table.
|
java.util.Set<Cnst> |
getSyntaxAxiomTypSet()
Return Syntax Axiom Type Code Set.
|
java.util.List<TypeConversionRule> |
getTypeConversionGRList()
Return TypeConversionRule List.
|
java.util.Set<Cnst> |
getVarHypTypSet()
Return VarHyp Type Code Set.
|
void |
incNotationGRGimmeMatchCnt()
Increment (add 1) to Count to NotationRules deemed to be "gimme matches".
|
boolean |
initializeGrammar(Messages messages,
java.util.Map<java.lang.String,Sym> symTblParam,
java.util.Map<java.lang.String,Stmt> stmtTblParam)
Initializes the grammar.
|
void |
notationGRSetAdd(NotationRule notationRule)
Add NotationRule to Notation Rule Set.
|
int |
nullsPermittedGRListAdd(NullsPermittedRule r)
Add NullsPermittedRule to List.
|
void |
parseAllFormulas(Messages messages,
java.util.Map<java.lang.String,Sym> symTblParam,
java.util.Map<java.lang.String,Stmt> stmtTblParam)
Parse all Statement Formulas and update stmtTbl with results.
|
ParseTree |
parseFormula(Messages messages,
java.util.Map<java.lang.String,Sym> symTbl,
java.util.Map<java.lang.String,Stmt> stmtTbl,
Formula formula,
VarHyp[] varHypArray,
int highestSeq,
Stmt defaultStmt)
Parse a single formula.
|
ParseTree |
parseFormulaWithoutSafetyNet(Formula formula,
Hyp[] hypArray,
int highestSeq)
Alternate access to parse algorithm for use in Proof Assistant with no
pre-check of Grammar initialization or error messages.
|
ParseTree |
parseOneStmt(Messages messages,
java.util.Map<java.lang.String,Sym> symTbl,
java.util.Map<java.lang.String,Stmt> stmtTbl,
Stmt stmt)
Parse a single Statement.
|
void |
setDerivedRuleQueue(java.util.PriorityQueue<GrammarRule> p)
Set derivedRuleQueue.
|
void |
setGrammarInitializedFalse()
Set grammarInitialized flag to false.
|
void |
setNotationGRGimmeMatchCnt(int cnt)
Set Count of NotationRules deemed to be "gimme matches".
|
void |
setParserPrototype(java.lang.Class<? extends GrammaticalParser> proto) |
void |
setStore(SessionStore store) |
int |
typeConversionGRListAdd(TypeConversionRule typeConversionRule)
Add TypeConversionRule to List.
|
public java.util.Map<java.lang.String,Sym> symTbl
public java.util.Map<java.lang.String,Stmt> stmtTbl
public Grammar(java.lang.String[] provableLogicStmtTypCodes, java.lang.String[] logicStmtTypCodes, boolean doCompleteGrammarAmbiguityEdits, boolean doCompleteStmtAmbiguityEdits, java.lang.Class<? extends GrammaticalParser> parserPrototype) throws VerifyException
provableLogicStmtTypCodes
- lists Type Codes used on Theorems
involving Logical Statements thus identifying all other Type
Codes as Syntax Type Codes -- if array is empty or null, the
default is "|-".logicStmtTypCodes
- - lists Type Codes used on statements of logic
(such as "wff" but not "set", etc) -- if array is empty or
null, the default is "wff".doCompleteGrammarAmbiguityEdits
- Thorough (attempt to) prove
grammar is not Ambiguous. If Ambiguity is found, an error
message is generated (hopefully listing a formula with two
different grammatical parsings in RPN format.) These edits are
redundant unless the grammar or software have changed since
the last edit; and they may be long and tedious, which is why
this option is provided. The default is True.doCompleteStmtAmbiguityEdits
- Check each non-Grammar Stmt being
parsed to see if there are two different, valid parse trees --
meaning that the Stmt is ambiguous, an error. Note that this
option will often require much more time due to generation of
many duplicate parse trees (not an error).parserPrototype
- A pluggable parser implementationVerifyException
- If validation of provableLogicStmtTypCodes and
logicStmtTypCodes failspublic Grammar(java.lang.String[] provableLogicStmtTypCodes, java.lang.String[] logicStmtTypCodes, boolean doCompleteGrammarAmbiguityEdits) throws VerifyException
provableLogicStmtTypCodes
- lists Type Codes used on Theorems
involving Logical Statements thus identifying all other Type
Codes as Syntax Type Codes -- if array is empty or null, the
default is "|-".logicStmtTypCodes
- - lists Type Codes used on statements of logic
(such as "wff" but not "set", etc) -- if array is empty or
null, the default is "wff".doCompleteGrammarAmbiguityEdits
- Thorough (attempt to) prove
grammar is not Ambiguous. If Ambiguity is found, an error
message is generated (hopefully listing a formula with two
different grammatical parsings in RPN format.) These edits are
redundant unless the grammar or software have changed since
the last edit; and they may be long and tedious, which is why
this option is provided. The default is True.VerifyException
- If validation of provableLogicStmtTypCodes and
logicStmtTypCodes failspublic Grammar(java.lang.String[] provableLogicStmtTypCodes, java.lang.String[] logicStmtTypCodes) throws VerifyException
provableLogicStmtTypCodes
- lists Type Codes used on Theorems
involving Logical Statements thus identifying all other Type
Codes as Syntax Type Codes -- if array is empty or null, the
default is "|-".logicStmtTypCodes
- - lists Type Codes used on statements of logic
(such as "wff" but not "set", etc) -- if array is empty or
null, the default is "wff".VerifyException
- If validation of provableLogicStmtTypCodes and
logicStmtTypCodes failspublic Grammar() throws VerifyException
VerifyException
- If validation of provableLogicStmtTypCodes and
logicStmtTypCodes failspublic int assignNextGrammarRuleNbr()
public void setStore(SessionStore store)
public boolean getGrammarInitialized()
public void setGrammarInitializedFalse()
public void derivedRuleQueueInit()
public void setDerivedRuleQueue(java.util.PriorityQueue<GrammarRule> p)
p
- PriorityQueue.public void derivedRuleQueueAdd(GrammarRule gR)
gR
- GrammarRule to add.public GrammarRule derivedRuleQueueRead()
public int getSymTblSize()
public void notationGRSetAdd(NotationRule notationRule)
notationRule
- to add to set.public java.util.Set<NotationRule> getNotationGRSet()
public void incNotationGRGimmeMatchCnt()
public void setNotationGRGimmeMatchCnt(int cnt)
cnt
- Count of NotationRules deemed to be "gimme matches".public int getNotationGRGimmeMatchCnt()
public int typeConversionGRListAdd(TypeConversionRule typeConversionRule)
typeConversionRule
- to be added.public java.util.List<TypeConversionRule> getTypeConversionGRList()
public int nullsPermittedGRListAdd(NullsPermittedRule r)
r
- NullsPermittedRule to be added.public java.util.List<NullsPermittedRule> getNullsPermittedGRList()
public java.util.Set<Cnst> getNullsPermittedTypSet()
This is the set of Type Codes for which Nulls are permitted.
public java.util.Set<Cnst> getVarHypTypSet()
public java.util.Set<Cnst> getSyntaxAxiomTypSet()
This is the set of Type Codes for Syntax Axioms are defined.
public Cnst[] getLogicStmtTypArray()
At this time, the array will have exactly one element.
public Cnst[] getProvableLogicStmtTypArray()
public Messages getMessages()
public ParseTree parseOneStmt(Messages messages, java.util.Map<java.lang.String,Sym> symTbl, java.util.Map<java.lang.String,Stmt> stmtTbl, Stmt stmt)
If used with VarHyp or SyntaxAxiom, simply returns stmt.getExprParseTree().
Note: access to symTbl and stmtTbl is required in case the grammar needs to be re-initialized.
parseOneStmt
in interface SyntaxVerifier
messages
- Messages object for error/info messages.symTbl
- Symbol Table (Map).stmtTbl
- Statement Table (Map).stmt
- Stmt in stmtTbl to parse.public ParseTree parseFormula(Messages messages, java.util.Map<java.lang.String,Sym> symTbl, java.util.Map<java.lang.String,Stmt> stmtTbl, Formula formula, VarHyp[] varHypArray, int highestSeq, Stmt defaultStmt)
Note: access to symTbl and stmtTbl is required in case the grammar needs to be re-initialized.
Note: highestSeq is similar to the restriction in proof verification: a proof can only refer to previous statements. However, setting this to java.lang.Integer.MAX_VALUE says, parse this formula with the entire grammar -- which ought to result in the same parse, unless the grammar is ambiguous (new Grammar Rules should be "disjoint" from previous ones.)
parseFormula
in interface SyntaxVerifier
messages
- Messages object for error/info messages.symTbl
- Symbol Table (Map).stmtTbl
- Statement Table (Map).formula
- Formula to parse.varHypArray
- VarHyp's for the Formula's Var's.highestSeq
- Max MObj.seq that can be referenced.defaultStmt
- Default Stmt for output RPN/message.public void parseAllFormulas(Messages messages, java.util.Map<java.lang.String,Sym> symTblParam, java.util.Map<java.lang.String,Stmt> stmtTblParam)
parseAllFormulas
in interface SyntaxVerifier
messages
- Messages object for error/info messages.symTblParam
- Symbol Table (Map).stmtTblParam
- Statement Table (Map).public boolean initializeGrammar(Messages messages, java.util.Map<java.lang.String,Sym> symTblParam, java.util.Map<java.lang.String,Stmt> stmtTblParam)
Normally this is handled automatically. It initializes "the grammar" but does not parse every Stmt in stmtTbl.
The intended use of this function would be to initialize the grammar without parsing every statement (parsing every statement in set.mm requires 8ish seconds!)
initializeGrammar
in interface SyntaxVerifier
messages
- Messages object for error/info messages.symTblParam
- Symbol Table (Map).stmtTblParam
- Statement Table (Map).public ParseTree parseFormulaWithoutSafetyNet(Formula formula, Hyp[] hypArray, int highestSeq)
This method assumes that we're being invoked from someplace like BatchMMJ2 that has already confirmed that the Grammar is initialized. In theory this could cause a problem if a non-syntax error comes out and is taken as a syntax error by the user. However, with the current parser that should not be a problem (as seen in the following method).
formula
- formula to be parsed.hypArray
- Hyp's for Formula's Var's.highestSeq
- Maximum Stmt.seq that may be used to parse the
expression.public void setParserPrototype(java.lang.Class<? extends GrammaticalParser> proto)