public class GrammarAmbiguity
extends java.lang.Object
At present, there are two levels of Grammar Ambiguity validation: basic and complete (aka "full"). The distinction between the two levels is somewhat arbitrary, but has to do with speed: a user does not need to validate her grammar every time the program is run. So "basic" is just the basics: are any grammar rules parseable using other grammar rules? The "complete" ambiguity checking process for a grammar may well prove to be quite time-consuming, involving many heuristics and passes through the grammar!
As of August, 2005, the "complete" level is stubbed out, except for a couple of little things. It will be a high priority project in the near future.
Constructor and Description |
---|
GrammarAmbiguity(Grammar grammar,
boolean doCompleteGrammarAmbiguityEdits) |
Modifier and Type | Method and Description |
---|---|
boolean |
basicAmbiguityEdits()
The Primary Objective of basicAmbiguityEdits is to make a pass through
Grammar's NotationRule Set and attempt to parse each Notation Grammar
Rule using a Max Sequence number of Integer(dot)MAX_VALUE to absolutely
positively confirm that none of the NotationRules is parseable using
other NotationRules.
|
boolean |
fullAmbiguityEdits()
Right now this has just two edits, a warning about undefined
non-terminals and an informational message that a grammar like peano.mm
is unambiguous because every NotationRule is a "gimme match".
|
public GrammarAmbiguity(Grammar grammar, boolean doCompleteGrammarAmbiguityEdits)
public boolean fullAmbiguityEdits()
public boolean basicAmbiguityEdits()
During the initial pass through Grammar's NotationRule Set, useful information related to the rules is collected for subsequent use in ambiguity checking. This information is quite interesting and goes to the heart of the question of what makes a grammar (un)ambiguous -- embedding and overlapping notation rules (there are also deep mysteries to be pursued involving all-constant Notation Axioms which can embody characteristics of both constant "literals" and grammatical types, but those are not dealt with here.)
Secondary Objective: set each NotationRule.isGimmeMatchNbr to -1 if rule is *not* a "gimme match" or +1 if it is (zero indicates that the question has not yet been decided.) HOWEVER, there must be no intrinsic ambiguity in the rules themselves -- meaning that no grammar rule can be parsed using other grammar rules (and no duplicates.)
Tertiary Objective: if doCompleteGrammarAmbiguityEdits is set to true then intermediate results from this routine should be stored and made available to fullEdits(). Specifically, the non-gimme match Notation Rules will require deep investigations of their potential ambiguity and it is desireable that these results from this routine not be wasted:
(these are not mutually exclusive since Rule A may be very short and be repeated more than once in Rule B.)
Overlap:
Embedding:
On Recursion:
More on Gimme Matches:
For the purposes of identifying a gimme we need to show that a rule has no overlaps and is neither embedded in nor contains an embedded rule, and is not recursive. We already know that it is not a duplicate rule and cannot itself be grammatically parsed into other rules.)
More on Recursion -- cannot be Gimme Matches!:
The fact that a rule is recursive DOES mean that it is *not* a gimme match. For example, if we have rules #1: A -> AB and #2: A -> ABB, then #2 is already in error because it is parseable (into #1(#1(a, b), b)))! Or consider these other examples:
#1: A -> (A) and #2: A -> (A)B .... = not gimmes!
#2: A -> A * and #2: A -> A * B ... = not gimmes!
#3: A -> * A and #2: A -> * A B ... = not gimmes!
#4: A -> * A and #2: A -> * * A ... = error!
#5: A -> * A and #2: A -> + * A ... = not gimmes!
So these *direct* recursion rules are either ambiguous or errors, but theye are *not* gimme matches in any case.
Note: *Indirect* recursions are possible. For example:
#1: E -> A *
#2: A -> E * B
which, depending on the set of other rules could mean both #1 and #2 are
"gimme" matches. If there is Type Conversion #3: E -> A
then #1
would turn out to be a non-gimme because a variant of #2 would have been
generated, #2.1: A -> A * B
.