NOTE: use Username "ocat" at bottom of page to update the wiki if the wikilord has not 'whitelisted' you (due to wiki bot infestation). You are authorized by me to use "ocat" for comment/input about mmj2, but please input an id of somesort next to your comments so that we know who is saying what. Thx. ocat
ALL SYSTEMS GO!
The first set of regression tests is complete. Success! I added a new volume test, "2e", and compared its output to volume test 2d. The 2e test uses the "DeriveFormulas?" and "AsciiRetest?" options on the ProofAsstBatchTest? RunParm? – and uses these additional RunParms?:
ProofAsstDjVarsSoftErrors,GenerateReplacements and RecheckProofAsstUsingProofVerifier,yes ProofAsstAutoReformat,no
"DeriveFormulas?" causes the batch test logic to build a ProofWorksheet? with missing formulas (except for the hypothesis and 'qed' steps.) The formulas are derived during the test by the Proof Assistant.
"AsciiRetest?" is a special test option which, after a Proof Worksheet has been unified, takes the updated Proof Worksheet and generates a new ASCII text area which it then re-processes. This confirms that the output of DeriveFormulas? is actually outputting valid formulas.
The new RunParm? "ProofAsstAutoReformat?" with "no" causes the Proof Assistant to update the text area of the Proof Worksheet "in place" instead of just updating the parse trees and recreating formulas when Work Variables are resolved. This 2e volume test validates this new functionality for every theorem in set.mm, and when combined with AsciiRetest? gives us a very confident outlook on the correctness of the code!
The "RecheckProofAsstUsingProofVerifier?" RunParm? is another test feature (but can be used by paranoid users if desired :-) What it does is re-check the RPN proof generated by the Proof Assistant using the VerifyProofs? logic (which implements the Metamath Proof Verification specification).
An example of how this 2e test is working may be seen with theorem "ixpssmap" in set.mm. Pull up the proof in the mmj2 GUI using File/Get Proof and then manually erase the formulas of the proof steps (except the hyps and the 'qed' step.) Then press Ctrl-U to unify the proof. Voila, the formulas are derived…but there is a minor issue: mmj2 chooses optional variable 'y' whereas the Metamath proof uses variable 'f' (for semiotic reasons.) This results in a change to the $d specifications for the theorem – and the default $d processing is, "GenerateReplacements?", which is non-optimal in this case because mmj2 adds new $d statements to the existing $d statements; it is better to use GenerateNew? in this case.
So…there is quite a lot of processing going on "under the hood". It is easy to forget that (I've been asked, "Why is the code so big?" Well…there is a lot of stuff happening :-) 361KB of object code isn't that big, I guess, considering the many features of mmj2 :-)
—
Hi again, fl. Literate Programming is a good thing, to a degree… Better than nothing, which is what I have seen at nearly all corporations doing Big Iron :-) It would do me no good though if you wrote Francais documentation – and for that matter, non-English program source code tends to be indecipherable to me (and let's face it, no one likes anyone else's code – put that in your predicate logic pipe and smoke it!)
So, the most important documentation, aside from in-code descriptions and naming of components is, in my opinion, pictorial. I recently downloaded "Dia", which is a Visio type diagramming tool. If I had had that when I began mmj2 you would have some nice mmj2 documentation. The reason diagrams are so valuable is that most programmers hate to read documentation, but nearly everyone likes looking at pictures :-)
Another great idea of mine, which I never implemented, is to create digital movies of the system creators – designers, programmers and managers – during the development process. Then, hyperlinks to specific portions of these interviews and speeches would be added to the diagrams. So, if you click on a particular module's icon you get to enjoy seeing the original programmer sitting at his/her workstation typing in the code and describing what in the world was going through his/her pointy little head as he/she wrote the code :-) And the greatest treat would be to click on the high-level icon for the system as a whole and see video of the management team selling the product to obtain budget money :-) I think you'd also want video linked to individual test cases, not to mention users actually using the system to "do stuff"…
The problem with the video idea is that programmers are not paid to be (comedic/tragedy) video stars. Also, we know that the main users of the video would be subsequent programmers debugging the production code, ridiculing the original coders for being morons… And frankly, taking videos of programmers writing code is a bit like filming Jane Goodall giving bananas to chimpanzees.
--ocat 4-Jan-2008
—
I notice that Norm has added big-unifier.mm to the official Metamath download (see 2-Jan-2008 news).
The big-unifier.mm file contains one theorem with proof step formulas that are huge. Norm created the file for mmj2 testing of the Work Variables enhancement's new unification algorithm. One of the fun activities with the mmj2 Proof Assistant GUI is to pull up big-unifier.mm's theorem1 and then manually delete the derivation step formulas, leaving just the step/hyp/ref fields and the theorem's 'qed' and hypothesis steps – this invokes the mmj2 "Derive" feature. And mmj2 cuts through theorem1 like a hot knife through butter!
Right now I am running the mmj2 "test suite". big-unifier.mm is called big.mm right now in mmj2's UnitTest?4 set of tests, and is just a small part of nearly 6 gazillion separate tests run as part of each new software release (very tiresome…) One of the reasons why mmj2 has been relatively clean and bugfree is that set.mm and ql.mm provide extensive test sets. And as much as possible the mmj2 test suite includes "batch" tests that exercise each new release's functionality – not just for one test case, but for every theorem in set.mm as part of the mmj2 "volume tests". These tests are run in parallel with the previous release's code, and then file comparisons of the output are performed to ensure that no new bugs have been introduced. THEN, just to be safe, the most recent set.mm is downloaded and the parallel/regression tests are re-run! Surprisingly – or not – these extensive tests have actually triggered a bug or two, and therefore must be regarded as essential quality control steps (my only regret is not having automated more of the process…)
Speaking of Q.C. if anyone is going to test the new "beta" version of mmj2, please go ahead and do so. Bug reports are greatly appreciated before the final release is uploaded :-) ocat 2-Jan-2008
Enhanced version of the "beta" mmj2Release20080201:
http://us2.metamath.org:8888/ocat/mmj2/mmj2.jar http://us2.metamath.org:8888/ocat/mmj2/mmj2chgdsrc.zip
This version includes the "ProofAsstAutoReformat?" RunParm? processing described below.
It has not been fully tested…
"mmj2chgdsrc.zip" contains the revised ".java" source files (may be updated further, but just in case I get hit by a truck, the good stuff is safe :0-)
--ocat 2007-12-26
I added a feature which had been requested but which I forgot to code. It is controlled by a new RunParm?,
ProofAsstAutoReformat,no
The default is 'yes'. If set to 'no', then when proof step work variables are resolved by the Proof Assistant, the updates are made "in place" without reformatting the formulas. This is intended to benefit a person who wants to manually enter and format Proof Worksheet formulas (because, after all, it is impossible to format formulas so that everyone is happy.) In conjunction with the new (alt) right-cursor options to reformat a single proof step, the new release of mmj2 empowers the user with much more control over the way Proof Worksheets look (but automatic formatting always occurs when proofs are retrieved from the input .mm file – that is not easily fixable, either.)
re: Scripting… One reason why a (some kind of) scripting facility could work with mmj3(hypothetically) is that the average mmj2 user IQ is so enormous (well obviously yes, I won't dispute about this but is it enough ? --fl) . The situation is perhaps similar to that of Sage Math, except in that case the "scripting" is done in a dialect of Python
Another reason for providing such a feature is that there is no such thing as an optimal universal search – this is called the No Free Lunch Theorem. Now, in a Finite Axiom system like Metamath with, say, set.mm, if a proof step is provable then the justifying assertion must be one of the previous assertions in the database; so, proving is a search problem. Heretofore I have avoided hardcoding "intelligence" about .mm file contents into mmj2, except to provide default settings such as "|-" and "wff". But there is a gigantic knowledgebase concerning proof techniques in propositional and predicate logic. So to the extent that mmj2/mmj3 cannnot provide "universal" heuristics facilitating proof searches, it may be that a given user in a specific situation may provide the "IQ boost" needed.
--ocat
An unrelated enhancement, which is technical, and implements the Java "generic" container coding features will be deferred. The reason is that a great many lines of code are affected and none of the changes directly benefit the user. So the risk/reward ratio is not satisfactory – better to wait until a major mmj2 upgrade rolls out to justify the odds of bug introduction.
P.S. I will upload a mmj2.jar file for function testing soon. I need to run a few more tests first…
OK, fl's request uploaded to new version of
http://us2.metamath.org:8888/ocat/mmj2/mmj2.jar
(Note: Help screen does not reflect latest enhancements but otherwise the GUI code is as it will be.)
I am almost ready to upload a mmj2jar.zip for user "function testing". This is a tiny release that focuses on comments and small requests provided near the end of the last release "process" :-) I have included something for everyone, including:
And that is about it for the 1-Feb-2008 release date unless an easily doable request is provided. (I may do some work to add instrumentation for testing of cursor positioning because I have no automated regression testing capability for cursor position, and the cursor location is a big part of the "conversation".)
I plan to quickly perform the regression testing and prepare the "beta" release so that if am hit by a truck, and if there are no bugs, all that will be needed is a rename of the beta mmj2.zip file.
P.S. I am getting pretty interested in the "Super Step Unification Search" (aka Limited Prover), to add 1 or 2 proof levels to connect step "n" to previous proof steps. The limit is intended to make the task reasonable, in terms of performance and usability – we don't want to search 10*9 possibilities each time the user presses the "go" button, eh? My latest idea is to build the feature as a "patch" which would be triggered after normal unification processing when the user right-mouses "super search, n levels". I think the Proof Worksheet is too unwieldy for searching the forest of possibilities, and also that I may be able to multi-thread concurrent searches so that during the coming years the new code can take advantage of multi-core processors (and vast quantities of memory.) Writing it as a "patch" is good too because it should reduce the odds of adding bugs to the existing code, which is nearing its maximum level of maintainable entropy. The way it would work is, after the initial unification, if no errors, then the Super Search is performed, and if the search finds something, it splices the results back into the proof worksheet and re-unifies the proof. Ka-ching!
(Let me try this explanation again…)
To set the "AsIs?" cursor behavior you use a RunParm? (not a GUI menu item because it is not something to change frequently):
ProofAsstIncompleteStepCursor,AsIs
This applies after unification if there are no "errors" (even if the proof is complete!)
P.S. We might need to tweak this to make it do what you want. I uploaded mmj2.jar (just the jar file). Let me know if it exceeds with your expectations :-)
http://us2.metamath.org:8888/ocat/mmj2/mmj2.jar
--ocat
1)Minor tweak adding Ctrl+ and Ctrl- for Increase/Decrease Font size respectively (note: it is actually Ctrl and "=" because the shift key is not required – which matches the functionality in the Mozilla browser, by the way.)
2) Swapped order of messages: Error Messages appear before Info Messages, which matches the priority given in output cursor positioning.
3)Code added and tested to obtain the input cursor location when the user requests something like Unification or Reformat.
The feature is available for other uses but is only invoked when a request requiring parsing of the Proof Worksheet is invoked.
The code locates the input cursor to a particular statement (e.g. proof step) but does not record the location or symbol within the statement -- which is pretty much all we need for now.
Happily, I was able to integrate the location finding process within the "loadWorksheet()" process so that an extra pass through the Proof Worksheet input is unnecessary, so there is little extra overhead. And I think the cursor positioning on the output side will be improved for Reformat requests, which is a bonus!
—
As a "proof of concept" for using the input-cursor location finding feature, I am planning to add -- initially – the ability to "Reformat" and "Reformat - Alt Swap" a single proof step, instead of reformatting the entire proof.
And as a useful tweak I plan to eliminate the automatic reformatting of the entire proof when the user changes the Format Number (via "Edit/Set Format Nbr"). This will enable the users who desire more control over the layout of the Proof Worksheet to adjust formatting for each Proof Step, and not destroy any hand-formatted proof steps!
By the way, I will also be adding a RunParm? option to allow the user to Disable automatic reformatting of proof steps for which Work Variables are "resolved". Reformatting of proof steps happens every time a Work Variable substitution is made, which is unhappy for some users (it is also a little bit inefficient, and code will be tweaked to eliminate any re-reformatting within a single invocation of the Unification module!)
The new single step Reformat and Reformat: Alt Swap features will be invoked (I propose) via the right hand (alternate) mouse button. This will eliminate further clutter on the Edit menu, and it will operate in a natural fashion: right-click the mouse on the proof step you desire reformatted, and select the function! (Note: mouse clicking moves the input cursor – called a "caret" by Java -- so a single right or left mouse click is all that is needed to choose the proof step…but of course the input cursor can be moved the old fashioned way, using the keyboard.)
One of the mmj2 users made a suggestion that I think may be useful in conjunction with the new input-cursor location feature and the new ability to perform an operation on a single proof step.
What I envision is a "super" unification search process for a single proof step. Heretofore I refused to add elaborations such as this because I couldn't see a way to do anything reasonable that wasn't impossibly lame. Not that we have the Work Variables feature in place, and input-cursor location it makes sense to attempt something that has the potential the hammer the CPU pretty hard!
Key Features of "Super Step Unification Search":
1. Unification Search and "Derive" features on a single proof step using not just existing database Assertions, but also previous proof steps...even if Work Variables are present in the proof step formulas! 2. Will attempt to fill in *one* missing proof step level if the search fails for a solution not requiring an added proof step. For example, if used on the 'qed' step it will find two-proof step proofs. 3. If the "Super Step Unification Search" is successful, the normal Unification process will be automatically invoked afterwards, and, bada-bing, bada-boom, we're done. If the search fails, an informational message set will provide valuable "Hints" and perhaps a "Best Guess" (based on some not-yet-figured-out heuristic.) Note: checking of "hard" Distinct Variable restrictions will be performed prior to proclaiming "Success". 4. A RunParm/Menu Option will be provided to allow the user to set a maximum search time limit. And, as always, the "Cancel- I-Think-It-Is-Looping" menu option will be available... How This Might Be Done ====================== PHASE ONE For discussion, assume that the user has not entered either a Ref or any Hyp's. Then, we look for Assertions with zero hypotheses -- and if we find one that is unifiable we declare Success. [Note: if the proof step formula contains Work Variables there may be more than one match, so as a general heuristic, the Assertions are "super searched" in descending order of parse tree depth and formula length.] If no unifying zero-hypothesis assertions are found, we look through the one-hypothesis assertions, unifying them first against the proof step formula itself, and then if the proof step formula unifies with the Assertion conclusion, we search the previous proof steps for the Assertion's hypothesis. EXAMPLE: "one hypothesis" scenario involving step 'qed' in "super search", PHASE ONE. Here step 5 is determined to match the hypothesis of Assertion A (shown prior to substitution for &W1): (before) h1::xyz.1 |- blah blah 5:1 |- blah &W1 blah qed:: |- blah blah blah (after) h1::xyz.1 |- blah blah 5:1 |- blah &W1 blah qed:5:A |- blah blah blah Then if the one-hypothesis Assertion "super search" fails, we try Assertions with two hypotheses, etc., and so on. Finally, if all of the previous searches failed we conclude that there may be a one step gap in the proof just prior to the selected proof step! Now the hard part begins... PHASE TWO Now, possibly using "memoed" results of the previous "super search" which tell us which Assertions with > 0 hypotheses have conclusions that are unifiable with just the selected proof step's formula (disregarding hypotheses), we begin PHASE TWO, which may require __?___ times longer than PHASE ONE. We begin with the one-hypothesis Assertions that unify with the proof step's formula, then if necessary, the two-hypothesis Assertions, etc., and so on, just as in PHASE ONE. For each candidate Assertion, "A", we attempt to "prove" each of its hypotheses using previous proof steps. To do this we search the database of Assertions and use the previous proof steps as hypotheses which, when fed into Assertion "B", generate a conclusion (formula) which can be unified with Assertion A's hypotheses. For example, "one hypothesis" scenario involving step 'qed' in "super search", PHASE TWO: EXAMPLE: "one hypothesis" scenario involving step 'qed' in "super search", PHASE TWO. Here it is determined that if step 5 is applied to Assertion B -- generating step 5001 -- then Assertion A's hypothesis is also satisfied! (before) h1::xyz.1 |- blah blah 5:1 |- blah qed:: |- blah blah blah (after) h1::xyz.1 |- blah blah 5:1 |- blah 5001:5:B |- blah blah qed:5001:A |- blah blah blah NOTE: The very astute observer will recognize that PHASE ONE and TWO in this proposed "Super Step Unification Search" can be handled with a single set of code (they are described separately for narrative purposes.) This proposal is merely a bottom-up proof search for a single step, which uses a breadth-first search. PHASE ONE is just depth "zero" and handles what the current (standard) unification search does not: work variables and the matching of derived hypotheses to previous proof step formulas. PHASE TWO is a depth one search which logically incorporates PHASE ONE. In theory there is no reason why there could not be PHASE THREE, etc., except for the fact that the search space grows as the power of a power (or more :-)
NOTE 2: We'll have to see how this gets spec'd and coded, but it seems clear now that more RAM is needed. I can run mmj2 very nicely with just 256MB now, though 2GB is not uncommon for users running Windows Vista. It may be helpful to provide the "Super Step Unification Search" feature as a RunParm? option, meaning that if it is disabled then the extra RAM for searching is not used (and the feature doesn't appear on the popup menu.)