HomeHome Metamath Proof Explorer  
Real and Complex Numbers

Contents of this page
  • Axioms for Complex Numbers
  • The Extended Real Number System
  • The Real Numbers are Uncountable
  •   ...my daughters have been studying (chemistry) for several semesters, think they have learned differential and integral calculus in school, and yet even today don't know why x · y = y · x is true.
    —Edmund Landau

    Axioms for Complex Numbers   In the Metamath Proof Explorer we derive the postulates or axioms of complex arithmetic as theorems of ZFC set theory. This section collects in one place these results, providing a complete specification of the properties of complex numbers.

    For convenience, after deriving the postulates we re-introduce them as actual new axioms on top of set theory. This lets us easily identify which axioms are needed for a particular complex number proof, without the obfuscation of the set theory used to derive them. To do this, we introduce two classes CC (set of complex numbers) and RR (set of real numbers); three constants 0 (zero), 1 (one), and i (imaginary unit); two binary operations + (plus) and x. (times); and one binary relation < (less than). We then assert that these eight objects have the properties specified by the axioms in the table below, as extensions to ZFC set theory.

    To derive the postulates as theorems, we define each of these objects as specific sets, then derive their properties from the axioms of set theory. The derivation is not easy, but the fact that it works is quite remarkable and lends support to the idea that ZFC set theory is all we need to provide a foundation for essentially all of mathematics.

    Note that subtraction and division are introduced later as definitions. We later define natural numbers, integers, and rational numbers as specific subsets of CC, leading to the nice relationships NN (. ZZ (. QQ (. RR (. CC.

    Most analysis texts construct complex numbers as ordered pairs of reals, leading to construction-dependent properties that satisfy these axioms but are not usually stated as explicit axioms. Other texts state simply that "RR is a complete ordered subfield of CC," leading to redundant axioms when this phrase is completely expanded out. Apparently,* the complete set of axioms in the explicit form given here has never been published, and an open problem is whether any axiom on this list is redundant (can be derived from the others). Eric Schmidt has found several redundancies, indicated in the table, in the original axiom system posted here.

    [*Note added 1-Aug-2007: Slawomir Kolodynski found a reference with 27 complex number axioms, similar in spirit to Metamath's 22 axioms. See Edward V. Huntington, "The fundamental propositions of algebra," Monographs on Topics of Modern Mathematics Relevant to the Elementary Field, edited by I. W. A. Young, New York, 1911; see also Trans. Amer. Math. Soc., vol. VI, 1905, pp. 209—229.]

    [Note added 19-Jun-2012: Eric Schmidt has discovered that in the axioms shown below, axaddcom (now addcom) and ax0id (now addid1) are redundant. He has written up these and some other nice results, including some independence results for the axioms, in schmidt-cnaxioms.pdf (schmidt-cnaxioms.tex). Added 3-Jan-2013: Scott Fenton has formalized Eric's results, which are now in set.mm.]

    [Note added 18-Jun-2012: Mario Carneiro has done a major revision and cleanup of the construction. In particular, rather than using equivalence classes as is customary for the construction of the temporary rationals, he used only "reduced fractions", so that the use of the axiom of infinity is avoided until it becomes necessary for the construction of the temporary reals.]

    To construct the complex numbers, we start with the finite ordinals (natural numbers) of set theory and successively build temporary positive integers, temporary positive rationals, temporary positive reals (based on Dedekind cuts), temporary signed reals, and finally the actual complex numbers. ("Temporary" means they would not normally be used outside of the construction.) Together with the arithmetic operations and other auxilliary sets needed at each level, there are a total of 38 temporary definitions involved. You can look at the Statement List starting at cnpi (click on its "Related theorems" link) to see the theorems used for the construction, which total 53 (including the 22 that result in the final axioms). These theorems are, in essence, the complete formalization of an entire 136-page book, Landau's Foundations of Analysis (a.k.a. Grundlagen der Analysis), although the approach we use is somewhat more modern.

    The construction is "portable" in the sense that the final axioms below hide how they are constructed, and another construction that develops the same axioms could be plugged in in place of it. For example, another common construction is to define real numbers as Cauchy sequences of rationals instead of the Dedekind cuts that we use.

    Interestingly, we could eliminate 0 as an axiomatic object by defining it as as ((i x. i) + 1) and replacing it with this expression throughout the axioms. If this is done, axiom axi2m1 becomes redundant. However, the remaining axioms would become longer and less intuitive.

    The 22 Axioms for Complex Numbers
    Ref Expression
    axcnex |- CC e. V     Proved redundant by Mario Carneiro on 18-Feb-2014; see theorem cnex
    axresscn |- RR (_ CC
    ax0re |- 0 e. RR     Proved redundant on 19-Feb-2005; see theorem 0re
    ax1cn |- 1 e. CC     Weakened from earlier 1 e. RR by Eric Schmidt, 11-Apr-2007; see theorem 1re.
    axicn |- i e. CC
    axaddcl |- ((A e. CC /\ B e. CC) -> (A + B) e. CC)
    axaddrcl |- ((A e. RR /\ B e. RR) -> (A + B) e. RR)
    axmulcl |- ((A e. CC /\ B e. CC) -> (A x. B) e. CC)
    axmulrcl |- ((A e. RR /\ B e. RR) -> (A x. B) e. RR)
    axaddcom |- ((A e. CC /\ B e. CC) -> (A + B) = (B + A))     Proved redundant by Eric Schmidt, 19-Jun-2012, and formalized by Scott Fenton on 4-Jan-2013; see theorem addcom
    axmulcom |- ((A e. CC /\ B e. CC) -> (A x. B) = (B x. A))
    axaddass |- ((A e. CC /\ B e. CC /\ C e. CC) -> ((A + B) + C) = (A + (B + C)))
    axmulass |- ((A e. CC /\ B e. CC /\ C e. CC) -> ((A x. B) x. C) = (A x. (B x. C)))
    axdistr |- ((A e. CC /\ B e. CC /\ C e. CC) -> (A x. (B + C)) = ((A x. B) + (A x. C)))
    axi2m1 |- ((i x. i) + 1) = 0
    ax1ne0 |- 1 =/= 0
    ax0id |- (A e. CC -> (A + 0) = A)     Proved redundant by Eric Schmidt, 19-Jun-2012, and formalized by Scott Fenton on 4-Jan-2013; see theorem addid1
    ax1rid |- (A e. RR -> (A x. 1) = A)     Weakened from earlier ax1id (now mulid1) by Eric Schmidt, 19-Jun-2012, with formalization by Scott Fenton, 4-Jan-2013
    axnegex |- (A e. CC -> E.x e. CC (A + x) = 0)     Proved redundant by Eric Schmidt, 21-May-2007; see theorem cnegex
    axrecex |- ((A e. CC /\ A =/= 0) -> E.x e. CC (A x. x) = 1)     Proved redundant by Eric Schmidt, 21-May-2007; see theorem recex
    axrnegex |- (A e. RR -> E.x e. RR (A + x) = 0)
    axrrecex |- ((A e. RR /\ A =/= 0) -> E.x e. RR (A x. x) = 1)
    axcnre |- (A e. CC -> E.x e. RR E.y e. RR A = (x + (y x. i)))
    axlttri |- ((A e. RR /\ B e. RR) -> (A < B <-> -. (A = B \/ B < A)))
    axlttrn |- ((A e. RR /\ B e. RR /\ C e. RR) -> ((A < B /\ B < C) -> A < C))
    axltadd |- ((A e. RR /\ B e. RR /\ C e. RR) -> (A < B -> (C + A) < (C + B)))
    axmulgt0 |- ((A e. RR /\ B e. RR) -> ((0 < A /\ 0 < B) -> 0 < (A x. B)))
    axsup |- ((A (_ RR /\ A =/= (/) /\ E.x e. RR A.y e. A y < x) -> E.x e. RR (A.y e. A -. x < y /\ A.y e. RR (y < x -> E.z e. A y < z)))
    Colors of variables: wff set class

    In case you are wondering: Why do we use (purple) class variables for most postulates instead of more conventional (red) set variables? In fact, we can equivalently state these with set variables only (which would be more in keeping with textbook notation) then prove these class-variable versions from them. You may want to do that if you want a nice conventional presentation of these axioms for your classroom, book, etc. However, in our formalism, the advantage of a class variable is that we can directly substitute it with another class expression such as the number "2" or the term "((i x. x) + B)". A set variable would require some logic manipulations to do this, using theorems such as vtoclg (as used, for example, to convert elirrv to elirr). So for convenience we use class variables outright. On the other hand, a class variable cannot be quantified with A. (for all) and E. (there exists), so for axioms with quantifiers, the set variables are unavoidable.


    The Extended Real Number System   A convenient device used in most analysis texts is an extension RR* of the real number system (see df-xr and related theorems) that includes +oo and -oo.

    Analysis textbooks rarely if ever actually define +oo and -oo as concrete objects. Instead, they just say that they are two "new" distinguished elements. But we must choose specific sets in order to use set theory to work with them. So I picked P~ U.CC (the power set of the union of the set of complex numbers) for +oo (df-pnf) and P~ +oo for -oo (df-mnf), which we can prove are different from each other and to not belong to RR (see theorems pnfnemnf, pnfnre, and mnfnre). Many other choices are possible too. In spite of the seemingly awesome "size" of these two new members, they are really nothing deeper than notational devices to make some proofs shorter. Any theorems making use of them can be stated equivalently without them. They provide a shorthand for saying, for example, "no matter how large a number we pick, this series will eventually grow larger than that number." These "infinities" have nothing to do with the infinite cardinals of set theory—in fact one possible choice for -oo is {CC}, which is a singleton with cardinality 1! (We didn't use this simpler choice because its justification requires the Axiom of Regularity, whose use we prefer to avoid when possible.)


    The Real Numbers are Uncountable     Initially my plan was to formalize Cantor's diagonal argument [external] using decimal (or possibly quaternary) expansions. However, I ran into some technical complications, such as the fact that some numbers have two equivalent decimal representations (e.g. 0.999... = 1.000...), that would have made a formal proof somewhat messy. So I chose another proof that is simpler to formalize but which I believe is in the same spirit as Cantor's diagonal proof, in the sense that it constructs a real number different from all the numbers in a purported list of all real numbers.

    Even so, the "simpler" proof is still daunting when worked out in complete formal detail, involving some 39 lemmas. Therefore I will first give an informal description of the proof, then describe the key formal lemmas that lead to the final result, which is theorem ruc (or equivalently ruclem39).

    Related results: Unlike the real numbers, the rational numbers are countable. See theorem qnnen. Moreover, there are at least aleph-one reals (aleph1re) and irrationals (aleph1irr). For another very different proof that the reals are uncountable, see rucALT, which follows from the exact computation of the cardinality of reals, rpnnen

    The informal argument    We will start by claiming that we can list all the reals, i.e. that there exists a function f from NN (natural numbers) onto RR (the reals). We will then proceed to construct a real number that is not in this purported list f(1), f(2), f(3),... of all reals, thereby falsifying this claim.

    Here is how we construct this number. We construct, in parallel, two auxiliary sequences g(1), g(2), g(3),... and h(1), h(2), h(3),... derived from f(1), f(2), f(3),...

    We start off by assigning:

    g(1) = f(1) + 1
    h(1) = f(1) + 2
    Given g(i) and h(i), we construct the next value g(i+1) and h(i+1) as follows:
    If g(i) < f(i+1) < h(i), then assign
    g(i+1) = (2x.f(i+1) + h(i)) / 3
    h(i+1) = (f(i+1) + 2x.h(i)) / 3
    Otherwise, assign
    g(i+1) = (2x.g(i) + h(i)) / 3
    h(i+1) = (g(i) + 2x.h(i)) / 3
    In either case, f(i+1) will not fall between g(i+1) and h(i+1).

    Now, using elementary algebra, you can check that g(1) < g(2) < g(3) < ... < h(3) < h(2) < h(1). In addition, for each i, the interval between g(i) and h(i) does not contain any of the numbers f(1), f(2), f(3), ..., f(i). This interval keeps getting smaller and smaller as i grows, avoiding all the f(i)'s up to that point. In effect we are constructing an interval of real numbers that are different from all f(i)'s. This is very much like the diagonal argument, but it is much better suited to a formal proof. Just as happens when we add more decimals in Cantor's proof, the sequence g(1), g(2), g(3),... gets closer and closer (converges to) to a real number that is not in the claimed complete listing.

    As i goes to infinity, the interval shrinks down to almost nothing. What is left in this shrinking interval? Well, g(i) and h(i) will converge towards each other, from opposite directions, so the interval will shrink down either to a single point or to an empty interval. It is tempting to think that the interval will shrink down to exactly nothing, but in fact it will not be empty, and the formal proof shows this. Specifically, it will contain exactly one real number, which is the supremum (meaning "least upper bound") of all values g(i). How do we know this? Well, it results from one of the axioms for real numbers. This axiom, shown as axsup above, says that the supremum of any bounded-above set of real numbers is a real number. And the set of values g(i) is certainly bounded from above; in fact all of them are less than h(1) in particular.

    Contrast this to the rational numbers, where the supremum axiom does not hold. That is why this proof fails for the rational numbers, as it should, since we can make a countable list of all rational numbers. To give you a concrete feel for why the supremum axiom fails for rationals, consider the infinite set of numbers 3, 3.1, 3.14, 3.141, 3.1415... Each of these numbers is a rational number. But the supremum is pi, which is not a rational number.

    In Cantor's original diagonal proof, the supremum comes into play as follows. As you go diagonally down the list of decimal expansions of real numbers, mismatching the list digit by digit, at each point you will have a new rational number with one more digit (that is different from all numbers in the list up to that point). The supremum of this list of new rational numbers is a real number (with an infinite number of digits after the decimal point) that is not on the list. The statement "all decimal expansions represent real numbers," which is often stated casually and taken for granted, is actually somewhat deep and needs the supremum axiom for its derivation.

    The formal proof    The formal proof consists of 39 lemmas, ruclem1 through ruclem39, and the final theorem ruc. In the formal proof, the functions f, g, and h above are called F, G, and H. A function value such as f(1) is expressed (F` 1). A natural number index i is usually expressed as A; you can tell by the hypothesis A e. NN such as ruclem18.a in ruclem26.

    In the hypotheses for many of the lemmas, for example those for ruclem14, we let F be any function such that F:NN-->RR, meaning F can be any arbitrary mapping from NN into RR. We then derive additional properties that F must have. In particular, we will conclude that no matter what the function F is, it is impossible for it to map onto the set of all reals.

    We also have to construct the functions G and H, and doing this formally is rather involved. The purpose of the majority of the lemmas, in fact, is simply to work out that the hypotheses of ruclem14 indeed result in functions G and H that have the properties described in our informal argument section above.

    What makes it complicated is the fact that G and H depend not only on F but on each other as well. So, we have to construct them in parallel, and we do this using a "sequence builder" that you can see defined in df-seq1. The sequence builder takes in the functions C and D defined in hypotheses ruclem.1 and ruclem.2 of ruclem14, and uses them to construct a function on NN whose values are ordered pairs. The first member of each ordered pair is the value of G and the second is the value of H. By using ordered pairs, the sequence builder can make use of the previous values of both G and H simultaneously for its internal recursive construction.

    The function C, which provides the second argument or "input sequence" for the sequence builder seq1, is constructed from F as follows. Its first value (value at 1) is the ordered pair <.((F` 1) + 1), ((F` 1) + 2)>.. This provides the "seed" for the sequence builder, corresponding to the initial assignment to g(1) and h(1) in the informal argument section above. The subsequent values at 2, 3,... are just the values of F, and these feed the recursion part of the sequence builder to generate new ordered pairs for the values of the sequence builder at 2, 3,....

    The two functions called 1st and 2nd in hypothesis ruclem.2 of ruclem14 return the first and second members, respectively, of an ordered pair. The if operation, defined by df-if, can be thought of as a conditional expression operator analogous to those used by computer languages. The expression if(ph, A, B) can be read "if ph is true then the expression equals A else the expression equals B".

    Hypotheses ruclem.3 and ruclem.4 of ruclem14 extract the first and second members from the ordered pair values of the sequence builder finally giving us our desired auxiliary functions G, and H

    Armed with this information, you should now compare the arithmetic expressions in the hypothesis ruclem.2 of ruclem14 to the ones in the informal argument section above. Hopefully you will see a resemblance, if only a very rough one, that will provide you with a clue should you want to study the formal proof in more depth. By the way, the assertion (conclusion) of ruclem14 shows the first value of the sequence builder. You can see that the ordered pair entries match g(1) and h(1) from the informal argument section above.

    Let us now look at the key lemmas for the uncountability proof. Lemma ruclem26 shows that G has an ever-increasing set of values, and ruclem27 shows that H has an ever-decreasing set of values. In spite of this, the twain shall never meet, as shown by ruclem32. Lemma ruclem34 defines the supremum S of the values of G (i.e. the supremum of its range) and shows that the supremum is a real number. Lemma ruclem35 shows that the supremum S is always sandwiched between G and H, whereas ruclem29 shows that this is not true for any value of F. Lemma ruclem36 uses these last two facts to show that the supremum is not equal to any value of F and therefore not in the list of real numbers provided by F. This means, as shown by ruclem37, that F cannot map onto the set of all reals.

    We are now in a position to get rid of most of the hypotheses (since their variables are no longer referenced in the assertion). In ruclem38 we eliminate all but one hypotheses of ruclem37 by using instances of eqid. In ruclem39 we get rid of the final hypothesis (using the weak deduction theorem dedth, involving a quite different application of the if operator) to result in "there is no function mapping NN onto the reals," and finally ruc converts this to the notation for a strict dominance relation.


      This page was last updated on 18-Feb-2014.
    Copyright terms: Public domain
    W3C HTML validation [external]