In practice, even in quite rich foundational systems, let alone in pure set theory, representations of mathematical objects become unwieldy. For example, [bourbaki-sets] estimates28 that even the number 1 if fully expanded would occupy thousands of symbols in their system. So some kind of definitional mechanism is needed, at least for human consumption, and possibly to stop overflow even for a machine! The fundamental property of definitions is that it should be possible in principle to do without them, though we need to say carefully precisely what this means.
The simplest kinds of definitions are just shorthands for more complicated
expressions, for example 2 rather than {0,1}, (x,y) rather than {x,{x,y}}
and !x. P[x]
rather than
x. P[x]
y. P[y]
y = x. We can understand such
definitions in two ways, either as metalogical abbreviations used to make
formal texts more palatable to the reader, or as a method of sanctioning the
addition (in the object logic) of new equational axioms. Generally, the
latter is more efficient, since it keeps the underlying terms small, but the
former is more flexible: the defined notion need not correspond to any kind
of logical object, nor respect any underlying type system. If fact the Nuprl
system allows boths kinds of definition.
Assuming that the object language has facilities for variable abstraction
(something like lambda-calculus), then any definitional constructs of this kind
can be reduced to new abbreviations or axioms of the form c = t where c
is a new constant name any t any closed term. For example, instead of writing
f(x) = E[x], we use f = x. E[x]. In any reasonable logical system the
addition of such axioms makes no difference: in principle we could do without
them (replace c by its expansion everywhere), and we do not derive any new
facts not mentioning c. They thus fulfil all the criteria we could demand of
definitions.
There is a more general means of extending the logic with new constant
symbols: using names for things which have been proved to exist. If we can
prove
x. P[x], then
adding a new constant c and an axiom
P[c] is a conservative extension of most logical systems,
in the sense that anything not involving c could equally well be deduced
without using c or its axiom. (Simply because something like the natural
deduction
-elim rule
is usually valid.) It's not quite conservative in the sense that it can be
written away from formulas, though any formula Q[c] can be read as (
x.
P[x])
x.
P[x]
Q[x], or
something like that. Given a theorem
x.
!y. P[x,y], it is usually
conservative to add a new function symbol f and an axiom
x.
P[x,f(x)]. However if unique existence is weakened to existence, this can
smuggle in the Axiom of Choice. The situation for constructive logics is
even more delicate.
A useful feature in a practical logic is a descriptor. For
example, we may allow a new term constructor x. P[x], read 'the (unique) x such that P[x]',
together with an axiom
(
!x. P[x])
P[
x. P[x]], or
x. P[x], read 'some x such that P[x]', together
with an axiom
(
x.
P[x])
P[
x.
P[x]]. These are called definite and indefinite descriptors, respectively.
Note that the latter is something like the Axiom of Global Choice if we
allow P[x] to contain other free variables, and even the former is weakly
nonconstructive because it effectively allows us to introduce a total
function even though
x. P[x] may be conditional on some other
property. A descriptor is very useful for rendering informal notions of 'the
x such that ...' (it's much used by Bourbaki), and can even be used to
implement lambda-binding, set abstraction and so on, e.g. define
x.
E[x] to be
f.
x. f(x) = E[x].
If we can prove
x.
P[x] and we have an indefinite descriptor, then defining c =
x.
P[x] gives us
P[c].
Conversely, for any term t we can prove
x. x = t, so the existential kind of
definition always sanctions the equational kind. There is thus a close
relationship between the two kinds of definition, given a descriptor.
There are more complex definitional mechanisms which cannot be expanded
away in such a simpleminded manner, but can nevertheless be understood as
inessential abbreviations which could in principle be done without. These
are usually referred to as contextual definitions. A good example is
the use of ZF proper classes, where statements about 'set' membership are
really an encoding of a logical fact. Another example: cardinal numbers can
in principle be done without, since all statements about arithmetic and
equality of cardinals may be regarded as statements about equinumerosity
etc. of sets. For example instead of |X| = |Y|2 |Z| we can write
X ~ (Y Y)
Z. The most sophisticated example of all is using
category-theoretic language like 'the category of all sets' purely as a
syntactic sugar on top of quite different statements. One of the interesting
questions in formalized mathematics is the extent to which this is possible.
In any case, we shouldn't lay too much stress on the theoretical eliminability of definitions. It's mainly a way of assuring ourselves that we aren't 'really' extending the logic. Life without definitions would be unbearably complicated, and anyway the whole significance of a theorem may subsist in the definitions it depends on.
There are other forms of definition, for example definition by recursion and definition of inductive sets or relations. However given a mathematical theorem justifying the existence of such a function, it can then be defined using one of the mechanisms given above. In programmable systems, the tedious matter of deriving the required case of this theorem can be automated completely.