Unification Pattern Matching
The Unify Function
Unify( False, ?num / ?den , expr)
Number of Arguments: 3
Argument types: <Boolean> <Query> <Target>
<Boolean> is True or False
<Query> and <Target> are arbitrary
Function description:
Unify is a Unification Pattern Matcher as described in
Charniak, Riesbeck & McDermott: Artificial Intelligence Programming or
Wilensky: Lispcraft.
It can be used for "deductive information retrieval" and Prolog-style
clausal-form logic programming as described for example in
Kowalski: Logic for Problem Solving.
For the following discussion, we recommend that you first read the
discussion of List Operations
Unify uses special match-variables which are Variables that begin with ?
(such as ?this , ?that, ?x , ?y etc.). These variables are in no way treated
differently from other variables outside of Unify or Replace. One way to think of Unify is as a generalization of the test for equality: ==.
Let us define a term as a list such as (COLOR BLOCK (OR BLUE RED)). A term may contain
match-variables or not. The same match-variable may occur several times in the same
term, and in this case will in each instance refer to the same item.
Now suppose value[Term1] = trm1 and value[Term2] = trm2. If trm1 and trm2
contain no match variables and are equal, that is Term1 == Term2 returns True, then
Unify( False, Term1, Term2) will return True.
The "flag" False, the first argument to Unify, will be explained momentarily.
Now suppose trm1 or trm2 contains match-variables. These terms will
be said to be "unified" if there is given a set of substitutions for each of the
match-variables in trm1, and an
independent set of substitutions for the match-variables in trm2 such that the resulting
expressions are equal. When we say the substitutions are independent we mean that if
match-variables in trm1 happen to have the same name as match-variables in
trm2 they are unrelated, that is, needn't stand for the same item.
Match-variables may be substituted by either
other match-variables or items and these may come from either the term
containing them or the other term.
When Unify is called, it attempts to unify its second and third
argument. If the items are equal, and they contain no match-variables it returns, as
we said above, True.
This means they are unified, and no substitutions need be made.
If at least one of them contains match-variables and substitutions can
be made to unify the items, then Unify returns an Association List of substitutions that do the job. If the terms cannot be unified then Unify
returns False.
The first argument to Unify is a Boolean Variable: True or False. For
the moment, let us consider that argument to be False. This option is
somewhat faster than the True option, but allows certain matches which, while they
are pathological and arise infrequently, should be disallowed on semantic
grounds. More on this later.
Now suppose
Unify( False, Term1, Term2)
is evaluated with value[Term1] =trm1 and value[Term2] = trm2 as above.
Then trm1 is considered the Source or Query and trm2 is considered the Target. If the terms can be unified and contain match-variables then the list of
substitutions will be a list of pairs of dotted pairs such as the one below:
( (Label : Variable) (Label : Item) ).
We will treat the return value as an Association List.
Label is ? if the following variable or item is in the SOURCE, and
Label is T if the following variable or item is in the TARGET. In this way for
example variables with the same name in different terms are distinguished. (T : ?X) is different from (? : ?X).
Variable is a match-variable, and Item is some item in one of the terms, or it is a match-variable.
A pair such as the one above indicates a possible binding or a substitution of the second item for the first variable. The "bound"
variable is not actually given a value by Unify; the list returned is the only
effect of Unify.
Unify will only bind a variable once in the list it returns. And of course if
Unify encounters an inconsistency in its efforts to find bindings it will
return False.
Also notice that if item is a list such as (this list) and label is ?, then
the dotted pair will have the form (? this list) and not (? : (this list)). That is because, by definition,
(? : (this list)) is (? this list). See List Operations for more information on dotted pairs.
While Lisp has no trouble with dotted pairs and list expressions like these,
they are awkward in MathScript. Therefore, we will provide a few utility
functions for examining the output of Unify.
First, we write a function that takes a single pair of dotted pairs
( (Label : Variable) (Label : Item) ).
and replaces it with a pair
(item1 item2)
where item1 is the item associated with the source, and item2 is the one
associated with the target.
make variable ?this;
make variable ?that;
make variable ?;
program arrange (li dottedpairs) {
let a be first(dottedpairs);
let b be first(rest(dottedpairs));
if first(a) == ? then {return list(rest(a), rest(b))}
else {return list(rest(b), rest(a))}
}
Next, we create from the output of Unify, the list of these pairs. If all
match variables appear in the Source (which is usual) then this can be used as an
association list of bindings. The Lookup function can retrieve the bindings
when it wants to.
Program Unification (ex source) (ex target) {
let res be unify(false, source, target);
if res == true then {return True;} else {return map(arrange, res);}
}
Let us consider some simple scenarios.
make u unification(?this+?that, x+y)
returns ((?THIS X) (?THAT Y))
make u unification(?this+?that, x+y+z)
returns ((?THIS (+ X Y)) (?THAT Z))
So that
first(rest(Lookup(?that, u)))
returns Z
Now, assume in what follows that all variables have been declared. If we
create a term:
Make R list(Color, Block, ?What)
Make A list(Color, Block, Chartreuse)
Unification(R, A) returns
( (?WHAT CHARTREUSE) )
the variable ?WHAT in the source is bound to CHARTREUSE in the target.
Unification(A,R) returns
( (CHARTREUSE ?WHAT) )
the variable ?WHAT is now in the target, and CHARTREUSE is in the source.
Next, if we try:
Make B list(Color, ?Object, Pink)
Unification( A , B)
returns NIL, the Empty List.
no match is possible between (Color Block Chartreuse) and (Color ?Object
Pink)
But, Make B list(Color, ?Object, Chartreuse) causes
Unification( R , B)
to return:
(( BLOCK ?OBJECT) (?WHAT CHARTREUSE) )
two substitutions are needed for the unification, one for ?OBJECT in the
target, and one for ?WHAT in the source.
Unification( A , B) returns
(( BLOCK ?OBJECT))
Now to consider a more complex example, consider:
make R list( loves, ?x, ?y);
make A list ( loves, list( father, ?x), list( mother, ?x) );
Unification( R , A) returns
( ( ?X (FATHER ?X)) ( ?Y (MOTHER ?X)) )
Note that in the first pair the instances of ?X are distinct. Now we try
something a bit more ambitious:
Make R list( loves, ?x, ?x);
The match-variables refer to the same item. Does ?X love itself?
make A list ( loves, list( father, ?s), list( mother, ?s) );
Unification( R , A) returns
NIL
?x is bound to (father ?s) and then it is attempted to bind the same ?x to
(mother ?s). this can be done if (father ?s) can be unified with (mother ?s) by
some binding of ?s
consistent with all current bindings. such a unification is impossible, so
unify gives up!
We adjust things slightly, and look at the return from Unify rather than
from Unification. The reason will be clear.
make A list ( loves, list( father, ?s), list( ?p, son) );
Unify(False, R , A) returns
( ((? : ?X) (T FATHER ?S)) ((T : ?P) (T : FATHER)) ((T : ?S) (T : SON)) )
As before ?X is bound to (FATHER ?S). Next the attempt is made
to bind ?X to (?P SON). This means that (?P SON) must be unified with
(FATHER ?S)
in the current binding environment. This is accomplished by binding ?P in
the target with FATHER in the target, and then ?S in the target with SON in the
target! So in this instance we must bind variables and items from the same
term.
To test this, enter
make A list ( loves, list( father, ?s), list( ?s, son) );
Unify( False, R, A)
returns NIL
This time unification is impossible because (?S SON) cannot be unified with
(FATHER ?S) since both occurrences of ?S are in the Target and must therefore
have consistent bindings. All of our calls to Unify have been with the flag
False. Let us now consider the reason for this flag.
make A list ( loves, list( father, ?s), list(father, list( father, ?s)) );
Unify( False, R, A) returns
( ((? : ?X) (T FATHER ?S)) ((T : ?S) (T FATHER ?S)) )
After ?X is bound to (FATHER ?S) in the target, (FATHER ?S) is "unified"
with
(FATHER (FATHER ?S)) in the target by the binding of ?S to (FATHER ?S).
Except in exotic applications where one wishes to represent infinite lists defined
recursively:
(FATHER (FATHER (FATHER (FATHER (......))) this is incorrect! Binding ?S to
(FATHER ?S) does not yield expressions which can be equated meaningfully by
equal. We thus want to rule
out such bindings when they might arise.
Now the flag True or False which is the first argument to Unify is
used for this purpose.
Unify( True, R, A) returns
NIL
The flag True causes the list of bindings to be checked for "circularities"
and reports False if it happens that a variable is bound to something, which is
bound to something, which is bound to something, ..., which is bound to the
original variable. The flag False does not check for such circularities.
In principle, Unify is faster with the False option than with the T, and there
are many circumstances where one knows a priori that no circularity can arise
because of the choice of knowledge representation. In such cases, Unify False
may be used. On the other hand, the final check uses an idea that appears in Knuth's:
Art of Computer Programming a "topological"algorithm for embedding trees
in linear graphs. The algorithm employed is quite fast, it checks a certain directed
graph for cycles and reports False if it finds one. Differences in time between
the flags will appear if the terms Unified should contain very large numbers of
match-variables and these should be nested in intricate deep chains of bindings.
So there is little expense in using the True option generally with Unify.