Unification Pattern Matching

The Unify Function

Unify( False, ?num / ?den , expr)

Number of Arguments: 3

Argument types: <Boolean> <Query> <Target>

<Boolean> is True or False

<Query> and <Target> are arbitrary

Function description:

Unify is a Unification Pattern Matcher as described in

Charniak, Riesbeck & McDermott: Artificial Intelligence Programming or

Wilensky: Lispcraft.

It can be used for "deductive information retrieval" and Prolog-style clausal-form logic programming as described for example in

Kowalski: Logic for Problem Solving.

For the following discussion, we recommend that you first read the discussion of List
Operations

Unify uses special match-variables which are Variables that begin with ?

(such as ?this , ?that, ?x , ?y etc.). These variables are in no way treated differently from other variables outside of
Unify or Replace. One way to think of Unify is as a generalization of the test for equality: ==.

Let us define a term as a list such as (COLOR BLOCK (OR BLUE RED)). A term may contain match-variables or not. The same match-variable may occur several times in the same term, and in this case will in each instance refer to the same item.

Now suppose value[Term1] = trm1 and value[Term2] = trm2. If trm1 and trm2 contain no match variables and are equal, that is Term1 == Term2 returns True, then

Unify( False, Term1, Term2)
will return True.

The "flag" False, the first argument to Unify, will be explained momentarily.

Now suppose trm1 or trm2 contains match-variables. These terms will be said to be "unified" if there is given a set of substitutions for each of the match-variables in trm1, and an

independent set of substitutions for the match-variables in trm2 such that the resulting expressions are equal. When we say the substitutions are independent we mean that if

match-variables in trm1 happen to have the same name as match-variables in trm2 they are unrelated, that is, needn't stand for the same item. Match-variables may be substituted by either

other match-variables or items and these may come from either the term containing them or the other term.

When Unify is called, it attempts to unify its second and third argument. If the items are equal, and they contain no match-variables it returns, as we said above, True.

This means they are unified, and no substitutions need be made.

If at least one of them contains match-variables and substitutions can be made to unify the items, then Unify returns an
Association List of substitutions that do the job. If the terms cannot be unified then Unify returns False.

The first argument to Unify is a Boolean Variable: True or False. For the moment, let us consider that argument to be False. This option is somewhat faster than the True option, but allows certain matches which, while they are pathological and arise infrequently, should be disallowed on semantic grounds. More on this later.

Now suppose

Unify( False, Term1, Term2)

is evaluated with value[Term1] =trm1 and value[Term2] = trm2 as above.

Then trm1 is considered the Source or Query and trm2 is considered the Target. If the terms can be unified and contain match-variables then the list of substitutions will be a list of pairs of dotted pairs such as the one below:

( (Label : Variable) (Label : Item) ).

We will treat the return value as an Association List.

Label is ? if the following variable or item is in the SOURCE, and

Label is T if the following variable or item is in the TARGET. In this way for example variables with the same name in different terms are distinguished. (T : ?X) is different from (? : ?X).

Variable is a match-variable, and Item is some item in one of the terms, or it is a match-variable.

A pair such as the one above indicates a possible binding or a substitution of the second item for the first variable. The "bound" variable is not actually given a value by Unify; the list returned is the only effect of Unify.

Unify will only bind a variable once in the list it returns. And of course if Unify encounters an inconsistency in its efforts to find bindings it will return False.

Also notice that if item is a list such as (this list) and label is ?, then the dotted pair will have the form (? this list) and not (? : (this list)). That is because, by definition,

(? : (this list)) is (? this list). See
List Operations for more information on dotted pairs.

While Lisp has no trouble with dotted pairs and list expressions like these, they are awkward in MathScript. Therefore, we will provide a few utility functions for examining the output of Unify.

First, we write a function that takes a single pair of dotted pairs

( (Label : Variable) (Label : Item) ).

and replaces it with a pair

(item1 item2)

where item1 is the item associated with the source, and item2 is the one associated with the target.

make variable ?this;

make variable ?that;

make variable ?;

program arrange (li dottedpairs) {

let a be first(dottedpairs);

let b be first(rest(dottedpairs));

if first(a) == ? then {return list(rest(a), rest(b))}

else {return list(rest(b), rest(a))}

}

Next, we create from the output of Unify, the list of these pairs. If all match variables appear in the Source (which is usual) then this can be used as an association list of bindings. The Lookup function can retrieve the bindings when it wants to.

Program Unification (ex source) (ex target) {

let res be unify(false, source, target);

if res == true then {return True;} else {return map(arrange, res);}

}

Let us consider some simple scenarios.

make u unification(?this+?that, x+y)

returns ((?THIS X) (?THAT Y))

make u unification(?this+?that, x+y+z)

returns ((?THIS (+ X Y)) (?THAT Z))

So that

first(rest(Lookup(?that, u)))

returns Z

Now, assume in what follows that all variables have been declared. If we create a term:

Make R list(Color, Block, ?What)

Make A list(Color, Block, Chartreuse)

Unification(R, A) returns

( (?WHAT CHARTREUSE) )

the variable ?WHAT in the source is bound to CHARTREUSE in the target.

Unification(A,R) returns

( (CHARTREUSE ?WHAT) )

the variable ?WHAT is now in the target, and CHARTREUSE is in the source.

Next, if we try:

Make B list(Color, ?Object, Pink)

Unification( A , B)

returns NIL, the Empty List.

no match is possible between (Color Block Chartreuse) and (Color ?Object Pink)

But, Make B list(Color, ?Object, Chartreuse) causes

Unification( R , B)

to return:

(( BLOCK ?OBJECT) (?WHAT CHARTREUSE) )

two substitutions are needed for the unification, one for ?OBJECT in the target, and one for ?WHAT in the source.

Unification( A , B) returns

(( BLOCK ?OBJECT))

Now to consider a more complex example, consider:

make R list( loves, ?x, ?y);

make A list ( loves, list( father, ?x), list( mother, ?x) );

Unification( R , A) returns

( ( ?X (FATHER ?X)) ( ?Y (MOTHER ?X)) )

Note that in the first pair the instances of ?X are distinct. Now we try something a bit more ambitious:

Make R list( loves, ?x, ?x);

The match-variables refer to the same item. Does ?X love itself?

make A list ( loves, list( father, ?s), list( mother, ?s) );

Unification( R , A) returns

NIL

?x is bound to (father ?s) and then it is attempted to bind the same ?x to (mother ?s). this can be done if (father ?s) can be unified with (mother ?s) by some binding of ?s

consistent with all current bindings. such a unification is impossible, so unify gives up!

We adjust things slightly, and look at the return from Unify rather than from Unification. The reason will be clear.

make A list ( loves, list( father, ?s), list( ?p, son) );

Unify(False, R , A) returns

( ((? : ?X) (T FATHER ?S)) ((T : ?P) (T : FATHER)) ((T : ?S) (T : SON)) )

As before ?X is bound to (FATHER ?S). Next the attempt is made

to bind ?X to (?P SON). This means that (?P SON) must be unified with (FATHER ?S)

in the current binding environment. This is accomplished by binding ?P in the target with FATHER in the target, and then ?S in the target with SON in the target! So in this instance we must bind variables and items from the same term.

To test this, enter

make A list ( loves, list( father, ?s), list( ?s, son) );

Unify( False, R, A)

returns NIL

This time unification is impossible because (?S SON) cannot be unified with

(FATHER ?S) since both occurrences of ?S are in the Target and must therefore have consistent bindings. All of our calls to Unify have been with the flag False. Let us now consider the reason for this flag.

make A list ( loves, list( father, ?s), list(father, list( father, ?s)) );

Unify( False, R, A) returns

( ((? : ?X) (T FATHER ?S)) ((T : ?S) (T FATHER ?S)) )

After ?X is bound to (FATHER ?S) in the target, (FATHER ?S) is "unified" with

(FATHER (FATHER ?S)) in the target by the binding of ?S to (FATHER ?S). Except in exotic applications where one wishes to represent infinite lists defined recursively:

(FATHER (FATHER (FATHER (FATHER (......))) this is incorrect! Binding ?S to (FATHER ?S) does not yield expressions which can be equated meaningfully by equal. We thus want to rule

out such bindings when they might arise.

Now the flag True or False which is the first argument to Unify is

used for this purpose.

Unify( True, R, A) returns

NIL

The flag True causes the list of bindings to be checked for "circularities" and reports False if it happens that a variable is bound to something, which is bound to something, which is bound to something, ..., which is bound to the original variable. The flag False does not check for such circularities.

In principle, Unify is faster with the False option than with the T, and there are many circumstances where one knows a priori that no circularity can arise because of the choice of knowledge representation. In such cases, Unify False may be used. On the other hand, the final check uses an idea that appears in Knuth's: Art of Computer Programming a "topological"algorithm for embedding trees in linear graphs. The algorithm employed is quite fast, it checks a certain directed graph for cycles and reports False if it finds one. Differences in time between the flags will appear if the terms Unified should contain very large numbers of match-variables and these should be nested in intricate deep chains of bindings. So there is little expense in using the True option generally with Unify.