Oh, wherefore art Y...

January 24, 2006

One of the most profound ideas in lambda calculus, is the Y-operator. I’ve learned the Y-operator at least three times, and every time, I found it extremely hard to understand. This blog is therefore an example of a quixotic undertaking: I want to see whether it is possible for me to explain the Y-operator so that you, gentle reader, can understand it with minimal background. In this case, “minimal background” means at least a few years of college level mathematics.

Background: Why Y?

The Y-operator is part of lambda calculus, a field of logic that appeals to people who just couldn’t get enough of solving equations in high-school mathematics. Lambda calculus is named after the “lambda” operator, which stands for an “anonymous function”. It can be used to syntactically (that is, mechanically) transform expressions. A short example: “(lambda (x) (* x x))” is a method that takes one argument (called x) and returns the square of the argument. Here is how you use it:

((lambda (x) (* x x)) 8) ;; apply the abovementioned function to 8
=>
(* 8 8)                   ;; MECHANICALLY substiture 8 for x
=>
64                         ;; Whee!

The Y-operator is used to create recursive functions using lambdas. That is, functions that call themselves. That is its purpose. Y is defined as follows:

(Y F) = (F (Y F))

For this to work, F better be a function. Read it as follows: “Y of a function F equals F applied to Y of F”.

An example: Factorial

As I said: This allows us to define recursive stuff using just lambdas. To demonstrate, I will need a volunteer recursive function. You over there: Factorial! could you step over here, please?

The factorial function is defined thus: The factorial of a number n is the product of all natural numbers up to n. A way of writing this in a fairly conventional syntax is: fact(n) = (n == 1) ? 1 : (n * fact(n - 1)). Now, lets do it without function definitions. Lets say we want to find the factorial of 8. In lambda calculus we can write:

((Y (lambda (f) 
           (lambda (x)
                   (if (= x 1) 1 (* x (f (-1 x)))))))
     8)

Holy lambda, Batman! That didn’t help much, did it? Well: here’s the good thing: You don’t have to understand it. Using what we learned about lambda, we can just blindly calculate it. Lets see how it works.

Doing the math

Calculating the factorial for 8 will be a really long exercise, so let’s settle for 2 instead. Here we go:

((Y (lambda (f)
           (lambda (x)
                   (if (= x 1) 1 (* x (f (-1 x)))))))
     2)

Substitute the definition of (Y F) = (F (Y F)):

(((lambda (f)
           (lambda (x)
                   (if (= x 1) 1 (* x (f (-1 x))))))
   (Y (lambda (f)
           (lambda (x)
                   (if (= x 1) 1 (* x (f (-1 x))))))))
     2)

Whoa, those parenteses are dizzying. Well, we can substitute for the f in the first lambda:

((lambda (x)
         (if (= x 1) 
             1 
             (* x ((Y (lambda (f)
                               (lambda (x)
                                       (if (= x 1) 1 (* x (f (-1 x)))))))
                    (-1 x)))))
   
 2)

Keep substituting for the outer lambda at all times. This time x = 2 (only for the outer lambda)

(if (= 2 1) 
     1 
     (* 2 ((Y (lambda (f)
                       (lambda (x)
                               (if (= x 1) 1 (* x (f (-1 x)))))))
            (-1 2))))

We now know the answer to the if (and I’ll expand (-1 2) => 1 as well):

(* 2 ((Y (lambda (f)
                  (lambda (x)
                          (if (= x 1) 1 (* x (f (-1 x)))))))
       1))

We need to do another of those crazy Y expansions. It is not hard, just exchange the f (I’ll skip one step this time):

(* 2 ((lambda (x)
               (if (= x 1) 
                   1
                   (* x ((Y (lambda (f)
                                     (lambda (x)
                                             (if (= x 1) 1 (* x (f (-1 x))))))) 
                         (-1 x)))))
       1))

Substitute the 1 for x in the outermost lambda:

(* 2 (if (= 1 1) 
          1
          (* 1 ((Y (lambda (f)
                            (lambda (x)
                                    (if (= x 1) 1 (* x (f (-1 x))))))) 
                (-1 x)))))

If we had started with a larger number, we would’ve gotten a string of (* 4 (* 3 (* 2…. built up from these transformations. But we’re almost done: The if is true this time, so we can substitute:

(* 2 1)

The whole Y and lambda mess just evaporated. w00t! And the answer is of course 2 which is correct.

What happened?

What the magic (Y F) = (F (Y F)) does is to let us embed a function within itself an inifinitely number of times. So, for factorial:

(Y fact) = (fact (fact (fact (fact (fact (Y fact))))))

In this case fact is defined as (pay attention now): A function that takes a function as an argument and returns another function. The returned function takes a natural number as an argument, and if the number is one, it returns it, otherwise: it calls the function that was given as argument to the first function. Y keeps injecting fact into itself, so it can always go deeper in the recursive call. But as soon as the number reaches 1, we don’t have to go any further, and the “final” (Y fact) collapses into nothingness.

Y generates as many function applications as we need, so it will let us create recursiveness, without naming a single function. (Y F) is then the value of F when F is applied to itself an infinitely number of times. This is called the fixed point for the function F.

Conclusion: Why Y?

The Y operator lets us use purely syntactic (mechanical) steps to evaluate recursive functions. It gives the mathematical construct of lambda calculus as much computing power as a programming language, without needing to define names for functions. It is a mathematical marvel.

Can it be used for anything? I don’t really think so. But just as Y expands a lambda expression, it can expand your mind. I hope that you feel a little smarter going away from this blog than entering it.

You can read more about the Y-operator at wikipedia. Of course.

And if you want to go for the whole nine yards, check out the Structure and Interpretation of Computer Programs online video lectures! 20 hours of Lisp!

Postscript

Did you manage to read through the whole article? Did you understand it? Let me know what you think: Write a comment, or send me an email.

Comments:

[K-Y Chan] - Jan 1, 2007

Wow! This introduction to Y-operator is really good! Before, I could only understand the theory, the Y-operator in pure Lambda Calculus but not the implementation in Lisp… After reading this article, I can understand it finally!(yeah) Thank you very much! :D

My next target is to sort out the Turing Y-operator in Lisp… haha

Uncategorized