One of a series of tutorials about Scheme in general
and the Wraith Scheme interpreter in particular.
Copyright © 2011
Jay Reynolds Freeman,
all rights reserved.
Personal Web Site: http://JayReynoldsFreeman.com
In a previous tutorial, I mentioned that the fancy way of using "define" to create procedures, with a list containing the name of the procedure and the names of all its arguments, was merely syntactic sugar for the simpler use of define, that just binds a Scheme object to a name. Let's review that example.
(define (increment n) (+ n 1))
is syntactic sugar for
(define increment (lambda (n) (+ n 1)))
Look at the second define expression. Even though we haven't discussed lambda expressions yet, you know enough about "define" to see that it is obviously binding some kind of Scheme object -- represented by "(lambda (n) (+ n 1))" -- to the symbol "increment". In everyday terms, it is giving a name to "(lambda (n) (+ n 1))". (Actually, it is giving a name to whatever results from the evaluation of "(lambda (n) (+ n 1))".)
Now look at the first define expression. You know that the result of evaluating that "define" expression will be the creation of a procedure whose name is "increment". Furthermore, if the first define expression is syntactic sugar for the second, they must both do the same thing.
Therefore, on comparing the two expressions, you might begin to suspect that "(lambda (n) (+ n 1))" is a procedure that doesn't have a name, and that all "define" does is give it one. That is in fact precisely what is going on. A lambda expression evaluates to an unnamed procedure. Let's type one into the Wraith Scheme interpreter, and press "return", and see what happens.
(lambda (n) (+ n 1)) ;; ==> #<Interpreted lambda expression, possibly named "<no name>">
The result is a new kind of object, something we have not talked about in previous tutorials, called an interpreted lambda expression, and Wraith Scheme has helpfully printed out some additional information about it.
What is going on here is that "interpreted lambda expression" is what evaluated lambda expressions are called in Wraith Scheme (there might be other kinds of lambda expressions in Wraith Scheme at some future time). Furthermore, Wraith Scheme has an enhancement, a mechanism to keep track of when you bind a lambda expression to a name, so that it can tell you the most recent name you used for a particular lambda expression, as an aid to debugging. We did not bind this particular lambda expression to a name, so that mechanism had nothing useful to tell us at present.
With the syntactic sugar in mind, you will find that the previous tutorial, "Creating Procedures", has shown you about all you need to know about the syntax of lambda expressions; "lambda" itself is a keyword to tell Scheme what is coming up, and the argument list and procedure body work pretty much as they did with the syntactic-sugared version, except that there is no procedure name preceding the arguments. For example, here is a lambda expression with two arguments -- I hope you recognize the body of the procedure we used to find the hypotenuse of a right triangle.
(lambda (x y) (sqrt (+ (* x x) (* y y))))
Here is one that takes at least one argument, and returns a list of all its arguments.
(lambda (first . rest) (append (list first) rest))
Here is one that takes any number of arguments, including none at all. It also returns a list of its arguments.
(lambda args args)
One way to think of a lambda expression is that it is how you tell a Scheme interpreter to convert data into code. If no one had told you that "lambda" was a keyword, then you might think that
(lambda (n) (+ n 1))
was just a list, whose first item was the symbol "lambda", whose second item was the list "(n)", and whose third was the list "(+ n 1)". That is in fact correct -- that expression is just a list, but when a Scheme interpreter sees the keyword "lambda", it converts that list into a procedure. The evaluation of a lambda expression creates a procedure.
It is important to distinguish between the evaluation of a lambda expression and the evaluation of a procedure application that uses a lambda expression. If you type this into a Scheme interpreter
(lambda (n) (+ n 1)) ;; ==> #<Interpreted lambda expression, possibly named "<no name>">
the interpreter will evaluate the lambda expression, and will return a procedure of the kind that Wraith Scheme calls an "interpreted lambda expression". There is no procedure application involved; nothing was done with the procedure thereby created.
On the other hand, here is a procedure application involving a lambda expression. I have used white space to make it easier to distinguish between the procedure and its argument. Note that the entire lambda expression is the first element of two-element list, whose second element is the number 41. That two-element list is a procedure application.
( (lambda (n) (+ n 1)) 41 ) ;; ==> 42
When Scheme sees that example, it first evaluates the lambda expression itself, returning a procedure that takes one argument, then applies that procedure to 41, and gets 42 as the result. Recall that Scheme always evaluates the first item in a procedure application, though in most of the examples we have seen the evaluation was simply discovering the value of a variable -- looking up its binding -- such as finding that "+" is bound to a procedure that does addition.
Lambda expressions are of great theoretical interest in the Scheme community, because it turns out that nearly everything that can be done in Scheme can be described and even implemented in terms of lambda expressions. Yet if they were only of theoretical interest, I would not be discussing them here. Lambda expressions are very practical, and are handy tools for the working programmer in several circumstances.
To begin with, lambda expressions can make source code easier to read, easier to maintain, and easier to understand. For example, suppose you needed to add up the squares of three numbers, and also suppose that each number you needed to square was itself the result of a lengthy calculation. If you tried a straightforward but brute-force approach to writing the code to do that, you might end up with something like this:
(+ (* (+ (sin 3) (exp 1.5)) (+ (sin 3) (exp 1.5))) (* (modulo (* 6 9) 42) (modulo (* 6 9) 42)) (* (* 5 4 3 2 1) (* 5 4 3 2 1))) ;; ==> 14565.370363775319
Note the problems with that expression. To begin with, it is hard to read. Someone not familiar with your code might miss the overall simple fact that you were adding up three squares, or you yourself might forget if you put the code aside for a while. Second, you have had to type each of the expressions to be squared twice; for example, "(* 5 4 3 2 1)" appears two separate times in the expression. The repeats make the code messier to look at, they double the number of opportunities for typographical error, and they mean that the Scheme interpreter may well have to waste time calculating each repeated expression twice.
Instead, let's write it this way.
((lambda (a b c) (+ (* a a) (* b b) (* c c))) (+ (sin 3) (exp 1.5)) (modulo (* 6 9) 42) (* 5 4 3 2 1) ) ;; ==> 14565.370363775319
That expression is yet another example of a procedure application, and we already know how those work. The procedure is the lambda expression itself, "(lambda (a b c) (+ (* a a) (* b b) (* c c)))". It is obviously a procedure of three arguments, and it obviously returns the sum of their squares. Each argument is written out only once, which makes things easier to read and less subject to typographical error, and each argument is evaluated only once, which may make the program run a little faster.
This example demonstrates that lambda expressions can be used as unnamed auxiliary procedures. You can use auxiliary procedures in essentially any programming language, but most languages require you to write them out and give them a name, even if you are only going to use them once. That requires extra typing, and moves the body of the auxiliary procedure away from where it is going to be used, which might mean that you will have to do lots of scrolling back and forth from place to place to keep track of what is going on. Let me show you the named-auxiliary-procedure style, just so you can compare the two approaches, and I say again that I do not personally recommend this approach as good Scheme programming style.
(define (sum-three-squares a b c) (+ (* a a) (* b b) (* c c))) ;; ==> sum-three-squares (sum-three-squares (+ (sin 3) (exp 1.5)) (modulo (* 6 9) 42) (* 5 4 3 2 1) ) ;; ==> 14565.370363775319
I hope you get the feeling that it is rather needless work to keep writing and naming auxiliary procedures that are going to be used only once. That feeling will probably become stronger when you have some complicated routine that needs a dozen of them.
One place where unnamed auxiliary procedures can be especially useful is in the body of other procedures. Let's take the familiar example of the "hypotenuse" procedure as an example. It was
(define (hypotenuse side-a side-b) (sqrt (+ (* side-a side-a) (* side-b side-b)))) ;; ==> hypotenuse
That is simple enough that we don't really need an auxiliary procedure, but let's put one in to get used to the idea in simple circumstances. We can write an equivalent procedure as
(define (hypotenuse side-a side-b) (sqrt ((lambda (x y) (+ (* x x) (* y y))) side-a side-b))) ;; ==> hypotenuse (hypotenuse 3 4) ;; ==> 5
Using the auxiliary procedure perhaps makes it a little clearer on reading that what we are doing is summing two squares. In this example it doesn't make a whole lot of difference, but with more complicated operations it would. We will see more examples in subsequent tutorials.
Now that we have seen lambda expressions used inside of procedures, I can make another point about lexical scope, which we discussed in the tutorial, "Environments and Scope". Here is some code involving identically typed lambda expressions. I am not going to show what it does -- no ";; ==>"s for the moment. Just look at it, and think about some questions.
(define x 42) ((lambda () (display x) (newline))) (define (foo x) ((lambda () (display x) (newline)))) (foo 137)
The preceding example contains four expressions. When we evaluate the first one, we will simply be creating a variable named x, that has 42 bound to it as its value in the top-level environment. Here is the first question:
(define x 42) ;; ==> x ((lambda () (display x) (newline))) ;; ==> 42 ;; #t (define (foo x) ((lambda () (display x) (newline)))) ;; ==> foo (foo 137) ;; ==> 137 ;; #t
The explanation for that has to do with the details of how lexical scope works. Remember the basic idea: Languages in which you can figure out the scope of a variable binding just by looking at the source code are said to have lexical scope. And remember also, that the scope of the argument variables of a procedure is the procedure body; any such body has its own separate environment.
A lambda expression is a procedure of no names; therefore the first place Scheme looks for the bindings of symbols used in the body of a lambda expression is the environment associated with the lambda expression itself. The lambda expressions used in the preceding example do not have an argument variable named "x", so Scheme finds no binding for x in their own environments. Scheme must look elsewhere.
Let's investigate what happens in "(foo 137)" first. The lambda expression used in "foo" is lexically enclosed within the body of "foo" -- that is a fancy way of saying that the text of that lambda expression is contained within the text of the body of "foo". Therefore, for that lambda expression, the next place where Scheme looks for a binding of "x" is in the environment of "foo" -- and it finds one. Procedure "foo" has an argument variable named "x", so when Scheme is looking up the "x" for the lambda expression within "foo", it will find that argument variable in "foo"'s own environment, and use its value. So "(foo 137)" prints out the value that is bound to x in that particular procedure application, which is 137. (It then prints a newline, and the Wraith Scheme interpreter prints out the value returned from "foo", which is the value returned by the last expression evaluated within "foo", which is #t.)
For the evaluation of "((lambda () (display x) (newline)))" at top level, Scheme again finds no binding of x within the body of the lambda expression, so again turns to the lexically enclosing environment, which is the top-level environment. There is a binding of "x" there, to 42. So the top-level evaluation of our lambda expression prints out 42. Note carefully:
Lambda expressions that have the same source code can act differently depending on the environment in which they are used.
Consider "(foo 137)" again. The lookup procedure for the "x" used within its lambda expression did not "see" the binding of x at top level because it encountered the binding of x within the body of "foo" first. The terminology that the Scheme community uses for this situation indeed suggests vision being blocked: We say that the binding of x in "foo" shadows the binding of x at top level.
If you like, you can think of the parentheses that bound the definition of a procedure as walls. The scope of the procedure's argument variables -- where their environment is valid -- is between those walls, and in the case of nested procedures -- like the lambda expression within foo -- there are nested scopes. The fact that you can see the walls just by looking at the text of the procedures is why we call this kind of scoping "lexical scope".
Here are the procedures from the last example, with the "wall" parentheses shown in boldface type, and set off by extra white space. Note the nested scopes in "foo"; the one for the lambda expression is entirely contained within the one for "foo" itself. You can tell where that embedded lambda expression is going to get its value of x just by looking at the source code surrounding the lambda expression; you do not need to keep track of what else has been happening in the Scheme interpreter, like what programs have run or what things have been evaluated recently.
( (lambda () (display x) (newline)) ) ( define (foo x) ( (lambda () (display x) (newline)) ) )
As a final example, here is a procedure containing a slightly more complicated lambda expression, that prints out two variables. Given the top-level variable bindings of x and y shown in the example, what does "(bar 137)" print out?
(define x 42) (define y 73) (define z 23) (define (bar x) ((lambda (z) (display x) (newline) (display y) (newline) (display z) (newline) ) (- x 49) ) ) (bar 137)
The answer is, that it prints out 137, then 73, then 88. This lambda expression takes one argument, that gets bound to its own argument variable "z". In the example, the lambda expression is applied to (- x 49), and since "bar" is called with x bound to 137, the lambda expression finds that z is bound to 88 (137 minus 49) in its own environment. The lambda expression takes the value of x from the environment of "bar", where it is 137, and takes the value of y from the top-level environment, where it is 73. The top-level definitions of x and z are shadowed when seen from within the lambda expression.
Here is an even odder example of how lexical scope works.
(define (make-procedure x) (lambda () x) ) ;; ==> make-procedure
Look very closely at this example. The lambda expression within the body of "make-procedure" is not applied to anything; instead, it is the value returned by the procedure. What we have here is a procedure that creates another procedure. The return value from "make-procedure" is itself a procedure, one that takes no arguments. We can take the result of a procedure application of "make-procedure", and use it like any other procedure.
The question is, suppose we do the following.
(define x 42) ;; ==> x (define (make-procedure x) (lambda () x) ) ;; ==> make-procedure (define new-procedure (make-procedure 99)) ;; ==> new-procedure. (set! x 137) ;; ==> (new-procedure) ;; ==> <Guess what?!?>
What value do you think "(new-procedure)" will return?
The answer has to do with how the lexical-scope environment lookup mechanism works. How does Scheme know what environments to use in looking up variables, and in what sequence? It turns out that every evaluated lambda expression contains a list of pointers to the environments in which it is supposed to look things up, in the correct order. I say "pointers" to make it clear that the environments are not copied; for example, the end of every such pointer list is a pointer to the top-level environment, and there is only one top-level environment. Yet every time any given procedure is used in a procedure application, Scheme creates a new local environment for it, that contains the particular values of its argument variables during that particular procedure application.
Thus when the procedure application "(make-procedure 99)" was evaluated, Scheme created a local environment for "make-procedure" in which the variable x was bound to 99, and included a pointer to that environment in the list of environments in which "(lambda () x)" is supposed to look up things. Even after the procedure application "(make-procedure 99)" has ended, that environment is still around, and the procedure "new-procedure" will use it every time it needs to look up variable bindings. So in the procedure application "(new-procedure)", Scheme will use the value of x that obtained during procedure application "(make-procedure 99)", which was 99: Therefore
(new-procedure) ;; ==> 99
Let me say that again, in somewhat more melodramatic terms. Even after "(make-procedure 99)" has finished, and that procedure application has gone wherever old procedure applications go when they die, its environment yet survives, and can be used by the value the deceased procedure evaluation returned -- by "new-procedure" -- whenever necessary. The procedure application "(make-procedure 99)" is gone, but it is not forgotten!
There are some more new words here. This kind of persisting saved structure of lexically scoped variable bindings is called a closure. The function with which it is associated is said to be closed over all of the variables that it contains, whether those variables are its own argument variables or not.
The word "closure" in this context is not quite used in the same way as in any of its more common meanings, it is just a new and slightly suggestive term for a new concept. If you like, you can think of the function as putting all the variable bindings it is ever going to need in a closed box somewhere, where it can get at them when it needs them. Yet be careful -- some of the saved bindings are actually where other procedures can get at them and change them, like those in the top-level environment.
Once more, for clarity: In Scheme, every evaluated lambda expression is closed over all of the variables that it uses.
By the way, if you are thinking that this stuff about lexical scope and environments and closures is complicated and difficult, you have lots of company; I think so myself. Yet it is all very useful, as we will see in later tutorials.
-- Jay Reynolds Freeman (Jay_Reynolds_Freeman@mac.com)