Tail Recursion

One of a series of tutorials about Scheme in general
and the Wraith Scheme interpreter in particular.

Copyright © 2011 Jay Reynolds Freeman, all rights reserved.
Personal Web Site: http://JayReynoldsFreeman.com
EMail: Jay_Reynolds_Freeman@mac.com.

A student seeking wisdom found a venerable scholar who was reputed to know the answer to many questions. "Tell me, oh wise one," the student said, "The world is floating in infinite space. What keeps it from falling down?"

The scholar replied, "Our world is supported on the broad back of a great elephant, who is more than strong enough to bear the terrible weight."

The student thought for a moment, then asked with a puzzled expression, "Oh wise one, what keeps the elephant itself from falling down?"

"That elephant stands on the back of a second great elephant, who is more than strong enough to bear the weight," replied the scholar.

The student continued, "Oh wise one, what keeps the second elephant from falling down?"

"The second elephant stands upon a third, who is strong enough to bear the weight," replied the scholar.

"Oh wise one," the student began, "what keeps the third elephant from --" but the scholar raised a hand and smiled.

"Not to worry, young acolyte! It's elephants all the way down."

Many programming languages have an elephant problem that is closely related to the one in the joke. Scheme doesn't, and understanding what the problem is, and when, how and why Scheme can avoid it, are important things for a Scheme programmer to know. Unhappily, the issues here are moderately technical, so it will take a little explanation to get to the heart of the matter.

Furthermore, I expect that many readers of this tutorial will be programmers who are experienced with other languages that do have the "elephant problem", and that many of them are so used to it that it will take extra explaining to make the points that Scheme does something different, and that the difference confers great advantages and sometimes requires new programming styles. So please bear with me.

For those of you who have previous programming experience, I will name the problem in case you have seen it before -- but if you are a newcomer and have never heard of these terms, don't worry, we will talk about what they mean soon enough. The problem is called stack overflow, and sometimes people mention a specialize version of it called recursive stack overflow.

In these tutorials, we have seen many instances when a procedure or other form that is executing must call some other procedure or form. Here is a simple example with a silly, made-up procedure.

When procedure "foo" executes, it uses several other procedures and forms: It uses the "set!" special form, which in turn needs the "+" procedure to evaluate its second argument. Then it calls procedure "bar", which we have not yet defined.

I appeal to your intuition, that there is an issue here of Scheme needing some way to remember what procedure "foo" was doing, while that procedure is momentarily stopped to let some other procedure -- like "bar" -- execute. It is like when you are working in the kitchen, making something fancy from a recipe, and you come to a place in the cookbook where the instructions say "Make a mushroom mole (see page 42) and pour it on top of the marinated mangoes." What you have to do then is put all the ingredients and dishes and utensils for the main dish aside somewhere, and turn to page 42, and get out the stuff to make mushroom mole, and make it, and when it is done put all the stuff you used to make it away, and then get back to whatever you were about to do to those helpless mangoes.

That's a pretty big deal, but it might get worse. Suppose that on page 42, in the instructions for making mushroom mole, you find the instructions "Saute the mushrooms (see page 137)." Now you have to put most of the stuff for making mushroom mole aside for a while -- if your kitchen is small, perhaps you have to pile the dishes on top of the ones you were using for the marinated mangoes -- and set up for sauteing mushrooms.

It might get worse still, if on page 137 you find "Mushrooms are best sauteed in clarified butter (see page 73)." By now your stack of dishes in use is getting pretty deep, and there is not necessarily an end in sight. The cookbook may go on giving you cross-references to other pages forever. Sometimes I think the people who write cookbooks just love to do that.

To see what that has to do with programming, imagine that you have a robot to do your kitchen work (I wish!), and that it is programmed in Scheme with a lot of procedures for cooking. They might contain code like this.

Do you see what happens when those procedures run? The robot will have to interrupt what it is doing to do something else, in just the same pattern that occurred when you were doing the cooking.

Now, the point is not that the robot is going to have to stack dishes on your kitchen counter to get the job done, though perhaps it will have to do so. The point is that somehow, the robot is going to have to remember that it was in the middle of procedure "marinated-mangoes" while it has stopped that procedure temporarily to execute "mushroom-mole", and then remember that it was in the middle of "mushroom-mole" while it stops temporarily to execute "sautee-mushrooms", and so on.

The way programming languages usually deal with that kind of situation also involves a kind of stack, but one that involves computer memory rather than a kitchen counter. There is a big piece of memory that starts out empty; the idea is to put stuff into it starting at one end, in consecutive chunks, and to take stuff off in the reverse order. When you take stuff off the stack, you of course have emptied out the stack space that was being used by that particular batch of stuff, and can use the space again the next time you need to save something on the stack. The whole thing works just like the stack of dishes in the "kitchen" example we just talked about, or perhaps like the stack of papers on your desk when you have too many things to do and keep piling things on top of each other.

Thus when a procedure has to stop working, to call another procedure and let it execute, Scheme saves the details of what the first procedure was doing on the stack. After the second procedure has finished, Scheme can recover those details from the stack so that the first procedure can pick up where it left off. If the second procedure calls a third, its own details get put on the stack on top of the details of the first procedure, and so on. The stack gets bigger and smaller as procedures call other procedures, or finish up and return a result.

If you are a programmer, you may have heard of this kind of stack before, so I will mention some of the names commonly used to refer to it. Once again, if you are a newcomer, don't worry if the terms are new to you. This kind of stack is sometimes called a procedure call stack, or an application record stack (the kind of application meant is a procedure application, not an application in the sense of a complete program or "app"), or a frame stack. Sometimes it is just called the stack, when it is clear from context that you are talking about the inner workings of a running computer program.

Incidentally, this notion of a stack as a way of handling data in a computer program has many other uses. Many programs will create and use stacks that have nothing to do with saving data during procedure applications.

The summary point so far is that the stack of application records is like the stack of elephants in the joke: They take up space. They stand on top of each other. The "world" -- what is actually happening at the moment in the Scheme program -- relies on the stack to keep it from falling on its face (that is, to provide the knowledge of what to do next when the current procedure returns). And sometimes there are a whole lot of elephants -- the stack can get very deep. Our Scheme examples so far have only talked about stacks that were three or four deep, but that is by no means the limit.

In fact, there is no limit. It is possible to write a program in which the procedure call stack keeps growing forever. There is a problem with that, of course: Sooner or later the stack will run out of memory; the chunk of memory reserved for it will fill up. At that point, something bad will happen -- just what depends on the details of how the stack operates and what your computer's operating system does when a program attempts to access some memory that it is not supposed to use -- but for sure, your program will not be able to continue operating the way you expected.

The condition of a stack running out of memory is called stack overflow, and it is most commonly associated with recursive functions. A recursive function is one that calls itself; for example, consider a function to calculate the factorial of a positive integer n, which is written n! and which is the product of all the integers from one up through and including n. Thus

We might write a procedure to calculate the factorial of n as follows.

When you call this procedure with 4, it first checks to see whether 4 is less than or equal to 1, which it isn't. It then makes use of the fact that 4! = 4 * 3!, and tries to return (* 4 (factorial 3)). For that to work, procedure "factorial" must be called again, with an argument of 3. It in turn wants to return (* 3 (factorial 2)), and (factorial 2) wants to return (* 2 (factorial of 1)). At that point, however, the recursion stops, because (factorial 1) can return 1 with no further ado. That gets multiplied by 2 in (factorial 2), which thereby returns 2 to (factorial 3), which in turn multiplies by 3 and returns 6 to (factorial 4), which returns 24 as the entire result.

In this example, "factorial" only called itself recursively three times, but it is easy to write a procedure that recurses forever, and that will cause Wraith Scheme's stack to overflow in short order. Here is a contrived example.

All "foo" does is just call itself recursively and then return 42, except the returned value is a little bogus: Procedure "foo" never returns anything at all. Now let's evaluate "(foo)" and see what happens, but before you do make sure you know how to use Wraith Scheme's "Reset to Top-Level Loop" command. You are going to need it.

The program failed in less than a second when I ran it. How long it takes to fail on your computer will depend on how much memory you allow Wraith Scheme to use and how powerful a processor you have. All those lines of "foo" were efforts on the part of Wraith Scheme's simple-minded built-in debugger to tell you how the program got into its predicament: In more complicated circumstances it might be useful to know the exact sequence of procedure calls that led up to the failure, but in this case all that happened was that "foo" called itself recursively too many times.

Now let me show you how too modify "foo" so that it can recurse forever and yet will not cause stack overflow. We only have to remove one line.

Now let's try evaluating "(foo)".

Wraith Scheme just sits there, but if you look at the memory-use displays and indicator lights in the Wraith Scheme Instrument Panel you will see that Scheme is running code. The procedure is hard at work, calling itself recursively, and none of those procedure calls ever return. Yet the stack does not overflow! If you like, you can go away and take a few weeks' vacation and leave your computer running all the while, and the stack still will not have overflowed when you get back. Something new is going on. I have already given a hint of what it is, earlier in this tutorial, but the hint was subtle. Can you figure it out?

I said that the purpose of the procedure call stack was so that a procedure that had been temporarily stopped while another procedure was running, could recover the details of what it had been doing when that other procedure finished and returned a value. With that in mind, look carefully at the latest version of "foo", and notice that "foo" has nothing left to do after the recursive call to itself has returned: The recursive call, "(foo)", is the last thing that happens in the body of "foo", when "(foo)" is executed.

Let me restate. There is no need for Scheme to save a temporary record of what "foo" is doing at the time it makes its recursive call, because at that moment it is all done! All it has to do is return the value of "(foo)" -- when and if "(foo)" ever returns. Every procedure call has to return the value of the last thing it evaluates anyway, so there is no need to keep a special reminder that that is the plan. So in circumstances like these, Scheme does not need to put another application record on the stack -- there is no need to add another elephant -- and Scheme does not do so. In effect, it reuses the previous application record instead of providing a new one; thus the stack does not grow.

A procedure in which a call to another procedure occurs as the very last thing that the first procedure does is said to be making a tail call, and the call itself is said to be in tail position. The use of the word "tail" is quite conventional: The call takes place at the "tail end" of the procedure; it is the last thing that happens as the procedure goes by, so to speak. If the call in question is recursive, it is said to be tail recursive. The business of reusing the same application record is called tail-call elimination or tail-call optimization. Instances of recursive calls which can be tail-call optimized are sometimes said to be properly tail recursive, and sometimes that term is also applied to the entire procedure in which such calls occur. (Be careful, though: It is possible for a procedure to contain some recursive calls that are not properly tail recursive and others that are.)

Few programming language implementations provide tail-call elimination, but Scheme does. Tail-call elimination is a fundamental feature of Scheme. Scheme was designed with tail-call elimination in mind, even from its very beginnings.

More examples may make things clearer. Let's modify "foo" so that it actually does something before the recursive call.

When you paste in this "foo" and evaluate "(foo)", the process will print "foo " all over the screen -- once for each recursive call -- but it still won't overflow the stack. If you would like to see what the stack depth is, Wraith Scheme has, as an enhancement, a procedure called "e::stack-depth" that returns that value. Try this version of "foo". (Incidentally, the number displayed is not the number of application records on the stack, it is the total number of individual items on the stack -- each application record contains at least six.)

Make sure you understand that what counts is not what happens in the last line of text of the procedure body, but what happens in the last expression of the procedure that is actually evaluated. Here is a procedure that can end in either of two ways, depending on the argument it is called with. You call it with a boolean value, which it passes on in recursive calls. If the boolean is true, the recursive calls are tail calls, but if the boolean is #f, they are not tail calls, because something else happens after the recursive call. I included a print-out of the stack depth before the recursion, so that you can watch what happens when you evaluate the procedure.

Try evaluating "(two-ways #t)" and then "(two-ways #f)". In the first case, the stack depth will remain the same, but in the second, it will grow with every recursive call. That is true even though none of the calls to "(two-ways #f)" ever returns; the point is not whether these calls return or not, the point is that the second recursive call, in the "alternative" branch of the "if", is not a tail-call -- something else happens after it.

An extremely smart Scheme implementation might in principle be able to analyze procedure "two-ways", determine that the second recursive call would never return, and then indeed reuse the application record. To do so would not be tail-call optimization, however, simply because that second call is not a tail call. It would be a much more sophisticated form of optimization, which as far as I know does not presently have a name.

Here are two procedures which make tail calls to each other. Since no procedure calls itself, we do not have recursion in the precise technical sense, but Scheme can perform tail call optimization even so. Try calling "(bar)" or "(frob)", and see what happens.

So what good is tail recursion? All of the examples so far have been pretty contrived. The answer is part simplicity and part programming style.

First, there are certainly situations in which a program needs to do the same thing many times in a row -- situations in which the body of a recursive procedure like "foo" would be doing something other than contributing to a made-up example. A procedure might be monitoring some software status, or perhaps a piece of hardware that was connected to the computer, and might need to do so often. In that case, a tail-recursive procedure is very natural, something like this.

Procedure "e::usleep" is a Wraith Scheme enhancement that uses the Unix "usleep" system function to sleep for the given number of microseconds. Procedure "check-gizmo" would thereby run about a thousand times per second, and would rapidly overflow the stack if it were not tail recursive and tail-call optimized.

Other programming languages would use some kind of "loop" construct to achieve the same effect. Scheme doesn't have a "for" or "while" loop, but suppose it did. We could then write equivalent code in at least two ways. The first one would be to strip the recursive call and the sleep procedure out of "check-gizmo" and write a simple looping procedure that called "check-gizmo" repeatedly.

The second would be to write one bigger procedure that included the loop.

Which of these approaches looks simpler to you? The single procedure with tail-call optimization has six lines with 13 Scheme objects (counting the string as one object). The two-procedure approach has nine lines and 19 Scheme objects. The one bigger procedure has seven lines and 15 objects. I won't try to tell you what should look simpler to you, or that simplicity is a virtue in its own right, or what kind of code to write, but I will tell you that many people who are familiar with both kinds of code much prefer the style with optimized tail calls. (I am one of them -- I have written hundreds of thousands of lines of C and C++.) I will also tell you that the vast majority of programmers have little or no experience with the use of tail recursion in this manner, and will thereby suggest that you should at least try it.

The difference between looping constructs and optimized tail recursion is even more prominent when you have to run many procedures in sequence, and that sequence contains a loop. The model for the next example is a bunch of people sitting around a table, all thinking about an idea, or trying to solve a problem, and passing the results of their cogitation from one to another in turn, for many loops around the table.

Written this way, it is very clear what happens, but try writing it in a loop and not only does it get messy, but also you have in effect created a manager of thinkers -- the system of iterating over each thinker inside the loop -- that micromanages: You have put the control over what each thinker does in a central place, and reduced the freedom of each thinker to decide what is appropriate to do next. Suppose that the right thing for thinker-3 to do were to decide which thinker gets the results next.

Try writing a loop with that in it. Worse yet, suppose all of the thinkers were making decisions like that. In such circumstances, I think there is a decent case that the tail-call optimized code is much easier to understand and to maintain than is code written using a loop.

This style of computation, in which one of a number of procedures does something and then invokes some other procedure to carry on the work, is sometimes called agent-based computing. The individual procedures are then called either agents or actors.

By the way, if you are one of those geeky people like me, who think the formal theory of computation is not only neat but also useful, you might appreciate that it is particularly easy to write code for a deterministic finite automaton in Scheme. If you have never heard of a deterministic finite automaton, don't apologize, we geeks are very forgiving -- condescending, but forgiving.

I hope I have not lost you completely in this introduction to tail recursion and how it is handled in Scheme, and I hope you get a sense that tail-call optimization provides programmers with the opportunity to think about computation in new, powerful ways. I hope you also remember that it doesn't have to be elephants for a long way down -- with good coding practice and tail-call optimization, the world of Scheme can stay very nearly at the bottom end of infinite space, and most of those hapless elephants can at last be set free.

-- Jay Reynolds Freeman (Jay_Reynolds_Freeman@mac.com)

Wraith Face