Environments and Scope

One of a series of tutorials about Scheme in general
and the Wraith Scheme interpreter in particular.

Copyright © 2011 Jay Reynolds Freeman, all rights reserved.
Personal Web Site: http://JayReynoldsFreeman.com
EMail: Jay_Reynolds_Freeman@mac.com.

Let's try some rather trick questions, to get you thinking. Suppose you typed the following expressions into Scheme, in the order given. I am deliberately not showing what happens, to make my point.

The problem here is that we have used the variable name "x" in two different places. It is an argument name in the definition of procedure "list-up-two", and it is also a variable defined at top level, whose value is 100. The first trick question has to do with the procedure application, "(list-up-two 1 2)". Does the procedure use the value of "x" that was the first argument to the procedure, namely "1"? Or does it use the value of "x" that was created by the top-level definition, namely "100"? The second trick question is, what is returned by the very last line in the example, where "x" is typed in at top level? Is it the value "100", that was bound to x at top level? Or is it the chronologically more recent binding of the value "1", that took place during the procedure application?

I hope you said that the procedure uses "1", because I have already described many times how the arguments of procedures are bound to variable names and passed to the procedure. Perhaps you also suspected -- though I have not said so yet -- that the result of typing "x" at top level was indeed "100", notwithstanding the more recent binding of "1" within the procedure application, and that is indeed the case. Let's try it to make sure.

The point is not that these things happen, or that it makes sense that they happen (and I think it does), but that Scheme had to specify that they happened. There are in fact two different bindings to the variable "x" in the preceding example, and there must be rules to figure out which binding is used where. The rules in this case are pretty simple, but the example just given was pretty simple as well. More complicated situations exist, and we will see them later.

The idea that a variable may have different values depending on where it is used isn't really very surprising. After all, in everyday life there are plenty of examples of how a question can have different correct answers depending on the context in which it is asked. Thus if I ask "Is it raining?", I may get a different answer from someone who just came into my house than if I am talking on the telephone to a person in another city, and what is more, those answers will both change as time goes on. Scheme frequently has to ask itself what value is bound to a particular variable right now, and that is a question, just like "Is it raining?" So it is completely normal that the answer may depend on the time and on the context in which the question is asked.

Let me introduce a couple of words that Scheme uses for discussing these matters. I am not going to be completely precise about their use and meaning yet. I merely want you to have them in mind as the tutorials progress.

I used the word "context", informally, in describing how a question might sometimes have different answers. The formal technical term that Scheme programmers use when addressing the fact that variables may sometimes have different values is not context, it is environment. We say that variables may have different values bound to them in different environments.

In the example just given, there were two environments. One was for just the variables used in the body of procedure "list-up-two", and the other was for everything defined at top level.

A Scheme environment is a moderately complex thing in its own right, but for purposes of figuring out what value is associated with a variable we can think of it as a little dictionary, or perhaps a list of key-value pairs. If I hand you an environment and ask "What value is bound to 'q' in this environment?", you just go through the dictionary looking for "q", and tell me what is written down next to it. It is that simple.

Well, not quite. There is actually one problem. What if the environment I just gave you does not contain a binding of "q"? What then?

(Incidentally, the opposite problem is never going to occur: No valid Scheme environment is ever going to contain more than one binding of any given variable name. Such a thing would be a serious bug in the implementation of Scheme where it happened.)

The problem may be insoluble. Perhaps there is no definition of "q" anywhere in the whole Scheme system. In that case, the best we can hope for is that Scheme will print an informative error message and then forget the whole thing so that we can go on and do something else.

Yet perhaps there is some other environment to which we might turn, if we cannot find what we are looking for in the first environment. An example might help. Look at the following procedure.

This procedure makes a list of the values of x, y, and z. The first two are the variable names for its arguments, but what about z? No binding of "z" is created in the environment for the body of "list-up-two-and-z". If we try to use this procedure in a procedure application, we will get an error message:

Here is one possible fix. We create a binding for z in the top-level environment.

Now let's rerun the procedure: There is no error message, and the top-level value of "z" shows up in the result.

During the procedure application, Scheme looked for a binding of z in the environment associated with the body of "list-up-two-and-z", and did not find one. It then looked in the top-level environment, and did find a binding of z there, so it used that value in the list it was building. Scheme has a set of rules for how to look up variable bindings in a sequence of environments, starting with the environment that is as local as possible to the place where the value was needed, and moving progressively through environments that are less and less local. I won't go into all those rules quite yet, but bear in mind that they are there, and remember that the value of a variable depends on which environment you look it up in: Many different environments may exist at the same time, possibly containing different values all bound to the same variable.

We need another new word -- scope -- to talk about the places where particular variable bindings are supposed to be used. The domain in which a variable binding is valid is called its scope. Thus in the examples we have been working with, the scope of every variable name used for a procedure argument has been the body of the procedure in which it occurs. Sometimes two different procedures happen to use the same variable names for one of their arguments: In the last two examples, "list-up-two" and "list-up-two-and-z" both used "x" (and also both used "y"). In that case there were two different regions of bindings to x -- two different scopes -- and there were two different environments, one for each procedure body, each of which had its own variable bindings listed up.

One thing that makes environments particularly useful is that it is common for many bindings to have the same scope -- like, all the argument variable names in a procedure. Scheme would be a lot more complicated if no two variables ever had the same scope.

The scope of a variable binding made in the top-level environment is the entire top-level environment after the time the binding is created; that is, after the "define" statement that created it. Some bindings are built into Scheme itself, like the one that associates the symbol "+" with the procedure that does addition. Those bindings already have already been defined when you start Scheme running.

One interesting feature of Scheme is that you can always tell what the scope of a variable binding is by looking at the source code where it is created and used. Running a procedure can change the value that is bound to a variable name, but it cannot move the key-value pair that represents the entire binding from one environment to another. Languages in which you can figure out the scope of a variable binding just by looking at the source code are said to have lexical scope. If running procedures did move variable bindings around, it would make it even more difficult to understand what a program did on the basis of source code. You would have to know what procedures had run, and in what order, and what they all did. It is already plenty difficult to understand computer source code just by reading it, so lexical scope is a valuable aid to simplicity.

Thus if you use "set!" to change the value of a variable named "x", Scheme will use the look-up mechanism I mentioned to find out which binding to use, and will change only that one binding. The changed binding will still be part of the same environment it was in before: The variable has a new value in that environment, and all subsequent uses of it in the same environment will get the new value. All Scheme has done is change what is written down in a particular one of those dictionaries that represent environments; the other dictionaries are still the same.

You will hear a great deal more about environments and scope in subsequent tutorials. For the moment, just try to be aware of them in any programs that you write.

-- Jay Reynolds Freeman (Jay_Reynolds_Freeman@mac.com)

Wraith Face