One of a series of tutorials about Scheme in general
and the Wraith Scheme interpreter in particular.
Copyright © 2011
Jay Reynolds Freeman,
all rights reserved.
Personal Web Site: http://JayReynoldsFreeman.com
You might think that when you wanted to use a character in Scheme, you could just type it out the way you do in a word processor. For example, if you want to use the "a" character, why can't you just type
Try it and see. If you enter "a" that way and then press "return", you will get an error message:
a a Problem: No lexically visible binding of this symbol. (Resetting) Top-level loop ...
The problem is that Scheme doesn't know you meant to type a character; it thought you wanted the value of a variable named "a", but when it tried to look up that value, it found that you hadn't even defined a variable named "a" in the first place.
Scheme needs a way to distinguish variable names that are one character long from characters. The standard that the designers of Scheme settled on was to prefix the character itself with "#\" -- a sharp character followed by a backslash character -- like this:
If you type that in and press "return", you will find that characters evaluate to themselves, so that there is no need to quote them.
#\a ;; ==> #\a
There are a few characters for which the sharp-backslash syntax doesn't work very well: It might be confusing to type in a "newline" character that way, or a blank space, or a tab. So there is another way to type in those characters.
#\newline ;; ==> #\newline #\space ;; ==> #\space
Those alternatives are Scheme standards. Wraith Scheme adds a few more, as enhancements.
#\return ;; ==> #\return #\tab ;; ==> #\tab
(Incidentally, the distinction between "return" and "newline" dates from the days of typewriters and teletype machines that had a roller and a carriage to move paper around. The "return" operation made the carriage slide horizontally from wherever it was to the beginning of the same line, and the "newline" operation made the roller move to bring up a fresh line on the paper, that did not have any typing on it already. Mechanical typewriters had a single lever that did both of those things, but there was usually some way to make the two operations happen independently.)
By the way, you actually can enter the "space" and "tab" characters by typing the literal characters. The problem is, that when you look back at your work it is hard to tell what you have done.
#\ ;; ==> #\space ;; I typed a space after the "\" #\ ;; ==> #\tab ;; I typed a tab after the "\"
You don't need fancy procedures to create characters. Mostly you just type them out. Furthermore, there is no issue of modifying them or of retrieving their content.
Well, actually, there is one issue of content. The way characters work on computers in general is that what the computer understands to be a character is just a number. The software that does the calligraphy -- that actually draws the character -- uses the number to tell what character to draw. Wraith Scheme uses the character numbers that are associated with the ASCII character set. "ASCII" stands for "American Standard Code for Information Interchange", and you can read all about it on the Internet if you so choose. The important thing to realize is that in every implementation of Scheme there is a number for each character. You can use procedures "char->integer" and "integer->char" to translate back and forth between characters and their corresponding numbers.
(char->integer #\A) ;; ==> 65 (char->integer #\a) ;; ==> 97 (integer->char 98) ;; ==> #\b
I don't think you need to remember the numbers for the whole ASCII character set, or even part of it. Yet it may be useful to know that
The reason why you might need to know a little about the underlying numbers is that those numbers are used for comparison of characters. There are ten predicates to compare characters. Five of them are case-independent; specifically, any argument that is a letter is converted to lower case before the comparison is made. Each case-independent predicate has "ci" (for "case-independent") somewhere near the end of its name.
Each character-comparison predicate takes two arguments, both characters. The restriction on argument quantity is different from the predicates that compare numbers: All numeric-comparison predicates can take two or more arguments. The R5 report gives Scheme implementations a choice of whether or not to let the character-comparison predicates take more than two arguments; I chose to allow only two in Wraith Scheme. The character-comparison predicates are
char=? char<? char>? char<=? char>=? char-ci=? char-ci<? char-ci>? char-ci<=? char-ci>=?
Here are some examples of their use.
(char=? #\a #\A) ;; ==> #f (char-ci=? #\a #\A) ;; ==> #t (char<=? #\[ #\a) ;; ==> #t (char<=? #\[ #\A) ;; ==> #f (char-ci<=? #\[ #\a) ;; ==> #t (char-ci<=? #\[ #\A) ;; ==> #t
Several predicates let you ask for information about characters.
(char-alphabetic? #\a) ;; ==> #t (char-alphabetic? #\space) ;; ==> #f (char-numeric? #\a) ;; ==> #f (char-numeric? #\tab) ;; ==> #f (char-numeric? #\0) ;; ==> #t (char-whitespace? #\a) ;; ==> #f (char-whitespace? #\space) ;; ==> #t (char-upper-case? #\C) ;; ==> #t (char-upper-case? #\c) ;; ==> #f (char-lower-case? #\C) ;; ==> #f (char-lower-case? #\c) ;; ==> #t
Finally, two procedures allow you to obtain upper- and lower-case versions of characters that are letters. They return their arguments unchanged when the arguments do not represent letters of the alphabet.
(char-upcase #\a) ;; ==> #\A (char-upcase #\A) ;; ==> #\A (char-upcase #\space) ;; ==> #\space (char-downcase #\a) ;; ==> #\a (char-downcase #\A) ;; ==> #\a (char-downcase #\space) ;; ==> #\space
Remember, Wraith Scheme's character data type is just a thin veneer over the numbers used to represent ASCII characters.
-- Jay Reynolds Freeman (Jay_Reynolds_Freeman@mac.com)