This chapter describes those of Guile's simple data types which are primarily are used for their role as items of generic data. By simple we mean data types that are not primarily used as containers to hold other data -- i.e. pairs, lists, vectors and so on. For the documentation of such compound data types, see section Compound Data Types.
One of the great strengths of Scheme is that there is no straightforward distinction between "data" and "functionality". For example, Guile's support for dynamic linking could be described
The contents of this chapter are, therefore, a matter of judgment. By generic, we mean to select those data types whose typical use as data in a wide variety of programming contexts is more important than their use in the implementation of a particular piece of functionality. The last section of this chapter provides references for all the data types that are documented not here but in a "functionality-centric" way elsewhere in the manual.
The two boolean values are #t for true and #f for false.
Boolean values are returned by predicate procedures, such as the general
equality predicates eq?, eqv? and equal?
(see section Equality) and numerical and string comparison operators like
string=? (see section String Comparison) and <=
(see section Comparison Predicates).
(<= 3 8) => #t (<= 3 -3) => #f (equal? "house" "houses") => #f (eq? #f #f) => #t
In test condition contexts like if and cond (see section Simple Conditional Evaluation), where a group of subexpressions will be evaluated only if a
condition expression evaluates to "true", "true" means any
value at all except #f.
(if #t "yes" "no") => "yes" (if 0 "yes" "no") => "yes" (if #f "yes" "no") => "no"
A result of this asymmetry is that typical Scheme source code more often
uses #f explicitly than #t: #f is necessary to
represent an if or cond false value, whereas #t is
not necessary to represent an if or cond true value.
It is important to note that #f is not equivalent to any
other Scheme value. In particular, #f is not the same as the
number 0 (like in C and C++), and not the same as the "empty list"
(like in some Lisp dialects).
The not procedure returns the boolean inverse of its argument:
#t iff x is #f, else return #f.
The boolean? procedure is a predicate that returns #t if
its argument is one of the boolean values, otherwise #f.
#t iff obj is either #t or #f.
Guile supports a rich "tower" of numerical types -- integer, rational, real and complex -- and provides an extensive set of mathematical and scientific functions for operating on numerical data. This section of the manual documents those types and functions.
You may also find it illuminating to read R5RS's presentation of numbers in Scheme, which is particularly clear and accessible: see See section Numerical data types.
Scheme's numerical "tower" consists of the following categories of numbers:
It is called a tower because each category "sits on" the one that follows it, in the sense that every integer is also a rational, every rational is also real, and every real number is also a complex number (but with zero imaginary part).
Of these, Guile implements integers, reals and complex numbers as distinct types. Rationals are implemented as regards the read syntax for rational numbers that is specified by R5RS, but are immediately converted by Guile to the corresponding real number.
The number? predicate may be applied to any Scheme value to
discover whether the value is any of the supported numerical types.
#t if obj is any kind of number, #f else.
For example:
(number? 3) => #t (number? "hello there!") => #f (define pi 3.141592654) (number? pi) => #t
The next few subsections document each of Guile's numerical data types in detail.
Integers are whole numbers, that is numbers with no fractional part, such as 2, 83 and -3789.
Integers in Guile can be arbitrarily big, as shown by the following example.
(define (factorial n)
(let loop ((n n) (product 1))
(if (= n 0)
product
(loop (- n 1) (* product n)))))
(factorial 3)
=>
6
(factorial 20)
=>
2432902008176640000
(- (factorial 45))
=>
-119622220865480194561963161495657715064383733760000000000
Readers whose background is in programming languages where integers are limited by the need to fit into just 4 or 8 bytes of memory may find this surprising, or suspect that Guile's representation of integers is inefficient. In fact, Guile achieves a near optimal balance of convenience and efficiency by using the host computer's native representation of integers where possible, and a more general representation where the required number does not fit in the native form. Conversion between these two representations is automatic and completely invisible to the Scheme level programmer.
#t if x is an integer number, #f else.
(integer? 487) => #t (integer? -3.4) => #f
Mathematically, the real numbers are the set of numbers that describe all possible points along a continuous, infinite, one-dimensional line. The rational numbers are the set of all numbers that can be written as fractions P/Q, where P and Q are integers. All rational numbers are also real, but there are real numbers that are not rational, for example the square root of 2, and pi.
Guile represents both real and rational numbers approximately using a floating point encoding with limited precision. Even though the actual encoding is in binary, it may be helpful to think of it as a decimal number with a limited number of significant figures and a decimal point somewhere, since this corresponds to the standard notation for non-whole numbers. For example:
0.34 -0.00000142857931198 -5648394822220000000000.0 4.0
The limited precision of Guile's encoding means that any "real" number
in Guile can be written in a rational form, by multiplying and then dividing
by sufficient powers of 10 (or in fact, 2). For example,
-0.00000142857931198 is the same as 142857931198 divided by
100000000000000000. In Guile's current incarnation, therefore,
the rational? and real? predicates are equivalent.
Another aspect of this equivalence is that Guile currently does not preserve the exactness that is possible with rational arithmetic. If such exactness is needed, it is of course possible to implement exact rational arithmetic at the Scheme level using Guile's arbitrary size integers.
A planned future revision of Guile's numerical tower will make it possible to implement exact representations and arithmetic for both rational numbers and real irrational numbers such as square roots, and in such a way that the new kinds of number integrate seamlessly with those that are already implemented.
#t if obj is a real number, #f else.
Note that the sets of integer and rational values form subsets
of the set of real numbers, so the predicate will also be fulfilled
if obj is an integer number or a rational number.
#t if x is a rational number, #f
otherwise. Note that the set of integer values forms a subset of
the set of rational numbers, i. e. the predicate will also be
fulfilled if x is an integer number. Real numbers
will also satisfy this predicate, because of their limited
precision.
Complex numbers are the set of numbers that describe all possible points in a two-dimensional space. The two coordinates of a particular point in this space are known as the real and imaginary parts of the complex number that describes that point.
In Guile, complex numbers are written in rectangular form as the sum of
their real and imaginary parts, using the symbol i to indicate
the imaginary part.
3+4i => 3.0+4.0i (* 3-8i 2.3+0.3i) => 9.3-17.5i
Guile represents a complex number as a pair of numbers both of which are real, so the real and imaginary parts of a complex number have the same properties of inexactness and limited precision as single real numbers.
#t if x is a complex number, #f
otherwise. Note that the sets of real, rational and integer
values form subsets of the set of complex numbers, i. e. the
predicate will also be fulfilled if x is a real,
rational or integer number.
R5RS requires that a calculation involving inexact numbers always
produces an inexact result. To meet this requirement, Guile
distinguishes between an exact integer value such as 5 and the
corresponding inexact real value which, to the limited precision
available, has no fractional part, and is printed as 5.0. Guile
will only convert the latter value to the former when forced to do so by
an invocation of the inexact->exact procedure.
#t if x is an exact number, #f
otherwise.
#t if x is an inexact number, #f
else.
The read syntax for integers is a string of digits, optionally preceded by a minus or plus character, a code indicating the base in which the integer is encoded, and a code indicating whether the number is exact or inexact. The supported base codes are:
#b, #B -- the integer is written in binary (base 2)
#o, #O -- the integer is written in octal (base 8)
#d, #D -- the integer is written in decimal (base 10)
#x, #X -- the integer is written in hexadecimal (base 16).
If the base code is omitted, the integer is assumed to be decimal. The following examples show how these base codes are used.
-13 => -13 #d-13 => -13 #x-13 => -19 #b+1101 => 13 #o377 => 255
The codes for indicating exactness (which can, incidentally, be applied to all numerical values) are:
#e, #E -- the number is exact
#i, #I -- the number is inexact.
If the exactness indicator is omitted, the integer is assumed to be exact,
since Guile's internal representation for integers is always exact.
Real numbers have limited precision similar to the precision of the
double type in C. A consequence of the limited precision is that
all real numbers in Guile are also rational, since any number R with a
limited number of decimal places, say N, can be made into an integer by
multiplying by 10^N.
#t if n is an odd number, #f
otherwise.
#t if n is an even number, #f
otherwise.
(remainder 13 4) => 1 (remainder -13 4) => -1
(modulo 13 4) => 1 (modulo -13 4) => 3
#t if all parameters are numerically equal.
#t if the list of parameters is monotonically
increasing.
#t if the list of parameters is monotonically
decreasing.
#t if the list of parameters is monotonically
non-decreasing.
#t if the list of parameters is monotonically
non-increasing.
#t if z is an exact or inexact number equal to
zero.
#t if x is an exact or inexact number greater than
zero.
#t if x is an exact or inexact number less than
zero.
string->number returns #f.
abs for real arguments, but also allows complex numbers.
x must be a number with zero imaginary part. To calculate the
magnitude of a complex number, use magnitude instead.
For the truncate and round procedures, the Guile library
exports equivalent C functions, but taking and returning arguments of
type double rather than the usual SCM.
For floor and ceiling, the equivalent C functions are
floor and ceil from the standard mathematics library
(which also take and return double arguments).
The following procedures accept any kind of number as arguments, including complex numbers.
Many of Guile's numeric procedures which accept any kind of numbers as arguments, including complex numbers, are implemented as Scheme procedures that use the following real number-based primitives. These primitives signal an error if they are called with complex arguments.
For the hyperbolic arc-functions, the Guile library exports C functions
corresponding to these Scheme procedures, but taking and returning
arguments of type double rather than the usual SCM.
For all the other Scheme procedures above, except expt and
atan2 (whose entries specifically mention an equivalent C
function), the equivalent C functions are those provided by the standard
mathematics library. The mapping is as follows.
$abs fabs
$sqrt sqrt
$sin sin
$cos cos
$tan tan
$asin asin
$acos acos
$atan atan
$exp exp
$log log
$sinh sinh
$cosh cosh
$tanh tanh
Naturally, these C functions expect and return double arguments.
(logand) => -1 (logand 7) => 7 (logand #b111 #b011 #b001) => 1
(logior) => 0 (logior 7) => 7 (logior #b000 #b001 #b011) => 3
(logxor) => 0 (logxor 7) => 7 (logxor #b000 #b001 #b011) => 2 (logxor #b000 #b001 #b011 #b011) => 1
(number->string (lognot #b10000000) 2) => "-10000001" (number->string (lognot #b0) 2) => "-1"
(logtest j k) == (not (zero? (logand j k))) (logtest #b0100 #b1011) => #f (logtest #b0100 #b0111) => #t
(logbit? index j) == (logtest (integer-expt 2 index) j) (logbit? 0 #b1101) => #t (logbit? 1 #b1101) => #f (logbit? 2 #b1101) => #t (logbit? 3 #b1101) => #t (logbit? 4 #b1101) => #f
Formally, the function returns an integer equivalent to
(inexact->exact (floor (* n (expt 2 cnt)))).
(number->string (ash #b1 3) 2) => "1000" (number->string (ash #b1010 -1) 2) => "101"
(logcount #b10101010) => 4 (logcount 0) => 0 (logcount -2) => 1
(integer-length #b10101010) => 8 (integer-length 0) => 0 (integer-length #b1111) => 4
(integer-expt 2 5) => 32 (integer-expt -3 3) => -27
(number->string (bit-extract #b1101101010 0 4) 2) => "1010" (number->string (bit-extract #b1101101010 4 9) 2) => "10110"
Accepts a positive integer or real n and returns a number of the same type between zero (inclusive) and N (exclusive). The values returned have a uniform distribution.
The optional argument state must be of the type produced
by seed->random-state. It defaults to the value of the
variable *random-state*. This object is used to maintain
the state of the pseudo-random-number generator and is altered
as a side effect of the random operation.
(+ m (* d (random:normal))).
Most of the characters in the ASCII character set may be referred to by
name: for example, #\tab, #\esc, #\stx, and so on.
The following table describes the ASCII names for each character.
0 = #\nul |
1 = #\soh
| 2 = #\stx
| 3 = #\etx
|
4 = #\eot |
5 = #\enq
| 6 = #\ack
| 7 = #\bel
|
8 = #\bs |
9 = #\ht
| 10 = #\nl
| 11 = #\vt
|
12 = #\np |
13 = #\cr
| 14 = #\so
| 15 = #\si
|
16 = #\dle |
17 = #\dc1
| 18 = #\dc2
| 19 = #\dc3
|
20 = #\dc4 |
21 = #\nak
| 22 = #\syn
| 23 = #\etb
|
24 = #\can |
25 = #\em
| 26 = #\sub
| 27 = #\esc
|
28 = #\fs |
29 = #\gs
| 30 = #\rs
| 31 = #\us
|
32 = #\sp |
The delete character (octal 177) may be referred to with the name
#\del.
Several characters have more than one name:
#\space, #\sp
#\newline, #\nl
#\tab, #\ht
#\backspace, #\bs
#\return, #\cr
#\page, #\np
#\null, #\nul
#t iff x is less than or equal to y in the
ASCII sequence, else #f.
#t iff x is greater than or equal to y in the
ASCII sequence, else #f.
#t iff x is less than y in the ASCII sequence
ignoring case, else #f.
#t iff x is less than or equal to y in the
ASCII sequence ignoring case, else #f.
#t iff x is greater than y in the ASCII
sequence ignoring case, else #f.
#t iff x is greater than or equal to y in the
ASCII sequence ignoring case, else #f.
#t iff chr is alphabetic, else #f.
Alphabetic means the same thing as the isalpha C library function.
#t iff chr is numeric, else #f.
Numeric means the same thing as the isdigit C library function.
#t iff chr is whitespace, else #f.
Whitespace means the same thing as the isspace C library function.
#t iff chr is uppercase, else #f.
Uppercase means the same thing as the isupper C library function.
#t iff chr is lowercase, else #f.
Lowercase means the same thing as the islower C library function.
#t iff chr is either uppercase or lowercase, else #f.
Uppercase and lowercase are as defined by the isupper and islower
C library functions.
Strings are fixed-length sequences of characters. They can be created by calling constructor procedures, but they can also literally get entered at the REPL or in Scheme source files.
Guile provides a rich set of string processing procedures, because text handling is very important when Guile is used as a scripting language.
Strings always carry the information about how many characters they are
composed of with them, so there is no special end-of-string character,
like in C. That means that Scheme strings can contain any character,
even the NUL character '\0'. But note: Since most operating
system calls dealing with strings (such as for file operations) expect
strings to be zero-terminated, they might do unexpected things when
called with string containing unusual characters.
The read syntax for strings is an arbitrarily long sequence of
characters enclosed in double quotes ("). (7) If you want to insert a double quote character into a
string literal, it must be prefixed with a backslash \ character
(called an escape character).
The following are examples of string literals:
"foo" "bar plonk" "Hello World" "\"Hi\", he said."
The following procedures can be used to check whether a given string fulfills some specified property.
#t if str's length is zero, and
#f otherwise.
(string-null? "") => #t y => "foo" (string-null? y) => #f
The string constructor procedures create new string objects, possibly initializing them with some specified character data.
When processing strings, it is often convenient to first convert them
into a list representation by using the procedure string->list,
work with the resulting list, and then convert it back into a string.
These procedures are useful for similar tasks.
string->list and
list->string are inverses as far as `equal?' is
concerned.
(string-split "root:x:0:0:root:/root:/bin/bash" #\:)
=>
("root" "x" "0" "0" "root" "/root" "/bin/bash")
(string-split "::" #\:)
=>
("" "" "")
(string-split "" #\:)
=>
("")
Portions of strings can be extracted by these procedures.
string-ref delivers individual characters whereas
substring can be used to extract substrings from longer strings.
0 <= start <= end <= (string-length str).
These procedures are for modifying strings in-place. This means that the result of the operation is not a new string; instead, the original string's memory representation is modified.
(define y "abcdefg") (substring-fill! y 1 3 #\r) y => "arrdefg"
The procedures in this section are similar to the character ordering
predicates (see section Characters), but are defined on character sequences.
They all return #t on success and #f on failure. The
predicates ending in -ci ignore the character case when comparing
strings.
#t if the two
strings are the same length and contain the same characters in
the same positions, otherwise return #f.
The procedure string-ci=? treats upper and lower case
letters as though they were the same character, but
string=? treats upper and lower case as distinct
characters.
#t if s1
is lexicographically less than s2.
#t if s1
is lexicographically less than or equal to s2.
#t if s1
is lexicographically greater than s2.
#t if s1
is lexicographically greater than or equal to s2.
#t if
the two strings are the same length and their component
characters match (ignoring case) at each position; otherwise
return #f.
#t if s1 is lexicographically less than s2
regardless of case.
#t if s1 is lexicographically less than or equal
to s2 regardless of case.
#t if s1 is lexicographically greater than
s2 regardless of case.
#t if s1 is lexicographically greater than or
equal to s2 regardless of case.
When searching for the index of a character in a string, these procedures can be used.
index or
strchr functions from the C library.
(string-index "weiner" #\e) => 1 (string-index "weiner" #\e 2) => 4 (string-index "weiner" #\e 2 4) => #f
string-index, but search from the right of the
string rather than from the left. This procedure essentially
implements the rindex or strrchr functions from
the C library.
(string-rindex "weiner" #\e) => 4 (string-rindex "weiner" #\e 2 4) => #f (string-rindex "weiner" #\e 2 5) => 4
These are procedures for mapping strings to their upper- or lower-case equivalents, respectively, or for capitalizing strings.
y => "arrdefg" (string-upcase! y) => "ARRDEFG" y => "ARRDEFG"
y => "ARRDEFG" (string-downcase! y) => "arrdefg" y => "arrdefg"
y => "hello world" (string-capitalize! y) => "Hello World" y => "Hello World"
The procedure string-append appends several strings together to
form a longer result string.
A regular expression (or regexp) is a pattern that describes a whole class of strings. A full description of regular expressions and their syntax is beyond the scope of this manual; an introduction can be found in the Emacs manual (see section `Syntax of Regular Expressions' in The GNU Emacs Manual), or in many general Unix reference books.
If your system does not include a POSIX regular expression library, and
you have not linked Guile with a third-party regexp library such as Rx,
these functions will not be available. You can tell whether your Guile
installation includes regular expression support by checking whether the
*features* list includes the regex symbol.
[FIXME: it may be useful to include an Examples section. Parts of this interface are bewildering on first glance.]
By default, Guile supports POSIX extended regular expressions. That means that the characters `(', `)', `+' and `?' are special, and must be escaped if you wish to match the literal characters.
This regular expression interface was modeled after that implemented by SCSH, the Scheme Shell. It is intended to be upwardly compatible with SCSH regular expressions.
string-match returns a match structure which
describes what, if anything, was matched by the regular
expression. See section Match Structures. If str does not match
pattern at all, string-match returns #f.
Each time string-match is called, it must compile its
pattern argument into a regular expression structure. This
operation is expensive, which makes string-match inefficient if
the same regular expression is used several times (for example, in a
loop). For better performance, you can compile a regular expression in
advance and then match strings against the compiled regexp.
make-regexp throws
a regular-expression-syntax error.
The flags arguments change the behavior of the compiled regular expression. The following flags may be supplied:
regexp/icase
regexp/newline
regexp/basic
regexp/extended
make-regexp includes
both regexp/basic and regexp/extended flags, the
one which comes last will override the earlier one.
str. If the optional integer start argument is
provided, begin matching from that position in the string.
Return a match structure describing the results of the match,
or #f if no match could be found.
The flags arguments change the matching behavior. The following flags may be supplied:
regexp/notbol
regexp/newline
is used). Use this when the beginning of the string should
not be considered the beginning of a line.
regexp/noteol
regexp/newline
is used). Use this when the end of the string should not be
considered the end of a line.
#t if obj is a compiled regular expression,
or #f otherwise.
Regular expressions are commonly used to find patterns in one string and replace them with the contents of another string.
port may be #f, in which case nothing is written; instead,
regexp-substitute constructs a string from the specified
items and returns that.
regexp-substitute, but can be used to perform global
substitutions on str. Instead of taking a match structure as an
argument, regexp-substitute/global takes two string arguments: a
regexp string describing a regular expression, and a target
string which should be matched against this regular expression.
Each item behaves as in regexp-substitute, with the following exceptions:
regexp-substitute/global to recurse
on the unmatched portion of str. This must be supplied in
order to perform global search-and-replace on str; if it is not
present among the items, then regexp-substitute/global will
return after processing a single match.
A match structure is the object returned by string-match and
regexp-exec. It describes which portion of a string, if any,
matched the given regular expression. Match structures include: a
reference to the string that was checked for matches; the starting and
ending positions of the regexp match; and, if the regexp included any
parenthesized subexpressions, the starting and ending positions of each
submatch.
In each of the regexp match functions described below, the match
argument must be a match structure returned by a previous call to
string-match or regexp-exec. Most of these functions
return some information about the original target string that was
matched against a regular expression; we will call that string
target for easy reference.
#t if obj is a match structure returned by a
previous call to regexp-exec, or #f otherwise.
#f.
Sometimes you will want a regexp to match characters like `*' or `$' exactly. For example, to check whether a particular string represents a menu entry from an Info node, it would be useful to match it against a regexp like `^* [^:]*::'. However, this won't work; because the asterisk is a metacharacter, it won't match the `*' at the beginning of the string. In this case, we want to make the first asterisk un-magic.
You can do this by preceding the metacharacter with a backslash character `\'. (This is also called quoting the metacharacter, and is known as a backslash escape.) When Guile sees a backslash in a regular expression, it considers the following glyph to be an ordinary character, no matter what special meaning it would ordinarily have. Therefore, we can make the above example work by changing the regexp to `^\* [^:]*::'. The `\*' sequence tells the regular expression engine to match only a single asterisk in the target string.
Since the backslash is itself a metacharacter, you may force a regexp to match a backslash in the target string by preceding the backslash with itself. For example, to find variable references in a TeX program, you might want to find occurrences of the string `\let\' followed by any number of alphabetic characters. The regular expression `\\let\\[A-Za-z]*' would do this: the double backslashes in the regexp each match a single backslash in the target string.
Very important: Using backslash escapes in Guile source code (as in Emacs Lisp or C) can be tricky, because the backslash character has special meaning for the Guile reader. For example, if Guile encounters the character sequence `\n' in the middle of a string while processing Scheme code, it replaces those characters with a newline character. Similarly, the character sequence `\t' is replaced by a horizontal tab. Several of these escape sequences are processed by the Guile reader before your code is executed. Unrecognized escape sequences are ignored: if the characters `\*' appear in a string, they will be translated to the single character `*'.
This translation is obviously undesirable for regular expressions, since we want to be able to include backslashes in a string in order to escape regexp metacharacters. Therefore, to make sure that a backslash is preserved in a string in your Guile program, you must use two consecutive backslashes:
(define Info-menu-entry-pattern (make-regexp "^\\* [^:]*"))
The string in this example is preprocessed by the Guile reader before
any code is executed. The resulting argument to make-regexp is
the string `^\* [^:]*', which is what we really want.
This also means that in order to write a regular expression that matches a single backslash character, the regular expression string in the source code must include four backslashes. Each consecutive pair of backslashes gets translated by the Guile reader to a single backslash, and the resulting double-backslash is interpreted by the regexp engine as matching a single backslash character. Hence:
(define tex-variable-pattern (make-regexp "\\\\let\\\\=[A-Za-z]*"))
The reason for the unwieldiness of this syntax is historical. Both regular expression pattern matchers and Unix string processing systems have traditionally used backslashes with the special meanings described above. The POSIX regular expression specification and ANSI C standard both require these semantics. Attempting to abandon either convention would cause other kinds of compatibility problems, possibly more severe ones. Therefore, without extending the Scheme reader to support strings with different quoting conventions (an ungainly and confusing extension when implemented in other languages), we must adhere to this cumbersome escape syntax.
Symbols have two main uses. Crucially, they are used for denoting variables in a Scheme program. In addition, they are very useful for describing discrete literal data.
A symbol is an object with a name that consists of a string of characters. In the usual case (where the name doesn't include any characters that could be confused with other elements of Scheme syntax) a symbol can be written in a Scheme program by writing the sequence of characters that make up the symbol's name. For example, the read syntax for the symbol named "multiply-by-2" is simply
multiply-by-2
Symbols, then, look rather like strings but without any quotation marks. But there are several functional differences between them. The first big functional difference between symbols and strings concerns uniqueness. If the same-looking string is read twice from two different places in a program, the result is two distinguishable string objects whose contents just happen to be the same. If, on the other hand, the same-looking symbol is read twice from two different places in a program, the result is the same symbol object both times.
(define str1 "hello") (define str2 "hello") (eq? str1 str2) => #f (define sym1 (quote hello)) (define sym2 (quote hello)) (eq? sym1 sym2) => #t
The second important difference is that symbols, unlike strings, are not
self-evaluating. An unquoted symbol is interpreted as a variable
reference, and the result of evaluating that symbol is the corresponding
variable's value. (By the way, this is why we needed the (quote
...)s in the example above: (quote hello) returns the symbol
object named "hello" itself, whereas an unquoted hello would try
to find and dereference a variable associated with that symbol.)
For example, when the expression (string-length "abcd") is read
and evaluated, the sequence of characters string-length is read
as the symbol whose name is "string-length". This symbol is associated
with a variable whose value is the procedure that implements string
length calculation. Therefore evaluation of the string-length
symbol results in that procedure.
Although the use of symbols for variable references is undoubtedly their most important role in Scheme, it is not documented further here. See instead section Definitions and Variable Bindings, for how associations between symbols and variables are created, and section Modules, for how those associations are affected by Guile's module system. The rest of this section explains how symbols can also be used to represent discrete values, and documents the procedures available that relate to symbols as data objects per se.
The read syntax for symbols is a sequence of letters, digits, and
extended alphabetic characters, beginning with a character that
cannot begin a number. In addition, the special cases of +,
-, and ... are read as symbols even though numbers can
begin with +, - or ..
Extended alphabetic characters may be used within identifiers as if they were letters. The set of extended alphabetic characters is:
! $ % & * + - . / : < = > ? @ ^ _ ~
In addition to the standard read syntax defined above (which is taken from R5RS (see section `Formal syntax' in The Revised^5 Report on Scheme)), Guile provides an extended symbol read syntax that allows the inclusion of unusual characters such as space characters, newlines and parentheses. If (for whatever reason) you need to write a symbol containing characters not mentioned above, you can do so as follows.
#{,
}#.
Here are a few examples of this form of read syntax. The first symbol needs to use extended syntax because it contains a space character, the second because it contains a line break, and the last because it looks like a number.
#{foo bar}#
#{what
ever}#
#{4242}#
Although Guile provides this extended read syntax for symbols, widespread usage of it is discouraged because it is not portable and not very readable.
#t if obj is a symbol, otherwise return
#f.
symbol->string.
The following examples assume that the implementation's standard case is lower case:
(eq? 'mISSISSIppi 'mississippi) => #t
(string->symbol "mISSISSIppi") => the symbol with name "mISSISSIppi"
(eq? 'bitBlt (string->symbol "bitBlt")) => #f
(eq? 'JollyWog
(string->symbol (symbol->string 'JollyWog))) => #t
(string=? "K. Harper, M.D."
(symbol->string
(string->symbol "K. Harper, M.D."))) =>#t
read procedure,
and its name contains alphabetic characters, then the string
returned will contain characters in the implementation's
preferred standard case--some implementations will prefer
upper case, others lower case. If the symbol was returned by
string->symbol, the case of characters in the string
returned will be the same as the case in the string that was
passed to string->symbol. It is an error to apply
mutation procedures like string-set! to strings returned
by this procedure.
The following examples assume that the implementation's standard case is lower case:
(symbol->string 'flying-fish) => "flying-fish" (symbol->string 'Martin) => "martin" (symbol->string (string->symbol "Malvina")) => "Malvina"
g. The counter
is increased by 1 at each call. There is no provision for
resetting the counter.
Symbols are especially useful because two symbols which are spelled the
same way are equivalent in the sense of eq?. That means that
they are actually the same Scheme object. The advantage is that symbols
can be compared extremely efficiently, although they carry more
information for the human reader than, say, numbers.
It is very common in Scheme programs to use symbols as keys in association lists (see section Association Lists) or hash tables (see section Hash Tables), because this usage improves the readability a lot, and does not cause any performance loss.
What makes symbols useful is that they are automatically kept unique. There are no two symbols that are distinct objects but have the same name. But of course, there is no rule without exception. In addition to the normal symbols that have been discussed upto now, you can also create special uninterned symbols that behave slightly differently.
To understand what is different about them and why they might be useful, we look at how normal symbols are actually kept unique.
Whenever Guile wants to find the symbol with a specific name, for
example during read or when executing string->symbol, it
first looks into a table of all existing symbols to find out whether a
symbol with the given name already exists. When this is the case, Guile
just returns that symbol. When not, a new symbol with the name is
created and entered into the table so that it can be found later.
Sometimes you might want to create a symbol that is guaranteed `fresh', i.e., a symbol that did not exist previously. You might also want to somehow guarantee that no one else will ever unintentionally stumble across your symbol in the future. These properties of a symbol are often needed when generating code during macro expansion. When introducing new temporary variables, you want to guarantee that they don't conflict with variables in other peoples code.
The simplest way to arrange for this is to create a new symbol and to not enter it into the global table of all symbols. That way, no one will ever get access to your symbol by chance. Symbols that are not in the table are called uninterned. Of course, symbols that are in the table are called interned.
You create new uninterned symbols with the function make-symbol.
You can test whether a symbol is interned or not with
symbol-interned?.
Uninterned symbols break the rule that the name of a symbol uniquely
identifies the symbol object. Because of this, they can not be written
out and read back in like interned symbols. Currently, Guile has no
support for reading uninterned symbols. Note that the function
gensym does not return uninterned symbols for this reason.
string->symbol will not return it.
#t if symbol is interned, otherwise return
#f.
For example:
(define foo-1 (string->symbol "foo"))
(define foo-2 (string->symbol "foo"))
(define foo-3 (make-symbol "foo"))
(define foo-4 (make-symbol "foo"))
(eq? foo-1 foo-2)
@result{#t} ; Two interned symbols with the same name are the same object,
(eq? foo-1 foo-3)
@result{#f} ; but a call to make-symbol with the same name returns a
; distinct object.
(eq? foo-3 foo-4)
@result{#f} ; A call to make-symbol always returns a new object, even for
; the same name.
foo-3
@result{#<uninterned-symbol foo 8085290>}
; Uninterned symbols print different from interned symbols,
(symbol? foo-3)
@result{#t} ; but they are still symbols.
(symbol-interned? foo-3)
@result{#f} ; Just not interned.
Keywords are self-evaluating objects with a convenient read syntax that makes them easy to type.
Guile's keyword support conforms to R5RS, and adds a (switchable) read
syntax extension to permit keywords to begin with : as well as
#:.
Keywords are useful in contexts where a program or procedure wants to be able to accept a large number of optional arguments without making its interface unmanageable.
To illustrate this, consider a hypothetical make-window
procedure, which creates a new window on the screen for drawing into
using some graphical toolkit. There are many parameters that the caller
might like to specify, but which could also be sensibly defaulted, for
example:
If make-window did not use keywords, the caller would have to
pass in a value for each possible argument, remembering the correct
argument order and using a special value to indicate the default value
for that argument:
(make-window 'default ;; Color depth
'default ;; Background color
800 ;; Width
100 ;; Height
...) ;; More make-window arguments
With keywords, on the other hand, defaulted arguments are omitted, and non-default arguments are clearly tagged by the appropriate keyword. As a result, the invocation becomes much clearer:
(make-window #:width 800 #:height 100)
On the other hand, for a simpler procedure with few arguments, the use
of keywords would be a hindrance rather than a help. The primitive
procedure cons, for example, would not be improved if it had to
be invoked as
(cons #:car x #:cdr y)
So the decision whether to use keywords or not is purely pragmatic: use them if they will clarify the procedure invocation at point of call.
If a procedure wants to support keywords, it should take a rest argument and then use whatever means is convenient to extract keywords and their corresponding arguments from the contents of that rest argument.
The following example illustrates the principle: the code for
make-window uses a helper procedure called
get-keyword-value to extract individual keyword arguments from
the rest argument.
(define (get-keyword-value args keyword default)
(let ((kv (memq keyword args)))
(if (and kv (>= (length kv) 2))
(cadr kv)
default)))
(define (make-window . args)
(let ((depth (get-keyword-value args #:depth screen-depth))
(bg (get-keyword-value args #:bg "white"))
(width (get-keyword-value args #:width 800))
(height (get-keyword-value args #:height 100))
...)
...))
But you don't need to write get-keyword-value. The (ice-9
optargs) module provides a set of powerful macros that you can use to
implement keyword-supporting procedures like this:
(use-modules (ice-9 optargs))
(define (make-window . args)
(let-keywords args #f ((depth screen-depth)
(bg "white")
(width 800)
(height 100))
...))
Or, even more economically, like this:
(use-modules (ice-9 optargs))
(define* (make-window #:key (depth screen-depth)
(bg "white")
(width 800)
(height 100))
...)
For further details on let-keywords, define* and other
facilities provided by the (ice-9 optargs) module, see
section Optional Arguments.
Guile, by default, only recognizes the keyword syntax specified by R5RS.
A token of the form #:NAME, where NAME has the same syntax
as a Scheme symbol (see section Extended Read Syntax for Symbols), is the external
representation of the keyword named NAME. Keyword objects print
using this syntax as well, so values containing keyword objects can be
read back into Guile. When used in an expression, keywords are
self-quoting objects.
If the keyword read option is set to 'prefix, Guile also
recognizes the alternative read syntax :NAME. Otherwise, tokens
of the form :NAME are read as symbols, as required by R5RS.
To enable and disable the alternative non-R5RS keyword syntax, you use
the read-options procedure documented in section General option interface and section Reader options.
(read-set! keywords 'prefix) #:type => #:type :type => #:type (read-set! keywords #f) #:type => #:type :type -| ERROR: In expression :type: ERROR: Unbound variable: :type ABORT: (unbound-variable)
The following procedures can be used for converting symbols to keywords and back.
Internally, a keyword is implemented as something like a tagged symbol,
where the tag identifies the keyword as being self-evaluating, and the
symbol, known as the keyword's dash symbol has the same name as
the keyword name but prefixed by a single dash. For example, the
keyword #:name has the corresponding dash symbol -name.
Most keyword objects are constructed automatically by the reader when it
reads a token beginning with #:. However, if you need to
construct a keyword object programmatically, you can do so by calling
make-keyword-from-dash-symbol with the corresponding dash symbol
(as the reader does). The dash symbol for a keyword object can be
retrieved using the keyword-dash-symbol procedure.
#t if the argument obj is a keyword, else
#f.
make-keyword-from-dash-symbol.
Procedures and macros are documented in their own chapter: see section Procedures and Macros.
Variable objects are documented as part of the description of Guile's module system: see section Variables.
Asyncs, dynamic roots and fluids are described in the chapter on scheduling: see section Threads, Mutexes, Asyncs and Dynamic Roots.
Hooks are documented in the chapter on general utility functions: see section Hooks.
Ports are described in the chapter on I/O: see section Input and Output.
Go to the first, previous, next, last section, table of contents.