This chapter describes Guile's compound data types. By compound we mean that the primary purpose of these data types is to act as containers for other kinds of data (including other compound objects). For instance, a (non-uniform) vector with length 5 is a container that can hold five arbitrary Scheme objects.
The various kinds of container object differ from each other in how their memory is allocated, how they are indexed, and how particular values can be looked up within them.
Pairs are used to combine two Scheme objects into one compound object. Hence the name: A pair stores a pair of objects.
The data type pair is extremely important in Scheme, just like in any other Lisp dialect. The reason is that pairs are not only used to make two values available as one object, but that pairs are used for constructing lists of values. Because lists are so important in Scheme, they are described in a section of their own (see section Lists).
Pairs can literally get entered in source code or at the REPL, in the
so-called dotted list syntax. This syntax consists of an opening
parentheses, the first element of the pair, a dot, the second element
and a closing parentheses. The following example shows how a pair
consisting of the two numbers 1 and 2, and a pair containing the symbols
foo and bar can be entered. It is very important to write
the whitespace before and after the dot, because otherwise the Scheme
parser would not be able to figure out where to split the tokens.
(1 . 2) (foo . bar)
But beware, if you want to try out these examples, you have to quote the expressions. More information about quotation is available in the section (REFFIXME). The correct way to try these examples is as follows.
'(1 . 2) => (1 . 2) '(foo . bar) => (foo . bar)
A new pair is made by calling the procedure cons with two
arguments. Then the argument values are stored into a newly allocated
pair, and the pair is returned. The name cons stands for
"construct". Use the procedure pair? to test whether a
given Scheme object is a pair or not.
eq?) from every previously existing object.
#t if x is a pair; otherwise return
#f.
The two parts of a pair are traditionally called car and
cdr. They can be retrieved with procedures of the same name
(car and cdr), and can be modified with the procedures
set-car! and set-cdr!. Since a very common operation in
Scheme programs is to access the car of a pair, or the car of the cdr of
a pair, etc., the procedures called caar, cadr and so on
are also predefined.
car and cdr, where
for example caddr could be defined by
(define caddr (lambda (x) (car (cdr (cdr x)))))
set-car! is unspecified.
set-cdr! is unspecified.
A very important data type in Scheme--as well as in all other Lisp dialects--is the data type list.(8)
This is the short definition of what a list is:
(),
The syntax for lists is an opening parentheses, then all the elements of the list (separated by whitespace) and finally a closing parentheses.(9).
(1 2 3) ; a list of the numbers 1, 2 and 3
("foo" bar 3.1415) ; a string, a symbol and a real number
() ; the empty list
The last example needs a bit more explanation. A list with no elements, called the empty list, is special in some ways. It is used for terminating lists by storing it into the cdr of the last pair that makes up a list. An example will clear that up:
(car '(1)) => 1 (cdr '(1)) => ()
This example also shows that lists have to be quoted (REFFIXME) when written, because they would otherwise be mistakingly taken as procedure applications (see section Simple Procedure Invocation).
Often it is useful to test whether a given Scheme object is a list or not. List-processing procedures could use this information to test whether their input is valid, or they could do different things depending on the datatype of their arguments.
#t iff x is a proper list, else #f.
The predicate null? is often used in list-processing code to
tell whether a given list has run out of elements. That is, a loop
somehow deals with the elements of a list until the list satisfies
null?. Then, the algorithm terminates.
#t iff x is the empty list, else #f.
This section describes the procedures for constructing new lists.
list simply returns a list where the elements are the arguments,
cons* is similar, but the last argument is stored in the cdr of
the last pair of the list.
list.
list, but the last arg provides the tail of the
constructed list, returning (cons arg1 (cons
arg2 (cons ... argn))). Requires at least one
argument. If given one argument, that argument is returned as
result. This function is called list* in some other
Schemes and in Common LISP.
() if not given.
Note that list-copy only makes a copy of the pairs which make up
the spine of the lists. The list elements are not copied, which means
that modifying the elements of the new list also modifies the elements
of the old list. On the other hand, applying procedures like
set-cdr! or delv! to the new list will not alter the old
list. If you also need to copy the list elements (making a deep copy),
use the procedure copy-tree (see section Copying Deep Structures).
These procedures are used to get some information about a list, or to retrieve one or more elements of a list.
list-tail and list-cdr-ref are identical. It may help to
think of list-cdr-ref as accessing the kth cdr of the list,
or returning the results of cdring k times down lst.
append and append! are used to concatenate two or more
lists in order to form a new list. reverse and reverse!
return lists with the same elements as their arguments, but in reverse
order. The procedure variants with an ! directly modify the
pairs which form the list, whereas the other procedures create new
pairs. This is why you should be careful when using the side-effecting
variants.
(append '(x) '(y)) => (x y) (append '(a) '(b c d)) => (a b c d) (append '(a (b)) '((c))) => (a (b) (c))
The resulting list is always newly allocated, except that it shares structure with the last list argument. The last argument may actually be any object; an improper list results if the last argument is not a proper list.
(append '(a b) '(c . d)) => (a b c . d) (append '() 'a) => a
append (see section `Pairs and Lists' in The Revised^5 Report on Scheme). The cdr field
of each list's final pair is changed to point to the head of
the next list, so no consing is performed. Return a pointer to
the mutated list.
reverse (see section `Pairs and Lists' in The Revised^5 Report on Scheme). The cdr of each cell in lst is
modified to point to the previous list element. Return a pointer to the
head of the reversed list.
Caveat: because the list is modified in place, the tail of the original
list now becomes its head, and the head of the original list now becomes
the tail. Therefore, the lst symbol to which the head of the
original list was bound now points to the tail. To ensure that the head
of the modified list is not lost, it is wise to save the return value of
reverse!
The following procedures modify an existing list, either by changing elements of the list, or by changing the list structure itself.
eq? to item removed. This procedure mirrors
memq: delq compares elements of lst against
item with eq?.
eqv? to item removed. This procedure mirrors
memv: delv compares elements of lst against
item with eqv?.
equal? to item removed. This procedure mirrors
member: delete compares elements of lst
against item with equal?.
delq, delv
and delete: they modify the pointers in the existing lst
rather than creating a new list. Caveat evaluator: Like other
destructive list functions, these functions cannot modify the binding of
lst, and so cannot be used to delete the first element of
lst destructively.
delq!, but only deletes the first occurrence of
item from lst. Tests for equality using
eq?. See also delv1! and delete1!.
delv!, but only deletes the first occurrence of
item from lst. Tests for equality using
eqv?. See also delq1! and delete1!.
delete!, but only deletes the first occurrence of
item from lst. Tests for equality using
equal?. See also delq1! and delv1!.
The following procedures search lists for particular elements. They use
different comparison predicates for comparing list elements with the
object to be searched. When they fail, they return #f, otherwise
they return the sublist whose car is equal to the search object, where
equality depends on the equality predicate used.
eq?
to x where the sublists of lst are the non-empty
lists returned by (list-tail lst k) for
k less than the length of lst. If x does not
occur in lst, then #f (not the empty list) is
returned.
eqv?
to x where the sublists of lst are the non-empty
lists returned by (list-tail lst k) for
k less than the length of lst. If x does not
occur in lst, then #f (not the empty list) is
returned.
equal? to x where the sublists of lst are
the non-empty lists returned by (list-tail lst
k) for k less than the length of lst. If
x does not occur in lst, then #f (not the
empty list) is returned.
[FIXME: Is there any reason to have the `sloppy' functions available at high level at all? Maybe these docs should be relegated to a "Guile Internals" node or something. -twp]
memq, but does no type or error checking.
Its use is recommended only in writing Guile internals,
not for high-level Scheme programs.
memv, but does no type or error checking.
Its use is recommended only in writing Guile internals,
not for high-level Scheme programs.
member, but does no type or error checking.
Its use is recommended only in writing Guile internals,
not for high-level Scheme programs.
List processing is very convenient in Scheme because the process of iterating over the elements of a list can be highly abstracted. The procedures in this section are the most basic iterating procedures for lists. They take a procedure and one or more lists as arguments, and apply the procedure to each element of the list. They differ in their return value.
map, the order of procedure applications is not specified,
map-in-order applies the procedure from left to right to the list
elements.
map, but the procedure is always applied from left to right,
and the result(s) of the procedure applications are thrown away. The
return value is not specified.
Vectors are sequences of Scheme objects. Unlike lists, the length of a vector, once the vector is created, cannot be changed. The advantage of vectors over lists is that the time required to access one element of a vector given its position (synonymous with index), a zero-origin number, is constant, whereas lists have an access time linear to the position of the accessed element in the list.
Vectors can contain any kind of Scheme object; it is even possible to have different types of objects in the same vector. For vectors containing vectors, you may wish to use arrays, instead. Note, too, that some array procedures operate happily on vectors (see section Arrays).
Vectors can literally be entered in source code, just like strings,
characters or some of the other data types. The read syntax for vectors
is as follows: A sharp sign (#), followed by an opening
parentheses, all elements of the vector in their respective read syntax,
and finally a closing parentheses. The following are examples of the
read syntax for vectors; where the first vector only contains numbers
and the second three different object types: a string, a symbol and a
number in hexadecimal notation.
#(1 2 3)
#("Hello" foo #xdeadbeef)
#t if obj is a vector, otherwise return
#f.
list.
(vector 'a 'b 'c) => #(a b c)
(vector->list '#(dah dah didah)) => (dah dah didah) (list->vector '(dididit dah)) => #(dididit dah)
A vector created by any of the vector constructor procedures (see section Vectors) documented above can be modified using the following procedures.
NOTE: According to R5RS, using any of these procedures on literally entered vectors is an error, because these vectors are considered to be constant, although Guile currently does not detect this error.
(let ((vec (vector 0 '(2 2 2 2) "Anna")))
(vector-set! vec 1 '("Sue" "Sue"))
vec) => #(0 ("Sue" "Sue") "Anna")
vector-fill! is unspecified.
vector-move-left! copies elements in leftmost order.
Therefore, in the case where vec1 and vec2 refer to the
same vector, vector-move-left! is usually appropriate when
start1 is greater than start2.
vector-move-right! copies elements in rightmost order.
Therefore, in the case where vec1 and vec2 refer to the
same vector, vector-move-right! is usually appropriate when
start1 is less than start2.
These procedures return information about a given vector, such as the size or what elements are contained in the vector.
(vector-ref '#(1 1 2 3 5 8 13 21) 5) => 8
(vector-ref '#(1 1 2 3 5 8 13 21)
(let ((i (round (* 2 (acos -1)))))
(if (inexact? i)
(inexact->exact i)
i))) => 13
A record type is a first class object representing a user-defined data type. A record is an instance of a record type.
#t if obj is a record of any type and #f
otherwise.
Note that record? may be true of any Scheme value; there is no
promise that records are disjoint with other Scheme types.
make-record-type that created the type represented by rtd;
if the field-names argument is provided, it is an error if it
contains any duplicates or any symbols not in the default list.
make-record-type
that created the type represented by rtd.
make-record-type that created the type represented by
rtd.
record-predicate, the resulting predicate would return a true
value when passed the given record. Note that it is not necessarily the
case that the returned descriptor is the one that was passed to
record-constructor in the call that created the constructor
procedure that created the given record.
eqv? to the type-name argument given in
the call to make-record-type that created the type represented by
rtd.
equal? to the
field-names argument given in the call to make-record-type that
created the type represented by rtd.
[FIXME: this is pasted in from Tom Lord's original guile.texi and should be reviewed]
A structure type is a first class user-defined data type. A structure is an instance of a structure type. A structure type is itself a structure.
Structures are less abstract and more general than traditional records. In fact, in Guile Scheme, records are implemented using structures.
A structure object consists of a handle, structure data, and a vtable. The handle is a Scheme value which points to both the vtable and the structure's data. Structure data is a dynamically allocated region of memory, private to the structure, divided up into typed fields. A vtable is another structure used to hold type-specific data. Multiple structures can share a common vtable.
Three concepts are key to understanding structures.
When a structure is created, a region of memory is allocated to hold its state. The layout of the structure's type determines how that memory is divided into fields.
Each field has a specified type. There are only three types allowed, each corresponding to a one letter code. The allowed types are:
Each field also has an associated access protection. There are only three kinds of protection, each corresponding to a one letter code. The allowed protections are:
A layout specification is described by stringing together pairs of letters: one to specify a field type and one to specify a field protection. For example, a traditional cons pair type object could be described as:
; cons pairs have two writable fields of Scheme data "pwpw"
A pair object in which the first field is held constant could be:
"prpw"
Binary fields, (fields of type "u"), hold one word each. The
size of a word is a machine dependent value defined to be equal to the
value of the C expression: sizeof (long).
The last field of a structure layout may specify a tail array. A tail array is indicated by capitalizing the field's protection code ('W', 'R' or 'O'). A tail-array field is replaced by a read-only binary data field containing an array size. The array size is determined at the time the structure is created. It is followed by a corresponding number of fields of the type specified for the tail array. For example, a conventional Scheme vector can be described as:
; A vector is an arbitrary number of writable fields holding Scheme ; values: "pW"
In the above example, field 0 contains the size of the vector and fields beginning at 1 contain the vector elements.
A kind of tagged vector (a constant tag followed by conventional vector elements) might be:
"prpW"
Structure layouts are represented by specially interned symbols whose name is a string of type and protection codes. To create a new structure layout, use this procedure:
fields must be a string made up of pairs of characters strung together. The first character of each pair describes a field type, the second a field protection. Allowed types are 'p' for GC-protected Scheme data, 'u' for unprotected binary data, and 's' for a field that points to the structure itself. Allowed protections are 'w' for mutable fields, 'r' for read-only fields, and 'o' for opaque fields. The last field protection specification may be capitalized to indicate that the field is a tail-array.
This section describes the basic procedures for creating and accessing structures.
type must be a vtable structure (see section Vtables).
tail-elts must be a non-negative integer. If the layout specification indicated by type includes a tail-array, this is the number of elements allocated to that array.
The init1, ... are optional arguments describing how successive fields of the structure should be initialized. Only fields with protection 'r' or 'w' can be initialized, except for fields of type 's', which are automatically initialized to point to the new structure itself; fields with protection 'o' can not be initialized by Scheme programs.
If fewer optional arguments than initializable fields are supplied, fields of type 'p' get default value #f while fields of type 'u' are initialized to 0.
Structs are currently the basic representation for record-like data structures in Guile. The plan is to eventually replace them with a new representation which will at the same time be easier to use and more powerful.
For more information, see the documentation for make-vtable-vtable.
#t iff x is a structure object, else
#f.
If the field is of type 'p', then it can be set to an arbitrary value.
If the field is of type 'u', then it can only be set to a non-negative integer value small enough to fit in one machine word.
Vtables are structures that are used to represent structure types. Each
vtable contains a layout specification in field
vtable-index-layout -- instances of the type are laid out
according to that specification. Vtables contain additional fields
which are used only internally to libguile. The variable
vtable-offset-user is bound to a field number. Vtable fields
at that position or greater are user definable.
#t iff x is a vtable structure.
If you have a vtable structure, V, you can create an instance of
the type it describes by using (make-struct V ...). But where
does V itself come from? One possibility is that V is an
instance of a user-defined vtable type, V', so that V is
created by using (make-struct V' ...). Another possibility is
that V is an instance of the type it itself describes. Vtable
structures of the second sort are created by this procedure:
user-fields is a string describing user defined fields of the
vtable beginning at index vtable-offset-user
(see make-struct-layout).
tail-size specifies the size of the tail-array (if any) of this vtable.
init1, ... are the optional initializers for the fields of the vtable.
Vtables have one initializable system field--the struct printer.
This field comes before the user fields in the initializers passed
to make-vtable-vtable and make-struct, and thus works as
a third optional argument to make-vtable-vtable and a fourth to
make-struct when creating vtables:
If the value is a procedure, it will be called instead of the standard printer whenever a struct described by this vtable is printed. The procedure will be called with arguments STRUCT and PORT.
The structure of a struct is described by a vtable, so the vtable is in essence the type of the struct. The vtable is itself a struct with a vtable. This could go on forever if it weren't for the vtable-vtables which are self-describing vtables, and thus terminate the chain.
There are several potential ways of using structs, but the standard one is to use three kinds of structs, together building up a type sub-system: one vtable-vtable working as the root and one or several "types", each with a set of "instances". (The vtable-vtable should be compared to the class <class> which is the class of itself.)
(define ball-root (make-vtable-vtable "pr" 0))
(define (make-ball-type ball-color)
(make-struct ball-root 0
(make-struct-layout "pw")
(lambda (ball port)
(format port "#<a ~A ball owned by ~A>"
(color ball)
(owner ball)))
ball-color))
(define (color ball) (struct-ref (struct-vtable ball) vtable-offset-user))
(define (owner ball) (struct-ref ball 0))
(define red (make-ball-type 'red))
(define green (make-ball-type 'green))
(define (make-ball type owner) (make-struct type 0 owner))
(define ball (make-ball green 'Nisse))
ball => #<a green ball owned by Nisse>
Conventional arrays are a collection of cells organized into an arbitrary number of dimensions. Each cell can hold any kind of Scheme value and can be accessed in constant time by supplying an index for each dimension. This contrasts with uniform arrays, which use memory more efficiently but can hold data of only a single type, and lists where inserting and deleting cells is more efficient, but more time is usually required to access a particular cell.
A conventional array is displayed as # followed by the rank
(number of dimensions) followed by the cells, organized into dimensions
using parentheses. The nesting depth of the parentheses is equal to
the rank.
When an array is created, the number of dimensions and range of each dimension must be specified, e.g., to create a 2x3 array with a zero-based index:
(make-array 'ho 2 3) => #2((ho ho ho) (ho ho ho))
The range of each dimension can also be given explicitly, e.g., another way to create the same array:
(make-array 'ho '(0 1) '(0 2)) => #2((ho ho ho) (ho ho ho))
A conventional array with one dimension based at zero is identical to a vector:
(make-array 'ho 3) => #(ho ho ho)
The following procedures can be used with conventional arrays (or vectors).
#t if the obj is an array, and #f if
not. The prototype argument is used with uniform arrays
and is described elsewhere.
(LOWER UPPER), where both lower
and upper are integers, possibly less than zero, and possibly the
same number (however, lower cannot be greater than upper).
(index1, index2) element in
array.
#t if its arguments would be acceptable to
array-ref.
(index1, index2) element in array to
new-value. The value returned by array-set! is unspecified.
make-shared-array can be used to create shared subarrays of other
arrays. The mapper is a function that translates coordinates in
the new array into coordinates in the old array. A mapper must be
linear, and its range must stay within the bounds of the old array, but
it can be otherwise arbitrary. A simple example:
(define fred (make-array #f 8 8)) (define freds-diagonal (make-shared-array fred (lambda (i) (list i i)) 8)) (array-set! freds-diagonal 'foo 3) (array-ref fred 3 3) => foo (define freds-center (make-shared-array fred (lambda (i j) (list (+ 3 i) (+ 3 j))) 2 2)) (array-ref freds-center 0 0) => foo
The values of dim0, dim1, ... correspond to dimensions in the array to be returned, their positions in the argument list to dimensions of array. Several dims may have the same value, in which case the returned array will have smaller rank than array.
(transpose-array '#2((a b) (c d)) 1 0) => #2((a c) (b d))
(transpose-array '#2((a b) (c d)) 0 0) => #1(a d)
(transpose-array '#3(((a b c) (d e f)) ((1 2 3) (4 5 6))) 1 1 0) =>
#2((a 4) (b 5) (c 6))
An enclosed array is not a general Scheme array. Its elements may not
be set using array-set!. Two references to the same element of
an enclosed array will be equal? but will not in general be
eq?. The value returned by array-prototype when given an
enclosed array is unspecified.
examples:
(enclose-array '#3(((a b c) (d e f)) ((1 2 3) (4 5 6))) 1) => #<enclosed-array (#1(a d) #1(b e) #1(c f)) (#1(1 4) #1(2 5) #1(3 6))> (enclose-array '#3(((a b c) (d e f)) ((1 2 3) (4 5 6))) 1 0) => #<enclosed-array #2((a 1) (d 4)) #2((b 2) (e 5)) #2((c 3) (f 6))>
(array-shape (make-array 'foo '(-1 3) 5)) => ((-1 3) (0 4))
Array-dimensions is similar to array-shape but replaces
elements with a 0 minimum with one greater than the maximum. So:
(array-dimensions (make-array 'foo '(-1 3) 5)) => ((-1 3) 5)
0 is returned.
#t iff all arguments are arrays with the same shape, the
same type, and have corresponding elements which are either
equal? or array-equal?. This function differs from
equal? in that a one dimensional shared array may be
array-equal? but not equal? to a vector or uniform vector.
array-contents returns that shared array, otherwise it returns
#f. All arrays made by make-array and
make-uniform-array may be unrolled, some arrays made by
make-shared-array may not be.
If the optional argument strict is provided, a shared array will be returned only if its elements are stored internally contiguous in memory.
One can implement array-indexes as
(define (array-indexes array)
(let ((ra (apply make-array #f (array-shape array))))
(array-index-map! ra (lambda x x))
ra))
Another example:
(define (apl:index-generator n)
(let ((v (make-uniform-vector n 1)))
(array-index-map! v (lambda (i) i))
v))
Uniform arrays have elements all of the same type and occupy less storage than conventional arrays. Uniform arrays with a single zero-based dimension are also known as uniform vectors. The procedures in this section can also be used on conventional arrays, vectors, bit-vectors and strings.
When creating a uniform array, the type of data to be stored is indicated with a prototype argument. The following table lists the types available and example prototypes:
prototype type printing character #t boolean (bit-vector) b #\a char (string) a #\nul byte (integer) y 's short (integer) h 1 unsigned long (integer) u -1 signed long (integer) e 'l signed long long (integer) l 1.0 float (single precision) s 1/3 double (double precision float) i 0+i complex (double precision) c () conventional vector
Unshared uniform arrays of characters with a single zero-based dimension are identical to strings:
(make-uniform-array #\a 3) => "aaa"
Unshared uniform arrays of booleans with a single zero-based dimension are identical to section Bit Vectors.
(make-uniform-array #t 3) => #*111
Other uniform vectors are written in a form similar to that of vectors,
except that a single character from the above table is put between
# and (. For example, a uniform vector of signed
long integers is displayed in the form '#e(3 5 9).
#t if the obj is an array, and #f if not.
The prototype argument is used with uniform arrays and is described elsewhere.
make-uniform-array.
The optional arguments start and end allow a specified region of a vector (or linearized array) to be read, leaving the remainder of the vector unchanged.
uniform-array-read! returns the number of objects read.
port-or-fdes may be omitted, in which case it defaults to the value
returned by (current-input-port).
The optional arguments start and end allow a specified region of a vector (or linearized array) to be written.
The number of objects actually written is returned.
port-or-fdes may be
omitted, in which case it defaults to the value returned by
(current-output-port).
Bit vectors are a specific type of uniform array: an array of booleans with a single zero-based index.
They are displayed as a sequence of 0s and
1s prefixed by #*, e.g.,
(make-uniform-vector 8 #t #f) => #*00000000 #b(#t #f #t) => #*101
#f is returned.
#t, uve is OR'ed into
bv; If bool is #f, the inversion of uve is
AND'ed into bv.
If uve is a unsigned long integer vector all the elements of uve
must be between 0 and the length of bv. The bits
of bv corresponding to the indexes in uve are set to
bool. The return value is unspecified.
(bit-count (bit-set*! (if bool bv (bit-invert! bv)) uve #t) #t).
bv is not modified.
This chapter discusses dictionary objects: data structures that are useful for organizing and indexing large bodies of information.
A dictionary object is a data structure used to index
information in a user-defined way. In standard Scheme, the main
aggregate data types are lists and vectors. Lists are not really
indexed at all, and vectors are indexed only by number
(e.g. (vector-ref foo 5)). Often you will find it useful
to index your data on some other type; for example, in a library
catalog you might want to look up a book by the name of its
author. Dictionaries are used to help you organize information in
such a way.
An association list (or alist for short) is a list of
key-value pairs. Each pair represents a single quantity or
object; the car of the pair is a key which is used to
identify the object, and the cdr is the object's value.
A hash table also permits you to index objects with arbitrary keys, but in a way that makes looking up any one object extremely fast. A well-designed hash system makes hash table lookups almost as fast as conventional array or vector references.
Alists are popular among Lisp programmers because they use only the language's primitive operations (lists, car, cdr and the equality primitives). No changes to the language core are necessary. Therefore, with Scheme's built-in list manipulation facilities, it is very convenient to handle data stored in an association list. Also, alists are highly portable and can be easily implemented on even the most minimal Lisp systems.
However, alists are inefficient, especially for storing large quantities of data. Because we want Guile to be useful for large software systems as well as small ones, Guile provides a rich set of tools for using either association lists or hash tables.
An association list is a conventional data structure that is often used
to implement simple key-value databases. It consists of a list of
entries in which each entry is a pair. The key of each entry is
the car of the pair and the value of each entry is the
cdr.
ASSOCIATION LIST ::= '( (KEY1 . VALUE1)
(KEY2 . VALUE2)
(KEY3 . VALUE3)
...
)
Association lists are also known, for short, as alists.
The structure of an association list is just one example of the infinite
number of possible structures that can be built using pairs and lists.
As such, the keys and values in an association list can be manipulated
using the general list structure procedures cons, car,
cdr, set-car!, set-cdr! and so on. However,
because association lists are so useful, Guile also provides specific
procedures for manipulating them.
All of Guile's dedicated association list procedures, apart from
acons, come in three flavors, depending on the level of equality
that is required to decide whether an existing key in the association
list is the same as the key that the procedure call uses to identify the
required entry.
eq? to determine key
equality.
eqv? to determine
key equality.
equal? to
determine key equality.
acons is an exception because it is used to build association
lists which do not require their entries' keys to be unique.
acons adds a new entry to an association list and returns the
combined association list. The combined alist is formed by consing the
new entry onto the head of the alist specified in the acons
procedure call. So the specified alist is not modified, but its
contents become shared with the tail of the combined alist that
acons returns.
In the most common usage of acons, a variable holding the
original association list is updated with the combined alist:
(set! address-list (acons name address address-list))
In such cases, it doesn't matter that the old and new values of
address-list share some of their contents, since the old value is
usually no longer independently accessible.
Note that acons adds the specified new entry regardless of
whether the alist may already contain entries with keys that are, in
some sense, the same as that of the new entry. Thus acons is
ideal for building alists where there is no concept of key uniqueness.
(set! task-list (acons 3 "pay gas bill" '())) task-list => ((3 . "pay gas bill")) (set! task-list (acons 3 "tidy bedroom" task-list)) task-list => ((3 . "tidy bedroom") (3 . "pay gas bill"))
assq-set!, assv-set! and assoc-set! are used to add
or replace an entry in an association list where there is a
concept of key uniqueness. If the specified association list already
contains an entry whose key is the same as that specified in the
procedure call, the existing entry is replaced by the new one.
Otherwise, the new entry is consed onto the head of the old association
list to create the combined alist. In all cases, these procedures
return the combined alist.
assq-set! and friends may destructively modify the
structure of the old association list in such a way that an existing
variable is correctly updated without having to set! it to the
value returned:
address-list
=>
(("mary" . "34 Elm Road") ("james" . "16 Bow Street"))
(assoc-set! address-list "james" "1a London Road")
=>
(("mary" . "34 Elm Road") ("james" . "1a London Road"))
address-list
=>
(("mary" . "34 Elm Road") ("james" . "1a London Road"))
Or they may not:
(assoc-set! address-list "bob" "11 Newington Avenue")
=>
(("bob" . "11 Newington Avenue") ("mary" . "34 Elm Road")
("james" . "1a London Road"))
address-list
=>
(("mary" . "34 Elm Road") ("james" . "1a London Road"))
The only safe way to update an association list variable when adding or
replacing an entry like this is to set! the variable to the
returned value:
(set! address-list
(assoc-set! address-list "bob" "11 Newington Avenue"))
address-list
=>
(("bob" . "11 Newington Avenue") ("mary" . "34 Elm Road")
("james" . "1a London Road"))
Because of this slight inconvenience, you may find it more convenient to use hash tables to store dictionary data. If your application will not be modifying the contents of an alist very often, this may not make much difference to you.
If you need to keep the old value of an association list in a form
independent from the list that results from modification by
acons, assq-set!, assv-set! or assoc-set!,
use list-copy to copy the old association list before modifying
it.
These functions do not attempt to verify the structure of alist, and so may cause unusual results if passed an object that is not an association list.
assq, assv and assoc take an alist and a key as
arguments and return the entry for that key if an entry exists, or
#f if there is no entry for that key. Note that, in the cases
where an entry exists, these procedures return the complete entry, that
is (KEY . VALUE), not just the value.
assq compares keys with eq?, assv
uses eqv? and assoc uses equal?. If key
cannot be found in alist (according to whichever equality
predicate is in use), then return #f. These functions
return the entire alist entry found (i.e. both the key and the value).
assq-ref, assv-ref and assoc-ref, on the other
hand, take an alist and a key and return just the value for that
key, if an entry exists. If there is no entry for the specified key,
these procedures return #f.
This creates an ambiguity: if the return value is #f, it means
either that there is no entry with the specified key, or that there
is an entry for the specified key, with value #f.
Consequently, assq-ref and friends should only be used where it
is known that an entry exists, or where the ambiguity doesn't matter
for some other reason.
assq, assv and assoc, except that only the
value associated with key in alist is returned. These
functions are equivalent to
(let ((ent (associator key alist))) (and ent (cdr ent)))
where associator is one of assq, assv or assoc.
To remove the element from an association list whose key matches a
specified key, use assq-remove!, assv-remove! or
assoc-remove! (depending, as usual, on the level of equality
required between the key that you specify and the keys in the
association list).
As with assq-set! and friends, the specified alist may or may not
be modified destructively, and the only safe way to update a variable
containing the alist is to set! it to the value that
assq-remove! and friends return.
address-list
=>
(("bob" . "11 Newington Avenue") ("mary" . "34 Elm Road")
("james" . "1a London Road"))
(set! address-list (assoc-remove! address-list "mary"))
address-list
=>
(("bob" . "11 Newington Avenue") ("james" . "1a London Road"))
Note that, when assq/v/oc-remove! is used to modify an
association list that has been constructed only using the corresponding
assq/v/oc-set!, there can be at most one matching entry in the
alist, so the question of multiple entries being removed in one go does
not arise. If assq/v/oc-remove! is applied to an association
list that has been constructed using acons, or an
assq/v/oc-set! with a different level of equality, or any mixture
of these, it removes only the first matching entry from the alist, even
if the alist might contain further matching entries. For example:
(define address-list '())
(set! address-list (assq-set! address-list "mary" "11 Elm Street"))
(set! address-list (assq-set! address-list "mary" "57 Pine Drive"))
address-list
=>
(("mary" . "57 Pine Drive") ("mary" . "11 Elm Street"))
(set! address-list (assoc-remove! address-list "mary"))
address-list
=>
(("mary" . "11 Elm Street"))
In this example, the two instances of the string "mary" are not the same
when compared using eq?, so the two assq-set! calls add
two distinct entries to address-list. When compared using
equal?, both "mary"s in address-list are the same as the
"mary" in the assoc-remove! call, but assoc-remove! stops
after removing the first matching entry that it finds, and so one of the
"mary" entries is left in place.
sloppy-assq, sloppy-assv and sloppy-assoc behave
like the corresponding non-sloppy- procedures, except that they
return #f when the specified association list is not well-formed,
where the non-sloppy- versions would signal an error.
Specifically, there are two conditions for which the non-sloppy-
procedures signal an error, which the sloppy- procedures handle
instead by returning #f. Firstly, if the specified alist as a
whole is not a proper list:
(assoc "mary" '((1 . 2) ("key" . "door") . "open sesame"))
=>
ERROR: In procedure assoc in expression (assoc "mary" (quote #)):
ERROR: Wrong type argument in position 2 (expecting NULLP): "open sesame"
ABORT: (wrong-type-arg)
(sloppy-assoc "mary" '((1 . 2) ("key" . "door") . "open sesame"))
=>
#f
Secondly, if one of the entries in the specified alist is not a pair:
(assoc 2 '((1 . 1) 2 (3 . 9))) => ERROR: In procedure assoc in expression (assoc 2 (quote #)): ERROR: Wrong type argument in position 2 (expecting CONSP): 2 ABORT: (wrong-type-arg) (sloppy-assoc 2 '((1 . 1) 2 (3 . 9))) => #f
Unless you are explicitly working with badly formed association lists,
it is much safer to use the non-sloppy- procedures, because they
help to highlight coding and data errors that the sloppy-
versions would silently cover up.
assq but does not do any error checking.
Recommended only for use in Guile internals.
assv but does not do any error checking.
Recommended only for use in Guile internals.
assoc but does not do any error checking.
Recommended only for use in Guile internals.
Here is a longer example of how alists may be used in practice.
(define capitals '(("New York" . "Albany")
("Oregon" . "Salem")
("Florida" . "Miami")))
;; What's the capital of Oregon?
(assoc "Oregon" capitals) => ("Oregon" . "Salem")
(assoc-ref capitals "Oregon") => "Salem"
;; We left out South Dakota.
(set! capitals
(assoc-set! capitals "South Dakota" "Pierre"))
capitals
=> (("South Dakota" . "Pierre")
("New York" . "Albany")
("Oregon" . "Salem")
("Florida" . "Miami"))
;; And we got Florida wrong.
(set! capitals
(assoc-set! capitals "Florida" "Tallahassee"))
capitals
=> (("South Dakota" . "Pierre")
("New York" . "Albany")
("Oregon" . "Salem")
("Florida" . "Tallahassee"))
;; After Oregon secedes, we can remove it.
(set! capitals
(assoc-remove! capitals "Oregon"))
capitals
=> (("South Dakota" . "Pierre")
("New York" . "Albany")
("Florida" . "Tallahassee"))
Hash tables are dictionaries which offer similar functionality as association lists: They provide a mapping from keys to values. The difference is that association lists need time linear in the size of elements when searching for entries, whereas hash tables can normally search in constant time. The drawback is that hash tables require a little bit more memory, and that you can not use the normal list procedures (see section Lists) for working with them.
For demonstration purposes, this section gives a few usage examples of some hash table procedures, together with some explanation what they do.
First we start by creating a new hash table with 31 slots, and populate it with two key/value pairs.
(define h (make-hash-table 31)) (hashq-create-handle! h 'foo "bar") => (foo . "bar") (hashq-create-handle! h 'braz "zonk") => (braz . "zonk") (hashq-create-handle! h 'frob #f) => (frob . #f)
You can get the value for a given key with the procedure
hashq-ref, but the problem with this procedure is that you
cannot reliably determine whether a key does exists in the table. The
reason is that the procedure returns #f if the key is not in
the table, but it will return the same value if the key is in the
table and just happens to have the value #f, as you can see in
the following examples.
(hashq-ref h 'foo) => "bar" (hashq-ref h 'frob) => #f (hashq-ref h 'not-there) => #f
Better is to use the procedure hashq-get-handle, which makes a
distinction between the two cases. Just like assq, this
procedure returns a key/value-pair on success, and #f if the
key is not found.
(hashq-get-handle h 'foo) => (foo . "bar") (hashq-get-handle h 'not-there) => #f
There is no procedure for calculating the number of key/value-pairs in
a hash table, but hash-fold can be used for doing exactly that.
(hash-fold (lambda (key value seed) (+ 1 seed)) 0 h) => 3
Like the association list functions, the hash table functions come
in several varieties: hashq, hashv, and hash.
The hashq functions use eq? to determine whether two
keys match. The hashv functions use eqv?, and the
hash functions use equal?.
In each of the functions that follow, the table argument must be a vector. The key and value arguments may be any Scheme object.
#f if no default argument
is supplied). Uses eq? for equality testing.
#f if no default argument
is supplied). Uses eqv? for equality testing.
#f if no default argument
is supplied). Uses equal? for equality testing.
eq? for equality testing.
eqv? for equality testing.
equal? for equality
testing.
eq? for equality tests.
eqv? for equality tests.
equal? for equality tests.
The standard hash table functions may be too limited for some applications. For example, you may want a hash table to store strings in a case-insensitive manner, so that references to keys named "foobar", "FOOBAR" and "FooBaR" will all yield the same item. Guile provides you with extended hash tables that permit you to specify a hash function and associator function of your choosing. The functions described in the rest of this section can be used to implement such custom hash table structures.
If you are unfamiliar with the inner workings of hash tables, then this facility will probably be a little too abstract for you to use comfortably. If you are interested in learning more, see an introductory textbook on data structures or algorithms for an explanation of how hash tables are implemented.
eq? is
used as the equality predicate. The function returns an
integer in the range 0 to size - 1. Note that
hashq may use internal addresses. Thus two calls to
hashq where the keys are eq? are not guaranteed to
deliver the same value if the key object gets garbage collected
in between. This can happen, for example with symbols:
(hashq 'foo n) (gc) (hashq 'foo n) may produce two
different values, since foo will be garbage collected.
eqv? is
used as the equality predicate. The function returns an
integer in the range 0 to size - 1. Note that
(hashv key) may use internal addresses. Thus two calls
to hashv where the keys are eqv? are not guaranteed to
deliver the same value if the key object gets garbage collected
in between. This can happen, for example with symbols:
(hashv 'foo n) (gc) (hashv 'foo n) may produce two
different values, since foo will be garbage collected.
equal?
is used as the equality predicate. The function returns an
integer in the range 0 to size - 1.
ref
function, but uses hash as a hash function and
assoc to compare keys. hash must be a function
that takes two arguments, a key to be hashed and a table size.
assoc must be an associator function, like assoc,
assq or assv.
By way of illustration, hashq-ref table key is
equivalent to hashx-ref hashq assq table key.
set!
function, but uses hash as a hash function and
assoc to compare keys. hash must be a function
that takes two arguments, a key to be hashed and a table size.
assoc must be an associator function, like assoc,
assq or assv.
By way of illustration, hashq-set! table key is
equivalent to hashx-set! hashq assq table key.
(key . value) pair from the
hash table table. If table does not hold an
associated value for key, #f is returned.
Uses eq? for equality testing.
(key . value) pair from the
hash table table. If table does not hold an
associated value for key, #f is returned.
Uses eqv? for equality testing.
(key . value) pair from the
hash table table. If table does not hold an
associated value for key, #f is returned.
Uses equal? for equality testing.
-get-handle function, but uses hash as a hash
function and assoc to compare keys. hash must be
a function that takes two arguments, a key to be hashed and a
table size. assoc must be an associator function, like
assoc, assq or assv.
-create-handle function, but uses hash as a hash
function and assoc to compare keys. hash must be
a function that takes two arguments, a key to be hashed and a
table size. assoc must be an associator function, like
assoc, assq or assv.
(hash-fold acons '() tab) will convert a hash
table into an a-list of key-value pairs.
Go to the first, previous, next, last section, table of contents.