Go to the first, previous, next, last section, table of contents.


Memory Management and Garbage Collection

Garbage Collection

Scheme Procedure: gc
C Function: scm_gc ()
Scans all of SCM objects and reclaims for further use those that are no longer accessible. You normally don't need to call this function explicitely. It is called automatically when appropriate.

Scheme Procedure: gc-stats
C Function: scm_gc_stats ()
Return an association list of statistics about Guile's current use of storage.

Memory Blocks

In C programs, dynamic management of memory blocks is normally done with the functions malloc, realloc, and free. Guile has additional functions for dynamic memory allocation that are integrated into the garbage collector and the error reporting system.

Memory blocks that are associated with Scheme objects (for example a smob) should be allocated and freed with scm_gc_malloc and scm_gc_free. The function scm_gc_malloc will either return a valid pointer or signal an error. It will also assume that the new memory can be freed by a garbage collection. The garbage collector uses this information to decide when to try to actually collect some garbage. Memory blocks allocated with scm_gc_malloc must be freed with scm_gc_free.

For memory that is not associated with a Scheme object, you can use scm_malloc instead of malloc. Like scm_gc_malloc, it will either return a valid pointer or signal an error. However, it will not assume that the new memory block can be freed by a garbage collection. The memory can be freed with free.

There is also scm_gc_realloc and scm_realloc, to be used in place of realloc when appropriate.

For really specialized needs, take at look at scm_gc_register_collectable_memory and scm_gc_unregister_collectable_memory.

C Function: void *scm_malloc (size_t size)
Allocate size bytes of memory and return a pointer to it. When size is 0, return NULL. When not enough memory is available, signal an error. This function runs the GC to free up some memory when it deems it appropriate.

The memory is allocated by the libc malloc function and can be freed with free. There is no scm_free function to go with scm_malloc to make it easier to pass memory back and forth between different modules.

C Function: void *scm_realloc (void *mem, size_t new_size)
Change the size of the memory block at mem to new_size and return its new location. When new_size is 0, this is the same as calling free on mem and NULL is returned. When mem is NULL, this function behaves like scm_malloc and allocates a new block of size new_size.

When not enough memory is available, signal an error. This function runs the GC to free up some memory when it deems it appropriate.

C Function: void scm_gc_register_collectable_memory (void *mem, size_t size, const char *what)
Informs the GC that the memory at mem of size size can potentially be freed during a GC. That is, announce that mem is part of a GC controlled object and when the GC happens to free that object, size bytes will be freed along with it. The GC will not free the memory itself, it will just know that so-and-so much bytes of memory are associated with GC controlled objects and the memory system figures this into its decisions when to run a GC.

mem does not need to come from scm_malloc. You can only call this function once for every memory block.

The what argument is used for statistical purposes. It should describe the type of object that the memory will be used for so that users can identify just what strange objects are eating up their memory.

C Function: void scm_gc_unregister_collectable_memory (void *mem, size_t size)
Informs the GC that the memory at mem of size size is no longer associated with a GC controlled object. You must take care to match up every call to scm_gc_register_collectable_memory with a call to scm_gc_unregister_collectable_memory. If you don't do this, the GC might have a wrong impression of what is going on and run much less efficiently than it could.

C Function: void *scm_gc_malloc (size_t size, const char *what)
C Function: void *scm_gc_realloc (void *mem, size_t old_size, size_t new_size, const char *what);
Like scm_malloc or scm_realloc, but also call scm_gc_register_collectable_memory. Note that you need to pass the old size of a reallocated memory block as well. See below for a motivation.

C Function: void scm_gc_free (void *mem, size_t size, const char *what)
Like free, but also call scm_gc_unregister_collectable_memory.

Note that you need to explicitely pass the size parameter. This is done since it should normally be easy to provide this parameter (for memory that is associated with GC controlled objects) and this frees us from tracking this value in the GC itself, which will keep the memory management overhead very low.

Upgrading from scm_must_malloc et al

Version 1.6 of Guile and earlier did not have the functions from the previous section. In their place, it had the functions scm_must_malloc, scm_must_realloc and scm_must_free. This section explains why we want you to stop using them, and how to do this.

The functions scm_must_malloc and scm_must_realloc behaved like scm_gc_malloc and scm_gc_realloc do now, respectively. They would inform the GC about the newly allocated memory via the internal equivalent of scm_gc_register_collectable_memory. However, scm_must_free did not unregister the memory it was about to free. The usual way to unregister memory was to return its size from a smob free function.

This disconnectedness of the actual freeing of memory and reporting this to the GC proved to be bad in practice. It was easy to make mistakes and report the wrong size because allocating and freeing was not done with symmetric code, and because it is cumbersome to compute the total size of nested data structures that were freed with multiple calls to scm_must_free. Additionally, there was no equivalent to scm_malloc, and it was tempting to just use scm_must_malloc and never to tell the GC that the memory has been freed.

The effect was that the internal statistics kept by the GC drifted out of sync with reality and could even overflow in long running programs. When this happened, the result was a dramatic increase in (senseless) GC activity which would effectively stop the program dead.

The functions scm_done_malloc and scm_done_free were introduced to help restore balance to the force, but existing bugs did not magically disappear, of course.

Therefore we decided to force everybody to review their code by deprecating the existing functions and introducing new ones in their place that are hopefully easier to use correctly.

For every use of scm_must_malloc you need to decide whether to use scm_malloc or scm_gc_malloc in its place. When the memory block is not part of a smob or some other Scheme object whose lifetime is ultimately managed by the garbage collector, use scm_malloc and free. When it is part of a smob, use scm_gc_malloc and change the smob free function to use scm_gc_free instead of scm_must_free or free and make it return zero.

The important thing is to always pair scm_malloc with free; and to always pair scm_gc_malloc with scm_gc_free.

The same reasoning applies to scm_must_realloc and scm_realloc versus scm_gc_realloc.

Weak References

[FIXME: This chapter is based on Mikael Djurfeldt's answer to a question by Michael Livshin. Any mistakes are not theirs, of course. ]

Weak references let you attach bookkeeping information to data so that the additional information automatically disappears when the original data is no longer in use and gets garbage collected. In a weak key hash, the hash entry for that key disappears as soon as the key is no longer referenced from anywhere else. For weak value hashes, the same happens as soon as the value is no longer in use. Entries in a doubly weak hash disappear when either the key or the value are not used anywhere else anymore.

Object properties offer the same kind of functionality as weak key hashes in many situations. (see section Object Properties)

Here's an example (a little bit strained perhaps, but one of the examples is actually used in Guile):

Assume that you're implementing a debugging system where you want to associate information about filename and position of source code expressions with the expressions themselves.

Hashtables can be used for that, but if you use ordinary hash tables it will be impossible for the scheme interpreter to "forget" old source when, for example, a file is reloaded.

To implement the mapping from source code expressions to positional information it is necessary to use weak-key tables since we don't want the expressions to be remembered just because they are in our table.

To implement a mapping from source file line numbers to source code expressions you would use a weak-value table.

To implement a mapping from source code expressions to the procedures they constitute a doubly-weak table has to be used.

Weak key hashes

Scheme Procedure: make-weak-key-hash-table size
Scheme Procedure: make-weak-value-hash-table size
Scheme Procedure: make-doubly-weak-hash-table size
C Function: scm_make_weak_key_hash_table (size)
C Function: scm_make_weak_value_hash_table (size)
C Function: scm_make_doubly_weak_hash_table (size)
Return a weak hash table with size buckets. As with any hash table, choosing a good size for the table requires some caution.

You can modify weak hash tables in exactly the same way you would modify regular hash tables. (see section Hash Tables)

Scheme Procedure: weak-key-hash-table? obj
Scheme Procedure: weak-value-hash-table? obj
Scheme Procedure: doubly-weak-hash-table? obj
C Function: scm_weak_key_hash_table_p (obj)
C Function: scm_weak_value_hash_table_p (obj)
C Function: scm_doubly_weak_hash_table_p (obj)
Return #t if obj is the specified weak hash table. Note that a doubly weak hash table is neither a weak key nor a weak value hash table.

Scheme Procedure: make-weak-value-hash-table k

Scheme Procedure: weak-value-hash-table? x

Scheme Procedure: make-doubly-weak-hash-table k

Scheme Procedure: doubly-weak-hash-table? x

Weak vectors

Weak vectors are mainly useful in Guile's implementation of weak hash tables.

Scheme Procedure: make-weak-vector size [fill]
C Function: scm_make_weak_vector (size, fill)
Return a weak vector with size elements. If the optional argument fill is given, all entries in the vector will be set to fill. The default value for fill is the empty list.

Scheme Procedure: weak-vector . l
Scheme Procedure: list->weak-vector l
C Function: scm_weak_vector (l)
Construct a weak vector from a list: weak-vector uses the list of its arguments while list->weak-vector uses its only argument l (a list) to construct a weak vector the same way list->vector would.

Scheme Procedure: weak-vector? obj
C Function: scm_weak_vector_p (obj)
Return #t if obj is a weak vector. Note that all weak hashes are also weak vectors.

Guardians

Scheme Procedure: make-guardian [greedy?]
C Function: scm_make_guardian (greedy_p)
Create a new guardian. A guardian protects a set of objects from garbage collection, allowing a program to apply cleanup or other actions.

make-guardian returns a procedure representing the guardian. Calling the guardian procedure with an argument adds the argument to the guardian's set of protected objects. Calling the guardian procedure without an argument returns one of the protected objects which are ready for garbage collection, or #f if no such object is available. Objects which are returned in this way are removed from the guardian.

make-guardian takes one optional argument that says whether the new guardian should be greedy or sharing. If there is any chance that any object protected by the guardian may be resurrected, then you should make the guardian greedy (this is the default).

See R. Kent Dybvig, Carl Bruggeman, and David Eby (1993) "Guardians in a Generation-Based Garbage Collector". ACM SIGPLAN Conference on Programming Language Design and Implementation, June 1993.

(the semantics are slightly different at this point, but the paper still (mostly) accurately describes the interface).

Scheme Procedure: destroy-guardian! guardian
C Function: scm_destroy_guardian_x (guardian)
Destroys guardian, by making it impossible to put any more objects in it or get any objects from it. It also unguards any objects guarded by guardian.

Scheme Procedure: guardian-greedy? guardian
C Function: scm_guardian_greedy_p (guardian)
Return #t if guardian is a greedy guardian, otherwise #f.

Scheme Procedure: guardian-destroyed? guardian
C Function: scm_guardian_destroyed_p (guardian)
Return #t if guardian has been destroyed, otherwise #f.


Go to the first, previous, next, last section, table of contents.