Get your own free workspace
View
 

Generational GC for Parrot

Page history last edited by PBworks 6 years, 8 months ago

A Generational, Thread-Aware GC for Parrot

 

 

Abstract:

 

Parrot, the VM for Perl6 (and other languages), currently uses a

plain Mark-and-Sweep GC. As it will be used on a wide range of systems, from

clusters to embedded, this scheme isn't really optimal. It also does not

consider multi-threading, which should be included later in Parrot.

 

The project is to implement a new GC that fixes all (or most of) these issues,

by using a generational, copying scheme.

 

 

 

Benefits:

 

This new GC should be more efficient and faster than the one that is currently

being used. It won't need to be rewritten when multi-threading comes to parrot.

It also provides a simple and cheap solution to the problem of timely

destruction, which is needed for python and backward compatibility with perl5.

Some parts should also be tunable (at least at build-time), which is always

useful when running on different platforms with different needs.

 

 

 

 

Project Detail:

 

This is a copy of the mail that was sent to perl6-internals@perl.org, where it

is currently being discussed

(http://www.nntp.perl.org/group/perl.perl6.internals/29994).

 

 

In order to allow more powerful GC schemes, we need to be able to copy objects

from a location of memory to another. There are problems with this, though, as

C structures can hold pointers to objects we are moving, and we have no way

(and don't want to) to make them update this info.

 

So we add a level of indirection: objects consist of a header and the actual

data. Headers are allocated once and for all at object creation and do not

move. They only hold a pointer to the 'real' object, thus making it possible

to move objects anywhere in the memory. Outside functions will see the header

address as the object address and some internal macros will handle the extra

indirection.

 

The big disadvantage of this approach is that we use one or two words (if

objects need to know where their header is, which seems reasonable) per object.

 

 

We then use a generational, incremental mark-and-sweep algorithm, such as

described by http://citeseer.csail.mit.edu/armstrong95one.html

 

We have different generations (in fact queues) of objects, from 1 to n-1

(the value of n could be tuned dynamically). We also have two special

generations, n, where objects do not grow old (to take care of timely

destruction) and 0, where objects are very constant and long-lived (mainly

used during parrot init). Objects in a generation are also sorted according

to their age.

The idea is to have the following invariant: all pointers go from new objects

to old objects. Thus, we can mark all objects in only one pass, starting from

the newest objects, generation n (using a Eratosthene-sieve-like method : if

an object is not marked, leave it, else mark all of its pointers). When an

entire generation has been processed, all non-marked objects are destroyed

and the others are compacted. Generation n becomes generation n-1, with the

exceptions of generation 0 and n.

Comparing the age of two objects then becomes very easy : if they are in

different generations, just compare their numbers. If not, the one with the

lower address is the youngest.

We can of course use a generational approach by limiting the number of

non-marked objects.

 

One problem is that our invariant can be broken. The solution is to track

down these inter-generational pointers (IGP) and simply add them to the root

set.

In order to reduce the number of IGP, we allocate an aggregate as being

artificially younger than scalars it has pointers to.

 

Thread issues still need to be discussed and are not covered here.

 

 

Keywords: generational, mark-and-collect, copying

Comments (0)

You don't have permission to comment on this page.