Skip to content
Menu
Open World News Open World News
  • Privacy Policy
Open World News Open World News

Andy Wingo: v8’s precise field-logging remembered set

Posted on January 5, 2024 by Michael G

Author:
Source

A remembered set is used by a garbage collector to identify graph
edges between partitioned sub-spaces of a heap. The canonical example
is in generational collection, where you allocate new objects in
newspace, and eventually promote survivor objects to oldspace. If
most objects die young, we can focus GC effort on newspace, to avoid
traversing all of oldspace all the time.

Collecting a subspace instead of the whole heap is sound if and only if
we can identify all live objects in the subspace. We start with some
set of roots that point into the subspace from outside, and then
traverse all links in those objects, but only to other objects within
the subspace.

The roots are, like, global variables, and the stack, and registers; and
in the case of a partial collection in which we identify live objects
only within newspace, also any link into newspace from other spaces
(oldspace, in our case). This set of inbound links is a remembered
set
.

There are a few strategies for maintaining a remembered set. Generally
speaking, you start by implementing a write barrier that intercepts all
stores in a program. Instead of:

obj[slot] := val;

You might abstract this away:

write_slot(obj, sizeof obj, &obj[slot], val);

As you can see, it’s quite an annoying transformation to do by hand;
typically you will want some sort of language-level abstraction that
lets you keep the more natural syntax. C++ can do this pretty well, or
if you are implementing a compiler, you just add this logic to the
code generator.

Then the actual write barrier… well its implementation is twingled up
with implementation of the remembered set. The simplest variant is a
card-marking scheme, whereby the heap is divided into equal-sized
power-of-two-sized cards, and each card has a bit. If the heap is
also divided into blocks (say, 2 MB in size), then you might divide
those blocks into 256-byte cards, yielding 8192 cards per block. A
barrier might look like this:

void write_slot(ObjRef obj, size_t size,
                SlotAddr slot, ObjRef val) {
  obj[slot] := val; // Start with the store.

  uintptr_t block_size = 1<<21;
  uintptr_t card_size = 1<<8;
  uintptr_t cards_per_block = block_size / card_size;

  uintptr_t obj_addr = obj;
  uintptr_t card_idx = (obj_addr / card_size) % cards_per_block;

  // Assume remset allocated at block start.
  void *block_start = obj_addr & ~(block_size-1);
  uint32_t *cards = block_start;

  // Set the bit.
  cards[card_idx / 32] |= 1 << (card_idx % 32);
}

Then when marking the new generation, you visit all cards, and for all
marked cards, trace all outbound links in all live objects that begin on
the card.

Card-marking is simple to implement and simple to statically allocate as
part of the heap. Finding marked cards takes time proportional to the
size of the heap, but you hope that the constant factors and SIMD
minimize this cost. However iterating over objects within a card can be
costly. You hope that there are few old-to-new links but what do you
know?

In Whippet I have been struggling a
bit with sticky-mark-bit generational
marking
,
in which new and old objects are not spatially partitioned. Sometimes
generational collection is a win, but in benchmarking I find that often
it isn’t, and I think Whippet’s card-marking
barrier

is at fault: it is simply too imprecise. Consider firstly that our
write barrier applies to stores to slots in all objects, not just those
in oldspace; a store to a new object will mark a card, but that card may
contain old objects which would then be re-scanned. Or consider a store
to an old object in a more dense part of oldspace; scanning the card may
incur more work than needed. It could also be that Whippet is being too
aggressive at re-using blocks for new allocations, where it should be
limiting itself to blocks that are very sparsely populated with old
objects.

what v8 does

There is a tradeoff in write barriers between the overhead imposed on
stores, the size of the remembered set, and the precision of the
remembered set. Card-marking is relatively low-overhead and usually
small as a fraction of the heap, but not very precise. It would be
better if a remembered set recorded objects, not cards. And it would be
even better if it recorded slots in objects, not just objects.

V8 takes this latter strategy: it has per-block remembered sets which
record slots containing “interesting” links. All of the above words
were to get here, to take a brief look at its remembered set.

The main operation is
RememberedSet::Insert.
It takes the MemoryChunk (a block, in our language from above) and the
address of a slot in the block. Each block has a remembered set; in
fact, six remembered
sets

for some reason. The remembered set itself is a
SlotSet,
whose interesting operations come from
BasicSlotSet.

The structure of a slot set is a bitvector partitioned into
equal-sized, possibly-empty
buckets
.
There is one bit per slot in the block, so in the limit the size
overhead for the remembered set may be 3% (1/32, assuming compressed
pointers). Currently each bucket is 1024 bits (128
bytes)
,
plus the 4 bytes for the bucket pointer itself.

Inserting into the slot
set

will first allocate a bucket (using C++ new) if needed, then load the
“cell” (32-bit
integer)

containing the slot. There is a template parameter declaring whether
this is an atomic or normal load. Finally, if the slot bit in the cell
is not yet set, V8 will set the bit, possibly using atomic
compare-and-swap.

In the language of Blackburn’s Design and analysis of field-logging
write
barriers
,
I believe this is a field-logging barrier, rather than the bit-stealing
slot barrier described by Yang et al in the 2012 Barriers
Reconsidered, Friendlier
Still!
.
Unlike Blackburn’s field-logging barrier, however, this remembered set
is implemented completely on the side: there is no in-object remembered
bit, nor remembered bits for the fields.

On the one hand, V8’s remembered sets are precise. There are some
tradeoffs, though: they require off-managed-heap dynamic allocation for
the buckets, and traversing the remembered
sets

takes time proportional to the whole heap size. And, should V8 ever
switch its minor mark-sweep generational
collector

to use sticky mark bits, the lack of a spatial partition could lead to
similar problems as I am seeing in Whippet. I will be interested to see
what they come up with in this regard.

Well, that’s all for today. Happy hacking in the new year!

Read more

Related Posts:

  • Andy Wingo: the sticky mark-bit algorithm
    Andy Wingo: the sticky mark-bit algorithm
  • Andy Wingo: a world to win: webassembly for the rest of us
    Andy Wingo: a world to win: webassembly for the rest of us
  • Andy Wingo: approaching cps soup
    Andy Wingo: approaching cps soup
  • Andy Wingo: unintentional concurrency
    Andy Wingo: unintentional concurrency
  • Andy Wingo: ephemeral success
    Andy Wingo: ephemeral success
  • Andy Wingo: coarse or lazy?
    Andy Wingo: coarse or lazy?

Recent Posts

  • [TUT] LoRa & LoRaWAN – MikroTik wAP LR8 kit mit The Things Network verbinden [4K | DE]
  • Mercado aguarda Powell e olha Trump, dados e Haddad | MINUTO TOURO DE OURO – 11/02/25
  • Dan Levy Gets Candid About Learning How To Act Differently After Schitt’s Creek: ‘It’s Physically…
  • Building a Rock Shelter & Overnight Stay in Heavy Snow 🏕️⛰️
  • Les milliardaires Elon Musk et Xavier Niel s’insultent copieusement

Categories

  • Android
  • Linux
  • News
  • Open Source
©2025 Open World News | Powered by Superb Themes
We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept All”, you consent to the use of ALL the cookies. However, you may visit "Cookie Settings" to provide a controlled consent.
Cookie SettingsAccept All
Manage consent

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
CookieDurationDescription
cookielawinfo-checkbox-analytics11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional11 monthsThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy11 monthsThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytics
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
Others
Other uncategorized cookies are those that are being analyzed and have not been classified into a category as yet.
SAVE & ACCEPT