Red Programming Language: 0.6.6: Memory Management Improvements

This new milestone brings many low-level improvements to Red's memory management and garbage collecting. Most of those are long-planned additions needed to complete the internal memory model and make it robust enough for the future stable Red v1.0.

First, here is a simplified overview of the Red memory model (existing parts in green color, new parts in orange, non-Red parts in blue):

All Red values are stored in series. Some Red values require one or more buffers to hold their content. The values can never reference a buffer directly, but only through a node reference, to enable relocation when expanding the series buffer or when moving it around during compaction by the GC.

Now let's dive into the hairy details!

External resources GC

The Red/View engine backends rely on external resources provided by the OS. Among those resources, some are linked to face! or font! object and require special care when those objects are not reachable anymore. So far, our GC (Garbage Collector) was not able to release such resources (images bitmap buffers and fonts handles), as unreachable Red aggregate values are seeing as simple series during the sweeping GC stage. In order to improve that, we have added an external resources manager, that will track and free unused resources, allowing now unrestricted images and fonts usage!

Accurate GC

The Red GC relies on allocated memory walking and native stack scanning to identify live Red values. Scanning the native stack can be challenging. The scanner used so far a conservative approach, which is simpler, but can lead to corruptions or crashes in rare cases (e.g. a floating point number being mistaken for a series or node pointer). Moreover, such approach precluded from having a nodes frame GC, as there was no way to accurately identify node pointers on the stack. This is now solved. The plan was always to make it precise when getting closer to a Red v1.0 and that's what we did in this release.

In order to achieve that, several key additions were made:

➤ Frame records hints: the R/S compiler now generates a map of hints for arguments and local variables which are series/nodes pointers, using bit arrays stored in the .data segment and retrieved during scanning. In order to match a call frame on stack to the right bitmap array, an offset is now pushed on stack by each function call as part of their prolog sequence. Only the stack slots corresponding to 1's in the bit array are analyzed further to identify their origin series/node frame, then marked and stored in an list for the collector to later update them if needed. The bit arrays are compressed using our CRUSH algorithm implementation, so that, e.g. for the GUI console executable, all bit arrays add only about 3472 extra bytes to the final executable.

➤ Variadic hints (typed vs untyped): for variadic functions, the bit array is dynamically created. If the typed mode is used, an accurate bitmap is produced. If the generic untyped variadic mode is used, all the arguments stack slots will be marked for processing. This could, in theory, create false positives, but in practice, in Red's runtime code, all such cases are safe, referencing only Red values.

➤ Optimized pointers identification performance: each extracted pointer from the stack needs to be confirmed to be a valid series or a node pointer. Such checking is now achieved using cached sorted lists and binary search, ensuring vastly faster operation.

➤ Optimized frame walking by skipping non-Red frames: the stack scanning is done by jumping between call frames, relying on the saved frame pointer in each frame to chain the frames. However, when R/S callback functions are invoked by external (mostly OS) code, those external frames should be skipped to avoid false positives and for sake of performances. Now the scanner identifies which call frames are part of Red's code segment and skips the rest. However, one last hurdle remains, the dreadful compilation option in C compilers where the frame pointer is omitted in call frames (e.g. -fomit-frame-pointer in gcc). In such cases, walking the stack by dereferencing frame pointers is not an option anymore. The workaround is to save an extra "last known Red frame" pointer before calling any external code, which is then used by the scanner to jump over external code directly into the parent Red frame.

Node frames compaction

The GC is now capable of reclaiming node frames where the number of used slots is very low. It was, until now, a cause of memory leaking for long-running apps with bursts of high number of series allocations, as new node frames were allocated, but unused (non-empty) ones were never released.

This is now taken care of through a special GC pass that runs when specific conditions are fulfilled, moving live nodes from emptier frames to fuller frames, then freeing the entirely empty frames. The GC is then updating all references to relocated nodes during its marking and stack scanning stages.

In addition to that, the internal structure of node frames was improved. The free slots tracking method was changed from a stack-oriented model, to free slots linked-lists, resulting in doubling the node frames capacity, while keeping constant-time allocation/freeing performance.

External Red values reference management

Red values can sometimes be referenced by external non-Red code. The View engine relies on that and was storing copies of face object values inside external OS structures in order to be able to retrieve them on OS-generated events that would trigger Red callbacks. Such practice is not reliable and not compatible with the new node GC, as some node pointers could be stored away from the GC reach. So a new external values management system was introduced to only export a reference (an array index used as ID) to external code and keep all values inside Red series. The View backends were modified accordingly to rely on those references instead of copying the face object values.

That sub-system could in the future also be used for libRed external values management, to replace the ring buffer used there, which is functionally almost identical but now redundant.

Low-level allocations tracking

The Red runtime code sometimes has a need for allocating memory regions which last until the end of the Red process or need to be kept away for the GC. For that purpose, Red relies on malloc for such use-cases, just importing it from C library. Instead of a direct mapping, it now uses a thin layer on top of it in order to track all allocations providing extra features:

➤ Freeing of all system allocated memory regions on Red exit. This is not strickly needed for Red runtime, but allows to track possible leaks (rare case as most of such allocations are permanent).

➤ Ability to gather stats about such allocations (reported in show/info in "allocated on heap" part).

➤ Buffer overflow detection in debug mode using guard barriers at the tail of allocated buffers and checked on freeing for possible overflows.

➤ This layer is part of the R/S runtime, so available to R/S code too.

Other Changes

➤ stats native improvements: /info has been extended to contain also total allocated from OS and allocated memory from heap (see above). /show refinement has been implemented to pretty-print all those infos.

➤ Lowered memory allocations in Red runtime at start (about 1MB gained in total).

➤ Memory frames integrity auto-testing in debug mode (only node frames for now).

➤ Handle! values now hold a sub-type, revealed by mold/all (for debugging purposes):

    view [b: button "Hi!" [print mold/all b/state/1]]

    #[handle! 030A063Eh window]

➤ Now the final buffer is preallocated internally for insert and append calls with /dup refinement, resulting in much lower memory usage.

➤ Using zero?, with a point3D value was always returning false due to an incomplete copy/paste change. Fixed now.

➤ Out-of-range integer math operations now promote results to float values.

➤ Updated GPIO definitions for RaspberryPi devices. Pi 4 should be supported, but untested yet. Pi 5 not yet supported (should be updated soon).

➤ Camera widget: now viewport aspect ratio is honored regardless of the container face size. Viewport is now centered and black bars are added if needed to fill the container face. A %camera-resize.red script is provided for testing.

➤ Toolchain: --view <engine> compilation option added to force a different View engine than the default one for the target.

➤ 47 tickets closed with a fix since 0.6.5.

➤ Stable releases are back!

Red/System changes

➤ Added system/lib-image to support libRedRT image properties.

➤ Improved native stack trace reports (frame address, stack records chaining support).

➤ On IA-32 backend, the passive casting mode (no conversion, all bits preserved) between integer! and float32! (as [integer!|float32!]keep)now returns the correct results.

What's next?

Next release (should be 0.7.0) will feature the full async IO support we are all waiting for! There are also other major features we're working on which probably will be released after 0.7.0.

In the meantime, enjoy this release!

9 comments:

-pekr-March 20, 2025 at 7:54 AM
Great release, as always. Brings solid ground towards the further 1.0 development. Looking forward to the async IO :-)
AnonymousMarch 20, 2025 at 11:18 AM
Great work!
It's really exciting to see the progress made on this fascinating language!
AnonymousMarch 20, 2025 at 11:53 AM
Good job on this release!
AnonymousMarch 24, 2025 at 12:12 AM
Jedan 'nitpick': Jeste li riječ "eventual" koristili u duhu našeg jezika tj. "moguće, možebitno, ...", jer to na engleskom znači "sigurno, konačno, neizbježno..."
Pozdrav!
AnonymousApril 12, 2025 at 10:38 PM
This is how programming languages should be - clear, readable and without cryptic syntax. Easy to learn, even for an old man like me, who come from an era where Basic was the norm. Thank you so much and please keep up the good work!
AnonymousMay 1, 2025 at 9:11 PM
Great work
AnonymousJune 22, 2025 at 2:09 PM
The link to "CRUSH algorithm" paper is wrong (the paper is basically about a poorly designed RLE), the actually implemented algorithm has nothing to do with it (it's a LZ77 derivative).

Red Programming Language

Pages

March 19, 2025

0.6.6: Memory Management Improvements