From what I understand, not enough people know about the tunable garbage collector (GC) provided by Ruby Enterprise Edition. Section 4.2 of the REE documentation gives you the run-down.
Now that you read the documentation, why aren’t you tuning your GC? This is one of the easiest ways to speed up your test runs. Here is a quick example of the gains that can be achieved just by setting 5 environment variables.
With out tuning
410 scenarios (410 passed)
3213 steps (3213 passed)
9m29.685s
With tuning
410 scenarios (410 passed)
3213 steps (3213 passed)
5m58.661s
Keep in mind, all I did was change 5 settings.
export RUBY_HEAP_MIN_SLOTS=1000000
export RUBY_HEAP_SLOTS_INCREMENT=1000000
export RUBY_HEAP_SLOTS_GROWTH_FACTOR=1
export RUBY_GC_MALLOC_LIMIT=1000000000
export RUBY_HEAP_FREE_MIN=500000
How do these settings effect your application during test runs, and how can you make your application (running on REE) faster? Well, let’s go over what we have. I will annotate the existing documentation with the reasoning for my settings:
- RUBY_HEAP_MIN_SLOTS
-
This specifies the initial number of heap slots. The default is 10000.
-
The minimum amount of heap slots is pretty small. Since this configuration is just for my tests, I’m sure I will use way more slots, so I’m going to start of with more. This means that initially my test process will consume more memory.
- RUBY_HEAP_SLOTS_INCREMENT
-
The number of additional heap slots to allocate when Ruby needs to allocate new heap slots for the first time. The default is 10000.
For example, suppose that the default GC settings are in effect, and 10000 Ruby objects exist on the heap (= 10000 used heap slots). When the program creates another object, Ruby will allocate a new heap with 10000 heap slots in it. There are now 20000 heap slots in total, of which 10001 are used and 9999 are unused.
-
From what I understand, when the slots are all consumed, a new batch of slots will be allocated. Once again, this number is way too low, so I increase it by a factor of 100.
- RUBY_HEAP_SLOTS_GROWTH_FACTOR
-
Multiplicator used for calculating the number of new heaps slots to allocate next time Ruby needs new heap slots. The default is 1.8.
Take the program in the last example. Suppose that the program creates 10000 more objects. Upon creating the 10000th object, Ruby needs to allocate another heap. This heap will have 10000 * 1.8 = 18000 heap slots. There are now 20000 + 18000 = 38000 heap slots in total, of which 20001 are used and 17999 are unused.
The next time Ruby needs to allocate a new heap, that heap will have 18000 * 1.8 = 32400 heap slots.
- Since my increment is so large, I’m only going to use a factor of 1.
- RUBY_GC_MALLOC_LIMIT
-
The amount of C data structures which can be allocated without triggering a garbage collection. If this is set too low, then the garbage collector will be started even if there are empty heap slots available. The default value is 8000000.
- This is my development box, so feel free to use as much memory as you want. I’m sure there is an upper limit on this number, but I haven’t hit it yet.
- RUBY_HEAP_FREE_MIN
-
The number of heap slots that should be available after a garbage collector run. If fewer heap slots are available, then Ruby will allocate a new heap according to the RUBY_HEAP_SLOTS_INCREMENTand RUBY_HEAP_SLOTS_GROWTH_FACTOR parameters. The default value is 4096.
Since we have lots of memory, we’ll just make this half of our minimum number of slots.
Please do not use these settings in production. Both 37signals and Twitter have provided their settings, and you can use those as a basis for tuning your app.
Thanks to Evan Weaver and Nick Gauthier for most of the guts of this blog post.