One time for the leakers, double up for the callers

EdLeak tracks only one level of callers on the allocation functions that are monitored. This allows to limit as much as possible the monitoring overhead. However sometimes this is not enough to easily find the origin of the memory leak. This is why it is possible to trace the call-stack of leaks. This feature is available in EdLeak since a long time but I never explained how to enable it and use it. After reading this article you will be able to use it when needed.

How it works

Call stacks are not dumped by default since this would add a systematic overhead. Dumping a call-stack is needed only when a leak is detected and the first caller level is not enough to find the culprit. So call-stack are enabled on demand per allocer id. This is available on the python client only, via the addStackWatch method of the EdLeak class. Each allocer added via this API will have all its branches of the backtrace recorded.

Manual dump

Using it is quite easy. The following examples are executed against the leaker tool provided with edLeak, which aims at leaking memory. So first start it within edLeak:

LD_PRELOAD=$PWD/src/edleak/libedleak.so test/leaker/leaker

Then start ipython. I strongly suggest to use ipython when doing interactive work since it contains many useful features such as command completion and pretty printing result values. You first need to initialize the client:

import rpc.ws
import edleak.api
import edleak.slice_runner
ws_rpc = rpc.ws.WebService("localhost", 8080)
el = edleak.api.EdLeak(ws_rpc)

Then create and start a slice runner:

runner = edleak.slice_runner.SliceRunner(el)
asset = runner.run(1, 60)

Once the run is completed, you can retrieve the list of all allocers, and extract the leakers:

allocers = asset.getAllocerList()
leakers = [l for l in allocers if l['leak_factor']['leak'] > 0]
len(leakers)
=> Out: 7

From this first dump, only the first level of callers is available. The caller is available from the ’stack’ field of the allocer which contains only one entry for now. Here content of the first leaker :

In [28]: leakers[0]['stack']
Out[28]: [u'0x7f7aa7ac4df4:String::AppendString(char const*, unsigned int)']

If we consider that String::AppendString is not enough to find the origin of the leak, we can enable call-stack dumps on it:

el.addStackWatch(leaker[0][‘id'])

And start again the slice runner:

asset2 = runner.run(1, 60)
allocers2 = asset2.getAllocerList()
leakers2 = [l for l in allocers2 if l['leak_factor']['leak'] > 0]
len(leakers)
=> Out: 11\r

There are more leakers than on the previous run, which is good news. The list of leakers originating form String::AppendString can be extracted this way:

append_leakers = [l for l in leakers2 if
    l['stack'][0] == '0x7f7aa7ac4df4:String::AppendString(char const*, unsigned int)']

This give 4 leakers. We can now print the call stack of each one:

In [38]: append_leakers[0]['stack']
Out[38]: [u'0x7f7aa7ac4df4:String::AppendString(char const*, unsigned int)']
In [39]: append_leakers[1]['stack']
Out[39]:
[u'0x7f7aa7ac4df4:String::AppendString(char const*, unsigned int)',
 u'0x7f7aa7ac4b5d:String::operator<<(char const*)',
 u'0x7f7aa7abd652:CU_GetSlice(String*, String*)',
 u'0x7f7aa7abdb45:FileWriter::Loop()',
 u'0x7f7aa7ac2c30:Thread::PThreadLoop(void*)']
In [40]: append_leakers[2]['stack']
Out[40]:
[u'0x7f7aa7ac4df4:String::AppendString(char const*, unsigned int)',
 u'0x7f7aa7ac4b5d:String::operator<<(char const*)',
 u'0x7f7aa7abd668:CU_GetSlice(String*, String*)',
 u'0x7f7aa7abdb45:FileWriter::Loop()',
 u'0x7f7aa7ac2c30:Thread::PThreadLoop(void*)']
In [41]: append_leakers[3]['stack']\rOut[41]:
[u'0x7f7aa7ac4df4:String::AppendString(char const*, unsigned int)',
 u'0x7f7aa7ac4b5d:String::operator<<(char const*)',
 u'0x7f7aa7abd48a:CU_GetSlice(String*, String*)',
 u'0x7f7aa7abe3ce:WsMethodSlice::Call(DynObject const&, String*)',
 u'0x7f7aa7ac4417:WsInterface::Call(DynObject const&, String*)']

The first leak is the one form the first run. The 3 others are some code from edLeak itself, allocating some strings. Yes, you read correctly : These leakers are the edLeak file and http controllers! However as I already explained, it does not mean that they really leak. A first hint is that they are classified as ‘log’ leaks in the leak factor, meaning that they are more likely a cache being filled. Looking further, This method is used to allocate the json string holding all available allocers. Since this information trends to grow quickly at the beginning and become constant once all allocers are found, it effectively behaves like a cache that is filled.Such false positive can be filtered out by analyzing only leakers that are classified as linear or exponential.

Automated dump

All these manual tasks can be easily automated in a single python script. By the way it is available as one the the examples : edleak_autodetect.py. It does the manual steps that we did just before:

Initialize the rpc.
Start a first run
find the leakers
enable callstack on all leakers
Start a second run
print the leakers with their callstack.

Note that the script discards leaks classified as constant and logarithmic to limit false positives.

Conclusion

Finding memory leaks is very easy and quite efficient with edLeak. Dumping stack-calls on identified leakers help to find the culprits even faster. With the examples provided in this article and the one shipped with edKit, you should be able to write your own scripts matching your needs. It should also be quite easy to integrate it in a CI system that is used to run functional tests. This would be a good way to automate memory leaks detection.