I am using
pandas.DataFrame in a multi-threaded code (actually a custom subclass of
Sound). I have noticed that I have a memory leak, since the memory usage of my program augments gradually over 10mn, to finally reach ~100% of my computer memory and crash.
I used objgraph to try tracking this leak, and found out that the count of instances of
MyDataFrame is going up all the time while it shouldn’t : every thread in its
run method creates an instance, makes some calculations, saves the result in a file and exits … so no references should be kept.
objgraph I found that all the data frames in memory have a similar reference graph :
I have no idea if that’s normal or not … it looks like this is what is keeping my objects in memory. Any idea, advice, insight ?
Confirmed that there’s some kind of memory leak going on in the indexing infrastructure. It’s not caused by the above reference graph. Let’s move the discussion to GitHub (SO is for Q&A):
EDIT: this actually appears to not be a memory leak at all, but has to do with the OS memory allocation issues perhaps. Please have a look at the github issue for more information