[mnet-devel] Ickle me, pickle me, tickle me too...

Jim McCoy mccoy at mad-scientist.com
Thu Aug 7 22:49:10 BST 2003


I am currently rooting around through the localblockstore and 
associated files and have been working on some bits for a replacement 
to localblockstore that I want to play with.  In the grand tradition of 
"replace bsddb whenever possible" I have a replacement indexer that is 
based on some things we did at Yahoo/Four11. The problem I am currently 
looking at is checkpointing the index.

I have not done much with pickle/cPickle other than a few toy examples, 
but it seems to be the right tool (or at least the best tool among 
those available to me) when it comes to providing a bit of persistence 
to the index. Unfortunately, the basic pickle.dump() is not going to 
get the job done on its own.  So, a few questions will be tossed up 
here to see if anyone can give me some suggestions before I start 
pestering various python lists...

1) I want to checkpoint to make restarts faster, but I am guessing that 
doing so after every insertion or deletion will probably be a bit of a 
resource drain. Does anyone know if this is actually true?

2) I think that I will have to write a wrapper to overwrite the 
existing pickle file on each checkpoint. Because each update touches 
several objects in the index, just spraying out the top level class 
instance should get everything, but this might end up making the 
problem in question #1 more significant.

3) I can also just write out copies of the changed index items (each 
insertion or deletion touches several index items) at each 
insert/delete and push those objects into the pickle file, with the 
object reconstitution consisting of creating an empty index and then 
replaying all of the pickled index items. The downside to this is that 
there may end up being a lot of these items and lots of "overwrites" as 
things move along.  Maybe I should write a __getstate__ and 
__setstate__ that try to do the right thing, but since I am trying to 
reload the state in the index's __init__ it will also mean that I will 
probably have to get my hands dirty with __getinitargs__, etc.  More 
work that I would avoid if I could, but I am not sure if it will be 
avoidable if I want to do this "the right way".

Jim



-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01
_______________________________________________
mnet-devel mailing list
mnet-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mnet-devel




More information about the Mnet-devel mailing list