What do __getstate__ and __setstate__ do?

Pickle is library for python serialization. Serialization means smth like “packing into bytes”
So it means that it can turn your class or function or variable into bytes and then it may save it on drive.

But to do so, you have to tell python and pickle library how to pack your class.

If you dont provide any methods, default methods will be used. Default methods just returns or sets self.__dict__ property. All things in __dict__ will be pickled by default. You can change this behaviour createing these methods:

__getstate__ should return object (representing class state) which will be pickled and saved.
__setstate__ should take object from parameter and use it to retrieve class state as it was before


In 08_data_block.ipynb notebook you have SplitData which is storing items (paths splited into two sets). And this __setstate__ method (I am assuming from code) just makes that when you load from drive before splited items, instead of overwriting whole “dict” it will update it.

def __setstate__(self,data:Any): self.__dict__.update(data)

There is a comment above, which suggests that this is some kind of workaround to save successfully.

#This is needed if we want to pickle SplitData and be able to load it back without recursion errors
3 Likes