Pickle
is library for python serialization. Serialization means smth like “packing into bytes”
So it means that it can turn your class or function or variable into bytes and then it may save it on drive.
But to do so, you have to tell python and pickle library how to pack your class.
If you dont provide any methods, default methods will be used. Default methods just returns or sets self.__dict__
property. All things in __dict__
will be pickled by default. You can change this behaviour createing these methods:
__getstate__
should return object (representing class state) which will be pickled and saved.
__setstate__
should take object from parameter and use it to retrieve class state as it was before
In 08_data_block.ipynb
notebook you have SplitData
which is storing items (paths splited into two sets). And this __setstate__
method (I am assuming from code) just makes that when you load from drive before splited items, instead of overwriting whole “dict” it will update it.
def __setstate__(self,data:Any): self.__dict__.update(data)
There is a comment above, which suggests that this is some kind of workaround to save successfully.
#This is needed if we want to pickle SplitData and be able to load it back without recursion errors