Hey everyone,
I was trying to save my databunch object using the data.save(“PATH/data.pkl”), when I encountered this error:
ValueError: ctypes objects containing pointers cannot be pickled
This is my code for creating databunch:
batch_size = 64
do_flip = True
flip_vert = True
max_rotate = 90
max_zoom = 1.1
max_lighting = 0.2
max_warp = 0.2
p_affine = 0.75
p_lighting = 0.75
tfms = get_transforms(do_flip=do_flip,
flip_vert=flip_vert,
max_rotate=max_rotate,
max_zoom=max_zoom,
max_lighting=max_lighting,
max_warp=max_warp,
p_affine=p_affine,
p_lighting=p_lighting)
train, valid = ObjectItemListSlide(train_images) ,ObjectItemListSlide(valid_images)
item_list = ItemLists(".", train, valid)
lls = item_list.label_from_func(lambda x: x.y, label_cls=SlideObjectCategoryList)
lls = lls.transform(tfms, tfm_y=True, size=patch_size)
data = lls.databunch(bs=batch_size, collate_fn=bb_pad_collate,num_workers=0).normalize()
Here train dataset is as follows:
SlideLabelList (100 items)
x: ObjectItemListSlide
Image (3, 256, 256),Image (3, 256, 256),Image (3, 256, 256),Image (3, 256, 256),Image (3, 256, 256)
y: SlideObjectCategoryList
ImageBBox (256, 256),ImageBBox (256, 256),ImageBBox (256, 256),ImageBBox (256, 256),ImageBBox (256, 256)
Path: .
And the train_images is a list of object_detection_fastai.helper.wsi_loader.SlideContainer objects that are created using the following function:
def create_wsi_container(annotations_df: pd.DataFrame):
container = []
for image_name in tqdm(annotations_df["file_name"].unique()):
image_annos = annotations_df[annotations_df["file_name"] == image_name]
bboxes = [box for box in image_annos["box"]]
labels = [label for label in image_annos["cat"]]
container.append(SlideContainer(image_folder/image_name, y=[bboxes, labels], level=res_level,width=patch_size, height=patch_size, sample_func=sample_function))
return container
I tried doing data.save(), the following error occurs:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-15-10be10a7c369> in <module>
----> 1 data.save("train-valid_data_bs8.pkl")
3 frames
/usr/local/lib/python3.7/dist-packages/fastai/basic_data.py in save(self, file)
153 warn("Serializing the `DataBunch` only works when you created it using the data block API.")
154 return
--> 155 try_save(self.label_list, self.path, file)
156
157 def add_test(self, items:Iterator, label:Any=None, tfms=None, tfm_y=None)->None:
/usr/local/lib/python3.7/dist-packages/fastai/torch_core.py in try_save(state, path, file)
414 #To avoid the warning that come from PyTorch about model not being checked
415 warnings.simplefilter("ignore")
--> 416 torch.save(state, target)
417 except OSError as e:
418 raise Exception(f"{e}\n Can't write {path/file}. Pass an absolute writable pathlib obj `fname`.")
/usr/local/lib/python3.7/dist-packages/torch/serialization.py in save(obj, f, pickle_module, pickle_protocol, _use_new_zipfile_serialization)
378 if _use_new_zipfile_serialization:
379 with _open_zipfile_writer(opened_file) as opened_zipfile:
--> 380 _save(obj, opened_zipfile, pickle_module, pickle_protocol)
381 return
382 _legacy_save(obj, opened_file, pickle_module, pickle_protocol)
/usr/local/lib/python3.7/dist-packages/torch/serialization.py in _save(obj, zip_file, pickle_module, pickle_protocol)
587 pickler = pickle_module.Pickler(data_buf, protocol=pickle_protocol)
588 pickler.persistent_id = persistent_id
--> 589 pickler.dump(obj)
590 data_value = data_buf.getvalue()
591 zip_file.write_record('data.pkl', data_value, len(data_value))
ValueError: ctypes objects containing pointers cannot be pickled
I know this error comes because C is implementing pointers to the ObjectItemListSlide object at the lower level, and those ctypes object cant be pickled, and one of the solutions to that is to
define our own __getstate__
/__reduce__
methods for pickling and __setstate__
for unpickling.
But how can I implement it in fastai, and is there any other way to save databunch.
Thanking you all in advance,
Harshit