Bcolz_array_iterator


(ben.bowles) #1

Hi there,

I was just curious, have people found this module anywhere? I can’t find it in the downloadable materials. If people have an idea I’d be very curious, thanks!

Ben


(Matthew Kleinsmith) #2

(Constantin) #3

Has anyone had this issue when creating an on-disk carray:

I tried to create a bcolz carray shaped (11000, 500, 500, 3) and get this error. It works with about 8000 samples (i.e. 800050050034bytes ~ 24 GB). I get the impression that even though I activated “rootdir=mydirectory” intermittendly bcolz would like to create an np.array (look at the error output cited above). If that is true, that would be a major issue with using bcolz for larger than RAM data.


(Constantin) #4

Workaround: Do not pre-allocate the entire array, but create using

c = bcolz.carray((0, height, width, channels), rootdir=mydir, mode='w', **kwargs)
# ...
c.append(myarray) 

#5

it seems that this iterator cannot make the multi-threaded works(through the parameter fit_generator workers due to the lock, is there any way to make it parallel? Thanks.


(Jeremy Howard (Admin)) #6

There’s not really any reason to, since it’s not doing any processing.


(Constantin) #7

I recently had another issue down the line of

        RuntimeError('fatal error during Blosc
        decompression: -1',) in
        'bcolz.carray_ext.chunk._getitem' ignored

I figured it must be related to several threads accessing the same carrays on disk. I had been preparing data in one notebook and reading from it in another using a generator. When I stopped one of the notebooks the error disappeared.