Applying function to pd.dataframe efficiently to get images

This is with respect to Quick Doodle from kaggle but this can be extended to almost anything as long as it concern with Pandas dataframe.
Many people have shared their kernels, and many of them are using ConvNet approach, or at some point. When we get a pd.dataframe, The way to extract images can be many, but I’ve used this df['drawing'].apply(strokes2img) where strokes2img returns the image. My concern is is there any efficient way to carry out this operation more efficiently. I’ve also used bag from dask but then converting it back to list is very expensive operation again which nullifies the benefit received at the first point. I’m using 4 CPUs with P100, and this above mentioned operation crashes the kernel every time. Is there any way can this operation be ported to GPU or more efficient aproach?

1 Like