HELP: READ LMDB dataset in Fastai

I’d like to read lmdb format dataset. Anybody tell me how to use lmdb dataset just like Caffe in Fastai?, should I customize the dataset or there are some interfaces in FastAI? Thank you!

There are lessons specifically on this dataset. Have you been through one of the previous lesson 1s?

Sorry, I have read the lessons in github and documents in https://docs.fast.ai/, but I cannot find any topic about LMDB dataset. And I also asked Google, while only IMDb rather than LMDB is found. :sob:

Oh sorry I just assumed you had typo’d. Any link to the LMDB dataset?

LMDB database, https://lmdb.readthedocs.io/en/release/, which is widely used in Caffe

Oh! Database not dataset. Using a custom Dataset in fastai: https://docs.fast.ai/basic_data.html#Using-a-custom-Dataset-in-fastai

No easy support from lmdb from my understanding.

:grinning:thanks a loooot.

It looks like you’ll need fastai’s data_block API https://docs.fast.ai/data_block.html along with LMDB’s Python wrapper.

I’d say worse case scenario is that you run each step by pulling a batch of data from LMDB, converting it into a DataFrame, and then feeding that into fastai to train the model. Run enough steps and you’ve got an epoch, and enough epochs will in theory give you a trained model.

1 Like

If there’s an academic faculty member interested in this problem, I recently saw a Facebook award that would well-suited for this problem. fastai is built on top of Facebook’s PyTorch, and given the amount of data the FB has, I suspect that they would be very interested in machine learning with high-performance databases. https://research.fb.com/programs/research-awards/proposals/2020-networking-request-for-proposals/