I am working on a classification project with a large number (1000s) of short videos. These have a footprint of ~20 GB of data. I have been trying to find an efficient way of processing the videos to be used with Fast.AI’s DataSet loader. My basic idea:
- Load each video with OpenCV
- Grab a few frames per second
- Either save as JPEG (StackOverflow1 or convert directly to tensors (StackOverflow2) and save to disk
- Train classifier on these transformed images.
For an 11 MB video, I got roughly 2 GB when trying to save as a numpy tensor (‘uint8’) or as JPEGs.
This seems like a really inefficient process to me. Any guidance on how I might better approach this project? Thanks!