I’m trying to classify video frames, and for that i need to create a batch from a sequence of frames and pass it to the model. currently am classifying frame by frame as following:
> cap = cv2.VideoCapture(video_path)
> _, frame= cap.read()
> frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
> img_t = pil2tensor(frame, np.float32)
> image = Image(img_t)
> pred = model.predict(image)
is there a way to create batch directly form frames, or to add them to a dataset and get the batch from it?
Thanks in advance
The way to go would be to create a custom Dataset that will load your data in init (if they fit in your memory) and to have getitem retrieves one video and its label.
Once you have done that, a standard DataLoader can be set with this Dataset object and your batch size. PyTorch/FastAi will then automatically take care of generating your batches.
One example below:
Note: my dataset doesn’t fit in the (~1.5TB) so I load files in getitem
file_list: np.array = None,
list_labels: np.array = None) -> None:
self._list_sequence_file = file_list
self.labels = list_labels
def __getitem__(self, index) -> np.array:
if index < len(self._list_sequence_file):
sequence = np.load(self._list_sequence_file[index])
sequence = sequence.astype(np.float)
# Any needed normalization
sequence = np.expand_dims(sequence, axis=0)
sequence = torch.from_numpy(sequence).float()
return sequence, labels[index]
return None, None
def __len__(self) -> int:
@bennnun thanks for your answer i’ll try this,
but i was wondering since i’m using an exported model where should I (if needed) apply the transformation function tfms (for example the one that normalize the tensor) ?
@kyis, I don’t think (not sure though) that FastAi library is able to apply transformations on videos through the data bunch.
I think your best chance is to apply the needed transformations directly in the getitem function before returning the two tensors (videos + label).
@bennnun ok i’ll go with this way, many thanks