Data pascalVOC to use fastai

Hello to all;
I’m a bit stuck on the third fastai lesson.
The bear prediction model obviously hasn’t given me any problems.
After that, I have built a grading model to differentiate fender electric guitars. I downloaded the images via Duck Duck go and the model apparently worked with that data, but when I tried it with different images it didn’t differentiate the guitars correctly.
Now I decided, not to use the images directly downloaded from the net. Instead, download 100 images of each guitar model (there are five different models so the total is 500 images) and label them with the Labelimg tool.
The difficulty I have now is that these labelled images are in an ‘xml’ format and I don’t really know how to process them with Fastai since, until now, I used jpg or png images directly.
Would someone be so kind as to explain how to work with this data format, which I think is called ‘pascalVOC’.
Best regards!

1 Like

Where did you get your images? Did you mean you downloaded from a different search engine? AFAIK downloading the images from duckduckgo doesn’t automatically make them relevant, you still need to go through them and make sure the labels correspond to the actual image. This is the data preparation step which you’ll need to do regardless of how you got the data.

What I would do is, get the images from ddg, maybe more than I wish to put in my dataset (get 150 instead of 100 of each category) so after my manual cleanup I can expect to be left with about a 100 of each and once they’re “good” and properly labeled (I’d be doing this part manually), I would run an epoch and run the tool provided by fastai to cleanup any images that are mislabeled.

I don’t understand how images can be in ‘xml format’ … are they base 64 encoded and are included in the xml output of the labeling program you are using?

It is not that the images are xml, the annotation file is a xml file in the PascalVOC-xml style.

To work with this, you can either write your own get_x, get_y and put them to a DataBlock. These get_x, get_y would then most likely be something like XML parsers going for the filename and path, depending on how the iamges are saved. This is something that is hard to know without having seen the files themselves.
I think, that the path and filename to the images are rather simple to find in a PascalVOC; for the labels, I do not know where they are hidden, but I believe they are included in the object - name structure. But this is something you have to check yourself.

Otherwise, you could also parse the PascalVOC file before putting it to a DataBlock, transforming it to a DataFrame and go with the ImageDataLoaders.from_df that is included in FastAI.

Hope this helps a little bit to a solution!

Hi, thanks for your response. If I understand correctly, in the DataBlock API, I will have to assign the independent variable get_x the folder of images, and the variable get_y the folder where those xml files are. Is it so? Could those tagged images be converted to a csv file?

If you could share an example, this would be much easier than guessing :slight_smile:

Here is an attempt that show that it is/was possible to load the data ‘easily’ course-v3/pascal.ipynb at master · fastai/course-v3 · GitHub
It is based on Pascal VOC and even though they are using json files to encode their annotations, I think it is a good start.

Alternatively, voc_ssd/ at master · PuchatekwSzortach/voc_ssd · GitHub there is an implementation that is based on xml, but this is for plain torch so please be aware of that.

Nevertheless, the get_x function should return images, the get_y function should return the label(s) assigned. In the Pascal VOC dataset, for example, one image can have multiple names, i.e. labels, with the corresponding bounding boxes.

And I did not understand the second question - why would you want to convert an image to a CSV? do you mean “can the content of the Pascal VOC result be turned into a csv”? Or, to be more explicit, can the result of LabelImg be converted to a csv (that pandas can load)?

They sure can, with an xml parser and knowledge of the Pascal VOC style and what is needed for your task.

Fast ai lession 9 | Kaggle a kaggle notebook for using VOC files, I think the loading part remains; at master · Nitinguptadu/ · GitHub - converting from voc-xml to coco-json format; I am not sure how to use this further, but I think from this to csv is simple
Google Colab Zachary Mueller did some object detection on the Pascal VOC in his walkwithfastai course, awesome! This should be useful!

1 Like

I have 2 folders. In one, .jpg images and in the other xml files which I guess are the tags of those images. In the ‘labelimg’ application it has an input path (images) and the output after labeling each image in a second folder.
I understand the example you have written, but you are talking about json files, not the same as xml files obviously.

The second to last link converts from XML by using ElementTree - with this as a template you should be able to generate the csv you want to have.

since I do not have any pascalvoc files on this computer but it sounds interesting, I was looking at stackoverflow (python - create pascol voc xml from csv - Stack Overflow) so it is not that hard to find something like this.

I have done the following to try to create a datablock. I have a folder full of guitar pictures. I assign the path to that folder to ‘Path’. And when I ask it to return in ‘slicer’ variable[:7] it returns 0.

Interesting. Maybe read up on the get_image_files method on why this happens.

I cannot judge why this is happening. From going through the code of the function (as done here Understanding get_image_files in fastai ) my question is now - what format are your images saved as?

Good morning,
I can’t quite understand the process of handling the different types of image data. I used the (python) tool ‘labelimg’. I labeled each photo with a series of segmentations/tags, this application saved the result in a second folder called ‘labeling-images’ (as you can see in the screenshot) as an xml file.
I’ve been dealing with this data package, but without success, for a couple of days.

Captura de pantalla 2022-07-06 a las 9.40.53
Aquí muestro la carpeta origen de las imágenes.

Captura de pantalla 2022-07-06 a las 9.46.12
Aquí la carpeta destino.

No consigo manejar este formato con Fastai :sleepy:

the only thing that I can suggest is to upload your data (so the whole fender folder, zipped) somewhere.

I usually do not work with images, I must admit, but getting this working with some base model behind it, this should be possible in my off time. I would then either share it the same way back or soemthing similar.

Maybe you could also create a colab to show what you did up to now and, by allowing others access to it, they can change methods and other tricks. this is easier to give you feedback maybe?

I think it’s possible to upload datasets to kaggle and it doesn’t eat up your kaggle workspace, but lives in the dataset repository in public.

The question is that I don’t know how to deal with two folders, one with the images and the other with xml files (I guess they are the labels).

You should maybe look in the documentation for basic questions such as this Computer vision | fastai

Here, the author actually shows what happens when he downloads a dataset, unzips it and sets it as a path. The path contains two folders, he only wants to deal with the images and not care about the annotations, so he continues to show what he is doing.