ImageCleaner missing argument in lesson 2 download notebook

Facing the same issue with Colab

Same issue with colab as well. My workaround was to connect colab to google drive,

from google.colab import drive
drive.mount(’/content/gdrive’)

Then download all the images to drive, and manually clean up the dataset in drive view.

Using

ds, idxs = DatasetFormatter().from_toplosses(learn, ds_type=DatasetType.Train)

I’m able to see the images by simple indexing ds.x[idxs[0]]. But in case of a large dataset comparing images visually will become cumbersome.
Please share how did you retrieve the absolute paths.

I just compared them visually in the drive, and discarded the “bad” ones. Maybe…

ds.to_df().iloc[idxs[:10]]

1 Like

Yeah, I don’t want to do the visual comparison; just get the absolute paths.

This is kind of an annoying work-around if you have a low performance laptop running linux.

  1. Install anaconda3, then install fastai. This should be enough to run the jupyter notebook locally.
  2. Use amatic’s suggestion of mounting google drive (then set the path to ‘gdrive/fastai_data’ or something similar):
  1. Run the notebook in colab up to the point where you want to run the cleaner.
  2. Download the data and the notebook from google drive to your local machine.
  3. Run the notebook locally until you have created the “learn” object, but not done any learning (this is the part you don’t want to run locally on a low performance machine).
  4. Then skip to learn.load(‘stage-2’). This should work just fine since it should have been downloaded with the data.
  5. Run the cleaner locally.
  6. Upload “cleaned.csv” to google drive.
  7. Reload the data set using “cleaned.csv” and continue working.
12 Likes

I’m trying to run Image Cleanup on folder contains [‘train’, ‘valid’]
With 2 categories in each : ‘mgg’, ‘met’ ) rather than csv file
Do you need to run the cleanup for each folder separately?

I tried several configurations without success. for e.g. running:

pathC=‘data/hggmetN2/train/hgg/’
ImageCleaner(ds, idxs, pathC)

Crashing on

I’d appreciate your help!
Moran

I took care of this by setting a variable with the path.

path = ds.x.path

and then passing that to ImageCleaner:

ImageCleaner(ds, idxs, path)

You can see whats in the dataset variable by printing it if you want to display your path:

print(ds)

1 Like

Thanks for your reply
I’d tried the following:
ds, idxs = DatasetFormatter().from_toplosses(learn, ds_type=DatasetType.Valid)
path = ds.x.path
ImageCleaner(ds, idxs, path)

with path = the main folder, contains the valid and train folders
print(ds)

The cleaned.csv file was created at my path folder
But the ImageCleaner GUI does not popup, And I get the following output

What did I do wrong?
Thanks a lot
Moran

I’m not sure. I see that in the notebook in the github repo someone fixed the path parameter missing in the ImageCleaner call. Maybe try pulling and getting the updated notebook?

Thanks!

Try running your notebook in Jupyter Notebook, not Jupyter Lab. That is https://localhost:8080/tree, not https://localhost:8080/lab which opens by default I suppose. Running the ImageCleaner widget in Jupyter Lab is not currently supported.

Hi, is there a way to get the top loss url out of this:
losses,idxs = interp.top_losses()
?
or use the idxs to delete this pics from the DataSet?
(without using the widget)
Many thanks

@MichalAha

Even I am facing a similar problem. Did you find a solution?

I’m trying the method of downloading the notebook to run locally (macOS), but run into a ModuleNotFoundError when trying to import from fastai. What am I doing wrong? There’s probably something obvious I’m missing, but I’m super new to fastai/Python in general. Thanks!

I had an error when running
from google.colab import drive on my jupyter notebook on my local machine:

ModuleNotFoundError: No module named ‘google.colab’

How do you deal with that situation? Thanks @ jonesey

That command is only for notebooks in Google Colaboratory. If you want access locally see the steps in this article:

1 Like

I am actually having trouble opening the lesson-2-download notebook :frowning:
I get this error when I try to open the notebook:
Unreadable Notebook: /home/jupyter/tutorials/fastai/course-v3/nbs/dl1/lesson2-download.ipynb NotJSONError('Notebook does not appear to be JSON: ‘{\n “cells”: [\n {\n “cell_type”: "m…’)

Here’s what I have already tried:

  • try to open the notebook on Github (fails to load, but that could be inability of Github to render the notebook)
  • download the notebook from github and try to render it on local/laptop (doesn’t work, gives the same Unreadable Notebook error)
    I pulled the latest code but this error seems to persist.
1 Like

You can use an older version

git checkout b5c7563f5d0d7b7413590d645660e285ded1765f

is a temporary workaround. It’s a commit from yesterday (2019-06-26) so nothing much is lost by using it. Seems to work for me.

EDIT: for the “Unreadable Notebook” problem

You can do it as such to see a list of file paths:

losses,idxs = interp.top_losses()
data.valid_ds.x.items[idxs]
2 Likes