Platform: Kaggle Kernels

Hi @init_27, I just uploaded IMDB Wiki Face dataset (from https://data.vision.ee.ethz.ch/cvl/rrothe/imdb-wiki/) into Kaggle. I can very well see the folders and files in the “Your Dataset” section on Kaggle.

However, when I am using this data into one of my kernels, I am getting errors.

Here are few screen shots:

  1. When I see this data into my “Your Dataset” folder:
    image

  2. In the Kaggle Kernel:
    image

  3. It is visible in the kernel as well:
    image

  4. But when I am extracting it with this code:
    image

I am getting below error:
image

I tried all tricks but nothing seemed to be working. Could you please help me in this regard.

Best Regards
Abhik

Thanks, this worked.

After the pre-trained model has been downloaded, is there a way run the learner while not being connected to the internet. To make a Kernel submission with Kaggle, a notebook has to be disconnected from the internet. But when it’s disconnected the Kernel restarts. And when you run the learner again tries to connect to the internet to download the pre-trained model.

Hey @abhikjha, you don’t have to extract it. It is directly accessible. Try this. Declare a variable called path = "…/ (2 dots and a slash) and then keep pressing tab. You will be able to access your files with something like path = “…/input/imdb_crop/imdb_crop/00/…” and so on. Hope this helps.

1 Like

Hey @Mauro, Read the thread from here :slight_smile:

Thanks Dipam! It worked… :smile:

Thanks Dipam, I did. It didn’t work for him either.

It worked

src = (ImageList.from_csv(path=path, csv_name='/kaggle/input/train_v2.csv', folder='train-jpg', suffix='.jpg')
       .random_split_by_pct(0.2).label_from_df(label_delim=' '))

So I labeled it exactly as you said. I got this from official jupyter notebook https://nbviewer.jupyter.org/github/fastai/course-v3/blob/master/nbs/dl1/lesson3-planet.ipynb

I am wondering why batchsize doesn’t seem to affect the training time in lesson 1. I’ve been varying batchsizes between bs=8 and bs=256 and keep getting the same training time around 9 minutes: https://www.kaggle.com/nemilentsau/fork-of-fast-ai-v3-lesson-1 Shouldn’t training time decrease with the batchsize increase?

probably because your num_workers is 0. Kaggle used to have a memory problem, which is why it was set to 0, but I think it has been fixed so try setting num_workers to like 8.

Hey Mauro, I was just going through a kernel and it seems he managed to use pretrained model in a Kaggle competition without having to connect to the internet. Link here. Read the kernel and comments. I’m gonna try to understand how he did it tonight, but looks like he uploaded the resnet50 weights as a dataset.

This doesn’t seem make much of a difference. Increasing num_workers slightly improves the learning time, but almost doesn’t affect learning time dependence on the batch_size. For example, for num_workers=4 (optimal number, learning is slower for num_workers=8) the learning time is 1m29s for batch_size=128 and 1m41s for batch_size=8

Hi I would like to contribute by just putting additional notes on the notebook. is there a way to do this?

Kaggle GPU Kernels are limited to two CPU cores. The speed limitation is probably due to the fast.ai image transformations which run on the CPU. That being said, Kaggle still uses Tesla K80s which are not the fastest anymore. Google upgraded Colab to T4s (although a Colab GPU instance only gets 1 CPU core). Perhaps Google will give Kaggle the same upgrade soon.

When going through the Lesson - 3 Camvid Nb, there is a block with commented out code:

# path = Path('./data/camvid-small')

# def get_y_fn(x): return Path(str(x.parent)+'annot')/x.name

# codes = array(['Sky', 'Building', 'Pole', 'Road', 'Sidewalk', 'Tree',
#     'Sign', 'Fence', 'Car', 'Pedestrian', 'Cyclist', 'Void'])

# src = (SegmentationItemList.from_folder(path)
#        .split_by_folder(valid='val')
#        .label_from_func(get_y_fn, classes=codes))

# bs=8
# data = (src.transform(get_transforms(), tfm_y=True)
#         .databunch(bs=bs)
#         .normalize(imagenet_stats))

Coming from a previous notebook, where it was mentioned that some of the code is commented out because of Kaggle’s restriction on the number of downloads, I thought that it was a similar case here and tried uncommenting and running the code - which was confusing. It was only after close inspection, that I found that this code is totally unrelated (or is it useful??).

If it isn’t used, I would vote for deleting the cell. Is it possible?

Thanks for the wonderful notebooks! :slight_smile:

The Planet Nb contains the following code:

src = (ImageItemList.from_csv(path, 'train_v2.csv', folder='train-jpg', suffix='.jpg')
      .random_split_by_pct(0.2)
      .label_from_df(sep=' '))

but the API has changed since, so the code should be updated to:

src = (ImageList.from_csv(path, 'train_v2.csv', folder='train-jpg', suffix='.jpg')
      .split_by_rand_pct(0.2)
      .label_from_df(label_delim=' '))

I know that maintainers are busy and I would love to contribute!

cc @init_27

For those having problems with saving to kaggle or the error statement learn.lr_find() while trying to run , check this blog post for direction on how to solve that

https://medium.com/machine-learning-demystify/how-to-effectively-make-savings-in-kaggle-workspace-a0bbf8636ce7?sk=26fcc639d111b522a72ee71e80613954

1 Like

Hi,

I am facing the same problem and refreshing does not help. Here’s the full text of the error if it helps.

---------------------------------------------------------------------------
gaierror                                  Traceback (most recent call last)
/opt/conda/lib/python3.6/site-packages/urllib3/connection.py in _new_conn(self)
    140             conn = connection.create_connection(
--> 141                 (self.host, self.port), self.timeout, **extra_kw)
    142 

/opt/conda/lib/python3.6/site-packages/urllib3/util/connection.py in create_connection(address, timeout, source_address, socket_options)
     59 
---> 60     for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
     61         af, socktype, proto, canonname, sa = res

/opt/conda/lib/python3.6/socket.py in getaddrinfo(host, port, family, type, proto, flags)
    744     addrlist = []
--> 745     for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
    746         af, socktype, proto, canonname, sa = res

gaierror: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

NewConnectionError                        Traceback (most recent call last)
/opt/conda/lib/python3.6/site-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    600                                                   body=body, headers=headers,
--> 601                                                   chunked=chunked)
    602 

/opt/conda/lib/python3.6/site-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
    345         try:
--> 346             self._validate_conn(conn)
    347         except (SocketTimeout, BaseSSLError) as e:

/opt/conda/lib/python3.6/site-packages/urllib3/connectionpool.py in _validate_conn(self, conn)
    849         if not getattr(conn, 'sock', None):  # AppEngine might not have  `.sock`
--> 850             conn.connect()
    851 

/opt/conda/lib/python3.6/site-packages/urllib3/connection.py in connect(self)
    283         # Add certificate verification
--> 284         conn = self._new_conn()
    285 

/opt/conda/lib/python3.6/site-packages/urllib3/connection.py in _new_conn(self)
    149             raise NewConnectionError(
--> 150                 self, "Failed to establish a new connection: %s" % e)
    151 

NewConnectionError: <urllib3.connection.VerifiedHTTPSConnection object at 0x7f5785904748>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

MaxRetryError                             Traceback (most recent call last)
/opt/conda/lib/python3.6/site-packages/requests/adapters.py in send(self, request, stream, timeout, verify, cert, proxies)
    448                     retries=self.max_retries,
--> 449                     timeout=timeout
    450                 )

/opt/conda/lib/python3.6/site-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    638             retries = retries.increment(method, url, error=e, _pool=self,
--> 639                                         _stacktrace=sys.exc_info()[2])
    640             retries.sleep()

/opt/conda/lib/python3.6/site-packages/urllib3/util/retry.py in increment(self, method, url, response, error, _pool, _stacktrace)
    387         if new_retry.is_exhausted():
--> 388             raise MaxRetryError(_pool, url, error or ResponseError(cause))
    389 

MaxRetryError: HTTPSConnectionPool(host='s3.amazonaws.com', port=443): Max retries exceeded with url: /fast-ai-imageclas/oxford-iiit-pet.tgz (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f5785904748>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',))

During handling of the above exception, another exception occurred:

ConnectionError                           Traceback (most recent call last)
<ipython-input-13-0dcb1b68c103> in <module>()
----> 1 path = untar_data(URLs.PETS); path

/opt/conda/lib/python3.6/site-packages/fastai/datasets.py in untar_data(url, fname, dest, data)
    118     dest = Path(ifnone(dest, url2path(url, data)))
    119     if not dest.exists():
--> 120         fname = download_data(url, fname=fname, data=data)
    121         tarfile.open(fname, 'r:gz').extractall(dest.parent)
    122     return dest

/opt/conda/lib/python3.6/site-packages/fastai/datasets.py in download_data(url, fname, data)
    111     if not fname.exists():
    112         print(f'Downloading {url}')
--> 113         download_url(f'{url}.tgz', fname)
    114     return fname
    115 

/opt/conda/lib/python3.6/site-packages/fastai/core.py in download_url(url, dest, overwrite, pbar, show_progress, chunk_size, timeout)
    162     if os.path.exists(dest) and not overwrite: return
    163 
--> 164     u = requests.get(url, stream=True, timeout=timeout)
    165     try: file_size = int(u.headers["Content-Length"])
    166     except: show_progress = False

/opt/conda/lib/python3.6/site-packages/requests/api.py in get(url, params, **kwargs)
     73 
     74     kwargs.setdefault('allow_redirects', True)
---> 75     return request('get', url, params=params, **kwargs)
     76 
     77 

/opt/conda/lib/python3.6/site-packages/requests/api.py in request(method, url, **kwargs)
     58     # cases, and look like a memory leak in others.
     59     with sessions.Session() as session:
---> 60         return session.request(method=method, url=url, **kwargs)
     61 
     62 

/opt/conda/lib/python3.6/site-packages/requests/sessions.py in request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
    531         }
    532         send_kwargs.update(settings)
--> 533         resp = self.send(prep, **send_kwargs)
    534 
    535         return resp

/opt/conda/lib/python3.6/site-packages/requests/sessions.py in send(self, request, **kwargs)
    644 
    645         # Send the request
--> 646         r = adapter.send(request, **kwargs)
    647 
    648         # Total elapsed time of the request (approximately)

/opt/conda/lib/python3.6/site-packages/requests/adapters.py in send(self, request, stream, timeout, verify, cert, proxies)
    514                 raise SSLError(e, request=request)
    515 
--> 516             raise ConnectionError(e, request=request)
    517 
    518         except ClosedPoolError as e:

ConnectionError: HTTPSConnectionPool(host='s3.amazonaws.com', port=443): Max retries exceeded with url: /fast-ai-imageclas/oxford-iiit-pet.tgz (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f5785904748>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',))

Thanks!

Please double-check if the internet option is enabled.
The stack trace complains: ConnectionError

Hi Everyone!
I’m slightly packed, since I’ve just changed cities, would anyone be interested in helping update the notebooks?

I’d be happy to continue it but I’m swamped so it might take a little longer.

Thanks in Advance!