ValueError: too many values to unpack (expected 2)

Judywawira · November 19, 2017, 11:07pm

I am running the following code for my model

data = ImageClassifierData.from_csv(PATH, 'boneage-training-dataset',label_csv , bs = 64, tfms=(None,None), val_idxs=val_idxs, suffix='.png',test_name=None, continuous=False, skip_header=True, num_workers=4)

However i get this error

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-95-3b0293571f6a> in <module>()
----> 1 data = ImageClassifierData.from_csv(PATH, 'boneage-training-dataset',label_csv , bs = 64, tfms=(None,None), val_idxs=val_idxs, suffix='.png',test_name=None, continuous=False, skip_header=True, num_workers=4)
      2 learn = ConvLearner.pretrained(arch,data,precompute=True)
      3 
      4 #ImageClassifierData.from_csv(path, folder, csv_fname, bs=64, tfms=(None, None), val_idxs=None, suffix='', test_name=None, continuous=False, skip_header=True, num_workers=8)

~/fastai/courses/dl1/fastai/dataset.py in from_csv(cls, path, folder, csv_fname, bs, tfms, val_idxs, suffix, test_name, continuous, skip_header, num_workers)
    345             ImageClassifierData
    346         """
--> 347         fnames,y,classes = csv_source(folder, csv_fname, skip_header, suffix, continuous=continuous)
    348         ((val_fnames,trn_fnames),(val_y,trn_y)) = split_by_idx(val_idxs, np.array(fnames), y)
    349 

~/fastai/courses/dl1/fastai/dataset.py in csv_source(folder, csv_file, skip_header, suffix, continuous)
     73 
     74 def csv_source(folder, csv_file, skip_header=True, suffix='', continuous=False):
---> 75     fnames,csv_labels,all_labels,label2idx = parse_csv_labels(csv_file, skip_header)
     76     full_names = [os.path.join(folder,fn+suffix) for fn in fnames]
     77     if continuous:

~/fastai/courses/dl1/fastai/dataset.py in parse_csv_labels(fn, skip_header)
     62     skip = 1 if skip_header else 0
     63     csv_lines = [o.strip().split(',') for o in open(fn)][skip:]
---> 64     csv_labels = {a:b.split(' ') for a,b in csv_lines}
     65     all_labels = sorted(list(set(p for o in csv_labels.values() for p in o)))
     66     label2idx = {v:k for k,v in enumerate(all_labels)}

~/fastai/courses/dl1/fastai/dataset.py in <dictcomp>(.0)
     62     skip = 1 if skip_header else 0
     63     csv_lines = [o.strip().split(',') for o in open(fn)][skip:]
---> 64     csv_labels = {a:b.split(' ') for a,b in csv_lines}
     65     all_labels = sorted(list(set(p for o in csv_labels.values() for p in o)))
     66     label2idx = {v:k for k,v in enumerate(all_labels)}

ValueError: too many values to unpack (expected 2)

My CSV file head is s below

id	boneage	male
1377	180	FALSE
1378	12	FALSE
1379	94	FALSE
1380	120	TRUE

Is it because it has multiple variables that i get this error ? Should i adjust anything in my code or in the read csv section?

Here is how i am reading the labels

label_csv = f'{PATH}train.csv'
n = len(list(open(label_csv)))-1
val_idxs = get_cv_idxs(n)

Thanks

rsrivastava · November 20, 2017, 12:38am

I am also getting the same error. All I am doing is trying to apply the learning on a sample dataset.

df_sample =df.sample(n=10000)
df_sample.to_csv(‘data/amazonPlanet/temp/train_v2-SAMPLE.csv’)
train_v2_SAMPLE=f’{PATH}train_v2-SAMPLE.csv’
val_idxs_SAMPLE = get_cv_idxs(n)
train_v2_csv =f’{PATH}train_v2.csv’
n =len(list(open(train_v2_csv)))-1
val_idxs = get_cv_idxs(n)
len(val_idxs_SAMPLE)
f_model=resnet34
def get_data_sample(sz):
tfms = tfms_from_model(f_model, sz, aug_tfms=transforms_top_down, max_zoom=1.1, pad=0, crop_type=None, tfm_y=None)
return ImageClassifierData.from_csv(PATH, ‘train-jpg’ , train_v2_SAMPLE, tfms=tfms, val_idxs=val_idxs_SAMPLE, suffix=’.jpg’, test_name=‘test-jpg’, continuous=False, skip_header=True, num_workers=16)

get_data_sample(256)

ValueError Traceback (most recent call last)
in ()
----> 1 get_data_sample(256)

in get_data_sample(sz)
2 def get_data_sample(sz):
3 tfms = tfms_from_model(f_model, sz, aug_tfms=transforms_top_down, max_zoom=1.1, pad=0, crop_type=None, tfm_y=None)
----> 4 return ImageClassifierData.from_csv(PATH, ‘train-jpg’ , train_v2_SAMPLE, tfms=tfms, val_idxs=val_idxs_SAMPLE, suffix=’.jpg’, test_name=‘test-jpg’, continuous=False, skip_header=True, num_workers=16)

~/fastai/courses/dl1/fastai/dataset.py in from_csv(cls, path, folder, csv_fname, bs, tfms, val_idxs, suffix, test_name, continuous, skip_header, num_workers)
354 ImageClassifierData
355 “”"
–> 356 fnames,y,classes = csv_source(folder, csv_fname, skip_header, suffix, continuous=continuous)
357 ((val_fnames,trn_fnames),(val_y,trn_y)) = split_by_idx(val_idxs, np.array(fnames), y)
358

~/fastai/courses/dl1/fastai/dataset.py in csv_source(folder, csv_file, skip_header, suffix, continuous)
77
78 def csv_source(folder, csv_file, skip_header=True, suffix=’’, continuous=False):
—> 79 fnames,csv_labels,all_labels,label2idx = parse_csv_labels(csv_file, skip_header)
80 full_names = [os.path.join(folder,fn+suffix) for fn in fnames]
81 if continuous:

~/fastai/courses/dl1/fastai/dataset.py in parse_csv_labels(fn, skip_header)
66 skip = 1 if skip_header else 0
67 csv_lines = [o.strip().split(’,’) for o in open(fn)][skip:]
—> 68 csv_labels = {a:b.split(’ ') for a,b in csv_lines}
69 all_labels = sorted(list(set(p for o in csv_labels.values() for p in o)))
70 label2idx = {v:k for k,v in enumerate(all_labels)}

~/fastai/courses/dl1/fastai/dataset.py in (.0)
66 skip = 1 if skip_header else 0
67 csv_lines = [o.strip().split(’,’) for o in open(fn)][skip:]
—> 68 csv_labels = {a:b.split(’ ') for a,b in csv_lines}
69 all_labels = sorted(list(set(p for o in csv_labels.values() for p in o)))
70 label2idx = {v:k for k,v in enumerate(all_labels)}

ValueError: too many values to unpack (expected 2)

ramesh · November 20, 2017, 12:43am

This is not a CSV. The Header and values need to be , (comma) separated (not Tab or spaces).

Judywawira · November 20, 2017, 12:48am

I copied the wrong file

id,boneage,male
1377,180,False
1378,12,False
1379,94,False
1380,120,True
1381,82,False
1382,138,True
1383,150,True

Is my file

ramesh · November 20, 2017, 1:06am

Now i understand - That’s not the format the Image Classifier expects in the CSV. It needs to be -
<image_id>,label

If you remove the male column from the csv file, it will should work fine. I am assuming you are trying to predict the bonnage column.

You can also check Yannet’s Notebook on boneage dataset - https://github.com/yanneta/pytorch-tutorials/tree/master/bone-age

Judywawira · November 20, 2017, 1:24am

But the Gender is important – I thought using the Multilabelled dataset would allow me to parse more than one label to a dataset – Maybe @jeremy will have some suggestions on how to do this

I am also getting really low accuracy in the 20 % - similar to the notebook from Yannet

memetzgz · November 20, 2017, 1:49am

@Judywawira, what ages are the images from? Perhaps gender is not so important if the images are from young children. Or, if you want to see if gender is important, you could try training separate models for males and females . . .

Judywawira · November 20, 2017, 1:52am

I know age is important from a clinical point of view – boys and girls grow at different rates and have different references –

Judywawira · November 20, 2017, 11:12pm

@jeremy could you spend some time showing us how to use a csv with multiple rows for labels to create the data … mine seems to fail with this error which resolves when the training is based on a csv with one column …that is the bone age …

jeremy · November 21, 2017, 12:21am

I’ll try @Judywawira, although it’s possible we may not get to this until part 2. It’s a somewhat advanced topic, and not something that we’ve built in fastai just yet.

rsrivastava · November 26, 2017, 3:29am

I am just trying to copy the class lecture but getting too many values of unpack error. My jupyter notebook is attached.StoreSalePrediction.pdf (160.5 KB)

ecdrid · November 26, 2017, 5:39am

Because proc_df can either return 3 or 4

https://github.com/fastai/fastai/blob/master/fastai/structured.py

x , y, nas, mapper

Returns:
    --------
    [x, y, nas, mapper(optional)]:
        x: x is the transformed version of df. x will not have the response variable
            and is entirely numeric.
        y: y is the response variable
        nas: returns a dictionary of which nas it created, and the associated median.
        mapper: A DataFrameMapper which stores the mean and standard deviation of the corresponding continous
        variables which is then used for scaling of during test-time.

rsrivastava · November 27, 2017, 1:57am

Thanks so much Aditya!! I really appreciate your input.

ecdrid · November 27, 2017, 5:21am

It’s my pleasure mam…

rikiya · November 27, 2017, 1:39pm

Here is a blog post with solution from the winning team of bone age challenge, FYI.

savannahar68 · February 26, 2019, 5:39am

Yes, this solved the problem,
If you are referring to part 1 Machine learning of FastAI(RandomForest) and trying to use
x, y = prod_df(df_raw, ‘Target’)

Then it’ll throw a error, so instead just add 1 more variable which will catch the columns which were created by proc_df function

x,y,nas = proc_df(df_raw, ‘Target’)

Thank you @ecdrid