ValueError: too many values to unpack (expected 2)

I am running the following code for my model

data = ImageClassifierData.from_csv(PATH, 'boneage-training-dataset',label_csv , bs = 64, tfms=(None,None), val_idxs=val_idxs, suffix='.png',test_name=None, continuous=False, skip_header=True, num_workers=4)

However i get this error

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-95-3b0293571f6a> in <module>()
----> 1 data = ImageClassifierData.from_csv(PATH, 'boneage-training-dataset',label_csv , bs = 64, tfms=(None,None), val_idxs=val_idxs, suffix='.png',test_name=None, continuous=False, skip_header=True, num_workers=4)
      2 learn = ConvLearner.pretrained(arch,data,precompute=True)
      3 
      4 #ImageClassifierData.from_csv(path, folder, csv_fname, bs=64, tfms=(None, None), val_idxs=None, suffix='', test_name=None, continuous=False, skip_header=True, num_workers=8)

~/fastai/courses/dl1/fastai/dataset.py in from_csv(cls, path, folder, csv_fname, bs, tfms, val_idxs, suffix, test_name, continuous, skip_header, num_workers)
    345             ImageClassifierData
    346         """
--> 347         fnames,y,classes = csv_source(folder, csv_fname, skip_header, suffix, continuous=continuous)
    348         ((val_fnames,trn_fnames),(val_y,trn_y)) = split_by_idx(val_idxs, np.array(fnames), y)
    349 

~/fastai/courses/dl1/fastai/dataset.py in csv_source(folder, csv_file, skip_header, suffix, continuous)
     73 
     74 def csv_source(folder, csv_file, skip_header=True, suffix='', continuous=False):
---> 75     fnames,csv_labels,all_labels,label2idx = parse_csv_labels(csv_file, skip_header)
     76     full_names = [os.path.join(folder,fn+suffix) for fn in fnames]
     77     if continuous:

~/fastai/courses/dl1/fastai/dataset.py in parse_csv_labels(fn, skip_header)
     62     skip = 1 if skip_header else 0
     63     csv_lines = [o.strip().split(',') for o in open(fn)][skip:]
---> 64     csv_labels = {a:b.split(' ') for a,b in csv_lines}
     65     all_labels = sorted(list(set(p for o in csv_labels.values() for p in o)))
     66     label2idx = {v:k for k,v in enumerate(all_labels)}

~/fastai/courses/dl1/fastai/dataset.py in <dictcomp>(.0)
     62     skip = 1 if skip_header else 0
     63     csv_lines = [o.strip().split(',') for o in open(fn)][skip:]
---> 64     csv_labels = {a:b.split(' ') for a,b in csv_lines}
     65     all_labels = sorted(list(set(p for o in csv_labels.values() for p in o)))
     66     label2idx = {v:k for k,v in enumerate(all_labels)}

ValueError: too many values to unpack (expected 2)

My CSV file head is s below

id	boneage	male
1377	180	FALSE
1378	12	FALSE
1379	94	FALSE
1380	120	TRUE

Is it because it has multiple variables that i get this error ? Should i adjust anything in my code or in the read csv section?

Here is how i am reading the labels

label_csv = f'{PATH}train.csv'
n = len(list(open(label_csv)))-1
val_idxs = get_cv_idxs(n)

Thanks

I am also getting the same error. All I am doing is trying to apply the learning on a sample dataset.

df_sample =df.sample(n=10000)
df_sample.to_csv(‘data/amazonPlanet/temp/train_v2-SAMPLE.csv’)
train_v2_SAMPLE=f’{PATH}train_v2-SAMPLE.csv’
val_idxs_SAMPLE = get_cv_idxs(n)
train_v2_csv =f’{PATH}train_v2.csv’
n =len(list(open(train_v2_csv)))-1
val_idxs = get_cv_idxs(n)
len(val_idxs_SAMPLE)
f_model=resnet34
def get_data_sample(sz):
tfms = tfms_from_model(f_model, sz, aug_tfms=transforms_top_down, max_zoom=1.1, pad=0, crop_type=None, tfm_y=None)
return ImageClassifierData.from_csv(PATH, ‘train-jpg’ , train_v2_SAMPLE, tfms=tfms, val_idxs=val_idxs_SAMPLE, suffix=’.jpg’, test_name=‘test-jpg’, continuous=False, skip_header=True, num_workers=16)

get_data_sample(256)


ValueError Traceback (most recent call last)
in ()
----> 1 get_data_sample(256)

in get_data_sample(sz)
2 def get_data_sample(sz):
3 tfms = tfms_from_model(f_model, sz, aug_tfms=transforms_top_down, max_zoom=1.1, pad=0, crop_type=None, tfm_y=None)
----> 4 return ImageClassifierData.from_csv(PATH, ‘train-jpg’ , train_v2_SAMPLE, tfms=tfms, val_idxs=val_idxs_SAMPLE, suffix=’.jpg’, test_name=‘test-jpg’, continuous=False, skip_header=True, num_workers=16)

~/fastai/courses/dl1/fastai/dataset.py in from_csv(cls, path, folder, csv_fname, bs, tfms, val_idxs, suffix, test_name, continuous, skip_header, num_workers)
354 ImageClassifierData
355 “”"
–> 356 fnames,y,classes = csv_source(folder, csv_fname, skip_header, suffix, continuous=continuous)
357 ((val_fnames,trn_fnames),(val_y,trn_y)) = split_by_idx(val_idxs, np.array(fnames), y)
358

~/fastai/courses/dl1/fastai/dataset.py in csv_source(folder, csv_file, skip_header, suffix, continuous)
77
78 def csv_source(folder, csv_file, skip_header=True, suffix=’’, continuous=False):
—> 79 fnames,csv_labels,all_labels,label2idx = parse_csv_labels(csv_file, skip_header)
80 full_names = [os.path.join(folder,fn+suffix) for fn in fnames]
81 if continuous:

~/fastai/courses/dl1/fastai/dataset.py in parse_csv_labels(fn, skip_header)
66 skip = 1 if skip_header else 0
67 csv_lines = [o.strip().split(’,’) for o in open(fn)][skip:]
—> 68 csv_labels = {a:b.split(’ ') for a,b in csv_lines}
69 all_labels = sorted(list(set(p for o in csv_labels.values() for p in o)))
70 label2idx = {v:k for k,v in enumerate(all_labels)}

~/fastai/courses/dl1/fastai/dataset.py in (.0)
66 skip = 1 if skip_header else 0
67 csv_lines = [o.strip().split(’,’) for o in open(fn)][skip:]
—> 68 csv_labels = {a:b.split(’ ') for a,b in csv_lines}
69 all_labels = sorted(list(set(p for o in csv_labels.values() for p in o)))
70 label2idx = {v:k for k,v in enumerate(all_labels)}

ValueError: too many values to unpack (expected 2)

This is not a CSV. The Header and values need to be , (comma) separated (not Tab or spaces).

I copied the wrong file

id,boneage,male
1377,180,False
1378,12,False
1379,94,False
1380,120,True
1381,82,False
1382,138,True
1383,150,True

Is my file

Now i understand - That’s not the format the Image Classifier expects in the CSV. It needs to be -
<image_id>,label

If you remove the male column from the csv file, it will should work fine. I am assuming you are trying to predict the bonnage column.

You can also check Yannet’s Notebook on boneage dataset - https://github.com/yanneta/pytorch-tutorials/tree/master/bone-age

1 Like

But the Gender is important – I thought using the Multilabelled dataset would allow me to parse more than one label to a dataset – Maybe @jeremy will have some suggestions on how to do this

I am also getting really low accuracy in the 20 % - similar to the notebook from Yannet

@Judywawira, what ages are the images from? Perhaps gender is not so important if the images are from young children. Or, if you want to see if gender is important, you could try training separate models for males and females . . .

I know age is important from a clinical point of view – boys and girls grow at different rates and have different references –

@jeremy could you spend some time showing us how to use a csv with multiple rows for labels to create the data … mine seems to fail with this error which resolves when the training is based on a csv with one column …that is the bone age …

I’ll try @Judywawira, although it’s possible we may not get to this until part 2. It’s a somewhat advanced topic, and not something that we’ve built in fastai just yet.

1 Like

I am just trying to copy the class lecture but getting too many values of unpack error. My jupyter notebook is attached.StoreSalePrediction.pdf (160.5 KB)

Because proc_df can either return 3 or 4

x , y, nas, mapper

Returns:
    --------
    [x, y, nas, mapper(optional)]:
        x: x is the transformed version of df. x will not have the response variable
            and is entirely numeric.
        y: y is the response variable
        nas: returns a dictionary of which nas it created, and the associated median.
        mapper: A DataFrameMapper which stores the mean and standard deviation of the corresponding continous
        variables which is then used for scaling of during test-time.

4 Likes

Thanks so much Aditya!! I really appreciate your input.

It’s my pleasure mam…

Here is a blog post with solution from the winning team of bone age challenge, FYI.

1 Like

Yes, this solved the problem,
If you are referring to part 1 Machine learning of FastAI(RandomForest) and trying to use
x, y = prod_df(df_raw, ‘Target’)

Then it’ll throw a error, so instead just add 1 more variable which will catch the columns which were created by proc_df function

x,y,nas = proc_df(df_raw, ‘Target’)

Thank you @ecdrid