Getting issues with Pandas for Lesson 3 on Rossmann dataset

Nandu · July 29, 2018, 5:24pm

Hi
I was trying to execute the following line of code for the Lesson3-Rossman.ipynb. and got the below error message.Can some one please help me out on this.Do we have to update/upgrade pandas or something here. coz from after this line onwards for most of the codes written i am getting error.I am working on google colab using clouderizer

code :for t in tables: display(DataFrameSummary(t).summary())

TypeError Traceback (most recent call last)
in ()
----> 1 for t in tables: display(DataFrameSummary(t).summary())

/usr/local/lib/python3.6/dist-packages/pandas_summary/init.py in init(self, df)
25 self.df = df
26 self.length = len(df)
—> 27 self.columns_stats = self._get_stats()
28 self.corr = df.corr()
29

/usr/local/lib/python3.6/dist-packages/pandas_summary/init.py in _get_stats(self)
83 counts.name = ‘counts’
84 uniques = self._get_uniques()
—> 85 missing = self._get_missing(counts)
86 stats = pd.concat([counts, uniques, missing], axis=1, sort=True)
87

/usr/local/lib/python3.6/dist-packages/pandas_summary/init.py in _get_missing(self, counts)
101 perc = (count / self.length).apply(self._percent)
102 perc.name = ‘missing_perc’
–> 103 return pd.concat([count, perc], axis=1, sort=True)
104
105 def _get_columns_info(self, stats):

TypeError: concat() got an unexpected keyword argument ‘sort’

Patrick · July 29, 2018, 6:20pm

Is pandas-summary loaded?

Run the following in a jupyter cell:

import pandas_summary
pandas_summary.__file__

What do you see?

Nandu · July 30, 2018, 1:55pm

Hi Patrick,
Tried the above code in the cell.but i am still getting the same error.

error

TypeError Traceback (most recent call last)
in ()
----> 1 for t in tables: display(DataFrameSummary(t).summary())

/usr/local/lib/python3.6/dist-packages/pandas_summary/init.py in init(self, df)
25 self.df = df
26 self.length = len(df)
—> 27 self.columns_stats = self._get_stats()
28 self.corr = df.corr()
29

/usr/local/lib/python3.6/dist-packages/pandas_summary/init.py in _get_stats(self)
83 counts.name = ‘counts’
84 uniques = self._get_uniques()
—> 85 missing = self._get_missing(counts)
86 stats = pd.concat([counts, uniques, missing], axis=1, sort=True)
87

/usr/local/lib/python3.6/dist-packages/pandas_summary/init.py in _get_missing(self, counts)
101 perc = (count / self.length).apply(self._percent)
102 perc.name = ‘missing_perc’
–> 103 return pd.concat([count, perc], axis=1, sort=True)
104
105 def _get_columns_info(self, stats):

TypeError: concat() got an unexpected keyword argument ‘sort’

Fuhgidabowit · August 2, 2018, 7:46am

did u “incomment” the “concat_” function with the weather & google in the beginning. Apparently u don’t need this concat function at all (as far as I understood, at least it is not used).