TabularLine back to pandas dataframe

muellerzr · May 29, 2019, 8:51pm

Sure! I was mostly wanting it for my own custom implementation of plot_top_losses() for tabular data, and here’s what I came up with:

def plot_top_losses(self, k, largest = True, return_table:bool=False):
        "Shows the respective rows in top_losses along with their prediction, actual, loss, and probability"
        tl_val, tl_idx = self.top_losses(k)
        classes = self.data.classes
        cat_names = self.data.x.cat_names
        cont_names = self.data.x.cont_names
        df = pd.DataFrame(columns = [['Prediction', 'Actual', 'Loss','Prob'] +  cat_names + cont_names])
        for i, idx in enumerate(tl_idx):
          da, cl = self.data.dl().dataset[idx]
          cl = int(cl)
          t1 = str(da) <-
          t1 = t1.split(';') <-
          arr = []
          arr.append(classes[self.pred_class[idx]])
          arr.append(classes[cl])
          arr.append(f'{self.losses[idx]:.2f}')
          arr.append(f'{self.probs[idx][cl]:.2f}')
          for x in range(len(t1)-1): <-
            _, value = t1[x].rsplit(' ',1) <-
            arr.append(value) <-
          df.loc[i] = arr <-
        display(df)
        if return_table: return df

Focus specifically on when tl and cl are being used. Let me know if you have questions!