I think I have spotted a bug in the get_sample
function. It always returns the same sample, while it should be returning a random sample. This is the source code for the function:
def get_sample(df,n):
idxs = sorted(np.random.permutation(len(df)))
return df.iloc[idxs[:n]].copy()
np.random.permutation(len(df)) permutes the indices in the dataframe, however applying sorted on the permutation brings it back to the original ordering. The first line should be:
idxs = np.random.permutation(len(df))