The issue with this piece of code is that the “Before” values are negative. I have built upon your idea with the following:
for o in ['After']:
for p in columns:
a = o+p
df_train.loc[df_train[a]<0, a] = 0
df_test.loc[df_test[a]<0, a] = 0
for o in ['Before']:
for p in columns:
a = o+p
df_train.loc[df_train[a]<-500, a] = -df_train['After' + p].max()
df_test.loc[df_test[a]<-500, a] = -df_train['After' + p].max()
Which seems to work well. It basically remove all negative values from after fields and clips the negative values to the maximum distance to a future event. I am surprised not more people are commenting about issues with this. The notebook in its current state does not work for me