Hi !
First post that must come with thanks for the amazing resources that FastAI and @jeremy are offering to the world with this course. I started my training a few weeks ago and I’m trying to reproduce the Excel Titanic exercise with Python as Jeremy suggested in the video.
I’m not so far I think. But I’m struggling with a few questions that I wanted to submit to the community.
Here where I am now:
Parameters definitions
layers = 2
seed = 42
learning_rate = 0.15
#Params
np.random.seed(seed)
params = np.random.uniform(-1, 1, size=(9,layers))
params = torch.tensor(params)
Model
from IPython.display import display
def relu(x):
x = torch.tensor(x, requires_grad=True)
y = torch.matmul(x,params)
y = torch.clip(y,0.)
y = torch.sum(y,dim=0)
y.backward()
return [y,x]
def model(df):
result_matrix = []
gradient_matrix = []
for index, passenger in df.iterrows():
pid = int(passenger["PassengerId"])
survived = passenger["Survived"]
passenger_matrix = passenger.drop(["PassengerId","Survived"])
result,passenger_matrix = relu(passenger_matrix)
loss = (result - survived).pow(2)
tensor_result = result
result_matrix.append([pid,tensor_result,loss])
gradient_matrix.append(passenger_matrix.grad)
result_matrix = torch.tensor(result_matrix)
mean_loss = torch.mean(result_matrix[:, 1])
mean_gradients = torch.mean(torch.stack(gradient_matrix), dim=0)
return [mean_loss,mean_gradients]
def check_error_rate(validation_df,show_top_losses = False):
success = 0
accuracy = []
prediction = []
validation_df["Prediction"] = 0
validation_df["Loss"] = 0
for index, passenger in validation_df.iterrows():
passenger_matrix = passenger.drop(["PassengerId","Survived","Loss","Prediction"])
result,passenger_matrix = relu(passenger_matrix)
validation_df.at[index, 'Loss'] = (result.item() - passenger["Survived"])**(2)
validation_df.at[index, 'Prediction'] = result.item()
if(torch.round(result) == passenger["Survived"]):
success += 1
print(f"Success :{success}/{len(validation_df)} - {round((success/len(validation_df))*100)}%")
if(show_top_losses):
print("Top Losses")
losses = validation_df.sort_values(by='Loss',ascending=False)
display(losses.head(10))
def fine_tune_params(mean_gradients):
i = 0
for param in params:
params[i] -= mean_gradients[i]*learning_rate
i += 1
def train(df,cycles):
i = 0
while i < cycles:
i += 1
loss,gradients = model(df)
print(f"Round {i} Loss: {loss}")
fine_tune_params(gradients)
check_error_rate(validation_df)
params_df = pd.DataFrame(params.T)
display(params_df)
The full notebook is on Kaggle.
And now the questions:
- By looking at my top losses based on my validation dataset, I noticed that there are all mostly passenger that survived. In fact, after a few rounds, the model do not predict any survivor anymore I guess, this is due to the fine tuning of my params
def fine_tune_params(mean_gradients):
i = 0
for param in params:
params[i] -= mean_gradients[i]*learning_rate
i += 1
-
That part is not explained in the video that relies on excel built-in functions to do the gradient descent. So I tried something from what is explained earlier but I’m not sure that it is right : I use the mean gradient of all results and I substract it (times the learning rate) to the param.
-
I took the loss function from the excel sheet. I have the intuition that it should be somehow related to the fine tuning of the parameters : the change in the params should be related to the importance of the loss. Yet, I’m only using the gradient of the relus’ y :
def relu(x):
x = torch.tensor(x, requires_grad=True)
y = torch.matmul(x,params)
y = torch.clip(y,0.)
y = torch.sum(y,dim=0)
y.backward()
return [y,x]
- Also @Jeremy hinted in the video that “It would be cool if we had some way” of constraining the results of the prediction between 0 and 1. It is already the case with the floor here : no result can go under 0 but it could go over 1. My intuition is that this tweeks the results towards 0 over time. Should I add a line to cap the results at 1 ?
Thanks for your help, and again, for the amazing work of FastAI and its community.
Kind regards,
Daij