I’ll answer some of your questions:
Question 1: Once you train and validate your model and have setup all the hyperparameters, it is advisable to train the model on the entire training set provided to you. The test (at Kaggle) will contain dates in the future and your model will be tested on that.
Question 2: The code is not meant to be run linearly. When you train, you comment out the test[columns] code and vice versa when you test.
Question 4: AFAICT ‘Id’ is not “added” here. Only Id and Sales are selected to be written out to the csv for submission to kaggle.