Share your work here ✅

Hey I am playing around with the same competition. I am confused about how to create validation set.
I see that you have used valid_pct =0.2. From what I understand, this argument will randomly take 20% of the data from the train folder and move to valid folder.Is this the correct way to do this? I only ask this because it is mentioned in this post that we should not have any driver common in test and validation set. The valid_pct = 0.2 will not ensure this. So what is the correct approach here?

1 Like