I think I understand what blending is - simply divide your train set into train and validation sets. Fit whatever amount of models on the train set and run them on validation set. Depending on their results on the validation set, use some algorithm to figure out what mixture of those models to take to maximize performance (weighted average, linear regression, whatever algorithm you might chose). Once you figure out how to combine the outputs of the models, run them on the test set and combine their results based on what you learned on the train / validation sets.

What is stacking though? I tried to google this, tried reading the paper referenced below, but I can’t seem to wrap my head around how is it different? In the paper even the author admits that the math makes it seem more difficult than it really is I couldn’t get past the math so if anyone would be so kind please and shed some additional light on what stacking is I would really appreciate that.

@davecg and @xinxin.li.seattle share here some interesting resources on model ensembling.

The original paper on stacking can be found here.