Hi.
In the code below, does split between training and validation sets influence effective vocab? In other words - is vocab built based on train AND validation sets or train only?
data_lm = TextLMDataBunch.from_df(train_df=df_trn, valid_df=df_val, path="", text_cols=['text'], label_cols=['label'], bs=64)
Thanks!