I have been trying to save the tokenized output from the
TextBlock.from_folder to a custom folder rather than the
_tok folder which the method builds internally but no luck so far. Here, it is mentioned that
output_dir can be used for the purpose but I couldn’t manage to get it.
My code below:
xt=TextBlock.from_folder(path, output_dir=tok_output_folder, is_lm=True)