Implementation Methods 100K+ files in folders

I’m trying to read 100K files (office type docs)
I’m planning to use Apache Tika to read the files.

In order for me to have a dictionary I will have to read all 100K files and then create a dictionary.
The folder structure will be part of the dictionary too.

Can you shed some light on the best way to achieved this?