When doing deep learning project, we always have to work on codebase from the others. For example, a scholar release a new paper with open source code and you would like to give it a try. The headache things are, those codebase always come with different styles and different frameworks. (unluckily most of them are not written in fastai…)
In some scenario, only a small subset of the repo is what you need. And in some case, the repo is well written but the codebase is too huge that you have no idea how to get a starter.
For my recent experience, I would like to take the original work of CycleGAN and Pix2Pix as an example, the codebase is well written in a sense that it considers a generic pipeline for both CycleGAN and Pix2Pix model. It also enables a whole bunch of hyperparameters for you to specify in different flags. But as I came across the codebase at the beginning, I find it a bit intimidating because of its complexity and abundance of flags. I am a bit struggle on how I should “interact” with such codebase and how I should make changes on these to fit my use case. (not sure if you share the same feeling)
With all these reasons, I find it pretty costly to work on a variety of “legacy codebase”. Hence, I am looking for some systematic and efficient ways to work on these “legacy codebase”.
I find related topics are rarely discussed in machine learning community, but I personally find it pretty significant.
What is your general pipeline when you have to work on others’ repo? (especially on deep learning repo). Do you have any hacks for this? Would love to hear your sharing!