I need to create a properly labeled audio dataset

Hello for a ptoyecto, train a model that detects the most important musical notes as well as the scale and melodic motifs of a song or any music.
To start the project, I need to create a properly labeled audio dataset.
My question is what to use to tag these audios: some api or library like librosa, torch audio, etc. (if you can do that with these applications) or some professional audio editing software like Amadeus Pro or Audacity? I would appreciate any resource to learn how to do this that is free on the internet…
Thanks