Model score orchestra to piano reduction

sdCarr · December 5, 2022, 1:11pm

Hi, I am trying to create a learning model with Fastai. I propose to convert an orchestral score to piano score, what we technically call ‘reduction’.
The first thing I am thinking about is the type of data I should use. Associated to this, the approach (vision, NLP, etc…).
I have a dataset It contains scores of orchestral with its reduction to piano each one.
The dataset is distributed in several folders according to the origin of the files…

Each folder contains its folders indexed with numbers.
another folder with a CSV with the same name and metadata.
-in each indexed subfolder, there are two MIDI files and their corresponding CSV.
the MIDI files are an orchestration and its corresponding piano version.
CSV contains mappings between the midi track names and some normalized names for the instrument names or e.g., a MIDI track name, a MIDI track name, a MIDI track name, a MIDI track name, a MIDI track name and a MIDI track name:
*Beethoven_classical_archives/32/beet9m2.csv and Beethoven_classical_archives/32/symphony_9_2_orch.csv.
Could someone give some advice for this task? Also if someone is interested in the project, we could work on It as a team.
I would be very gratefull!!

Conwyn · December 5, 2022, 10:37pm

Hi Silvino
WARNING I have no musical ability so this may be of no help what so ever.
A VAE (encoder) takes a image and produces a small image which when run back through the VAE (decoder) produces a similar images. So a picture 28x28 is reduced to 7x7 and then returns to 28x28. This means you could compress by 4x4=16 before emailing a friend who also has the VAE (decoder).

So you have lots of MIDI streams one for each instrument and you want to compress it down to one instrument the piano and the maybe de-reduce it back to a full orchestra.

So imagine a 2D picture with each instrument on its own score and then the same 2D for piano.
Make the length of the picture equal to a musical phrase. Repeat for sub-phrase, motifs and sentences.

Run the 2D pictures through the VAE to produced the compressed piano version.

Eventually you would have a model that takes the larger score and produces the piano.

My guess is it would not work because the MIDI piano could only play so many notes at once but I suppose human composers would not flood humans with lots of instruments at once.

Regards Conwyn

sdCarr · December 16, 2022, 12:25pm

Thanks for your answer, it is inspiring and I think I can get some application to my idea. But now the first challenge I have to face is the labeling of the data (in my case the midi files) and it is a difficulty because the reduction from orchestra to piano involves a series of technical, harmonic and piano language decisions that makes the task quite complicated. For example, when making a reduction you have to eliminate instruments that are simply voices doubled from others and at the same time know what is structural and what is essential in terms of texture and melody. So, now I am in the mud with the data issue
Best regards!

JJBloom · August 4, 2023, 1:20am

Hello I found your post while searching as I’m interested in the same goal. Maybe we can team up?
I have some ideas about how to get this working but it depends on the data set.

sdCarr · August 14, 2023, 11:52am

Hello, how are you?
I am starting to do part two of Jeremy’s course(stable diffussion…), so I am not an ML expert. I am a conservatory music teacher and my preferences with ML are in the field of music education and musicological research. I started to develop the idea of being able to reduce scores from piano to Orchestra and vice versa. I searched the web for resources to implement this but found very little.
However, I found datasets from an IRCAM project (LOP database) that could be useful to start developing a possible application.
For work reasons, when I wrote the last post, which you answered, I had to abandon the project idea. Now might be a good time to take it up again.
Regarding the dataset the README says:
This is a database for projective orchestration learning. It contains scores of orchestral pieces and their piano reduction (or equivalently piano pieces and their orchestration). Orchestrations and reductions have been performed by famous composers or composition teachers.

Scores are in the midi format.

The database is splitted in several folders according to the origin af the files :

each folder (liszt_classical_archives/), contains subfolders indexed by number.
a csv file with the same name as the folder contains metadata about the files contained in this folder (liszt_classical_archives.csv)
in each indexed subfolders, there are two midifiles and corresponding csv files (liszt_classical_archives/32)
midi files are an orchestration and its piano version (liszt_classical_archives/32/beet9m2.mid and liszt_classical_archives/32/symphony_9_2_orch.mid)
csv contains the mapping between the name of the midi tracks and a normalized nomenclatura for the instrument names
(liszt_classical_archives/32/beet9m2.csv and liszt_classical_archives/32/symphony_9_2_orch.csv)

sdCarr · August 14, 2023, 11:59am

The first thing to do is to establish the learning model to be used…based on these datasets I have mentioned, perhaps NLP is the most appropriate model to implement. But I have doubts, perhaps it would be necessary to retouch the dataset and to treat more in depth the information of nuances of the score that are not established explicitly in the score, besides other problems related to the time, suppression of voices, aesthetics, etc…
I take this opportunity to invite more people from the forum who might be interested in the subject to form a team and be able to work collegially.
Greetings