This thread is a start which I plan to keep up to date over time as I learn more during the course.
I plan to start this project as soon as we’ve covered NLP in Deep Learning 1 since I intend to use that technology for this project.
Goal of the project: Learn from existing music and take an input (e.g. a melodic line) to propose new ideas incl. more melodies, harmonies, musical structure.
Great idea! Signing up for updates or possibly teaming up.
I remember this used to be done using Markov Models but this was before the advent of DL. LSTMs and GANs come to mind nowadays. Exciting stuff!
That’s right. E.g. David Cope (a Californian composer who had writer’s block and needed a solution) used markov models (I believe) to emulate his own writing and then discovered he could make Bach like music and Mozart like music etc. The result of which he called EMI and later Emily I think. Very exciting. There’s also a Radiolab podcast where he’s interviewed and explaining what he has done.
Anyways… Musenet (OpenAI, 2019) is AFAIK the most impressive example that exists in the Deep Learning / NLP realm. Some other big research houses (incl Google/Magenta) did similar work but IMHO nothing as impressive as Musenet yet.
I can think of an endless roadmap for this project…
But to start, I’m hoping we’ll touch on some NLP concepts soon and hope to come up with a super simple proof of concept idea where we train the model on a small set of musical compositions and we let it predict new notes based on some input.
@typehinting Would be great to team up. Where are you based? (I’m in San Francisco.) Any specific interest in this area from your POV? Exciting
Sounds like you have quite the background on this, that’s great! I don’t know much about the state of research. I’m just a music nerd and a machine learning practicioner (from Toronto, by the way).
I will check out Musenet, and let’s stay in touch as the course progresses!
I’m also interested in this topic! I think you’re right that Musenet is the most recent significant development for this, but it’s worth reading about the (slightly older) Music Transformer from Magenta if you’re going through the research https://magenta.tensorflow.org/music-transformer
Magenta also put out DrumBot in Dec. 2019, I think it’s a really incredible thing–as they describe it, “a ‘jamming’ app – one where you and a model play music together, at the same time” https://magenta.tensorflow.org/drumbot
I’ll throw in https://chordify.net/pages/about/ which does chord recognition from audio.
I wasn’t blown away by their results, but I really like their feature that you can use it on any youtube video, which makes the corpus you can train on / predict on vast and diverse.
I am interested on this project. It sounds pretty cool to use NLP on Music. Yet, I don’t know anything about music. How much should I learn? Where can I start?
I wrote a blog post about LSTMs that trains a (pretty bad) drummer: https://machinethink.net/blog/recurrent-neural-networks-with-swift/
I think key to a project like this is to find a good dataset. Do you have something in mind already?
Hey @steef, thanks for posting this! I’m also super interested in this topic - for the course, I was hoping to do a project based on jazz composition. I play classical piano, but I’ve been wanting to pick up jazz for a really long time. I figured this would be a great opportunity to dive into it Would love to collaborate/discuss some time! I’m also based in San Francisco.
Wow! There’s been a lot of interest in this That’s awesome!
Wrt to the dataset @machinethink - really good question, I don’t know yet what to use. Christine Payne from Musenet refers to multiple sources of which some were very large I think and in her prior work she referred to sets too IIRC. There are a couple of “standard” data sources incl the one that Magenta or similar project uses based on live recorded MIDI-data (from an annual piano competition). And then there are massive MIDI file collections that people have used for similar projects in the past.
I’m a little concerned that (1) many of those datasets have various flaws (e.g. differences in handling multiple voices) and (2) some of these are massive which I presume will slow down fast iteration.
So I was wondering if we could find a small, cohesive, simple dataset first, e.g. children songs or something like that. Or what some others have done is just train a model on Bach chorales.
Any other ideas? @machinethink? Others?
@wdhorton Yes I <3 the Music Transformer. Seems like transformers in general are a major breakthrough for learning on long form sequences.
@All: thanks so much for all of your interest! Very exciting!
Does anyone know when the Deep Learning course is supposed to touch on NLP? That seems like a good moment to kick this off for real.
NLP is chapter 10. If no chapters are skipped, I guess it will in two weeks.
Sounds like a very cool project!
As I was telling you, I did some work in this space a while back (~2 years ago), and I recently (as in, today :-)) put online the app:
For reference this is the repo with all the code for the app (and the model, an LSTM-based recurrent neural network):
This is a very short presentation of what I did:
I am writing a blog post describing what I did more in details.
If I were to redo it today I would use a different model, but the results were already good with LSTMs.
I have previously used the Wikifonia dataset but the legality of this is… questionable. (I received it from them for research purposes many years ago but can’t share it. There are some copies floating around the web but it may not be a good idea to use it.)
I was thinking of taking some inspiration from this project: https://deepjazz.io/.
Something that’s really interesting about it is that it only uses a single MIDI file as its training set. Given that the training data is so limited, the end product is super impressive! Hoping to dig more into how the model works exactly.
Some other resources related to this project that other folks might find interesting:
New blog post from OpenAI (generating music from raw audio): https://openai.com/blog/jukebox/
Interestingly, it combines VQ-VAE + Transformer models
Oh great share! Thanks! Will dig into this after tonight’s study group
FYI next week Jeremy’ is covering NLP. After that I’m hoping to get started with NLP and music tests/projects.
Signing up to watch how the project evolves. It’s really interesting. I play music on spare time and the idea is great.