Can or will deep learning Compose new music, painting, script for a movie, lyrics?

ilarum · January 31, 2018, 5:43am

I am wondering if we will advance to a stage where deep learning will eventually enable machines to compose new music, or a totally new painting or write a new script for a movie or compose lyrics. Like how computers do with there imagination?

Do we think this is possible?

radek · January 31, 2018, 10:24am

I do not know much about imagination in general nor in the context of computers and would not be able to speak to that

But if you are talking about creating music or art, one of the students does amazing things with GANs and her artwork There has been musical pieces ‘composed’ by algorithms played by orchestras, you can look them up if you’d like.

This course covers the techniques that can be used to such ends with great success

jeremy · January 31, 2018, 9:14pm

Much of the best work is happening here: https://magenta.tensorflow.org/

xjdeng · February 1, 2018, 5:50am

As for music, I saw some guy on Youtube create an algorithm that trains itself on MIDI files and composes new music based on tones and rhythms similar to what it’s trained on. MIDIs are very primitive - I’m wondering if the same methodology can extend to, say, WAVs or MP3s.

Even · February 1, 2018, 4:50pm

On the script/storytelling front there’s a very interesting field called ‘Natural Language Generation’ that I’ve just started touching upon. The idea is to generate text from either text, images or structured data. It seems like deep learning is only just starting to be applied there and the results are quite interesting. It’s been very successfully applied for image captioning, but the field goes way beyond that.

I actually wanted to raise this topic area to @jeremy as I think it’s a very interesting one and would love to see it covered in part 2 of the course.

Here are a few papers/articles on the subject:
https://arxiv.org/pdf/1703.09902.pdf (A recent survey; 118 pages)
https://www.kdnuggets.com/2017/05/nlg-natural-language-generation-overview.html (An article focused on corporate solutions)
https://arxiv.org/abs/1707.02633 (Controlling Linguistic Style Aspects in Neural Language Generation)
http://www.emnlp2015.org/proceedings/EMNLP/pdf/EMNLP199.pdf (Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems)

At it’s core the idea is a really powerful one, especially when transforming structured data to text. In some ways it’s like a language model that you can influence/control the output of.

radek · February 1, 2018, 6:23pm

Hey Even - that is a nice survey there! Seems it is just a day old? Well, can’t look at it ATM but saving for later.

DivingStill · February 2, 2018, 4:20am

Here are some artists/researchers using deep learning to recreate the sounds of musicians such as the Beatles and Battles: Dadabots

They link to the relevant research as well.

alessa · March 9, 2018, 1:31pm

Wondering if fast.ai framework can provide better results than magenta?

[edit:] I’ve just noticed that we will cover this in part 2, the generative models - applied to images and sentences. So I guess it will be easy to transpose for music as well.

This weekend, I am participating to a music generation hackaton, and was trying to figure out how could I apply fast.ai and if it’s appropriate for it.

tyoc213 · March 9, 2018, 4:06pm

I dont have links to papers but

And for music, I think probably would be one of the first things that showed up to me in this regard

https://deepjazz.io/
https://medium.com/@granttimmerman/algo-rhythm-music-composition-using-neural-networks-f89897ff2df7
http://www.electronicbeats.net/the-feed/artificial-intelligence-just-wrote-produced-entire-pop-album/?platform=hootsuite
https://www.infoq.com/presentations/ai-machine-creativity (one of the first videos I saw about the subject)

Gius · March 10, 2018, 11:57am

It would be extremely interesting these days to startup a software package (with deep learning) allowing people to produce a complete, high-budget film without any budget. In fact, cinema is the least reproducible among the arts these days because a technically good film requires a lot of money and people.

alessa · March 12, 2018, 10:40pm

I participated to my first music hackaton - and I used lesson 4 to generate some nice music files.
Here it is the repository and the results https://github.com/alessaww/fastai_ws/tree/master/musichack

I will write a blog about it some day.

reshama · March 12, 2018, 11:15pm

today, write a blog about it today (or this week)…

gai · March 13, 2018, 1:04am

Siraj also has a video on deep learning and music. And even interviewed

Taryn Southern, who creates music using AI, explaining what kind of software she uses

Example of Taryn’s music:

shoof · March 16, 2018, 7:39pm

Definitely interesting work! How did you annotate the key signature in each tune you process? I see the sample file being processed in G major, which means F is F#, but I see the following in the code key: "K:maj" which is a bit confusing. Does the model learn sharps and flats from the structure of the tune, or it learns the key names separately from the notes?

alessa · March 19, 2018, 4:42pm

So, Unfortunatelly, I know nothing about music. I get inspired from [this work](https://github.com/IraKorshunova/folk-rnn). You can find some papers where they described how they cleaned the data, what tokens they used.
What I’ve done is not so much, I’ve took the data provided by them and train some models with pytorch.

The data comes indeed in notes, represented by letters, but I don’t know how to interpret that - so the maximum checking that I can do is to see if the result sounds good or bad.
I’ve seen as well, that the model always starts with K:maj, no matter how much I train, and no matter what the training data contains.

And another thing that is kind of confusing - if you try to give as input the first part of a song, but let the model to predict the next sequences it will fail, by repeating the same note over and over again.
So it’s more research to be done. But I encourage you to play with it, since it’s super easy to train it.

shoof · March 19, 2018, 5:26pm

Thanks! Yes ABC is a common notation format for folk music. Each letter represents a musical note so C is ‘do’ and D is ‘re’, so on so forth. Lower case means a higher octave. The wikipedia article isn’t that good with explaining how it works and I found this one better.

I think some decisions made on cleaning is worth digging deeper for sure. This will be a side project for part2 then I happen to play Irish music so it would be particularly interesting to see how ‘good’ the model is by how it sounds, in addition to the common metrics.

shoof · March 19, 2018, 5:34pm

Thanks for sharing the article! Didn’t know much about this field but sounds fascinating for sure.

murphybridget · January 8, 2024, 6:24am

It is highly possible however not any time soon. AI models are only as good as the data it gets. Furthermore current AI made images/arts seem to be very unrealistic.

usernotabuser · January 8, 2024, 7:27pm

Hi,

For the most of you, I don’t need to introduce this video, so heating to the point of this thread I forwarder to accurate part:

https://youtu.be/6avJHaC3C2U?t=2580

Regards

murphybridget · January 10, 2024, 9:09am

Thank you for sharing the video link.