Original Link: https://musicalai.substack.com/p/musing-1-music-creation-with-ai
Musing #1: Music Creation with AI
Human creativity meets technological progress in Gen AI
July 29, 2023
Music in many ways has been intertwined with the progress of society and technology. We started with our voices, and slowly built other means to create music from drums, stringed-instruments the world over like the sitar, veena, erhu, shamisen, and eventually complex instruments like harpsichords, pianos and harmoniums.
But there were just the beginning, in the last 150 years since the advent of electricity, using electronic instruments have been the in-thing. Most of our music enjoy the benefits of this technological explosion. Of course that doesnโt mean we donโt enjoy a fancy pianoforte (if the interstellar theme is any indication below) - but even those are often mastered to make them sound perfect (no acoustical blips)
Thanks for reading Muse-ical AI ! Subscribe for free to receive new posts and support my work.
Subscribe
Most of the music we listen to is synthesized in DAWs (digital audio workstations) and mastered to make them sound perfect to our ears. As a human species with computational tools we have mastered the ability to craft exactly the waveform we want. Yet we still donโt have a perfect objective for consonance and dissonance.
So when it comes to creating music from scratch often we still start with the same fundamentals - chords, melodies and samples. These samples can come from other musicians we jam with, or songs we listen to everyday. Chords and melodies have some fundamentals in Eastern and Western classical music, we generally can use chord progressions we know are consonant.
Once we have these basic components we can put any filter on these and modulate these waveforms however we want
Now, where does that leave us with AI โ well, AI is definitely a different beast. The mathematical cascade of functions that make up Transformers or any deep learning model, initially was meant to emulate the neurons in our brain, but has created a whole new mechanism for creating music.
Neural networks imitate humans very well but do not think like them โ letโs try a text-to-image decoder, text-to-speech decoder and text-to-music decoder to demonstrate with a simple prompt: โa calming beach scene with waves crashing against the shoreโ.
a calming beach scene with waves crashing against the shore
1ร
0:00
-0:05
Audio playback is not supported on your browser. Please upgrade.
Agatha speaking โa calming beach scene with waves crashing against the shoreโ aloud.
1ร
0:00
-0:15
Audio playback is not supported on your browser. Please upgrade.
โa calming beach scene with waves crashing against the shoreโ as interpreted by a decoder.
Well, itโs not too bad, many of the text and language encoders have a deep understanding of these concepts like โcalming, beach, etcโ and these concepts are quite common in the unsupervised pre-training data of these encoders. And since each of these models use a variation of those, itโs not surprising to see it doing a good job. Of the 3 however, music is the one that seems to not quite match what I was looking for. It is chill and repetitive with a rhythm and beat like music should be, but it doesnโt have that organic wave crashing sound I was looking for.
Itโs safe to say to Music as a medium is more complex and nuanced and language is not often the means by which we describe it. That is, data that describes music in detail is not as widespread, so while language encodings are helpful, mapping them to sequential outputs in a spectrogram is not as straightforward.
Musicians create music by โjammingโ with each other โ going back and forth with different musical samples with language as a guide to direct attention rather than fully describe a piece of music.
Soโฆwith this in mind, how can we leverage the latest in generative AI? Well, first we should try a small project to test out whether it can generate useful music. Iโll start with creating some Lofi Bollywood Music since I enjoy it and will explore how generative models can help me in the process. I have no idea where this will take meโฆ.but I hope eventually we can develop representations that enable musicians to โjamโ with AI in a productive way to expand possibilities and create โbetterโ (as defined by humans) music.
To be continuedโฆ..
Thanks for reading Muse-ical AI ! Subscribe for free to receive new posts and support my work.
Subscribe