One evening, a loud song was being played over and over, which seemed unusual. My dad revealed that Uncle T had composed music, in the style of a band called Bon Jovi. This piqued my interest, as Uncle T was a very busy person. However, my dad quickly clarified that Uncle T had used an AI tool called Suno to create the piece. 

Although Suno AI uses a proprietary system to generate music from text, some general information about their approach is known. Upon submitting a prompt, transformer models are activated and begin to process the input. These models use a “self-attention” mechanism to weigh the importance of different words in a sentence, and allow the machine to understand context and relationships between two words regardless of distance. Once the AI understands the “vibe” and structure of your request, it begins predicting which sounds should come next, much like how a smartphone suggests the next word in a text message. Instead of words, however, it predicts discrete audio tokens that represent acoustic properties like pitch, timbre, and rhythmic placement. These snippets are then processed through a neural audio codec – a learning model that uses neural networks to compress audio into discrete digital tokens – that assembles them into a continuous waveform. Consequently, the output sounds like an high-fidelity (“Hi-Fi”) recording rather than a jumble of noises. Because the system utilises a large context window to recall what it generated at the beginning of the track, it maintains long-range dependencies so that the melody stays consistent and the chorus and motifs return throughout the song.

I was deflated by my Uncle’s usage of Suno AI, and a bit annoyed at the fact that my dad had conflated an AI’s work with that of a human. While listening to the song, I anticipated a discovery of any oddities, such as a bad chord progression, a repetitive loop, or an unnatural beat. But to my frustration, I couldn’t tell one. Maybe I didn’t like Bon Jovi’s style of music, but there seemed to be no obvious musical flaws. More surprising was that all Uncle T had to do was hum a few phrases to get Suno to generate the entire piece. The output even sounded like the lead singer, apparently. If AI could compose like Bach and rearrange any children’s song into a symphonic masterpiece, such an achievement should not be seen as remarkable.

Although Suno AI uses a proprietary system to generate music from text, some general information about their approach is known. Upon submitting a prompt, transformer models are activated and begin to process the input. These models use a “self-attention” mechanism to weigh the importance of different words in a sentence, and allow the machine to understand context and relationships between two words regardless of distance. Once the AI understands the “vibe” and structure of your request, it begins predicting which sounds should come next, much like how a smartphone suggests the next word in a text message. Instead of words, however, it predicts discrete audio tokens that represent acoustic properties like pitch, timbre, and rhythmic placement. These snippets are then processed through a neural audio codec – a learning model that uses neural networks to compress audio into discrete digital tokens – that assembles them into a continuous waveform. Consequently, the output sounds like an high-fidelity (“Hi-Fi”) recording rather than a jumble of noises. Because the system utilises a large context window to recall what it generated at the beginning of the track, it maintains long-range dependencies so that the melody stays consistent and the chorus and motifs return throughout the song.

I kept thinking about it for the next few days: Why did I still feel frustrated? Where did my annoyance at my dad come from? 

Several reasons became clear: I found the song’s flawless execution frustrating. The process of composing my own piece of music had been lengthy and filled with numerous scrappings of initial ideas. I was also skeptical that my dad had been an undiscriminating audience. I recalled the countless times in the past that I’d played the piano for him, assuming that he appreciated all the details that I was putting into my music. In opposite, he repeatedly listened to a track generated by an algorithm, which replicated the musical style of an authentic musical legend, essentially appropriating it. How could he feel anything for a soulless transaction that ‘expressed’ nothing?

For me, originality is significant as it is closely associated with ethical considerations. The act of creation is a manifestation of an individual’s authenticity, identity, and uniqueness. It is known that an AI generator does not possess the capacity to assert these qualities when it “produces” content. Rather, it merely provides responses based on predefined prompts. For now, users have the discretion to regard AI-generated works as entertaining, often losing interest quickly. They understand the absence of human exploration and struggle inherent in any kind of creation. This is the fundamental distinction between human creativity and algorithm-driven outputs. 

But what if it all changes? There is potential for change in our idea of ‘originality’. Do we remember our younger days when the biggest crime was to be a “copycat”? AI tools like Suno encourage the act of producing fake versions that obscure their links to the original things. In this way, copying a “style” is framed as an acceptable practice, as if it’s the next natural thing to do in artistic expression. When we repeat such an act constantly, would our values – those of respecting authenticity and unique identity – stay the same? When our society adapts to the idea of “fakes,” we might one day have to consider whether original creations will hold their significance or diminish in value in a world filled with imitative works.

I went back to my dad to ask something that was even more important to me: Did he really connect with the song emotionally? His response was a clear no; he was merely impressed by the execution it demonstrated, nothing more.

Leave a Reply

Trending

Discover more from converge

Subscribe now to keep reading and get access to the full archive.

Continue reading