A single sentence can generate the melody you desire.

2024-05-14 10:39:14

Embarking on a new era of intelligent music creation, all it takes is a simple command on the keyboard, such as: “Please write me a song filled with joy/sorrow.” After a brief wait, the complete song you desired, with melody, vocals, and lyrics, is smoothly born. With the continuous advancement of artificial intelligence technology, AI

Embarking on a new era of intelligent music creation, all it takes is a simple command on the keyboard, such as: “Please write me a song filled with joy/sorrow.” After a brief wait, the complete song you desired, with melody, vocals, and lyrics, is smoothly born.

With the continuous advancement of artificial intelligence technology, AI music software products have sprung up like bamboo shoots after a rain. Among them, a much-attended AI music product, “Gege AI,” is one member of this group. The company recently received a large-scale investment, backed by well-known investment institutions.

“Gege AI” was launched in August 2022 by a company called Melody Spark, positioned in the field of AI music composition. The Melody Spark team is made up of technical elites from internet giants and experienced producers from the music industry. The team members have been dedicated to the development of AI music composition tools since 2016 and have accumulated rich experience in project entrepreneurship.

The technological leap around 2022, particularly the breakthroughs based on the Transformer architecture, provided great inspiration to the team members. They firmly believe that this technology will truly change the face of the music industry. Since the company’s establishment in 2023, Melody Spark started with the training of the underlying models. After a period of trial operation, it completed the development of an independent application in April 2024, which has now been officially launched into the market.

Gege AI’s grand goal is “to make everyone a musician.” The software is designed for all music-loving users and features an extremely intuitive and easy-to-use interface. Upon entering the application, users can choose from three modes: Free Mode, Surprise Mode, and Pure Music Mode. By communicating with the “AI producer,” users can generate their own exclusive music pieces.

In Free Mode, users only need to input a sentence as a prompt, and AI can compose a song with a specific theme, complete melody, and vocals. For example, after selecting freedom and courage as the themes, the application quickly generated a piece called “Song of Hope,” which is 2 minutes and 20 seconds long and comes with complete lyrics.

The melody and arrangement of the song are harmonious, with rich variations. Although there might be occasional issues with lyric fluidity or sentence breaks, users can manually adjust and optimize the content of the lyrics. Even without musical theory knowledge, users can easily adjust the pitch of every word through simple operations.

For those with higher standards, “Gege AI” also offers a “Surprise Mode,” which can create commercial-quality music pieces. Moreover, users have the option of a paid service to replace the AI-generated voice with their own, further personalizing their musical creation.

In the current technological paradigm, users can make more detailed requests about their preferred style of music and the specific instrumental configuration. Artificial intelligence can compose complete melodies that include the chorus, verse, bridge, and other structures. At the same time, the generated vocal parts are more refined and varied. Whether it is in the use of vibrato or the transitions in melody pitch, it all appears more natural, approaching the level of real human vocals, reducing the so-called “AI feel.”

In these modes, users use a certain amount of free credits to generate musical works. If users are satisfied with the AI-created demo and wish to expand or modify it, such as changing the musical style, they can communicate with the “AI Producer” through a dialog box, and pay a fee to let AI create a more complete music piece. Currently, the platform offers paid options in three tiers, which are 18 yuan, 48 yuan, and 98 yuan per month.

The Chief Operating Officer of Melody Sparkle Company, Wang Shupei, said that this technology known as “Surprise Mode” uses end-to-end large model technology. To ensure the high quality of the musical effect, specific modifications of lyrics and vocals are not supported for the time being. Compared to many AI music generation products on the market, this technological approach has significant advantages.

In the previous stage of applying AI technology to music creation, the common output was to generate MIDI music, which digitizes melody, often resulting in monotonous and mechanical-sounding monophonic melodies. However, as AI technology gradually matures, the process of music creation has begun to utilize the training of small models and expert systems. This means that by manually annotating a large number of music clips and training models for different “subjects” such as melody, lyrics, and arrangement, each part is produced by a separate small model. These mature music clips are eventually stitched together to form a complete melody. However, this method has the disadvantage that due to the disconnection between elements, the presence of a mechanical feel makes it difficult for the music to achieve a truly harmonious effect.

To improve music quality and present more natural and cohesive works, Gege AI chooses to follow the end-to-end large model route, meaning that music data does not need to be meticulously segmented or preprocessed before being input into the model for training, thus producing complete and consistent melodies.

Even more noteworthy is that Gege AI’s goal is not merely “creating songs with AI” but rather to fundamentally change the traditional methods of music creation and distribution at every stage based on generative AI technology. The Melody Sparkle team points out that the content recommended by current music platforms is often homogenous and cannot truly meet user needs. “Nowadays, about several hundred thousand new songs are produced every day in the country, and with the help of generative AI technology, this number is expected to grow to tens of millions,” Wang Shupei discussed, suggesting that the popularization of AI technology will allow for more personalized music production, which could be a major opportunity to break the monopoly of the current music market giants.

Beyond basic song generation functionality, Gege AI has also expanded into more sectors, focusing on generation and distribution areas, demonstrating its profound influence and broad prospects in the field of music innovation.

In today’s music field, the application of AI technology has become one of the important driving forces for innovation. Users can now transform their voices into unique songs with the help of AI models, choose different styles and genres, and possibly distribute their works to major music platforms. Gege AI is committed to realizing this technology, helping users easily produce and distribute AI music.

This technology not only offers the possibility for automatic pitch correction and AI mixing processing, but also hopes that in the future, users can easily release their AI-generated music through a one-click distribution function across various platforms, thereby collecting royalties. Multimedia integration is also part of GeGe AI’s considerations; it currently supports downloading short videos with AI music, and plans to expand to AI-generated videos and live broadcast scenarios in the future.

The recently funded GeGe AI team plans to accelerate product iteration and invest resources to expand market share. The team is small yet professional, composed of less than ten people, including the CEO Long Yong, who has over 20 years of music production experience, participated in projects like The Voice of China and The Rap of China, and has extensive experience in music and copyright operations. In addition, COO Wang Shupei has dual degrees in engineering and music production, guitar performance with a profound background, and was the founder of the NetEase AI music project. CTO Zhang Wenbo is the founder of the “I Want to Write Songs” app. The professional backgrounds and experiences of the three core team members are promising for their future developments.