Prompt Engineering Guide
πŸ˜ƒ Basics
πŸ’Ό Applications
πŸ§™β€β™‚οΈ Intermediate
🧠 Advanced
Special Topics
🌱 New Techniques
πŸ€– Agents
βš–οΈ Reliability
πŸ–ΌοΈ Image Prompting
πŸ”“ Prompt Hacking
πŸ”¨ Tooling
πŸ’ͺ Prompt Tuning
πŸ—‚οΈ RAG
🎲 Miscellaneous
Models
πŸ”§ Models
Resources
πŸ“™ Vocabulary Resource
πŸ“š Bibliography
πŸ“¦ Prompted Products
πŸ›Έ Additional Resources
πŸ”₯ Hot Topics
✨ Credits
🎲 Miscellaneous🟒 Music Generation

Music Generation

🟒 This article is rated easy
Reading Time: 2 minutes
Last updated on August 7, 2024

Sander Schulhoff

Takeaways
  • This article explores different AI tools for music generation.

Music generation models are becoming increasingly popular, and will eventually have a large impact on the music industry.

Music generation models can create chord progressions, melodies, or full songs. They can structure and create music in specific genres and compose or improvise in the style of specific artists.

However, despite the enormous potential of music models, they are currently difficult to prompt. The generated output is often not thoroughly customizable by prompts, unlike image or text generation models.

Suno

Suno is a platform for creating music with GenAI. Users can generate unique tracks, remixes, and soundscapes by inputting simple prompts or using the site's advanced customization features.

Udio

Udio is another, similar platform music generation platform.

Riffusion

Riffusion, a fine-tuned version of Stable Diffusion, can be controlled with prompts to generate instruments and pseudo styles, but it has a limited number of beats available.

Mubert

Mubert seems to interpret prompts through sentiment analysis that links appropriate musical stylistics to the prompt (controlling the musical parameters in detail via prompts is not possible). It is unclear how much of the resultant generation is done by AI.

Other

There are attempts to use GPT-3 as a Text-2-Music tool with actual prompting for musical elements on the "micro-level" of notes (instead of the rather vague prompt-style-analogies mubert & riffusion produce) (e.g. write the notes for a folk song that only uses A, B, C#, F#, and G). However, at present those attempts are limited to single instruments.

Other approaches include a model chain that converts any image into sound that represents it and prompting ChatGPT to generate code for Python libraries that create sound.

Notes

Music prompting is not well built out... yet. MusicLM looks promising, but it is not yet available to the public.

Sander Schulhoff

Sander Schulhoff is the Founder of Learn Prompting and an ML Researcher at the University of Maryland. He created the first open-source Prompt Engineering guide, reaching 3M+ people and teaching them to use tools like ChatGPT. Sander also led a team behind Prompt Report, the most comprehensive study of prompting ever done, co-authored with researchers from the University of Maryland, OpenAI, Microsoft, Google, Princeton, Stanford, and other leading institutions. This 76-page survey analyzed 1,500+ academic papers and covered 200+ prompting techniques.

Footnotes

  1. Forsgren, S., & Martiros, H. (2022). Riffusion - Stable diffusion for real-time music generation. https://riffusion.com/about ↩