Ultimate prompting guide for Lyria 3 models

Lyria 3, Google's family of music generation models, is designed to give you granular control over vocals, instrumentation, and arrangement. So we  spent weeks testing against every musical genre and use case we could imagine.


We put together this guide to share exactly what we learned and how you can get the best results.


What you'll learn in this guide:




*


Model overview


*


Breakdown of tech specs


*


Best practices for effective prompting


*


The core prompting framework


*


Mastering vocals and lyrics


*


Advanced creative workflows


*


How Lyria 3 models work with other generative media models






Model overview




Lyria 3 Clip and Lyria 3 Pro are music generation models designed to support your creative workflows. The models excel in three key areas:




*


Structural control: Prompt for specific elements like intros, verses, choruses, and bridges to build a complete arrangement.


*


High-quality audio: Both models deliver high-fidelity stereo audio


*


Precision control: Dictate structural changes using timed lyrics, descriptive tempo conditioning, and multimodal inputs.






Breakdown of tech specs for Lyria 3 Clip and Lyria 3 Pro




Here is a breakdown of what the models can handle via the API on Vertex AI:




*


Track length: Lyria 3 Clip generates 30-second long songs, ideal for rapid prototyping and short-form assets. Lyria 3 Pro supports compositions up to three minutes long.


*


Vocal support: Both models feature improved realism and expressiveness for vocals, supporting multi-vocal conditioning and generation in eight languages (English, German, Spanish, French, Hindi, Japanese, Korean, and Portuguese).


*


Controls and conditioning: Lyria 3 Pro includes advanced controls for timed lyrics and tempo control through natural language descriptions.


*


Multimodal inputs: You can generate music using text, PDF files, or up to 10 reference images.


*


Trust and safety: All outputs include SynthID watermarking and support the C2PA open standard for cryptographically signed metadata.






For more, visit Lyria 3 models card.


Best practices for effective prompting




There are a few guidelines to ensure your generated audio matches your intent:




*


Be descriptive and specific: Use adjectives to create a clear description. The more detail you provide, the better Lyria understands your prompt.


*


Reference genres and eras: Clearly state the musical category (for example, Rock or Pop) and stylistic timeframe (e.g. the 1950s, early 90s).


*


Specify key instruments: Mention the important instruments driving the track, or Lyria chooses defaults based on the genre.


*


Iterate: If the first result isn't perfect, refine your prompt by adjusting keywords.






The core prompting framework




A simple list of keywords will generate great songs, but to control the models, use this framework.


[Genre and style] + [Mood] + [Instrumentation] + [Tempo and rhythm] + [Vocal style & language] + [Lyrics]




*


Genre and style: Define the primary category, for example, "cinematic orchestral fantasy".


*


Mood: Describe the emotional intent, for example, "tense and suspenseful".


*


Instrumentation: Name the specific instruments, for example, "guitar", "piano".


*


Tempo and rhythm: Set the speed, pace, and groove using descriptive terms, such as, "a fast, energetic pace with a driving beat".


*


Instrumental vs. vocal: Specify "instrumental" to exclude vocals.


*


Vocal style & language: Specify gender, tone (e.g., raspy, smooth), delivery (e.g. rapping), and language.


*


Lyrics: Either provide a theme for Lyria to generate the words (e.g., "song about a cross-cultural connection"), or provide your exact lyrics in quotes for the model to perform.






Example prompt: “A romantic fusion of classic Bossa Nova and modern R&B. The mood is intimate, warm, and deeply affectionate. Features a gentle acoustic nylon-string guitar, warm electric piano chords, and a crisp, laid-back modern hip-hop drum beat. A slow, swaying tempo. Featuring a vocal duet: a smooth male vocalist singing in English, and a soft, breathy female vocalist singing in French. The lyrics are a beautiful love song about an undeniable, cross-cultural connection”








https://www.youtube.com/watch?v=WZcD6JNP2cg />

















If you want instrumental only songs, write in the prompt “instrumental”.


Prompt example: “A warm, modern lofi hip-hop beat for studying, featuring a muffled drum break and dusty jazz piano samples. Instrumental.”







Lyria 3 Pro - Lofi hip-hop


















Mastering vocals and lyrics 




Lyria 3 models give you control over both the lyrics and the vocal performance.


Incorporating specific lyrics




*


Syntax for lyrics: To use your own lyrics, write the "Lyrics:" before the lines you want the model to sing.


*


Backing vocals: If you want backing singers to echo the main vocals, mention where you want the backing vocals in the prompt. 


*


Lyrics: If you prefer the model to write the lyrics for you, clearly describe the theme in your prompt, such as asking for "a love song" or a "new happy birthday song", or provide your lyrics to the model. 






Example prompt: “A smooth, moody jazz ballad featuring piano and upright bass. The vocals should be a female singer with a breathy, soulful soprano range. The vocal pattern should start out confident but get calmer and quieter as the track progresses. Song lyrics about meeting the love of her life in New York.”







Lyria 3 Pro - Moody jazz ballad


















Controlling the voice




Define the desired vocal style in detail to get the performance you want:




*


Singer demographics and range: Specify whether you want a male or female singer, and dictate their vocal range. For example, you can ask for "commanding baritone vocals" or a "clear and high soprano range."


*


Voice texture (timbre): Describe the texture of the voice, you can ask for vocals that are "gravelly," "soulful," or "breathy."


*


Vocal patterns and styles: Describe the specific vocal pattern you want to hear, such as a "fast-paced" or "laid-back" groove. You can also experiment with layering different vocal styles or having the vocals change dynamically, such as getting "calmer and quieter as the track progresses."


*


Language: You can specify the language you want the vocals to be sung in. The model supports multi-vocal generation in eight languages: English, German, Spanish, French, Hindi, Japanese, Korean, and Portuguese (more languages coming soon).






Example prompt: “An upbeat, high-energy J-pop track with bright, sparkling synths, electric guitar, and a driving bassline. Featuring a clear, expressive male tenor vocal singing in Japanese. The vocal style should be fast-paced and melodic, with a sweet and highly polished texture.”







Lyria 3 Pro - J-pop


















Advanced creative workflows




Workflow 1: Timestamp prompting


This workflow is ideal for creating dynamic genre shifts or scoring video content by assigning actions to timed segments.


Prompt example:


[00:00] Begin immediately with a massive gospel choir singing a powerful, uplifting harmony about being kind to yourself. 


[00:15] A heavy, modern hip-hop drum beat and a deep 808 bassline drop in, matching the energy of the choir. 


[00:30] A male lead vocalist begins rapping a confident verse about overcoming life's challenges, while the large choir punctuates his lines in the background. 


[01:10] Transition into a huge, triumphant chorus celebrating victory and winning. The gospel choir sings at full volume, layering rich, soulful harmonies over the driving hip-hop beat and triumphant brass horns. 


[01:50] The beat strips back to just a gentle Hammond B3 organ. The rapper delivers a quiet, emotional bridge about giving yourself grace, supported by soft, warm hums from the massive choir. 


[02:10] The full hip-hop beat and the giant choir return at maximum energy for an uplifting final chorus, before ending on a resonant, sustained choir chord at [03:00].







Lyria 3 Pro - Timestamp prompting


















Workflow 2: Multimodal generation 




Lyria 3 models allow you to upload reference images or PDFs to establish the emotional baseline for the track.


Prompt example: “A deeply emotional, modern Bollywood song in English. The lyrics and mood should match the story in the images attached.”







Lyria 3 Pro - Image to music


















Go further




Lyria 3 Clip and Lyria 3 Pro can be used with our other generative media models on Vertex AI.




*


Lyria + Veo: Generate video assets using Veo, and then dictate the exact structural timing in Lyria 3 Pro to score a custom soundtrack that matches every scene transition.


*


Lyria + Nano Banana: Generate images of a storyboard or vibe, and let Lyria create a song based on those images.


*


Lyria + Gemini: If you are struggling to define your desired sound, use Gemini to analyze your creative brief and output a highly descriptive prompt to feed into Lyria 3 models. Gemini can also create lyrics for you based on your creative brief.


*


Lyria + Agents: If you’re using these models with our GenMedia MCP tools, you can provide domain-specific sound design knowledge via this Agent Skill.






To get started, access the Lyria 3 models today via the API documentation, Gen AI SDK for Python notebook, and in our playground Vertex AI Media Studio.

---



Thanks to Khulan Davaajav, Russ Khaimov, and Sandeep Gupta for their contributions to prompting guidance for customers. 🔗 Google IA


https://cloud.google.com/blog/products/ai-machine-learning/ultimate-prompting-guide-for-lyria-3-pro/?utm_source=dlvr.it&utm_medium=blogger

No hay comentarios.

Imágenes del tema de enot-poloskun. Con tecnología de Blogger.