This endpoint layers AI-generated vocals on top of an existing instrumental. Given a prompt (e.g., lyrical concept or musical mood) and optional audio, it produces vocal output harmonized with the provided track.
uploadUrl: Valid instrumental audio file URLprompt: Description of the desired vocal content and styletitle: Title for the generated vocal track (max 100 characters)style: Music and vocal style (e.g., “Jazz”, “Pop”, “Classical”)negativeTags: Vocal styles or characteristics to excludecallBackUrl: URL to receive completion notificationsvocalGender: Preferred vocal gender (‘m’ for male, ‘f’ for female)styleWeight: Style adherence weight (0.00-1.00)weirdnessConstraint: Creativity/novelty constraint (0.00-1.00)audioWeight: Audio consistency weight (0.00-1.00)model: Model version used for generation. Allowed values: V4_5PLUS (default), V5All endpoints require authentication using Bearer Token.
Add to request headers:
Authorization: Bearer YOUR_API_KEY⚠️ Note:
- Keep your API Key secure and do not share it with others
- If you suspect your API Key has been compromised, reset it immediately from the management page
Description of the audio content to generate vocals for.
"A calm and relaxing piano track with soothing vocals"
The title of the music track.
"Relaxing Piano with Vocals"
Music styles or vocal traits to exclude from the generated track.
"Heavy Metal, Aggressive Vocals"
The music and vocal style.
"Jazz"
The URL of the uploaded audio file to add vocals to.
"https://example.com/instrumental.mp3"
The URL to receive task completion notifications when vocal generation is complete. The callback process has three stages: text (text generation), first (first track complete), complete (all tracks complete). Note: In some cases, text and first stages may be skipped, directly returning complete.
For detailed callback format and implementation guide, see Add Vocals Callbacks
"https://api.example.com/callback"
Preferred vocal gender. Optional. Allowed values: 'm' (male), 'f' (female).
m, f "m"
Style adherence weight. Optional. Range: 0-1. Two decimal places recommended.
0 <= x <= 1Must be a multiple of 0.010.61
Creativity/novelty constraint. Optional. Range: 0-1. Two decimal places recommended.
0 <= x <= 1Must be a multiple of 0.010.72
Relative weight of audio consistency versus other controls. Optional. Range: 0-1. Two decimal places recommended.
0 <= x <= 1Must be a multiple of 0.010.65
Model version to use for generation. Optional. Default: V4_5PLUS.
V4_5PLUS, V5 "V4_5PLUS"
Request successful
200, 400, 401, 404, 405, 413, 429, 430, 455, 500 200
Error message when code != 200
"success"