Key Capabilities
- Accepts an existing instrumental via uploadUrl, with optional prompt-based stylistic input.
- Supports control parameters including:
- prompt, style, tags, negativeTags (define lyrical content and vocal style)
- vocalGender, styleWeight, weirdnessConstraint, audioWeight, callBackUrl .
- Returns a taskId, supports the same 14-day retention and three-stage callback model as the instrumental endpoint .
Typical Use Cases
- Music platforms or tools enabling topline creation and rapid prototyping of lyrical ideas.
- Collaborative songwriting or co-creation workflows, where lyrics or vocal styles are iteratively tested over instrumental drafts.
Parameter Usage Guide
Required parameters for all requests:uploadUrl
: Valid instrumental audio file URLprompt
: Description of the desired vocal content and styletitle
: Title for the generated vocal track (80 characters max)style
: Music and vocal style (e.g., “Jazz”, “Pop”, “Classical”)negativeTags
: Vocal styles or characteristics to excludecallBackUrl
: URL to receive completion notifications
vocalGender
: Preferred vocal gender (‘m’ for male, ‘f’ for female)styleWeight
: Style adherence weight (0.00-1.00)weirdnessConstraint
: Creativity/novelty constraint (0.00-1.00)audioWeight
: Audio consistency weight (0.00-1.00)model
: Model version used for generation. Allowed values:V4_5PLUS
(default),V5
- Input Type: Instrumental or backing track audio files
- File Format: MP3, WAV, or other supported audio formats
- Quality: Clear instrumental tracks work best for vocal addition
- Accessibility: Ensure uploaded audio URLs are publicly accessible
Developer Notes
- Generated vocal tracks are retained for 15 days before being deleted
- Ensure you have proper rights to use the uploaded audio content
- Use clear, well-mixed instrumental tracks for best results
- Be specific about vocal style in your prompt (e.g., “smooth jazz vocals”, “energetic pop vocals”)
- Callback process has three stages: text (text generation), first (first track complete), complete (all tracks complete)
- You can use the Get Music Generation Details endpoint to actively check task status instead of waiting for callbacks
Authorizations
🔑 API Authentication
All endpoints require authentication using Bearer Token.
Get API Key
- Visit the API Key Management Page to obtain your API Key
Usage
Add to request headers:
Authorization: Bearer YOUR_API_KEY
⚠️ Note:
- Keep your API Key secure and do not share it with others
- If you suspect your API Key has been compromised, reset it immediately from the management page
Body
Description of the audio content to generate vocals for.
- Required.
- Provides context about the desired vocal style and content.
- The more detailed your prompt, the better the vocal generation will match your vision.
"A calm and relaxing piano track with soothing vocals"
The title of the music track.
- Required.
- This will be used as the title for the generated vocal track.
"Relaxing Piano with Vocals"
Music styles or vocal traits to exclude from the generated track.
- Required.
- Use to avoid specific vocal styles or characteristics.
Example: "Heavy Metal, Aggressive Vocals"
"Heavy Metal, Aggressive Vocals"
The music and vocal style.
- Required.
- Examples: "Jazz", "Classical", "Electronic", "Pop".
- Describes the overall genre and vocal approach.
"Jazz"
The URL of the uploaded audio file to add vocals to.
- Required.
- Must be a valid audio file URL accessible by the system.
- The uploaded audio should be in a supported format (MP3, WAV, etc.).
"https://example.com/instrumental.mp3"
The URL to receive task completion notifications when vocal generation is complete. The callback process has three stages: text
(text generation), first
(first track complete), complete
(all tracks complete). Note: In some cases, text
and first
stages may be skipped, directly returning complete
.
For detailed callback format and implementation guide, see Add Vocals Callbacks
- Alternatively, you can use the Get Music Generation Details interface to poll task status
"https://api.example.com/callback"
Preferred vocal gender. Optional. Allowed values: 'm' (male), 'f' (female).
m
, f
"m"
Style adherence weight. Optional. Range: 0-1. Two decimal places recommended.
0 <= x <= 1
Must be a multiple of 0.01
0.61
Creativity/novelty constraint. Optional. Range: 0-1. Two decimal places recommended.
0 <= x <= 1
Must be a multiple of 0.01
0.72
Relative weight of audio consistency versus other controls. Optional. Range: 0-1. Two decimal places recommended.
0 <= x <= 1
Must be a multiple of 0.01
0.65
Model version to use for generation. Optional. Default: V4_5PLUS.
V4_5PLUS
, V5
"V4_5PLUS"
Response
Request successful
Status Codes
- ✅ 200 - Request successful
- ⚠️ 400 - Invalid parameters
- ⚠️ 401 - Unauthorized access
- ⚠️ 404 - Invalid request method or path
- ⚠️ 405 - Rate limit exceeded
- ⚠️ 413 - Theme or prompt too long
- ⚠️ 429 - Insufficient credits
- ⚠️ 430 - Your call frequency is too high. Please try again later.
- ⚠️ 455 - System maintenance
- ❌ 500 - Server error
200
, 400
, 401
, 404
, 405
, 413
, 429
, 430
, 455
, 500
200
Error message when code != 200
"success"