Modellss Docs
  1. Audio
  • Send Request
  • Model List and Price
  • Model APIs
    • Chat API
      • Chat
      • GPTs
      • gemini-pro
      • gpt-4-all(Image)
      • gpt-4-vision-preview
    • Chat Completions API
      • Chat Completions API
    • Image API
      • Midjourney
        • InsightFace task submission
          • Submit swap_face task
        • Task Submission
          • Perform Action
          • Submit Blend Task
          • Submit a Describe task
          • Submit a Imagine task
          • Submit Modal
          • Submit Shorten Task
        • Task Inquiry
          • Inquire about all tasks
          • Inquire about tasks based on a list of IDs
          • Paged query for tasks
          • Inquire about the task queue
          • Cancel task
          • Retrieve task with specified ID
          • Get the seed for the task image
      • DALL·E 3
      • Stable Diffusion
      • MJ Chat
      • gpt-4-all(generate images)
    • Embedding Vector Generation Interface (Embeddings)
      • Create Embeddings
    • Audio
      • Create transcription
        POST
      • Create translation
        POST
      • Text to speech
        POST
  • Help Center
    • FAQ
  1. Audio

Create transcription

POST
/v1/audio/transcriptions
Learn how to turn audio into text or text into audio.
Related guide: Speech to text(https://platform.openai.com/docs/guides/speech-to-text)

Request

Header Params
Content-Type
string 
required
Example:
multipart/form-data
Accept
string 
required
Example:
application/json
Authorization
string 
optional
Example:
Bearer {{YOUR_API_KEY}}
Body Params multipart/form-data
file
file 
required
The audio file object (not file name) to transcribe, in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.
model
string 
required
ID of the model to use. Only whisper-1 (which is powered by our open source Whisper V2 model) is currently available.
Example:
whisper-1
prompt
string 
optional
An optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language.
Example:
eiusmod nulla
response_format
string 
optional
The format of the transcript output, in one of these options: json, text, srt, verbose_json, or vtt.
Example:
json
temperature
number 
optional
The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit.
Example:
0
language
string 
optional
The language of the input audio. Supplying the input language in ISO-639-1 format will improve accuracy and latency.

Request samples

Shell
JavaScript
Java
Swift
Go
PHP
Python
HTTP
C
C#
Objective-C
Ruby
OCaml
Dart
R
Request Request Example
Shell
JavaScript
Java
Swift
curl --location --request POST 'https://api.modelless.co/v1/audio/transcriptions' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer ' \
--header 'Content-Type: multipart/form-data' \
--form 'file=@""' \
--form 'model="whisper-1"' \
--form 'prompt="eiusmod nulla"' \
--form 'response_format="json"' \
--form 'temperature="0"' \
--form 'language=""'

Responses

🟢200OK
application/json
Body
text
string 
required
Example
{
    "text": "Imagine the wildest idea that you've ever had, and you're curious about how it might scale to something that's a 100, a 1,000 times bigger. This is a place where you can get to do that."
}
Previous
Create Embeddings
Next
Create translation
Built with