Completion() | liteLLM

📄️ Input Format

The Input params are exactly the same as the

📄️ Output Format

Here's the exact json output and type you can expect from all litellm completion calls for all models

📄️ Trimming Input Messages

Use litellm.trim_messages() to ensure messages does not exceed a model's token limit or specified max_tokens

📄️ Model Alias

The model name you show an end-user might be different from the one you pass to LiteLLM - e.g. Displaying GPT-3.5 while calling gpt-3.5-turbo-16k on the backend.

📄️ Reliability

LiteLLM supports the following functions for reliability:

📄️ Batching Completion() Calls

In the batch_completion method, you provide a list of messages where each sub-list of messages is passed to litellm.completion(), allowing you to process multiple prompts efficiently in a single API call.

📄️ Mock Requests

For testing purposes, you can use mock_completion() to mock calling the completion endpoint.