diff --git a/docs/providers/oci.md b/docs/providers/oci.md
index 182bb440..5e0277a0 100644
--- a/docs/providers/oci.md
+++ b/docs/providers/oci.md
@@ -8,19 +8,20 @@ Check the [OCI Models List](https://docs.oracle.com/en-us/iaas/Content/generativ
## Supported Models
+For model lifecycle, retirement dates, and recommended replacements, see [OCI's on-demand model retirement page](https://docs.oracle.com/en-us/iaas/Content/generative-ai/deprecating-on-demand.htm) — Oracle is the authoritative source.
+
### Chat / Text Generation
#### Meta Llama Models
-- `meta.llama-4-maverick-17b-128e-instruct-fp8`
-- `meta.llama-4-scout-17b-16e-instruct`
+- `meta.llama-4-maverick-17b-128e-instruct-fp8` (multimodal)
+- `meta.llama-4-scout-17b-16e-instruct` (multimodal)
- `meta.llama-3.3-70b-instruct`
- `meta.llama-3.3-70b-instruct-fp8-dynamic`
-- `meta.llama-3.2-90b-vision-instruct`
-- `meta.llama-3.2-11b-vision-instruct`
-- `meta.llama-3.1-405b-instruct`
-- `meta.llama-3.1-70b-instruct`
+- `meta.llama-3.2-90b-vision-instruct` (multimodal)
+- `meta.llama-3.2-11b-vision-instruct` (multimodal)
#### xAI Grok Models
+- `xai.grok-4.3`
- `xai.grok-4.20`
- `xai.grok-4.20-multi-agent`
- `xai.grok-4`
@@ -36,26 +37,31 @@ Check the [OCI Models List](https://docs.oracle.com/en-us/iaas/Content/generativ
- `cohere.command-latest`
- `cohere.command-a-03-2025`
- `cohere.command-a-reasoning-08-2025`
-- `cohere.command-a-vision-07-2025`
+- `cohere.command-a-vision-07-2025` (multimodal)
- `cohere.command-a-translate-08-2025`
- `cohere.command-plus-latest`
-- `cohere.command-r-08-2024`
- `cohere.command-r-plus-08-2024`
+- `cohere.command-r-08-2024`
#### Google Gemini Models (via OCI)
-- `google.gemini-2.5-pro`
-- `google.gemini-2.5-flash`
-- `google.gemini-2.5-flash-lite`
+- `google.gemini-2.5-pro` (multimodal)
+- `google.gemini-2.5-flash` (multimodal)
+- `google.gemini-2.5-flash-lite` (multimodal)
+
+#### OpenAI Open-Source Models (via OCI)
+- `openai.gpt-oss-120b`
+- `openai.gpt-oss-20b`
### Embedding Models
+- `cohere.embed-v4.0` (1536 dimensions, multimodal)
- `cohere.embed-english-v3.0` (1024 dimensions)
- `cohere.embed-english-light-v3.0` (384 dimensions)
- `cohere.embed-multilingual-v3.0` (1024 dimensions)
- `cohere.embed-multilingual-light-v3.0` (384 dimensions)
- `cohere.embed-english-image-v3.0` (1024 dimensions, multimodal)
- `cohere.embed-english-light-image-v3.0` (384 dimensions, multimodal)
+- `cohere.embed-multilingual-image-v3.0` (1024 dimensions, multimodal)
- `cohere.embed-multilingual-light-image-v3.0` (384 dimensions, multimodal)
-- `cohere.embed-v4.0` (1536 dimensions, multimodal)
## Authentication
@@ -73,6 +79,21 @@ Provide individual OCI credentials directly to LiteLLM. Follow the [official Ora
This is the default method for LiteLLM AI Gateway (LLM Proxy) access to OCI GenAI models.
+**Environment Variables**
+
+Instead of passing credentials in code, you can set the following environment variables — LiteLLM will read them automatically:
+
+```bash
+export OCI_REGION="us-chicago-1"
+export OCI_USER="ocid1.user.oc1.."
+export OCI_FINGERPRINT="xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx"
+export OCI_TENANCY="ocid1.tenancy.oc1.."
+export OCI_COMPARTMENT_ID="ocid1.compartment.oc1.."
+# Provide either the private key content OR the path to the key file:
+export OCI_KEY_FILE="/path/to/oci_api_key.pem"
+# export OCI_KEY="-----BEGIN PRIVATE KEY-----\n..."
+```
+
### Method 2: OCI SDK Signer
Use an OCI SDK `Signer` object for authentication. This method:
- Leverages the official [OCI SDK for signing](https://docs.oracle.com/en-us/iaas/tools/python/latest/api/signing.html)
@@ -220,6 +241,92 @@ print(response)
+## LiteLLM Proxy Usage
+
+Here's how to call OCI GenAI through the LiteLLM Proxy Server.
+
+### 1. Setup config.yaml
+
+```yaml
+model_list:
+ - model_name: oci-grok-4
+ litellm_params:
+ model: oci/xai.grok-4
+ oci_region: os.environ/OCI_REGION
+ oci_user: os.environ/OCI_USER
+ oci_fingerprint: os.environ/OCI_FINGERPRINT
+ oci_tenancy: os.environ/OCI_TENANCY
+ oci_key_file: os.environ/OCI_KEY_FILE
+ oci_compartment_id: os.environ/OCI_COMPARTMENT_ID
+
+ - model_name: oci-cohere-command
+ litellm_params:
+ model: oci/cohere.command-latest
+ oci_region: os.environ/OCI_REGION
+ oci_user: os.environ/OCI_USER
+ oci_fingerprint: os.environ/OCI_FINGERPRINT
+ oci_tenancy: os.environ/OCI_TENANCY
+ oci_key_file: os.environ/OCI_KEY_FILE
+ oci_compartment_id: os.environ/OCI_COMPARTMENT_ID
+```
+
+All possible auth params:
+
+```
+oci_region: Optional[str],
+oci_user: Optional[str],
+oci_fingerprint: Optional[str],
+oci_tenancy: Optional[str],
+oci_key: Optional[str], # private key content as string
+oci_key_file: Optional[str], # path to .pem file
+oci_compartment_id: Optional[str],
+oci_serving_mode: Optional[str], # "ON_DEMAND" (default) or "DEDICATED"
+oci_endpoint_id: Optional[str], # only used with DEDICATED
+```
+
+### 2. Start the proxy
+
+```bash
+litellm --config /path/to/config.yaml
+```
+
+### 3. Test it
+
+
+
+
+```shell
+curl --location 'http://0.0.0.0:4000/chat/completions' \
+--header 'Content-Type: application/json' \
+--data '{
+ "model": "oci-grok-4",
+ "messages": [
+ {"role": "user", "content": "what llm are you"}
+ ]
+}'
+```
+
+
+
+
+```python
+import openai
+
+client = openai.OpenAI(
+ api_key="anything",
+ base_url="http://0.0.0.0:4000"
+)
+
+response = client.chat.completions.create(
+ model="oci-grok-4",
+ messages=[{"role": "user", "content": "write a short poem"}],
+)
+print(response)
+```
+
+
+
+
## Usage - Streaming
Just set `stream=True` when calling completion.
@@ -411,20 +518,202 @@ response = completion(
)
```
+## Usage - Function Calling / Tool Calling
+
+OCI GenAI supports OpenAI-compatible function calling. LiteLLM normalizes the request and response shape so the same code that targets OpenAI works with OCI Cohere and Generic (xAI Grok, Meta Llama, Google Gemini) models.
+
+
+
+
+```python
+from litellm import completion
+
+tools = [
+ {
+ "type": "function",
+ "function": {
+ "name": "get_current_weather",
+ "description": "Get the current weather in a given location",
+ "parameters": {
+ "type": "object",
+ "properties": {
+ "location": {
+ "type": "string",
+ "description": "The city and state, e.g. San Francisco, CA",
+ },
+ "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
+ },
+ "required": ["location"],
+ },
+ },
+ }
+]
+
+response = completion(
+ model="oci/xai.grok-4",
+ messages=[{"role": "user", "content": "What's the weather in Boston today?"}],
+ tools=tools,
+ tool_choice="auto",
+ oci_region="us-chicago-1",
+ oci_user="",
+ oci_fingerprint="",
+ oci_tenancy="",
+ oci_key_file="",
+ oci_compartment_id="",
+)
+
+# Inspect the tool call
+print(response.choices[0].message.tool_calls)
+```
+
+
+
+
+```python
+import openai
+
+client = openai.OpenAI(api_key="anything", base_url="http://0.0.0.0:4000")
+
+response = client.chat.completions.create(
+ model="oci-grok-4",
+ messages=[{"role": "user", "content": "What's the weather in Boston today?"}],
+ tools=[
+ {
+ "type": "function",
+ "function": {
+ "name": "get_current_weather",
+ "description": "Get the current weather in a given location",
+ "parameters": {
+ "type": "object",
+ "properties": {
+ "location": {"type": "string"},
+ "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
+ },
+ "required": ["location"],
+ },
+ },
+ }
+ ],
+ tool_choice="auto",
+)
+print(response.choices[0].message.tool_calls)
+```
+
+
+
+
+Tool calling works with both Cohere (`cohere.command-*`) and Generic (`xai.grok-*`, `meta.llama-*`, `google.gemini-*`) model families — LiteLLM adapts the OpenAI tool schema to each vendor's native format internally.
+
+## Usage - Vision / Multimodal
+
+OCI GenAI exposes vision-capable models that accept images alongside text. Pass images using the standard OpenAI `image_url` content block.
+
+```python
+from litellm import completion
+
+response = completion(
+ model="oci/meta.llama-4-maverick-17b-128e-instruct-fp8",
+ messages=[
+ {
+ "role": "user",
+ "content": [
+ {"type": "text", "text": "What is in this image?"},
+ {
+ "type": "image_url",
+ "image_url": {
+ "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
+ },
+ },
+ ],
+ }
+ ],
+ oci_region="us-chicago-1",
+ oci_user="",
+ oci_fingerprint="",
+ oci_tenancy="",
+ oci_key_file="",
+ oci_compartment_id="",
+)
+print(response.choices[0].message.content)
+```
+
+Vision-capable models on OCI include:
+
+- `meta.llama-4-maverick-17b-128e-instruct-fp8`
+- `meta.llama-4-scout-17b-16e-instruct`
+- `meta.llama-3.2-11b-vision-instruct`
+- `meta.llama-3.2-90b-vision-instruct`
+- `cohere.command-a-vision-07-2025`
+- `google.gemini-2.5-pro`, `google.gemini-2.5-flash`, `google.gemini-2.5-flash-lite`
+
+Both URL and base64-encoded data URIs are supported.
+
+## Usage - Reasoning / Thinking
+
+OCI Generic-vendor models (xAI Grok reasoning variants, Google Gemini, etc.) support a reasoning step. LiteLLM exposes this via the OpenAI-compatible `reasoning_effort` parameter — accepted values are `"low"`, `"medium"`, `"high"`, and `"disable"` (mapped to OCI's `NONE`).
+
+Returned reasoning tokens are surfaced on `usage.completion_tokens_details.reasoning_tokens`, matching the OpenAI shape.
+
+
+
+
+```python
+from litellm import completion
+
+response = completion(
+ model="oci/xai.grok-3-mini",
+ messages=[{"role": "user", "content": "If 3x + 7 = 22, what is x? Show your reasoning."}],
+ reasoning_effort="high", # "low" | "medium" | "high" | "disable"
+ oci_region="us-chicago-1",
+ oci_user="",
+ oci_fingerprint="",
+ oci_tenancy="",
+ oci_key_file="",
+ oci_compartment_id="",
+)
+
+print(response.choices[0].message.content)
+print("Reasoning tokens:", response.usage.completion_tokens_details.reasoning_tokens)
+```
+
+
+
+
+```python
+import openai
+
+client = openai.OpenAI(api_key="anything", base_url="http://0.0.0.0:4000")
+
+response = client.chat.completions.create(
+ model="oci-grok-mini",
+ messages=[{"role": "user", "content": "If 3x + 7 = 22, what is x?"}],
+ reasoning_effort="high",
+)
+print(response.choices[0].message.content)
+```
+
+
+
+
+:::note
+`reasoning_effort` is only honored on Generic-vendor reasoning models (e.g., `xai.grok-3-mini`, `xai.grok-4`, `google.gemini-2.5-pro`). It is silently ignored for OCI Cohere models, which are not reasoning models.
+:::
+
## Optional Parameters
-| Parameter | Type | Default | Description |
-|-----------|------|---------|-------------|
-| `oci_region` | string | `us-ashburn-1` | OCI region where the GenAI service is deployed |
-| `oci_serving_mode` | string | `ON_DEMAND` | Service mode: `ON_DEMAND` for managed models or `DEDICATED` for dedicated endpoints |
-| `oci_endpoint_id` | string | Same as `model` | (For DEDICATED mode) The OCID of your dedicated endpoint |
-| `oci_compartment_id` | string | **Required** | The OCID of the OCI compartment containing your resources |
-| `oci_user` | string | - | (Manual auth) The OCID of the OCI user |
-| `oci_fingerprint` | string | - | (Manual auth) The fingerprint of the API signing key |
-| `oci_tenancy` | string | - | (Manual auth) The OCID of your OCI tenancy |
-| `oci_key` | string | - | (Manual auth) The private key content as a string |
-| `oci_key_file` | string | - | (Manual auth) Path to the private key file |
-| `oci_signer` | object | - | (SDK auth) OCI SDK Signer object for authentication |
+| Parameter | Type | Default | Environment Variable | Description |
+|-----------|------|---------|----------------------|-------------|
+| `oci_region` | string | `us-ashburn-1` | `OCI_REGION` | OCI region where the GenAI service is deployed |
+| `oci_serving_mode` | string | `ON_DEMAND` | – | Service mode: `ON_DEMAND` for managed models or `DEDICATED` for dedicated endpoints |
+| `oci_endpoint_id` | string | Same as `model` | – | (For DEDICATED mode) The OCID of your dedicated endpoint |
+| `oci_compartment_id` | string | **Required** | `OCI_COMPARTMENT_ID` | The OCID of the OCI compartment containing your resources |
+| `oci_user` | string | – | `OCI_USER` | (Manual auth) The OCID of the OCI user |
+| `oci_fingerprint` | string | – | `OCI_FINGERPRINT` | (Manual auth) The fingerprint of the API signing key |
+| `oci_tenancy` | string | – | `OCI_TENANCY` | (Manual auth) The OCID of your OCI tenancy |
+| `oci_key` | string | – | `OCI_KEY` | (Manual auth) The private key content as a string |
+| `oci_key_file` | string | – | `OCI_KEY_FILE` | (Manual auth) Path to the private key file |
+| `oci_signer` | object | – | – | (SDK auth) OCI SDK Signer object for authentication |
+| `reasoning_effort` | string | – | – | Reasoning level for Generic-vendor reasoning models: `low`, `medium`, `high`, `disable` |
## Embeddings