feat: Add Oracle Cloud Infrastructure (OCI) Generative AI client support (#718)

fede-kamel · web-flow · commit 152dbb15b4c9 · 2026-04-09T12:30:44.000-04:00
* feat: Add Oracle Cloud Infrastructure (OCI) Generative AI client support

Adds OciClient (V1 API) and OciClientV2 (V2 API) for the OCI Generative
AI service, following the BedrockClient pattern with httpx event hooks.

Authentication: config file, custom profiles, session tokens, direct
credentials, instance principal, resource principal.

API coverage: embed (all models), chat with streaming (OciClient for
Command R family, OciClientV2 for Command A). Lazy-loads oci SDK as an
optional dependency; install with `pip install cohere[oci]`.

* fix: address review feedback — remove stale model names and fix test profile

- README: remove specific model names from Supported APIs and Model
  Availability sections (per mkozakov review — will go out of date)
- tests: default OCI_PROFILE to DEFAULT instead of API_KEY_AUTH

* fix: remove dead chat_stream endpoint and body-based stream detection

The "stream" in endpoint check was dead code — both V1 and V2 SDK always
route through endpoint "chat" (v1/chat and v2/chat paths). Streaming is
reliably signalled via body["stream"], which the SDK always sets.

- Drop "stream" in endpoint guard on is_stream and isStream detection
- Remove "chat_stream" from action_map, transform, and response branches
- Update unit tests to use "chat" endpoint (the only real one)

* fix: don't trigger content-type transition on finish-only stream events

_current_content_type now returns None for events with no message content
(e.g. {"finishReason": "COMPLETE"}). The transition branch in
_transform_v2_event is skipped when event_content_type is None, so a
finish-only event after a thinking block no longer opens a spurious empty
text block before emitting content-end.

* test: add integration tests for light models, command-r-plus, multi-turn, and system message

* fix: use 'or []' to guard against explicit content=None in V2 messages

* Address cursor review: raise on unsupported endpoint, refresh session token per-request

- transform_request_to_oci now raises ValueError for endpoints other than
  'embed' and 'chat' instead of silently returning the untransformed body
- Session token auth uses a refreshing wrapper that re-reads the token file
  before each signing call, so OCI CLI token refreshes are picked up without
  restarting the client
- Add test_unsupported_endpoint_raises to cover the new explicit error
- Update test_session_auth_prefers_security_token_signer to expect multi-call
  behaviour from the refreshing signer

* Add test proving session token is re-read on subsequent requests

test_session_token_refreshed_on_subsequent_requests writes a real token file,
makes two requests with the file updated between them, and asserts that the
second signing call uses the new token — verifying the refreshing signer works
end-to-end.
diff --git a/.fernignore b/.fernignore
@@ -15,6 +15,7 @@ src/cohere/manually_maintained/__init__.py
 src/cohere/bedrock_client.py
 src/cohere/aws_client.py
 src/cohere/sagemaker_client.py
+src/cohere/oci_client.py
 src/cohere/client_v2.py
 mypy.ini
 src/cohere/aliases.py
diff --git a/README.md b/README.md
@@ -58,6 +58,111 @@ for event in response:
         print(event.delta.message.content.text, end="")
 ```
 
+## Oracle Cloud Infrastructure (OCI)
+
+The SDK supports Oracle Cloud Infrastructure (OCI) Generative AI service. First, install the OCI SDK:
+
+```
+pip install 'cohere[oci]'
+```
+
+Then use the `OciClient` or `OciClientV2`:
+
+```Python
+import cohere
+
+# Using OCI config file authentication (default: ~/.oci/config)
+co = cohere.OciClient(
+    oci_region="us-chicago-1",
+    oci_compartment_id="ocid1.compartment.oc1...",
+)
+
+response = co.embed(
+    model="embed-english-v3.0",
+    texts=["Hello world"],
+    input_type="search_document",
+)
+
+print(response.embeddings)
+```
+
+### OCI Authentication Methods
+
+**1. Config File (Default)**
+```Python
+co = cohere.OciClient(
+    oci_region="us-chicago-1",
+    oci_compartment_id="ocid1.compartment.oc1...",
+    # Uses ~/.oci/config with DEFAULT profile
+)
+```
+
+**2. Custom Profile**
+```Python
+co = cohere.OciClient(
+    oci_profile="MY_PROFILE",
+    oci_region="us-chicago-1",
+    oci_compartment_id="ocid1.compartment.oc1...",
+)
+```
+
+**3. Session-based Authentication (Security Token)**
+```Python
+# Works with OCI CLI session tokens
+co = cohere.OciClient(
+    oci_profile="MY_SESSION_PROFILE",  # Profile with security_token_file
+    oci_region="us-chicago-1",
+    oci_compartment_id="ocid1.compartment.oc1...",
+)
+```
+
+**4. Direct Credentials**
+```Python
+co = cohere.OciClient(
+    oci_user_id="ocid1.user.oc1...",
+    oci_fingerprint="xx:xx:xx:...",
+    oci_tenancy_id="ocid1.tenancy.oc1...",
+    oci_private_key_path="~/.oci/key.pem",
+    oci_region="us-chicago-1",
+    oci_compartment_id="ocid1.compartment.oc1...",
+)
+```
+
+**5. Instance Principal (for OCI Compute instances)**
+```Python
+co = cohere.OciClient(
+    auth_type="instance_principal",
+    oci_region="us-chicago-1",
+    oci_compartment_id="ocid1.compartment.oc1...",
+)
+```
+
+### Supported OCI APIs
+
+The OCI client supports the following Cohere APIs:
+- **Embed**: Full support for all embedding models
+- **Chat**: Full support with both V1 (`OciClient`) and V2 (`OciClientV2`) APIs
+  - Streaming available via `chat_stream()`
+  - Supports Command-R and Command-A model families
+
+### OCI Model Availability and Limitations
+
+**Available on OCI On-Demand Inference:**
+- ✅ **Embed models**: available on OCI Generative AI
+- ✅ **Chat models**: available via `OciClient` (V1) and `OciClientV2` (V2)
+
+**Not Available on OCI On-Demand Inference:**
+- ❌ **Generate API**: OCI TEXT_GENERATION models are base models that require fine-tuning before deployment
+- ❌ **Rerank API**: OCI TEXT_RERANK models are base models that require fine-tuning before deployment
+- ❌ **Multiple Embedding Types**: OCI on-demand models only support single embedding type per request (cannot request both `float` and `int8` simultaneously)
+
+**Note**: To use Generate or Rerank models on OCI, you need to:
+1. Fine-tune the base model using OCI's fine-tuning service
+2. Deploy the fine-tuned model to a dedicated endpoint
+3. Update your code to use the deployed model endpoint
+
+For the latest model availability, see the [OCI Generative AI documentation](https://docs.oracle.com/en-us/iaas/Content/generative-ai/home.htm).
+
 ## Contributing
 
 While we value open-source contributions to this SDK, the code is generated programmatically. Additions made directly would have to be moved over to our generation code, otherwise they would be overwritten upon the next generated release. Feel free to open a PR as a proof of concept, but know that we will not be able to merge it as-is. We suggest opening an issue first to discuss with us!
diff --git a/pyproject.toml b/pyproject.toml
@@ -48,6 +48,10 @@ requests = "^2.0.0"
 tokenizers = ">=0.15,<1"
 types-requests = "^2.0.0"
 typing_extensions = ">= 4.0.0"
+oci = { version = "^2.165.0", optional = true }
+
+[tool.poetry.extras]
+oci = ["oci"]
 
 [tool.poetry.group.dev.dependencies]
 mypy = "==1.13.0"
diff --git a/src/cohere/__init__.py b/src/cohere/__init__.py
@@ -523,6 +523,8 @@
     "NotFoundError": ".errors",
     "NotImplementedError": ".errors",
     "OAuthAuthorizeResponse": ".types",
+    "OciClient": ".oci_client",
+    "OciClientV2": ".oci_client",
     "ParseInfo": ".types",
     "RerankDocument": ".types",
     "RerankRequestDocumentsItem": ".types",
@@ -860,6 +862,8 @@ def __dir__():
     "NotFoundError",
     "NotImplementedError",
     "OAuthAuthorizeResponse",
+    "OciClient",
+    "OciClientV2",
     "ParseInfo",
     "RerankDocument",
     "RerankRequestDocumentsItem",
diff --git a/src/cohere/manually_maintained/lazy_oci_deps.py b/src/cohere/manually_maintained/lazy_oci_deps.py
@@ -0,0 +1,30 @@
+"""Lazy loading for optional OCI SDK dependency."""
+
+from typing import Any
+
+OCI_INSTALLATION_MESSAGE = """
+The OCI SDK is required to use OciClient or OciClientV2.
+
+Install it with:
+    pip install oci
+
+Or with the optional dependency group:
+    pip install cohere[oci]
+"""
+
+
+def lazy_oci() -> Any:
+    """
+    Lazily import the OCI SDK.
+
+    Returns:
+        The oci module
+
+    Raises:
+        ImportError: If the OCI SDK is not installed
+    """
+    try:
+        import oci
+        return oci
+    except ImportError:
+        raise ImportError(OCI_INSTALLATION_MESSAGE)
diff --git a/src/cohere/manually_maintained/streaming.py b/src/cohere/manually_maintained/streaming.py
@@ -0,0 +1,15 @@
+import typing
+
+from httpx import SyncByteStream
+
+
+class Streamer(SyncByteStream):
+    """Wrap an iterator of bytes for httpx streaming responses."""
+
+    lines: typing.Iterator[bytes]
+
+    def __init__(self, lines: typing.Iterator[bytes]):
+        self.lines = lines
+
+    def __iter__(self) -> typing.Iterator[bytes]:
+        return self.lines
diff --git a/src/cohere/oci_client.py b/src/cohere/oci_client.py
diff --git a/tests/test_oci_client.py b/tests/test_oci_client.py