Skip to content

Commit 39b8697

Browse files
committed
revisited method parameters
1 parent d92fe4d commit 39b8697

11 files changed

Lines changed: 1050 additions & 1 deletion

File tree

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -160,3 +160,5 @@ cython_debug/
160160
# and can be added to the global gitignore or merged into this file. For a more nuclear
161161
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
162162
#.idea/
163+
164+
.DS_Store

README.md

Lines changed: 97 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,97 @@
1-
# needle-python
1+
# Needle Python Library
2+
3+
This Python library provides convenient acccess to Needle API. There are various methods and data types which, we believe will help you explore Needle API quickly. There may be some functionality available in REST API earlier than this Python library. In any case, we recommend to take look the the complete [documentation](https://docs.needle-ai.com). Thank you for flying with us. 🚀
4+
5+
## Installation
6+
7+
This library requires Python >3.8 and `pip` to use. You don't need the sources unless you want to modify it. Install with:
8+
9+
```
10+
pip install needle-python
11+
```
12+
13+
## Usage ⚡️
14+
15+
To get started, generate an API key for your account in developer settings menu at (Needle)[https://needle-ai.com]. Note that your key will be valid until you revoke it. Set the following env variable before you run your code:
16+
17+
```
18+
export NEEDLE_API_KEY=<your-api-key>
19+
```
20+
21+
`NeedleClient` reads the API key from the environment by default. If you like to override this behaviour you can pass it in as a parameter.
22+
23+
### Retrieve context from Needle
24+
25+
```python
26+
from needle.v1 import NeedleClient
27+
from needle.v1.models import FileToAdd
28+
29+
30+
ndl = NeedleClient()
31+
collection = ndl.collections.create(name="Tech Trends")
32+
33+
# add file to collection
34+
files = ndl.collections.files.add(
35+
collection_id=collection_id,
36+
files=[
37+
FileToAdd(
38+
name="tech-radar-30.pdf",
39+
url="https://www.thoughtworks.com/content/dam/thoughtworks/documents/radar/2024/04/tr_technology_radar_vol_30_en.pdf",
40+
)
41+
],
42+
)
43+
44+
# wait until indexing is complete
45+
files = ndl.collections.files.list(collection_id)
46+
if not all(f.status == "indexed" for f in files):
47+
time.sleep(5)
48+
files = ndl.collections.files.list(collection_id)
49+
50+
# retrieve relevant context
51+
prompt = "What techniques moved into adopt in this volume of technology radar?"
52+
results = ndl.collections.search(collection_id, text=prompt)
53+
```
54+
55+
Needle instantly extracts key points from your files.
56+
57+
### Complete your RAG pipeline
58+
59+
Naturally, to compose a human friendly answer use an LLM provider of your choice. For the demo purposes, we used OpenAI in this example:
60+
61+
```python
62+
from openai import OpenAI
63+
64+
system_messages = [{"role": "system", "content": r.content} for r in results] # results from Needle
65+
user_message = {
66+
"role": "system",
67+
"content": f"""
68+
Do not hallucinate. Only answer the question based on the provided results data.
69+
If there is no data in the provided data for the question, do not try to generate an answer that does not make sense.
70+
This is the question: {prompt}
71+
""",
72+
}
73+
74+
openai_client = OpenAI()
75+
answer = openai_client.chat.completions.create(
76+
model="gpt-3.5-turbo",
77+
messages=[
78+
*system_messages,
79+
user_message,
80+
],
81+
)
82+
83+
print(answer.choices[0].message.content)
84+
# -> Retrieval-Augmented Generation (RAG) is the technique that moved into "Adopt" in this volume of the Technology Radar.
85+
```
86+
87+
This is one basic example of a RAG pipeline you can quicklu implement using Needle and OpenAI. Feel free to engineer more precise prompts and explore other prompting techniques such as chain-of-thoughts (CoT), graph of thoughts (GoT) etc.
88+
89+
Needle API helps you with hassle-free contextualization however does not limit you to a certain RAG technique. Let us know what you build in our (Discord channel)[https://discord.gg/JzJcHgTyZx] :)
90+
91+
## Exceptions 🧨
92+
93+
If a request to Needle API fails, `needle.v1.models.Error` object will be thrown. There you can see a `message` and more details about the error.
94+
95+
# Support 📞
96+
97+
If you have questions you can contact us in our (Discord channel)[https://discord.gg/JzJcHgTyZx].

needle/__init__.py

Whitespace-only changes.

needle/utils.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
from urllib.parse import urlparse, urlunparse
2+
3+
4+
def make_needle_search_url(needle_url: str):
5+
parsed_url = urlparse(needle_url)
6+
new_netloc = f"search.{parsed_url.netloc}"
7+
return urlunparse(parsed_url._replace(netloc=new_netloc))

needle/v1/__init__.py

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
"""
2+
This module provides NeedlClient class for interacting with Needle API.
3+
"""
4+
5+
from typing import Optional
6+
import os
7+
8+
from needle.utils import make_needle_search_url
9+
from needle.v1.models import (
10+
NeedleConfig,
11+
NeedleBaseClient,
12+
)
13+
from needle.v1.collections import NeedleCollections
14+
15+
16+
NEEDLE_DEFAULT_URL = "https://needle-ai.com"
17+
18+
19+
class NeedleClient(NeedleBaseClient):
20+
"""
21+
A client for interacting with the Needle API.
22+
23+
This class provides a high-level interface for interacting with the Needle API,
24+
including managing collections and performing searches.
25+
26+
Initialize the client with an API key and an optional URL.
27+
If no API key is provided, the client will use the `NEEDLE_API_KEY` environment variable.
28+
If no URL is provided, the client will use the default Needle API URL, that is https://needle-ai.com.
29+
30+
Attributes:
31+
collections (NeedleCollections): A client for managing collections within the Needle API.
32+
"""
33+
34+
def __init__(
35+
self,
36+
api_key: Optional[str] = os.environ.get("NEEDLE_API_KEY"),
37+
url: Optional[str] = NEEDLE_DEFAULT_URL,
38+
_search_url: Optional[str] = None,
39+
):
40+
if not _search_url:
41+
_search_url = make_needle_search_url(url)
42+
43+
config = NeedleConfig(api_key, url, search_url=_search_url)
44+
headers = {"x-api-key": config.api_key}
45+
super().__init__(config, headers)
46+
47+
# sub clients
48+
self.collections = NeedleCollections(config, headers)

needle/v1/collections/__init__.py

Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
"""
2+
This module provides NeedleCollections class for interacting with Needle API's collections endpoint.
3+
"""
4+
5+
from typing import Optional
6+
7+
import requests
8+
9+
from needle.v1.models import (
10+
NeedleConfig,
11+
NeedleBaseClient,
12+
Collection,
13+
Error,
14+
SearchResult,
15+
)
16+
from needle.v1.collections.files import NeedleCollectionsFiles
17+
18+
19+
class NeedleCollections(NeedleBaseClient):
20+
"""
21+
A client for interacting with the Needle API's collections endpoint.
22+
23+
This class provides methods to create and manage collections within the Needle API.
24+
It uses a requests session to handle HTTP requests with a default timeout of 120 seconds.
25+
"""
26+
27+
def __init__(self, config: NeedleConfig, headers: dict):
28+
super().__init__(config, headers)
29+
30+
self.endpoint = f"{config.url}/api/v1/collections"
31+
self.search_endpoint = f"{config.search_url}/api/v1/collections"
32+
33+
# requests config
34+
self.session = requests.Session()
35+
self.session.headers.update(headers)
36+
self.session.timeout = 120
37+
38+
# sub clients
39+
self.files = NeedleCollectionsFiles(config, headers)
40+
41+
def create(self, name: str, file_ids: Optional[list[str]] = None):
42+
"""
43+
Creates a new collection with the specified name and file IDs.
44+
45+
Args:
46+
name (str): The name of the collection.
47+
file_ids (Optiona[list[str]]): A list of file IDs to include in the collection.
48+
49+
Returns:
50+
Collection: The created collection object.
51+
52+
Raises:
53+
Error: If the API request fails.
54+
"""
55+
req_body = {"name": name, "file_ids": file_ids}
56+
resp = self.session.post(
57+
f"{self.endpoint}",
58+
json=req_body,
59+
)
60+
body = resp.json()
61+
if resp.status_code >= 400:
62+
error = body.get("error")
63+
raise Error(**error)
64+
c = body.get("result")
65+
return Collection(**c)
66+
67+
def get(self, collection_id: str):
68+
"""
69+
Retrieves a collection by its ID.
70+
71+
Args:
72+
collection_id (str): The ID of the collection to retrieve.
73+
74+
Returns:
75+
Collection: The retrieved collection object.
76+
77+
Raises:
78+
Error: If the API request fails.
79+
"""
80+
resp = self.session.get(f"{self.endpoint}/{collection_id}")
81+
body = resp.json()
82+
if resp.status_code >= 400:
83+
error = body.get("error")
84+
raise Error(**error)
85+
c = body.get("result")
86+
return Collection(**c)
87+
88+
def list(self):
89+
"""
90+
Lists all collections.
91+
92+
Returns:
93+
list[Collection]: A list of all collections.
94+
95+
Raises:
96+
Error: If the API request fails.
97+
"""
98+
resp = self.session.get(self.endpoint)
99+
body = resp.json()
100+
if resp.status_code >= 400:
101+
error = body.get("error")
102+
raise Error(**error)
103+
return [Collection(**c) for c in body.get("result")]
104+
105+
def search(self, collection_id: str, text: str):
106+
"""
107+
Searches within a collection based on the provided parameters.
108+
109+
Args:
110+
params (SearchCollectionRequest): The search parameters.
111+
112+
Returns:
113+
list[dict]: The search results.
114+
115+
Raises:
116+
Error: If the API request fails.
117+
"""
118+
endpoint = f"{self.search_endpoint}/{collection_id}/search"
119+
req_body = {"text": text}
120+
resp = self.session.post(endpoint, headers=self.headers, json=req_body)
121+
body = resp.json()
122+
if resp.status_code >= 400:
123+
error = body.get("error")
124+
raise Error(**error)
125+
return [SearchResult(**c) for c in body.get("result")]

needle/v1/collections/files.py

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
"""
2+
This module provides NeedleCollectionsFiles class for interacting with
3+
Needle API's collectiton files endpoint.
4+
"""
5+
6+
from dataclasses import asdict
7+
import requests
8+
9+
from needle.v1 import NeedleBaseClient, NeedleConfig
10+
from needle.v1.models import FileToAdd, Error, CollectionFile
11+
12+
13+
class NeedleCollectionsFiles(NeedleBaseClient):
14+
"""
15+
A client for interacting with the Needle API's collection files endpoint.
16+
17+
This class provides methods to create and manage collection files within the Needle API.
18+
It uses a requests session to handle HTTP requests with a default timeout of 120 seconds.
19+
"""
20+
21+
def __init__(self, config: NeedleConfig, headers: dict):
22+
super().__init__(config, headers)
23+
24+
self.collections_endpoint = f"{config.url}/api/v1/collections"
25+
26+
# requests config
27+
self.session = requests.Session()
28+
self.session.headers.update(headers)
29+
self.session.timeout = 120
30+
31+
def add(self, collection_id: str, files: list[FileToAdd]):
32+
"""
33+
Adds files to a specified collection. Added files will be automatically indexed and after be available for search within the collection.
34+
35+
Args:
36+
collection_id (str): The ID of the collection to which files will be added.
37+
files (list[FileToAdd]): A list of FileToAdd objects representing the files to be added.
38+
39+
Returns:
40+
list[CollectionFile]: A list of CollectionFile objects representing the added files.
41+
42+
Raises:
43+
Error: If the API request fails.
44+
"""
45+
46+
endpoint = f"{self.collections_endpoint}/{collection_id}/files"
47+
req_body = {"files": [asdict(f) for f in files]}
48+
resp = self.session.post(endpoint, json=req_body)
49+
body = resp.json()
50+
if resp.status_code >= 400:
51+
error = body.get("error")
52+
raise Error(**error)
53+
return [CollectionFile(**cf) for cf in body.get("result")]
54+
55+
def list(self, collection_id: str):
56+
"""
57+
Lists all files in a specified collection.
58+
59+
Args:
60+
collection_id (str): The ID of the collection whose files will be listed.
61+
62+
Returns:
63+
list[CollectionFile]: A list of CollectionFile objects representing the files in the collection.
64+
65+
Raises:
66+
Error: If the API request fails.
67+
"""
68+
endpoint = f"{self.collections_endpoint}/{collection_id}/files"
69+
resp = self.session.get(endpoint)
70+
body = resp.json()
71+
if resp.status_code >= 400:
72+
error = body.get("error")
73+
raise Error(**error)
74+
return [CollectionFile(**cf) for cf in body.get("result")]

0 commit comments

Comments
 (0)