You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -36,17 +36,17 @@ Potential usages include sanitizing of dataset strings (e.g. a collection of soc
36
36
37
37
<hr/>
38
38
39
-
## Running Locally with Poetry
40
-
This project uses Poetry. To run this project, install `poetry` and proceed to follow the instructions under `/docs/LOCAL_SETUP.md`.
39
+
## Running Locally with uv
40
+
This project uses `uv` for dependency management. To run this project, install `uv` and proceed to follow the instructions under `/docs/LOCAL_SETUP.md`.
41
41
42
-
`Note: This project has only been tested with Ubuntu and MacOS and with Python versions 3.9 and 3.10. You may need to upgrade pip ahead of installation.`
42
+
`Note: This project has only been tested with Ubuntu and MacOS and with Python versions 3.11 and 3.12. You may need to upgrade pip ahead of installation.`
43
43
44
44
## Installing with PIP
45
-
Video capture of install provided in LOCAL_SETUP.md file. Make sure you set up a virtual environment with either python 3.9 or 3.10 and upgrade pip with:
45
+
Video capture of install provided in LOCAL_SETUP.md file. Make sure you set up a virtual environment with either python 3.11 or 3.12 and upgrade pip with:
46
46
47
47
```bash
48
48
pip install --upgrade pip
49
-
pip install -U pip setuptools wheel# only needed if you haven't already done so
49
+
pip install -U pip uv# only needed if you haven't already done so
50
50
```
51
51
52
52
Before adding `pii-codex` on your project, download the spaCy `en_core_web_lg` model:
`Note: The extras installed with pii-codex[detections] are the spaCy, Micrisoft Presidio Analyzer, and Microsoft Anonymzer packages.`
69
69
70
-
Using Poetry:
70
+
Using uv:
71
71
72
72
```bash
73
-
poetry update
74
-
poetry add pii-codex
75
-
poetry install pii-codex --extras="detections"
73
+
uv sync
74
+
uv add pii-codex
75
+
uv add "pii-codex[detections]"
76
76
```
77
77
78
78
For those using Google Collab, check out the example notebook:
@@ -162,7 +162,7 @@ For more information on usage, check out the respective documentation for guidan
162
162
<hr/>
163
163
164
164
## Attributions
165
-
This project benefited greatly from a number of PII research works like that from Milne et al (2016) with the definition of the types and categories, Schwartz and Solove (2012) with the severity levels of Non-Identifiable, Semi-Identifiable, and Identifiable, and the documentation by NIST, DHS (2012), and HIPAA (full list of foundational publications provided below). A special thanks to all the open source projects, and frameworks that made the setup and structuring of this project much easier like Poetry, Microsoft Presidio, spaCy (2017), Jupyter, and several others.
165
+
This project benefited greatly from a number of PII research works like that from Milne et al (2016) with the definition of the types and categories, Schwartz and Solove (2012) with the severity levels of Non-Identifiable, Semi-Identifiable, and Identifiable, and the documentation by NIST, DHS (2012), and HIPAA (full list of foundational publications provided below). A special thanks to all the open source projects, and frameworks that made the setup and structuring of this project much easier like uv, Microsoft Presidio, spaCy (2017), Jupyter, and several others.
166
166
167
167
### Foundational Publications
168
168
The following publications that inspired and provided a foundation for this repository:
Copy file name to clipboardExpand all lines: docs/LOCAL_SETUP.md
+3-9Lines changed: 3 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,23 +14,17 @@ Video Demo of package import and usage:
14
14
For those contributing or modifying the source, use the following to set up locally.
15
15
16
16
## Environment Config
17
-
You'll need the Python (^3.9) and Poetry configured on your machine. Once those are configured, create a virtual
17
+
You'll need Python (^3.11) and `uv` configured on your machine. Once those are configured, create a virtual
18
18
environment and install dependencies.
19
19
20
20
```bash
21
-
python3.9 -m venv venv
22
-
23
-
. venv/bin/activate
24
-
25
-
pip install --upgrade pip
26
-
27
-
make install
21
+
uv sync
28
22
```
29
23
30
24
Installing dependencies will vary by usage. For those in need of the PII-Codex integration of the MSFT Presidio Analyzer, it is recommended to install the `detections` extras:
31
25
32
26
```bash
33
-
poetry install --extras="detections"
27
+
uv sync --extra detections
34
28
```
35
29
36
30
As part of the `detections` extras installation, the download for the `en_core_web_lg` spaCy model will be enabled on first use of the `PresidioPIIAnalyzer()`. If more language support is needed, you'll need to download it separately. Reference <ahref="https://github.com/explosion/spacy-models/releases?q=en_core_web_lg&expanded=true">explosion/spacy-models</a>.
0 commit comments