-
Notifications
You must be signed in to change notification settings - Fork 2
Clinical Knowledge Graph
Clinical Knowledge Graph (CKG) is an ambitious project that incorporates analysis, mining, and integration of Knowledge from clinical data from various popular biomedical databases. In this project, we will specifically deal with meta-protein data and aim to not only show the potential of Neo4j for the analysis of Clinical Knowledge graph data but optimize the performance of graph databases as well.
This specific blog concerns the problems we faced as a team while integrating the CKG into Neo4j and the solution we came up with not the steps to build CKG. For Step by Step procedure please follow the article Getting started with windows-Clinical Knowledge Graph
- Neo4j
- Java and JRE (Java is usually preinstalled in windows 10 therefore we can download JRE)
- Python (Though in document version 3.6 was suggested, python of version 3.8 is more appropriate)
- R
Step 1: First steps were to download all the requirements with specified versions
(Note: if not downloaded the version specified we will face errors)
Step 2: Open the Neo4j Desktop and create a database by clicking ADD Database then create a local Graph using Password "NeO4J"
Step 2.1: Install APOC and Graph data Science Library by clicking on Manage then Plugins
Step 2.2: we have to go to the settings tab and comment on the option of dbms.directories.import=import by adding # at the beginning of the line.
Step 3: Add the full path to the python into Path under Environment variables
Step 4: While installing Microsoft Visual C++ Build tools, we can manage the workload and install only **C++ build tool **under Workload
Step 4.1: Also install the latest version of MSVC v142 - VS 2019 C++ x64/x86 build tools and Windows 10 SDK from Individual Components
Problem encountered during these step 1 to 4
1. Version mismatched
Solution: download the exact version suggested in the document "Getting started with CKG for windows 10". If still, the problem arises download python 3.8
2. Access denied
Solution 1: Run Command Prompt as administrator
Solution 2: > * Open a Python shell
- Go to task manager
- Find the python process
- Right-click and open the location
- The folder will open in explorer, go up a directory
- Right-click the folder and select properties
- Click the Security tab and hit 'edit'
- Add everyone and give them permission to Read and Write.
Step 5: In the Environment variable, click New buttons in User Variable and make a new variable with the name R, and include a path to executable R as the Value.
Step 5.1: Add the same path executable path in the Variable path under System Variable as well
Step 5.2:
install.packages('BiocManager') BiocManager::install() BiocManager::install(c('AnnotationDbi', 'GO.db', 'preprocessCore', 'impute')) install.packages(c('flashClust','WGCNA', 'samr'), dependencies=TRUE, repos='http://cran.rstudio.com/')
Problem encountered during step 5 :
- R isn't recognized by the command prompt Solution: add path to executable R in Path under user variable as well
Step 6: Create a Virtual environment with the command python
-m venv path\to\env_name
**Step 6.1: **Activate the virtual environment
path\to\env_name\Scripts\activate.bat
Note: Virtual environment should be activated and neo4j must be running while building CKG graph
Step 7: Set up the Ckg by cloning the master branch of CKG GitHub repo into your device
Step 7.1: Go to the cloned CKG and install all the required packages by running the following commands
> **_cd CKG\_**
> **_pip3 install --upgrade pip_**
> _**pip3 install --ignore-installed -r requirements.tx**_t
Step 7.2: Create an appropriate directory architecture within the local copy of the cloned repository using the following commands
_**> python setup_CKG.py**_
_**> python setup_config_files.py**_
step 7.2 : This will automatically create the data folder and all subfolders, as well as setup the configuration for the log files where all errors and warnings related to the code will be written to.
In Windows, the database url needs to be set to localhost instead of 0.0.0.0 (non-particular address). To change this configuration open the file C:\CKG\src\graphdb_connector\connector_config.yml and modify the line: db_url: “0.0.0.0” to db_url: “localhost”.
Problem Encountered:
-
_**pip3 install --ignore-installed -r requirements.txt**_showed an error for downloading panda
Solution: Manually install the packages
- Go to https://www.lfd.uci.edu/~gohlke/pythonlibs/#pandas
- Download the pandas from there and keep it in the file where your ckg is downloaded
- pip install pandas-1.2.0-cp38-cp38-win_amd64.whl --force-reinstall
Step 8: Add CKG to the environment variables In the Environment variables dialog, click New in the top half of the dialog, to make a new user variable
Give the variable name as PYTHONPATH and the value is the path to the CKG code directory, for example C:\CKG\src. Notice that the path should always finish with \CKG\src. To confirm that the environment variable is correctly set in command line type:
echo %PYTHONPATH% This will print the path you used as a value (e.g. C:\CKG\src).
Step 9: To build a graph database $ cd src/graphdb_builder/builder $ python builder.py -b full -u neo4j
Before running builder.py, please make sure your Neo4j graph is running. The builder will fail otherwise.
While running python builder.py -b full -u neo4j Problem encountered: (CKG_ENV) C:\Users\sitas\CKG\src\graphdb_builder\builder>python builder.py -b full -u neo4j Traceback (most recent call last): File "builder.py", line 14, in from graphdb_builder.builder import importer, loader File "C:\Users\sitas\CKG\src\graphdb_builder\builder\importer.py", line 20, in from graphdb_builder.users import users_controller as uh File "C:\Users\sitas\CKG\src\graphdb_builder\users\users_controller.py", line 8, in from passlib.hash import bcrypt File "C:\Users\sitas\CKG_ENV\lib\site-packages\passlib\hash.py", line 25, in from passlib.registry import proxy File "C:\Users\sitas\CKG_ENV\lib\site-packages\passlib\registry.py", line 12, in from passlib.ifc import PasswordHash File "C:\Users\sitas\CKG_ENV\lib\site-packages\passlib\ifc.py", line 10, in from passlib.utils.decor import deprecated_method File "C:\Users\sitas\CKG_ENV\lib\site-packages\passlib\utils_init.py", line 845, in from time import clock as timer ImportError: cannot import name 'clock' from 'time' (unknown location)
Tried solution
Tried replacing time.clock to time.perf_counter (didnt work) According to the source download The solution was to install Python 3.6, after that it runs perfectly.
However, I had to 3.6 which didnt work so i had to switch to python.3.8 What to do now?
2)instead of importing clock, you should import time. It should be like "from time import time". Then instead of using clock(), you should use time()