-
Notifications
You must be signed in to change notification settings - Fork 2
Clinical Knowledge Graph
Clinical Knowledge Graph (CKG) is an ambitious project that incorporates analysis, mining, and integration of Knowledge from clinical data from various popular biomedical databases. In this project, we will specifically deal with meta-protein data and aim to not only show the potential of Neo4j for the analysis of Clinical Knowledge graph data but optimize the performance of graph databases as well.
This specific blog concerns the problems we faced as a team while integrating the CKG into Neo4j and the solution we came up with not the steps to build CKG. For Step by Step procedure please follow the article Getting started with windows-Clinical Knowledge Graph
| Software | Hardware |
|---|---|
| 1. Java 11 2. Neo4j 4.0 3. Python>3.6 4. R >= 3.5.2 5. Microsoft Visual Studios |
1. 80gb or more of disk storage |
Step 1: Java is usually preinstalled in windows 10 therefore we should download JRE kit
Step 1: Download Python and Ne04j
(Note: if not downloaded the version specified we will face errors)
Step 2: Open the Neo4j Desktop and create a database by clicking ADD Database then create a local Graph using Password "NeO4J"
Step 3: Install APOC and Graph data Science Library by clicking on Manage then Plugins
Step 4: we have to go to the settings tab and comment on the option of dbms.directories.import=import by adding # at the beginning of the line.
Step 5: Add the full path to the Python into Path under Environment variables
Step 6: While installing Microsoft Visual C++ Build tools, we can manage the workload and install only C++ build tool under Workload
Step 7: Also install the latest version of MSVC v142 - VS 2019 C++ x64/x86 build tools and Windows 10 SDK from Individual Components
1. Version mismatched
Solution: download the exact version suggested in the document "Getting started with CKG for windows 10". If still, the problem arises download python 3.8
2. Access denied
Solution 1: Run Command Prompt as administrator
Solution 2:
- Open a Python shell
- Go to task manager
- Find the python process
- Right-click and open the location
- The folder will open in explorer, go up a directory
- Right-click the folder and select properties
- Click the Security tab and hit 'edit'
- Add everyone and give them permission to Read and Write.
Step 1: Download R
Step 2: In the Environment variable, click New buttons in User Variable and make a new variable with the name R, and include a path to executable R as the Value.
Step 3: Add the same path executable path in the Variable path under System Variable as well
Step 4:
install.packages('BiocManager')
BiocManager::install()
BiocManager::install(c('AnnotationDbi', 'GO.db', 'preprocessCore', 'impute'))
install.packages(c('flashClust','WGCNA', 'samr'), dependencies=TRUE, repos='http://cran.rstudio.com/')
1. R isn't recognized by the command prompt
Solution: add path to executable R in Path under user variable as well
Step 1: Create a Virtual environment with the command python
-m venv path\to\env_name
Step 2: Activate the virtual environment
path\to\env_name\Scripts\activate.bat
Note: Virtual environment should be activated and neo4j must be running while building CKG graph
Step 1: Set up the Ckg by cloning the master branch of CKG GitHub repository into your device
Step 2: Go to the cloned CKG and install all the required packages by running the following commands
cd CKG\
pip3 install --upgrade pip
pip3 install --ignore-installed -r requirements.txt
Step 3: Create an appropriate directory architecture within the local copy of the cloned repository using the following commands
python setup_CKG.py
python setup_config_files.py
Step 4: Open the file C:\CKG\src\graphdb_connector\connector_config.yml and modify the line: db_url: “0.0.0.0” to db_url: “localhost”
1) Error for downloading panda
Solution: Manually install the packages
- Go to https://www.lfd.uci.edu/~gohlke/pythonlibs/#pandas
- Download the pandas and add them to the file where your CKG repo has been cloned
pip install pandas-1.2.0-cp38-cp38-win_amd64.whl --force-reinstallto install pandas
Step 1: Add CKG to the environment variables by making a new user variable named PYTHONPATH and add the value as the path to the CKG code directory
Step 2: To confirm that the environment variable is correctly set in command line type:
echo %PYTHONPATH%
This will print the path you used as a value (e.g. C:\CKG\src).
Step 1: To build a graph database
cd src/graphdb_builder/builderpython builder.py -b full -u neo4j
1) ImportError: cannot import name 'clock' from 'time' (unknown location)
Solution 1: Go to the file cd CKG_ENV\lib\site-packages\passlib\utils\__init__.py replace time.clock to time.perf_counter
Solution 2: Go to the file cd CKG_ENV\lib\site-packages\passlib\utils\__init__.py instead of importing clock, you should import time. It should be like "from time import time". Then instead of using clock(), you should use time()
Run python builder.py -b full -u neo4j