-
Notifications
You must be signed in to change notification settings - Fork 2
Clinical Knowledge Graph
Clinical Knowledge Graph (CKG) is an ambitious project that incorporates analysis, mining, and integration of Knowledge from clinical data from various popular biomedical databases. In this project, we will specifically deal with meta-protein data and aim to not only show the potential of Neo4j for the analysis of Clinical Knowledge graph data but optimize the performance of graph databases as well.
This specific blog concerns the problems we faced as a team while integrating the CKG into Neo4j and the solution we came up with not the steps to build CKG. For Step by Step procedure please follow the article Getting started with windows-Clinical Knowledge Graph
- Neo4j
- Java and JRE (Java is usually preinstalled in windows 10 therefore we can download JRE)
- Python (Though in document version 3.6 was suggested, python of version 3.8 is more appropriate)
- R
**Step 1: **First steps were to download all the requirements with specified versions (note if not downloaded the version specified we will face errors)
Step 2: Open the Neo4j Desktop and create a database by clicking ADD Database then create a local Graph using Password "NeO4J"
Step 2.1: Install APOC and Graph data Science Library by clicking on Manage then Plugins
Step 2.2: we have to go to the settings tab and comment on the option of dbms.directories.import=import by adding # at the beginning of the line.
Step 3: Add the full path to the python into Path under Environment variables
Step 4: While installing Microsoft Visual C++ Build tools, we can manage the workload and install only C++ build tool under Workload
Step 4.1: Also install the latest version of MSVC v142 - VS 2019 C++ x64/x86 build tools and _Windows 10 SDK _from Individual Components
**1. Version mismatched ** Solution: download the exact version suggested in the document "Getting started with CKG for windows 10". If still, the problem arises download python 3.8
**2. Acces denied ** Solution:
- Open a Python shell
- Go to task manager
- Find the python process
- Right-click and open the location
- The folder will open in explorer, go up a directory
- Right-click the folder and select properties
- Click the Security tab and hit 'edit'
- Add everyone and give them permission to Read and Write.
Step 5: In the Environment variable, click **New **buttons in User Variable and make a new variable with the name R, and include a path to executable R as the Value.
Step 5.1: Add the same path executable path in the Variable path under System Variable as well
Step 5.2: install.packages('BiocManager') BiocManager::install() BiocManager::install(c('AnnotationDbi', 'GO.db', 'preprocessCore', 'impute')) install.packages(c('flashClust','WGCNA', 'samr'), dependencies=TRUE, repos='http://cran.rstudio.com/')
Problem encountered :
- R wasn't recognized by the command prompt Solution: add path to executable R in Path under user variable as well
Step 6: Create a Virtual environment with command python -m venv path\to\env_name Step 6.1: Activate the virtual environment with command path\to\env_name\Scripts\activate.bat
Step 7: Setting up the Ckg by cloning the master branch of CKG github repo into your PC Step 7.1: All the packages can be found in requirement.txt Step 7.1: Go to the cloned CKG and install all the required packages by running the following commands Step 7.2
cd CKG
pip3 install --upgrade pip pip3 install --ignore-installed -r requirements.txt
Problem Encountered:
- pip3 install --ignore-installed -r requirements.txt always showed error for downloading panda which was hard to decipher Solution: Manually install the packages
- Go to https://www.lfd.uci.edu/~gohlke/pythonlibs/#pandas
- Download the pandas from there and keep it in the file where your ckg is downloaded
- pip install pandas-1.2.0-cp38-cp38-win_amd64.whl --force-reinstall