Skip to content

Clinical Knowledge Graph

sitashma rajbhandari edited this page Jan 24, 2021 · 31 revisions

Clinical Knowledge Graph

Clinical Knowledge Graph (CKG) is an ambitious project that incorporates analysis, mining, and integration of Knowledge from clinical data from various popular biomedical databases. In this project, we will specifically deal with meta-protein data and aim to not only show the potential of Neo4j for the analysis of Clinical Knowledge graph data but optimize the performance of graph databases as well.

This specific blog concerns the problems we faced as a team while integrating the CKG into Neo4j and the solution we came up with not the steps to build CKG. For Step by Step procedure please follow the article Getting started with windows-Clinical Knowledge Graph

Task: Integrating CKG into Neo4j

Requirements

Software Hardware
Neo4j
Java
Python
R
Microsoft visual studios
CoreI7 or Equivalent
128gb of storage (SSD)
16gb mermory

1. Setting up Java

Step 1: Java is usually preinstalled in windows 10 therefore we should download JRE kit

2. Setting up Neo4j, python, and Microsoft Visual

Step 1: Download Python and Ne04j (Note: if not downloaded the version specified we will face errors)

Step 2: Open the Neo4j Desktop and create a database by clicking ADD Database then create a local Graph using Password "NeO4J"

Step 3: Install APOC and Graph data Science Library by clicking on Manage then Plugins

Step 4: we have to go to the settings tab and comment on the option of dbms.directories.import=import by adding # at the beginning of the line.

Step 5: Add the full path to the Python into Path under Environment variables

Step 6: While installing Microsoft Visual C++ Build tools, we can manage the workload and install only C++ build tool under Workload

Step 7: Also install the latest version of MSVC v142 - VS 2019 C++ x64/x86 build tools and Windows 10 SDK from Individual Components

Problem encountered during Downloading and setting up Neo4j, python and Microsoft Visual

1. Version mismatched

Solution: download the exact version suggested in the document "Getting started with CKG for windows 10". If still, the problem arises download python 3.8

2. Access denied

Solution 1: Run Command Prompt as administrator

Solution 2:

  • Open a Python shell
  • Go to task manager
  • Find the python process
  • Right-click and open the location
  • The folder will open in explorer, go up a directory
  • Right-click the folder and select properties
  • Click the Security tab and hit 'edit'
  • Add everyone and give them permission to Read and Write.

3. Setting up R

Step 1: Download R

Step 2: In the Environment variable, click New buttons in User Variable and make a new variable with the name R, and include a path to executable R as the Value.

Step 3: Add the same path executable path in the Variable path under System Variable as well

Step 4:

install.packages('BiocManager')
BiocManager::install()
BiocManager::install(c('AnnotationDbi', 'GO.db', 'preprocessCore', 'impute'))
install.packages(c('flashClust','WGCNA', 'samr'), dependencies=TRUE, repos='http://cran.rstudio.com/')

Problem encountered during step 5 :

1. R isn't recognized by the command prompt

Solution: add path to executable R in Path under user variable as well

4. Create a virtual environemt

Step 1: Create a Virtual environment with the command python

-m venv path\to\env_name

Step 2: Activate the virtual environment

path\to\env_name\Scripts\activate.bat

Note: Virtual environment should be activated and neo4j must be running while building CKG graph

5. Setting up CKG

Step 1: Set up the Ckg by cloning the master branch of CKG GitHub repository into your device

Step 2: Go to the cloned CKG and install all the required packages by running the following commands

cd CKG\
pip3 install --upgrade pip
pip3 install --ignore-installed -r requirements.txt

Step 3: Create an appropriate directory architecture within the local copy of the cloned repository using the following commands

python setup_CKG.py
python setup_config_files.py

Step 4: Open the file C:\CKG\src\graphdb_connector\connector_config.yml and modify the line: db_url: “0.0.0.0” to db_url: “localhost”

Problem Encountered while setting up CKG

1) Error for downloading panda

Solution: Manually install the packages

6. Add CKG to environment variables

Step 1: Add CKG to the environment variables by making a new user variable named PYTHONPATH and add the value as the path to the CKG code directory

Step 2: To confirm that the environment variable is correctly set in command line type:

echo %PYTHONPATH%
This will print the path you used as a value (e.g. C:\CKG\src).

7. Build Neo4j Graph Database

Step 1: To build a graph database

cd src/graphdb_builder/builder python builder.py -b full -u neo4j

Problem encountered while building Neo4j Graph Database

1) ImportError: cannot import name 'clock' from 'time' (unknown location)

Solution 1: Go to the file cd CKG_ENV\lib\site-packages\passlib\utils\__init__.py replace time.clock to time.perf_counter

Solution 2: Go to the file cd CKG_ENV\lib\site-packages\passlib\utils\__init__.py instead of importing clock, you should import time. It should be like "from time import time". Then instead of using clock(), you should use time()

Run python builder.py -b full -u neo4j

Clone this wiki locally