Java-Medical-microdata-Scraper

Java-Medical-microdata-Scrapper is a Java web scraping tool specifically designed for extracting medical microdata from web pages. This tool allows you to scrape information such as company name, drug name, drug class, indications, dosage form, side effects, and warnings from target URLs. The extracted data can be printed to the console and saved in an XML format for further analysis or storage.

Installation

To use Java-Medical-microdata-Scraper, follow these steps:

Clone the repository to your local machine:

git clone https://github.com/fkitsantas/Java-Medical-microdata-Scrapper.git

Import the project into your preferred Java IDE.
Build the project to resolve any dependencies.
Run the MedDataScraper class to execute the scraper.

Usage

The scraper is designed to extract medical microdata from specific web pages. By default, it is set to scrape data from the URL http://linter.structured-data.org/examples/schema.org/Drug-TreatmentIndication-MedicalContraindication-273-rdfa.html. You can modify the URL in the code to scrape data from your desired source.

The extracted data will be printed to the console, displaying drug information including drug name, company name, drug class, indications, dosage form, side effects, and warnings.

The scraped data will also be saved in an XML format. The XML file will be named Scraped_Medical.xml and will be stored in the project directory.

Dependencies

Java-Medical-microdata-Scrapper tool requires the following dependencies:

Java 8 or above
DOM API
XML API

These dependencies are included in the standard Java Development Kit (JDK) libraries, so there's no need to install any additional packages.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Java-Medical-microdata-Scraper

Installation

Usage

Dependencies

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Java-Medical-microdata-Scraper

Installation

Usage

Dependencies