Skip to content

K-RET: Knowledgeable Biomedical Relation Extraction System

License

Notifications You must be signed in to change notification settings

lasigeBioTM/K-RET

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

K-RET: Knowledgeable Biomedical Relation Extraction System

K-RET is a flexible biomedical RE system, allowing for the use of any pre-trained BERT-based system (e.g., SciBERT and BioBERT) to inject knowledge in the form of knowledge graphs from a single source or multiple sources simultaneously. This knowledge can be applied to various contextualizing tokens or just to the tokens of the candidate relation for single and multi-token entities.

Our academic paper which describes K-RET in detail can be found here.

The uer folder corresponds to an updated version of the toolkit developed by Zhao et al. (2019) available here.

Downloading Pre-Trained Models

Available versions of the best performing pre-trained models are as follows:

The training details are described in our academic paper.

Getting Started

Our project includes code adaption of the K-BERT model available here.

Use the K-RET Image available at Docker Hub to set up the experimental environment.

Usage Example

 CUDA_VISIBLE_DEVICES='1,2,3' python3 -u run_classification.py \
    --pretrained_model_path ./models/pre_trained_model_scibert/output_model.bin \
    --config_path ./models/pre_trained_model_scibert/scibert_scivocab_uncased/config.json \
    --vocab_path ./models/pre_trained_model_scibert/scibert_scivocab_uncased/vocab.txt \
    --train_path ./datasets/ddi_corpus/train.tsv \
    --dev_path ./datasets/ddi_corpus/dev.tsv \
    --test_path ./datasets/ddi_corpus/test.tsv \
    --epochs_num 30 \
    --batch_size 32 \
    --kg_name "['ChEBI']" \
    --output_model_path ./outputs/scibert_ddi.bin | tee ./outputs/scibert_ddi.log &

For more options check run.sh and, for additional configuration settings (e.g., max_number_entities and contextual_knowledge), check brain/config.py.

Predict New Data Example

CUDA_VISIBLE_DEVICES='0' python3 -u run_classification.py \
    --pretrained_model_path ./models/pre_trained_model_scibert/output_model.bin \
    --config_path ./models/pre_trained_model_scibert/scibert_scivocab_uncased/config.json \
    --vocab_path ./models/pre_trained_model_scibert/scibert_scivocab_uncased/vocab.txt \
    --train_path ./datasets/ddi_corpus/train.tsv \
    --dev_path ./datasets/ddi_corpus/dev.tsv \
    --test_path ./datasets/ddi_corpus/test.tsv \
    --epochs_num 30 --batch_size 32 --kg_name "[]" \
    --testing True \
    --to_test_model ./outputs/scibert_ddi.bin \
    | tee ./outputs/ddi_results.log &

Process Results Example

python3 src/process_results.py ./outputs/ddi_results.log ./datasets/ddi_corpus/test.tsv ddi_results.tsv

Reference

  • Diana Sousa and Francisco M. Couto. 2022. K-RET: Knowledgeable Biomedical Relation Extraction System.