SOAP_framework

Multi-class Text Annotation and Classification using BERT-based Active Learning This project has three independent components that works in sequential manner. These three components are represented with three .py files including section_tagger_1.py, active_learning_2.py, and final_model_3.py, respectively. Description of the files:

section_tagger_1.py: code is written for identifying SOAP sections in the text.
active_learning_2.py: code is written for transfer active learning that include code for active learning strategies to select instances for labelling and also a BERT based
classification model with embedding layer.
final_model_3.py: code is written for the final classification model deployed for classifying text instances of SOAP sentences.

Running strategy:

Run the section_tagger_1.py on raw data to generate identified SOAP section as an output
Run the active_learning_2.py that take initial labeled data (seed data) coming from step 1 and also held-out unlabeled data. This will generate labeled data to use as a training data for the final classificaiotn model.
Run the final_model_3.py to create extended embeddings for the final classificaiton model. The code results in trained model that could be used for testing data as we have used for heldout test dataset. The same model could be used for any future application of similar nature.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.idea		.idea
data		data
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
active_learning_2.py		active_learning_2.py
final_model_3.py		final_model_3.py
requirements.txt		requirements.txt
section_tagger_1.py		section_tagger_1.py
text_preprocesssing.py		text_preprocesssing.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SOAP_framework

About

Releases

Packages

Contributors 2

Languages

License

BioMeGiX/SOAP_framework

Folders and files

Latest commit

History

Repository files navigation

SOAP_framework

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages