Site icon Medius Health

A Dictionary-based Oversampling Approach to Clinical Document Classification on Small and Imbalanced Dataset (19th IEEE/ACM WI-IAT 2020)

To address data imbalance in medical document classification, we propose a probabilistic dictionary-based data augmentation approach by oversampling on the minority class and creating new documents with high variety by using synonyms’ similarities with the original medical entity term.

Exit mobile version