Stockholm university

Publications from the Clinical Text Mining Group

New articles, conference papers, book chapters and other publications from the Clinical Text Mining Group.

Vakili, T., Hullmann T., Henriksson A. and H. Dalianis. 2024.
When Is a Name Sensitive? Eponyms in Clinical Text and Implications for De-Identification. In the Proceedings of the CALD-pseudo Workshop at the 18th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2024, Malta.

Ngo, P., Tejedor M., Olsen Svenning T., Chomutare T., Budrionis A. and H. Dalianis. 2024.
Deidentifying a Norwegian clinical corpus –An effort to create a privacy-preserving Norwegian large clinical language model. In the proceedings of the CALD-pseudo Workshop at the 18th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2024, Malta.

Lamproudis, A., Mora, S., Olsen Svenning T., Torsvik T., Chomutare T., Dinh Ngo P. and H. Dalianis. 2023.
De-identifying Norwegian Clinical Text using Resources from Swedish and Danish. Proceedings of AMIA 2023, Annual Symposium, November 11–15. New Orleans, LA, USA.

Lamproudis, A., Olsen Svenning T., Torsvik T., Chomutare T., Budrionis A., Dinh Ngo P., Vakili T. and H. Dalianis. 2023.
Using a Large Open Clinical Corpus for Improved ICD-10 Diagnosis Coding. Proceedings of AMIA 2023, Annual Symposium, November 11–15. New Orleans, LA, USA.

Valik J.K., Ward W., Tanushi H., Johansson A.F., Färnert A., Mogensen M.L., Pickering B.W., Herasevich V., Dalianis H., Henriksson A. and P. Nauclér. 2023.
Predicting sepsis onset using a machine learned causal probabilistic network algorithm based on electronic health records data. Scientific reports, 13(1), 11760.

Vakili, T. and H. Dalianis. 2023.
Using Membership Inference Attacks to Evaluate Privacy-Preserving Language Modeling Fails for Pseudonymizing Data. Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa 2023). Faroe Islands, May 22-24, 2023.

Lamproudis, A. & Henriksson, A. (2023). On the Impact of the Vocabulary for Domain-Adaptive Pretraining of Clinical Language Models. In International Joint Conference on Biomedical Engineering Systems and Technologies, pp. 315–332.

Verkberk, J.D.M., van der Werrff, S.D., Weegar, R., Henriksson, A., Richir, M.C., van Mourik, M.S.M., Nauclér, P. (2023). The augmented value of using clinical notes in semi-automated surveillance of deep surgical site infections after colorectal surgery. Antimicrobial Resistance & Infection Control, 12:117.

Henriksson, A., Pawar, Y., Hedberg, P., Nauclér, P. (2023). Multimodal fine-tuning of clinical language models for predicting COVID-19 outcomes. Artificial Intelligence in Medicine, 146.

Dolk, A., Davidsen H., Dalianis H. and T. Vakili. 2022.
Evaluation of LIME and SHAP in Explaining Automatic ICD-10 Classifications of Swedish Gastrointestinal Discharge Summaries. Proceedings from the 18th Scandinavian Conference on Health Informatics - SHI 2022 in Tromsø, Norway on August 22-24, pp. 166–173.

Budrionis, A., Chomutare T., Olsen Svenning T. and H. Dalianis. 2022.
The Influence of NegEx on ICD-10 Code Prediction in Swedish: How is the Performance of BERT and SVM Models Affected by Negations? In Proceedings from the 18th Scandinavian Conference on Health Informatics - SHI 2022 in Tromsø, Norway on August 22-24, pp. 174–178.

Chomutare, T., Budrionis, A. and H. Dalianis. 2022, July.
Combining deep learning and fuzzy logic to predict rare ICD-10 codes from clinical notes. In Proceedings from the 2022 IEEE International Conference on Digital Health (ICDH), pp. 163–168.

Pawar, Y., Henriksson, A., Hedberg, P., Naucler, P. (2022). Leveraging Clinical BERT in Multimodal Mortality Prediction Models for COVID-19. In Proceedings of the 35th IEEE International Symposium on Computer-Based Medical Systems (CBMS), pp. 199–204.

Lamproudis, A., Henriksson, A., Valik, J.K., Nauclér, P. (2022). Improving the Timeliness of Early Prediction Models for Sepsis through Utility Optimization. In Proceedings of IEEE International Conference on Tools with Artificial Intelligence (ICTAI), pp. 1062–1069.

Lamproudis, A., Henriksson A. and H. Dalianis. 2022.
Evaluating Pretraining Strategies for Clinical BERT Models, in Proceedings of the 13th International Conference on Language Resources and Evaluation, LREC 2022, Marseille, France, pp.410–416.

Vakili, T., Lamproudis, A., Henriksson, A. and H. Dalianis. 2022.
Downstream Task Performance of BERT Models Pre-Trained Using Automatically De-Identified Clinical Data, in Proceedings of the 13th International Conference on Language Resources and Evaluation, LREC 2022, Marseille, France, pp. 4245–4252.

Bridal, O., Vakili, T. and M. Santini 2022. 
Cross-Clinic De-Identification of Swedish Electronic Health Records: Nuances and Caveats. Proceedings of the Workshop on Legal and Ethical Issues in Human Language Technologies in conjunction with 13th International Conference on Language Resources and Evaluation, (LREC 2022), Marseille, June 21-23.

Jerdhaf, O., Santini, M., Lundberg, P., Bjerner, T., Al-Abasse, Y., Jönsson, A. and T. Vakili. 2022.
Evaluating Pre-Trained Language Models for Focused Terminology Extraction from Swedish Medical Records. Proceedings of the Workshop Terminology in the 21st century: many faces, many places – Term 21 in conjunction with 13th International Conference on Language Resources and Evaluation, (LREC 2022), Marseille, June 21-23.

Vakili, T. and H. Dalianis. 2022.
Utility Preservation of Clinical Text After De-Identification, In Proceedings of the 21st Workshop on Biomedical Language Processing (pp. 383-388) in conjunction with ACL 2022, Dublin, Ireland.

Blanco, A., Remmer, S., Pérez, A., Dalianis, H. and A. Casillas. 2022.
Implementation of specialised attention mechanisms: ICD-10 classification of Gastrointestinal discharge summaries in English, Spanish and Swedish. Journal of Biomedical Informatics.

van der Werff, S. D., Fritzing, M., Tanushi, H., Henriksson, A., Dalianis, H., Ternhag, A., Färnert, A. and P. Nauclér. 2022.
The accuracy of fully automated algorithms for surveillance of healthcare-onset Clostridioides difficile infections in hospitalized patients. Antimicrobial Stewardship & Healthcare Epidemiology, 2(1).

Lamproudis, A., Henriksson, A. and H. Dalianis. 2022.
Vocabulary Modifications for Domain-adaptive Pretraining of Clinical Language Models. In Proceedings of the 15th International Joint Conference on Biomedical Engineering Systems and Technologies – Volume 5: HEALTHINF, pp. 180-188.

Vakili, T. and H. Dalianis. 2021.
Are Clinical BERT Models Privacy Preserving? The Difficulty of Extracting Patient-Condition Associations. In the Proceedings of the Association for the Advancement of Artificial Intelligence AAAI Fall 2021 Symposium in HUman partnership with Medical Artificial iNtelligence (HUMAN.AI), November 4-6, 2021.

Remmer, S., Lamproudis, A. and H. Dalianis. 2021.
Multi-label Diagnosis Classification of Swedish Discharge Summaries – ICD-10 Code Assignment Using KB-BERT. In the Proceedings of RANLP 2021: Recent Advances in Natural Language Processing, 1-3 Sept 2021, Varna, Bulgaria.

Blanco, A., Remmer, S., Pérez, A., Dalianis, H. and A. Casillas. 2021.
On the contribution of per-ICD attention mechanisms to classify health records in languages with fewer resources than English. In the Proceedings of RANLP 2021: Recent Advances in Natural Language Processing, 1-3 Sept 2021, Varna, Bulgaria.

Lamproudis, A., Henriksson, A. and H. Dalianis. 2021.
Developing a Clinical Language Model for Swedish: Continued Pretraining of Generic BERT with In-Domain Data. In the Proceeding of RANLP 2021: Recent Advances in Natural Language Processing, 1-3 Sept 2021, Varna, Bulgaria.

Grancharova, M. and H. Dalianis. 2021.
Applying and Sharing pre-trained BERT-models for Named Entity Recognition and Classification in Swedish Electronic Patient Records. In the Proceedings of the 23rd Nordic Conference on Computational Linguistics, NoDaLiDa 2021, Iceland, May 31 - June 2, 2021.

Dalianis, H. and H. Berg. 2021.
HB Deid - HB De-identification tool demonstrator. In the Proceedings of the 23rd Nordic Conference on Computational Linguistics, NoDaLiDa 2021, Iceland, May 31 - June 2, 2021.

Dalianis, H och S. Velupillai.
Kap 8. Textanalys i Medicinsk Informatik (In Swedish). Eds G. Petersson, M. Rydmark and A. Thurin, Liber.

Tayefi, M., P. Ngo, T. Chomutare, H. Dalianis, E. Salvi, A. Budrionis, and F. Godtliebsen. 2021.
Challenges and opportunities beyond structured data in analysis of electronic health records, Computational Statistics, Wiley.

Berg, H., A. Henriksson, U. Fors and H. Dalianis. 2021.
De-identification of Clinical Text for Secondary Use: Research Issues. In Proceedings of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 5: HEALTHINF, pp. 592-599.

Mahbub Ul, A., A. Henriksson, H. Tanushi, E. Thiman, P. Naucler and H. Dalianis. 2021.
Terminology Expansion with Prototype Embeddings: Extracting Symptoms of Urinary Tract Infection from Clinical Text. In Proceedings of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 5: HEALTHINF, pp. 47-57.

Grancharova, M., H. Berg and H. Dalianis. 2020.
Improving Named Entity Recognition and Classification in Class Imbalanced Swedish Electronic Patient Records through Resampling. Compilation of abstracts in The Eight Swedish Language Technology Conference (SLTC-2020), Göteborg.

Berg, H., A. Henriksson and H. Dalianis. 2020.
The Impact of De-identification on Downstream Named Entity Recognition in Clinical Text. In Proceedings of the 11th International Workshop on Health Text Mining and Information Analysis, Louhi 2020, in conjunction with EMNLP 2020, (pp. 1-11).

Caccamisi, A., L. Jørgensen, H. Dalianis and M. Rosenlund. 2020.
Natural language processing and machine learning to enable automatic extraction and classification of patients’ smoking status from electronic medical records. Upsala Journal of Medical Sciences, 1-9.

Chomutare, T., Yigzaw, K., Budrionis, A., Makhlysheva, A., Godtliebsen, F. and H. Dalianis. 2020.
De-identifying Swedish EHR Text using Public Resources in the General Domain. In the proceedings of Medical Informatics Europe MIE-2020, Geneva.

Berg, H. and H. Dalianis. 2020.
A Semi-supervised Approach for De-identification of Swedish Clinical Text. Proceedings of 12th Conference on Language Resources and Evaluation, LREC 2020, May 13-15, Marseille, pp. 4444 4450.

Berg, H., T. Chomutare and H. Dalianis. 2019.
Building a De-identification System for Real Swedish Clinical Text Using Pseudonymised Clinical Text. In the Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis, Louhi 2019, in conjunction with the Conference on Empirical Methods in Natural Language Processing, (EMNLP) November 2019, Hongkong, ACL, pp 118-125.

Berg, H. and H. Dalianis. 2019.
Augmenting a De-identification System for Swedish Clinical Text Using Open Resources (and Deep learning). In the Proceedings of the Workshop on NLP and Pseudonymisation, in conjunction with the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa), Turku, Finland, September 30, 2019.

Dalianis, H. 2019.
Pseudonymisation of Swedish Electronic Patient Records Using a Rule-based Approach. In the Proceedings of the Workshop on NLP and Pseudonymisation, in conjunction with the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa), Turku, Finland, September 30, 2019.

Budrionis, A., H. Dalianis, K.Y. Yigzaw, A. Makhlysheva and T. Chomutare. 2018.
Negation detection in Norwegian medical text: Porting a Swedish NegEx to Norwegian Work in progress, Compilation of abstracts in The Seventh Swedish Language Technology Conference (SLTC-2018), Stockholm.

Dalianis, H. 2018.
Clinical Text Mining: Secondary Use of Electronic Patient Records, 181 pages, Springer, Open Access.