Overview
Hi, I am Hang! Most recently before joining Exeter as a lecturer, I was a senior research associate in Computer Science at the University of Oxford from 2022 to early 2024, working on Ontology Enrichment with Natural Language Processing for the EPSRC project (ConCur). Before this, I was a research fellow in health informatics at Usher Institute, University of Edinburgh, for the Health Data Research UK projects (text analytics), for over 2 years till 2021, which was timely when COVID-19 arrived. I had a doctoral degree in Computer Science at the University of Liverpool in 2020, working on knowledge discovery from text data, an MSc and a Bachelor's degree in the Information School at the University of Sheffield and Wuhan University, respectively.
My research interests include and are not limited to (with open access links to previous work):
- Machine learning for data, texts and structured knowledge: using machine learning and, especially, deep learning methods for text mining, information extraction, entity linking, classification, with a semantic structure, and ontology enrichment from texts and user-generated data. These tasks transform unstructured data into a structured form.
- Integrating Language Models and structured knowledge: exploring ways to support the knowledge reasoning capability of Language Models (for example, BERT variants and GPT series) by using human curated knowledge graphs, for example, ontologies or domain-specific concepts and relations in the healthcare domain.
- Healthcare text analytics and AI applications: Natural Language Processing, multimodal learning, with key applications in automated clinical coding and disease phenotyping, by integrating unstructured and structured data, and visual language understanding in medical data, to support healthcare service and clinical decision-making. A key focus is the explainability of models by integrating knowledge with unstructured data.
Editorial service:
I am an editorial board member for Transactions on Graph Data and Knowledge (TGDK). Also, I have been reviewing for top conferences and journals, including ACL Rowling Review, WWW, ISWC, ESWC, CIKM, IJCAI, FAccT, IEEE Trans. Med. Imaging, J. Biomed. Inform, J. Web Semant., IEEE Trans. Neural Netw. Learn. Syst., etc.
Publications
Copyright Notice: Any articles made available for download are for personal use only. Any other use requires prior permission of the author and the copyright holder.
| 2024 | 2023 | 2022 | 2021 | 2020 | 2019 | 2018 | 2015 |
2024
- Shi J, Dong H, Chen J, Wu Z, Horrocks I. (2024) Taxonomy Completion via Implicit Concept Insertion, The ACM Web Conference 2024, Singapour, 13th - 17th May 2024.
- Dong H, Chen J, He Y, Gao Y, Horrocks I. (2024) A Language Model based Framework for New Concept Placement in Ontologies, Extended Semantic Web Conference, Hersonissos, Crete, Greece, 26th - 30th May 2024.
- Falis M, Gema AP, Dong H, Daines L, Basetti S, Holder M, Penfold RS, Birch A, Alex B. (2024) Can GPT-3.5 Generate and Code Discharge Summaries?. [PDF]
2023
- Dong H, Suárez-Paniagua V, Zhang H, Wang M, Casey A, Davidson E, Chen J, Alex B, Whiteley W, Wu H. (2023) Ontology-driven and weakly supervised rare disease identification from clinical notes, BMC Medical Informatics and Decision Making, volume 23, no. 1, article no. 86, DOI:10.1186/s12911-023-02181-9. [PDF]
- He Y, Chen J, Jimenez-Ruiz E, Dong H, Horrocks I. (2023) Language Model Analysis for Ontology Subsumption Inference, Findings of the Association for Computational Linguistics: ACL 2023, 1st - 1st Jul 2023, Findings of the Association for Computational Linguistics: ACL 2023, DOI:10.18653/v1/2023.findings-acl.213. [PDF]
- Dong H, Chen J, He Y, Horrocks I. (2023) Ontology Enrichment from Texts: A Biomedical Dataset for Concept Discovery and Placement, CIKM '23: The 32nd ACM International Conference on Information and Knowledge Management, Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, DOI:10.1145/3583780.3615126. [PDF]
- Dong H, Chen J, He Y, Liu Y, Horrocks I. (2023) Reveal the Unknown: Out-of-Knowledge-Base Mention Discovery with Entity Linking, CIKM '23: The 32nd ACM International Conference on Information and Knowledge Management, Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, DOI:10.1145/3583780.3615036. [PDF]
- Chen J, He Y, Geng Y, Jiménez-Ruiz E, Dong H, Horrocks I. (2023) Contextual semantic embeddings for ontology subsumption prediction, World Wide Web, volume 26, no. 5, pages 2569-2591, DOI:10.1007/s11280-023-01169-9.
2022
- Chen Q, Allot A, Leaman R, Islamaj R, Du J, Fang L, Wang K, Xu S, Zhang Y, Bagherzadeh P. (2022) Multi-label classification for biomedical literature: An overview of the BioCreative VII LitCovid Track for COVID-19 literature topic annotations, Database, volume 2022, DOI:10.1093/database/baac069.
- Falis M, Dong H, Birch A, Alex B. (2022) Horses to Zebras: Ontology-Guided Data Augmentation and Synthesis for ICD-9 Coding, Proceedings of the Annual Meeting of the Association for Computational Linguistics, pages 389-401.
- Dong H, Falis M, Whiteley W, Alex B, Matterson J, Ji S, Chen J, Wu H. (2022) Automated clinical coding: what, why, and where we are?, npj Digital Medicine, volume 5, no. 1, DOI:10.1038/s41746-022-00705-7.
- He Y, Chen J, Dong H, Jiménez-Ruiz E, Hadian A, Horrocks I. (2022) Machine Learning-Friendly Biomedical Datasets for Equivalence and Subsumption Ontology Matching, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), volume 13489 LNCS, pages 575-591, DOI:10.1007/978-3-031-19433-7_33.
- Wu H, Wang M, Wu J, Francis F, Chang YH, Shavick A, Dong H, Poon MTC, Fitzpatrick N, Levine AP. (2022) A survey on clinical natural language processing in the United Kingdom from 2007 to 2022, npj Digital Medicine, volume 5, no. 1, DOI:10.1038/s41746-022-00730-6.
- Pour MAN, Algergawy A, Buche P, Castro LJ, Chen J, Dong H, Fallatah O, Faria D, Fundulaki I, Hertling S. (2022) Results of the Ontology Alignment Evaluation Initiative 2022, CEUR Workshop Proceedings, volume 3324, pages 84-128.
2021
- Falis M, Dong H, Birch A, Alex B. (2021) CoPHE: A Count-Preserving Hierarchical Evaluation Metric in Large-Scale Multi-Label Text Classification, EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings, pages 907-912.
- Dong H, Suárez-Paniagua V, Whiteley W, Wu H. (2021) Explainable automated coding of clinical notes using hierarchical label-wise attention networks and label embedding initialisation, Journal of Biomedical Informatics, volume 116, DOI:10.1016/j.jbi.2021.103728.
- Dong H, Wang W, Huang K, Coenen F. (2021) Automated Social Text Annotation with Joint Multilabel Attention Networks, IEEE Transactions on Neural Networks and Learning Systems, volume 32, no. 5, pages 2224-2238, DOI:10.1109/TNNLS.2020.3002798.
- Suárez-Paniagua V, Dong H, Casey A. (2021) A multi-BERT hybrid system for named entity recognition in Spanish radiology reports, CEUR Workshop Proceedings, volume 2936, pages 846-856.
- Davidson EM, Poon MTC, Casey A, Grivas A, Duma D, Dong H, Suárez-Paniagua V, Grover C, Tobin R, Whalley H. (2021) The reporting quality of natural language processing studies: systematic review of studies of radiology reports, BMC Medical Imaging, volume 21, no. 1, DOI:10.1186/s12880-021-00671-8.
- Dong H, Suarez-Paniagua V, Zhang H, Wang M, Whitfield E, Wu H. (2021) Rare Disease Identification from Clinical Notes with Ontologies and Weak Supervision, Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS, volume 2021-January, pages 2294-2298, DOI:10.1109/EMBC46164.2021.9630043.
- Casey A, Davidson E, Poon M, Dong H, Duma D, Grivas A, Grover C, Suárez-Paniagua V, Tobin R, Whiteley W. (2021) A systematic review of natural language processing applied to radiology reports, BMC Medical Informatics and Decision Making, volume 21, no. 1, DOI:10.1186/s12911-021-01533-7.
2020
- Dong H, Wang W, Coenen F, Huang K. (2020) Knowledge base enrichment by relation learning from social tagging data, Information Sciences, volume 526, pages 203-220, DOI:10.1016/j.ins.2020.04.002.
2019
- Dong H, Wang W, Huang K, Coenen F. (2019) Joint multi-label attention networks for social text annotation, NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, volume 1, pages 1348-1354.
- Lee J, Oh S, Dong H, Wang F, Burnett G. (2019) Motivations for self-archiving on an academic social networking site: A study on researchgate, Journal of the Association for Information Science and Technology, volume 70, no. 6, pages 563-574, DOI:10.1002/asi.24138.
2018
- Dong H, Wang W, Coenen F. (2018) Rules for inducing hierarchies from social tagging data, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), volume 10766 LNCS, pages 345-355, DOI:10.1007/978-3-319-78105-1_38.
- Dong H, Wang W, Coenen F. (2018) Learning relations from social tagging data, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), volume 11012 LNAI, pages 29-41, DOI:10.1007/978-3-319-97304-3_3.
- Chen Y, Dong H, Wang W. (2018) Topic-graph based recommendation on social tagging systems: A study on researchgate, ACM International Conference Proceeding Series, pages 138-143, DOI:10.1145/3239283.3239316.
2015
- Dong H, Wang W, Liang HN. (2015) Learning structured knowledge from social tagging data: A critical review of methods and techniques, Proceedings - 2015 IEEE International Conference on Smart City, SmartCity 2015, Held Jointly with 8th IEEE International Conference on Social Computing and Networking, SocialCom 2015, 5th IEEE International Conference on Sustainable Computing and Communications, SustainCom 2015, 2015 International Conference on Big Data Intelligence and Computing, DataCom 2015, 5th International Symposium on Cloud and Service Computing, SC2 2015, pages 307-314, DOI:10.1109/SmartCity.2015.89.