I am a Research Associate in the Big Scholarly Data and Text Analytics Group at the Knowledge Media Institute, UK. My research is in the areas of Text Mining, Data Mining, and Machine Learning, with applications to scientific literature and tasks including information extraction, subject classification, and research evaluation.
The focus of my research is on helping scientists work more effectively by enabling intelligent access to the content of research publications. My doctoral research directly resulted in the development of a new area called semantometrics, which, in contrast to purely citation-based research evaluation methods, utilizes full-text to produce new evaluation methods for use in research assessment.
Prior to my current appointment, I conducted a two-year internship in the Computational Data Analytics Group at Oak Ridge National Laboratory, TN, USA, where I worked on a number of projects in the areas of information extraction, text mining, and research evaluation. Prior to that I was a Computer Science PhD student at the Open University, UK. I completed my Master's degree in Information Systems and my Bachelor degree in Information Technology at the Faculty of Information Technology, Brno University of Technology.
I have been a finalist in two international research competitions in the areas of scholarly data mining and text mining (NTCIR-10 CrossLink-2 and 2016 WSDM Cup Challenge), and have been organizing research workshops since 2014 to bring together different communities working on related problems in text mining of research publications.
A listing of my publications is also available through these sites:
Drahomira Herrmannova, Stephen Young, Robert Patton, Christopher Stahl, Nicole Kleinstreuer, Mary Wolfe. Unsupervised Identification of Study Descriptors in Toxicology Research: An Experimental Study. Proceedings of the Ninth International Workshop on Health Text Mining and Information Analysis (LOUHI 2018 at EMNLP 2018).
Drahomira Herrmannova, Petr Knoth, Robert Patton. Analyzing Citation-Distance Networks for Evaluating Publication Impact. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018).
Christopher Stahl, Steven Young, Drahomira Herrmannova, Robert Patton, Jack Wells.
DeepPDF: A Deep Learning Approach to Extracting Text from PDFs.
Proceedings of the Seventh International Workshop on Mining Scientific Publications (WOSP 2018 at LREC 2018).
Drahomira Herrmannova, Petr Knoth, Christopher Stahl, Robert Patton, Jack Wells. Text and Graph Based Approach for Analyzing Patterns of Research Collaboration: An analysis of the TrueImpactDataset. Proceedings of the Fist Workshop on Computational Impact Detection from Text Data (CIDTD 2018 at LREC 2018).
Drahomira Herrmannova, Robert Patton, Petr Knoth, Christopher Stahl. Do citations and readership identify seminal publications? Scientometrics.
Robert Patton, Drahomira Herrmannova, Christopher Stahl, Jack Wells, Thomas Potok. Audience Based View of Publication Impact. Proceedings of the Sixth International Workshop on Mining Scientific Publications (WOSP 2017 at JCDL 2017).
Drahomira Herrmannova, Robert Patton, Petr Knoth, Christopher Stahl. Citations and Readership are Poor Indicators of Research Excellence: Introducing TrueImpactDataset, a New Dataset for Validating Research Evaluation Metrics. Proceedings of the First International Workshop on Scholarly Web Mining (SWM 2017 at WSDM 2017).
Drahomira Herrmannova, Petr Knoth. Semantometrics: Towards fulltext-based research evaluation. Proceedings of the 2016 Joint Conference on Digital Libraries (JCDL 2016).
Drahomira Herrmannova, Petr Knoth. Simple Yet Effective Methods for Large-Scale Scholarly Publication Ranking: KMi and Mendeley (team BletchleyPark) at WSDM Cup 2016. Proceedings of the 2016 WSDM Cup - Entity Ranking Challenge (2016 WSDM Cup at WSDM 2016).
Drahomira Herrmannova, Petr Knoth. Towards full-text based research metrics: Exploring semantometrics. Report, Jisc Repository.
Drahomira Herrmannova, Petr Knoth. Semantometrics in Coauthorship Networks: Fulltext-based Approach for Analysing Patterns of Research Collaboration. D-Lib Magazine.
Drahomira Herrmannova, Petr Knoth. Semantometrics: Fulltext-based Measures for Analysing Research Collaboration. Proceedings of the 15th International Society of Scientometrics and Informetrics Conference (ISSI 2015).
Drahomira Herrmannova, Martin Hlosta, Jakub Kuzilek, Zdenek Zdrahal. Evaluating Weekly Predictions of At-Risk Students at The Open University: Results and Issues. Proceedings of the European Distance and E-Learning Network 2015 Annual Conference (EDEN 2015).
Jakub Kuzilek, Martin Hlosta, Drahomira Herrmannova, Zdenek Zdrahal, Annika Wolff. OU Analyse: analysing at-risk students at The Open University. Learning Analytics Review.
Petr Knoth, Drahomira Herrmannova. Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing a Research Publication's Contribution. D-Lib Magazine.
Annika Wolff, Zdenek Zdrahal, Drahomira Herrmannova, Jakub Kuzilek, Martin Hlosta. Developing predictive models for early detection of at-risk students on distance learning modules. Proceedings of the 2014 Workshop on Learning Analytics and Machine Learning (LAML 2014 at LAK 2014).
Martin Hlosta, Drahomira Herrmannova, Lucie Vachova, Jakub Kuzilek, Zdenek Zdrahal, Annika Wolff. Modelling Student Online Behaviour in a Virtual Learning Environment. Proceedings of the 2014 Workshop on Learning Analytics and Machine Learning (LAML 2014 at LAK 2014).
Annika Wolff, Zdenek Zdrahal, Drahomira Herrmannova, Petr Knoth. Predicting Student Performance from Combined Data Sources. Chapter in Educational Data Mining.
Petr Knoth, Drahomira Herrmannova. Simple Yet Effective Methods for Cross-Lingual Link Discovery (CLLD) — KMI @ NTCIR-10 CrossLink-2. Proceedings of the Tenth NTCIR Conference on Evaluation of Information Access Technologies (CorssLink-2 Task at NTCIR-10).
Drahomira Herrmannova, Petr Knoth. Visual Search for Supporting Content Exploration in Large Document Collections. D-Lib Magazine.