Drahomira Herrmannova
Research Scientist

I am a Research Scientist in the Learning Systems Group at Oak Ridge National Laboratory. The focus of my research is on helping scientists work more effectively by applying Artificial Intelligence methods to improve research workflows and enable intelligent access to the content of research publications. My research interests span Text Mining, Natural Language Processing, Machine Learning, and their applications to biomedical, scientific, and other expert literature and data.

My recent work focuses on developing models for literature screening and information extraction from scientific publications in low-resource settings, data extraction from tables in scientific documents, and deploying misinformation detection models to study the effects of misinformation on health outcomes. Prior to my current appointment, I was a postdoctoral researcher at the Knowledge Media Institute, The Open University, UK.

For more information, see my publications and my CV.


Publications

A listing of my publications is also available through these sites:

2021

Drahomira Herrmannova, Chathika Gunaratne, Vickie Walker, Andrew Rooney, Robert Patton, Mary Wolfe, Charles Schmitt. Weak Supervision for Scientific Document Relevance Tagging. Accepted for presentation at the 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2021).

Gautam Thakur, Janna Caspersen, Drahomira Herrmannova, Bryan Eaton, Jordan Burdette. A Mixed-Method Design Approach for Empirically Based Selection of Unbiased Data Annotators. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021.

Drahomira Herrmannova, Gautam Thakur, Joshua Grant, Varisara Tansakul, Eaton Bryan, Olivera Kotevska, Jordan Burdette, Martin Smyth, Monica Smith. Challenges in Automated Detection of COVID-19 Misinformation. Workshop on Human Aspects of Misinformation Online at the 2021 ACM CHI Virtual Conference on Human Factors in Computing Systems.

2020

Ramakrishnan Kannan, Piyush Sao, Hao Lu, Drahomira Herrmannova, Vijay Thakkar, Robert Patton, Richard Vuduc, Thomas Potok. Scalable Knowledge Graph Analytics at 136 Petaflop/s. Proceedings of the 2020 International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2020). ACM Gordon Bell Prize Finalist.

2019

Drahomira Herrmannova, Nancy Pontika, Petr Knoth. Do Authors Deposit on Time? Tracking Open Access Policy Compliance. Proceedings of the 2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL 2019). Vannevar Bush Best Paper Award.

2018

Drahomira Herrmannova, Stephen Young, Robert Patton, Christopher Stahl, Nicole Kleinstreuer, Mary Wolfe. Unsupervised Identification of Study Descriptors in Toxicology Research: An Experimental Study. Proceedings of the Ninth International Workshop on Health Text Mining and Information Analysis (LOUHI 2018 at EMNLP 2018).

Drahomira Herrmannova, Petr Knoth, Robert Patton. Analyzing Citation-Distance Networks for Evaluating Publication Impact. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018).

Christopher Stahl, Steven Young, Drahomira Herrmannova, Robert Patton, Jack Wells. DeepPDF: A Deep Learning Approach to Extracting Text from PDFs. Proceedings of the Seventh International Workshop on Mining Scientific Publications (WOSP 2018 at LREC 2018).

Drahomira Herrmannova, Petr Knoth, Christopher Stahl, Robert Patton, Jack Wells. Text and Graph Based Approach for Analyzing Patterns of Research Collaboration: An analysis of the TrueImpactDataset. Proceedings of the Fist Workshop on Computational Impact Detection from Text Data (CIDTD 2018 at LREC 2018).

Drahomira Herrmannova, Robert Patton, Petr Knoth, Christopher Stahl. Do citations and readership identify seminal publications? Scientometrics.

2017

Robert Patton, Drahomira Herrmannova, Christopher Stahl, Jack Wells, Thomas Potok. Audience Based View of Publication Impact. Proceedings of the Sixth International Workshop on Mining Scientific Publications (WOSP 2017 at JCDL 2017).

Drahomira Herrmannova, Robert Patton, Petr Knoth, Christopher Stahl. Citations and Readership are Poor Indicators of Research Excellence: Introducing TrueImpactDataset, a New Dataset for Validating Research Evaluation Metrics. Proceedings of the First International Workshop on Scholarly Web Mining (SWM 2017 at WSDM 2017).

2016

Drahomira Herrmannova, Petr Knoth. An Analysis of the Microsoft Academic Graph. D-Lib Magazine.

Drahomira Herrmannova, Petr Knoth. Semantometrics: Towards fulltext-based research evaluation. Proceedings of the 2016 Joint Conference on Digital Libraries (JCDL 2016). Best Poster Award.

Drahomira Herrmannova, Petr Knoth. Simple Yet Effective Methods for Large-Scale Scholarly Publication Ranking: KMi and Mendeley (team BletchleyPark) at WSDM Cup 2016. Proceedings of the 2016 WSDM Cup - Entity Ranking Challenge (2016 WSDM Cup at WSDM 2016).

Drahomira Herrmannova, Petr Knoth. Towards full-text based research metrics: Exploring semantometrics. Report, Jisc Repository.

2015

Drahomira Herrmannova, Petr Knoth. Semantometrics in Coauthorship Networks: Fulltext-based Approach for Analysing Patterns of Research Collaboration. D-Lib Magazine.

Drahomira Herrmannova, Petr Knoth. Semantometrics: Fulltext-based Measures for Analysing Research Collaboration. Proceedings of the 15th International Society of Scientometrics and Informetrics Conference (ISSI 2015).

Drahomira Herrmannova, Martin Hlosta, Jakub Kuzilek, Zdenek Zdrahal. Evaluating Weekly Predictions of At-Risk Students at The Open University: Results and Issues. Proceedings of the European Distance and E-Learning Network 2015 Annual Conference (EDEN 2015).

Jakub Kuzilek, Martin Hlosta, Drahomira Herrmannova, Zdenek Zdrahal, Annika Wolff. OU Analyse: analysing at-risk students at The Open University. Learning Analytics Review.

2014

Petr Knoth, Drahomira Herrmannova. Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing a Research Publication's Contribution. D-Lib Magazine.

Annika Wolff, Zdenek Zdrahal, Drahomira Herrmannova, Jakub Kuzilek, Martin Hlosta. Developing predictive models for early detection of at-risk students on distance learning modules. Proceedings of the 2014 Workshop on Learning Analytics and Machine Learning (LAML 2014 at LAK 2014).

Martin Hlosta, Drahomira Herrmannova, Lucie Vachova, Jakub Kuzilek, Zdenek Zdrahal, Annika Wolff. Modelling Student Online Behaviour in a Virtual Learning Environment. Proceedings of the 2014 Workshop on Learning Analytics and Machine Learning (LAML 2014 at LAK 2014).

2013

Annika Wolff, Zdenek Zdrahal, Drahomira Herrmannova, Petr Knoth. Predicting Student Performance from Combined Data Sources. Chapter in Educational Data Mining.

Petr Knoth, Drahomira Herrmannova. Simple Yet Effective Methods for Cross-Lingual Link Discovery (CLLD) — KMI @ NTCIR-10 CrossLink-2. Proceedings of the Tenth NTCIR Conference on Evaluation of Information Access Technologies (CrossLink-2 Task at NTCIR-10).

2012

Drahomira Herrmannova, Petr Knoth. Visual Search for Supporting Content Exploration in Large Document Collections. D-Lib Magazine.


Academic Service

Conference programme committee and reviewing

  • Joint Conference on Digital Libraries (JCDL 2021), programme committee member
  • The SIGNLL Conference on Computational Natural Language Learning (CoNLL 2021), reviewer
  • The First Workshop on Bibliographic Data Analysis and Processing (BiblioDAP 2021), reviewer
  • Joint Conference on Digital Libraries (JCDL 2020), programme committee member
  • Language Resources and Evaluation Conference (LREC 2020), scientific committee member
  • The SIGNLL Conference on Computational Natural Language Learning (CoNLL 2020), reviewer
  • Workshop on Scientific Knowledge Graphs (SKG 2020) at TPDL 2020, programme committee member
  • Joint Conference on Digital Libraries (JCDL 2019), programme committee member and session chair
  • The SIGNLL Conference on Computational Natural Language Learning (CoNLL 2019), reviewer
  • Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2019)
  • Joint Conference on Digital Libraries (JCDL 2018), programme committee member and session chair
  • Workshop on Semantics, Analytics and Visualisation: Enhancing Scholarly Dissemination (SAVE-SD 2018)
  • Workshop on Learning Analytics and Machine Learning (LAML 2014)

Organizing committee

Journal reviewing


Acomplishments and awards

  • R&D 100 Awards Finalist, R&D World, 2021
  • ACM Gordon Bell Prize Finalist, International Conference for High Performance Computing, Networking, Storage and Analysis (SC 20)
  • Vannevar Bush Best Paper Award, ACM/IEEE Joint Conference on Digital Libraries (JCDL 2019)
  • WSDM Cup Challenge finalist, 9th ACM International Conference on Web Search and Data Mining (WSDM 2016)
  • Best Poster Award, ACM/IEEE Joint Conference on Digital Libraries (JCDL 2016)
  • CrossLink-2 Challenge Finalist, 2013 NTCIR Conference on Evaluation of Information Access Technologies (NTCIR-10), Tokyo, Japan
  • Prize of Zdena Rabova for excellent study and science results, Brno University of Technology, 2012

Invited talks

  • Invited speaker (together with my colleague Chris Stahl) at the Organization for Economic Co-operation and Development (OECD) Workshop on Systematic Reviews in the Scope of the Endocrine Disrupter Testing and Assessment (EDTA) Conceptual Framework Level 1, Paris, France, on the topic "Machine Learning for Data Extraction"
  • Invited speaker at Jisc Open Citations Workshop in London, UK, on the topic "Semantometrics: Towards full-text based research evaluation", 2016
  • Invited speaker at Jisc Digifest 2016 in Birmingham, UK, on the topic "Towards full-text based research metrics: exploring Semantometrics"

Mentoring

If you are looking for an internship, get in touch with me, we have internship opportunities available.

  • Mohammad Saad Salman, SULI Internship, summer 2021, currently a sophomore at Loyola Marymount University
  • Alex Perry, SULI Internship, summer 2020, currently a senior at
  • Tomas Danis, Erasmus Internship, spring 2019, currently a C++ Software Engineer at think-cell Software, Berlin, Germany
  • Shreyas Shahapur, work experience internship, summer 2019, currently a high school senior
  • Brett Hagan, SULI Internship, summer 2018, currently a graduate student at the University of Tennessee, Knoxville
  • Derek Shafer, SULI Internship, summer 2018, currently a graduate student at Tennessee Technological University

CV

My complete CV (as of September 2021) can be found here.