Text mining



Automated document annotation and biological information extraction.

OnTheFly2.0 is a web application to aid users collecting biological information from documents. With OnTheFly2.0 one is able to:

  • Extract bioentities from individual articles in formats such as plain text, Microsoft Word, Excel and PDF files.
  • Scan images and identify terms by using Optical Character Recognition (OCR).
  • Handle multiple files simultaneously.
  • Isolate proteins, chemical compounds, organisms, tissues, diseases/phenotypes and gene ontology terms.
  • Extract selected terms along with their identifiers in databases.
  • Perform functional enrichment analysis on a selected group of terms.
  • Identify co-occurring proteins in the scientific literature and in protein domain databases
  • Generate and visualize protein-protein and protein-chemical interaction networks.