Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Text mining in full text articles- methodical and representation issues

Preprint for Nature precedings, 22. April 2009, 1 S.
: Klinger, R.; Pesch, R.; Mevissen, T.; Fluck, J.

Volltext urn:nbn:de:0011-n-936860 (1.4 MByte PDF)
MD5 Fingerprint: eb74f44a09962834761d9b35f3cddce4
Erstellt am: 26.8.2009

Volltext ()

2009, 1 S.
International Biocuration Conference <3, 2009, Berlin>
Vortrag, Elektronische Publikation
Fraunhofer SCAI ()
visualization; PDF; text mining; full text; text parsing; HTML; journals; parsing; ProMiner; publishing; biodatabase; biocurator

In many cases, information from abstracts of biomedical publications is not sufficient for annotation of database entries. Therefore, text mining systems supporting curators of biodatabases should be able to process full text articles. Beside the technical problems arising from full text parsing, the representation of the annotated full text is an important issue. Journal articles are mostly electronically available in PDF or HTML format. Also with more easily manageable XML formats, readers would like to have a visualisation of annotations and semantic enrichment directly in the PDF or HTML. We summarize the technical problems arising from parsing of HTML and PDF journal full texts and show first results of visualisation in both formats.