  • Publication
    Artificial Intelligence. What Is Behind the Technology of the Future?
    (Springer Nature, 2024-05-16) ;
    Artificial Intelligence (AI) is already present in our daily routines, and in the future, we will encounter it in almost every aspect of life - from analyzing X-rays for medical diagnosis, driving autonomous cars, maintaining complex machinery, to drafting essays on environmental problems and drawing imaginative pictures. The potentials of AI are enormous, while at the same time many myths, uncertainties and challenges circulate that need to be tackled. The English translation of the book "Künstliche Intelligenz - Was steckt hinter der Technologie der Zukunft?" originally published in German (Springer Vieweg, 2020), this book is addressed to the general public, from interested citizens to corporate executives who want to develop a better and deeper understanding of AI technologies and assess their consequences. Mathematical basics, terminology, and methods are explained in understandable language. Adaptations to different media such as images, text, and speech and the corresponding generative models are introduced. A concluding discussion of opportunities and challenges helps readers evaluate new developments, demystify them, and assess their relevance for the future.
  • Publication
    A global scale comparison of risk aggregation in AI assessment frameworks
    ( 2024-05-06) ; ;
    Görge, Rebekka
    Cremers, Armin B.
    AI applications bear inherent risks in various risk dimensions, such as insufficient reliability, robustness, fairness or data protection. It is well-known that trade-offs between these dimensions can arise, for example, a highly accurate AI application may reflect unfairness and bias of the real-world data, or may provide hard-to-explain outcomes because of its internal complexity. AI risk assessment frameworks aim to provide systematic approaches to risk assessment in various dimensions. The overall trustworthiness assessment is then generated by some form of risk aggregation among the risk dimensions. This paper provides a systematic overview on risk aggregation schemes used in existing AI risk assessment frameworks, focusing on the question how potential trade-offs among the risk dimensions are incorporated. To this end, we examine how the general risk notion, the application context, the extent of risk quantification, and specific instructions for evaluation may influence overall risk aggregation. We discuss our findings in the current frameworks in terms of whether they provide meaningful and practicable guidance. Lastly, we derive recommendations for the further operationalization of risk aggregation both from horizontal and vertical perspectives.
  • Publication
    Developing trustworthy AI applications with foundation models
    ( 2024-04) ;
    Schmidt, Sebastian
    Müller, Felix Benjamin
    Görge, Rebekka
    ; ; ; ; ;
    Kern, Carmen
    Loh, Silke
    The trustworthiness of AI applications has been the subject of recent research and is also addressed in the EU's recently adopted AI Regulation. The currently emerging foundation models in the field of text, speech and image processing offer completely new possibilities for developing AI applications. This whitepaper shows how the trustworthiness of an AI application developed with foundation models can be evaluated and ensured. For this purpose, the application-specific, risk-based approach for testing and ensuring the trustworthiness of AI applications, as developed in the "AI Assessment Catalog - Guideline for Trustworthy Artificial Intelligence" by Fraunhofer IAIS, is transferred to the context of foundation models. Special consideration is given to the fact that specific risks of foundation models can have an impact on the AI application and must also be taken into account when checking trustworthiness.
  • Publication
    Machine Learning Operations (MLOps): Grundlagen, Chancen und Herausforderungen beim MLOps-Einsatz in Unternehmen
    ( 2024-04) ;
    Kerbel, Andreas
    Temath, Christian
    Zimmermann, Alexander
    Zorn, Alexander
    Was ist MLOps? Und wie wird es von Unternehmen genutzt? In einer Studie haben Experten von KI.NRW und dem MLOps-Team des Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS insgesamt 29 Unternehmen interviewt, um zu verstehen, wo sie bei ihrer MLOps-Reise stehen. Herausgekommen ist ein kompakter Überblick über Grundlagen, Chancen und Herausforderungen des MLOps-Einsatzes, der neben einer detaillierten Bestandsaufnahme auch wertvolle Handlungsempfehlungen für Unternehmen bereithält.
  • Publication
    Kreativität der generativen KI
    In diesem Beitrag wird die Frage diskutiert, ob auch Systeme der generativen KI kreative Inhalte erzeugen können. Es wird zunächst beschrieben, wie solche Systeme intern funktionieren und wie sie potenziell neue Inhalte generieren können. Anschließend wird der kreative Prozess diskutiert und es wird überprüft, ob KI-Systeme kreative Leistungen für die unterschiedlichen Medien Text, Bild und Musik erbringen können. In standardisierten Tests konnte gezeigt werden, dass das Sprachmodell GPT-4 inzwischen kreativere Antworten produziert als Menschen. Ähnliche Tests haben ergeben, dass Bilder, die mit einer älteren Version von DALL-E erstellt wurden, nur schwer von Künstlerbildern zu unterscheiden sind. Aufgrund der stark verbesserten Detailgenauigkeit neuerer Systeme ist davon auszugehen, dass diese heute eine verbesserte Kreativität besitzen. Systeme zur Generierung von Musik können derzeit dagegen noch nicht mit der Kreativität menschlicher Komponist*innen mithalten.
  • Publication
    Incorporating Query Recommendation for Improving In-Car Conversational Search
    ( 2024-03-23)
    Rony, Md. Rashad Al Hasan
    Khan, Abbas Goher
    Friedl, Ken E.
    Sudhi, Viju
    Süß, Christian
    Retrieval-augmented generation has become an effective mechanism for conversational systems in domain-specific settings. Retrieval of a wrong document due to the lack of context from the user utterance may lead to wrong answer generation. Such an issue may reduce the user engagement and thereby the system reliability. In this paper, we propose a context-guided follow-up question recommendation to internally improve the document retrieval in an iterative approach for developing an in-car conversational system. Specifically, a user utterance is first reformulated, given the context of the conversation to facilitate improved understanding to the retriever. In the cases, where the documents retrieved by the retriever are not relevant enough for answering the user utterance, we employ a large language model (LLM) to generate question recommendation which is then utilized to perform a refined retrieval. An empirical evaluation confirms the effectiveness of our proposed approaches in in-car conversations, achieving 48% and 22% improvement in the retrieval and system generated responses, respectively, against baseline approaches.
  • Publication
    Controlled Randomness Improves the Performance of Transformer Models
    ( 2024-03-19) ;
    Zhao, Cong
    Krämer, Wolfgang
    Leonhard, David
    During the pre-training step of natural language models, the main objective is to learn a general representation of the pre-training dataset, usually requiring large amounts of textual data to capture the complexity and diversity of natural language. Contrasting this, in most cases, the size of the data available to solve the specific downstream task is often dwarfed by the aforementioned pre-training dataset, especially in domains where data is scarce. We introduce controlled randomness, i.e. noise, into the training process to improve fine-tuning language models and explore the performance of targeted noise in addition to the parameters of these models. We find that adding such noise can improve the performance in our two downstream tasks of joint named entity recognition and relation extraction and text summarization.
  • Publication
    Deep Dynamic Language Models
    This thesis investigates the domain of deep dynamic language models, focusing on the integration of temporal dynamics to enhance language modeling and its application in various tasks, such as text generation, recommendation systems, and predicting post popularity. Temporal content change, i.e., trends and themes that change with time featured in document collections such as academic journals, news articles and social media, make the traditional static language models (LMs) not an optimal solution. In order to address this limitation, several approaches to develop dynamic LMs are proposed and explored in this thesis. Initially, the impact of incorporating temporal information is explored, specifically in the context of modeling online communities. For the analysis of temporal content change in Yelp - a crowd-sourced review platform - an instantaneous language model is proposed. This model combines a temporal point process (TPP) for modeling review creation times and a LM to capture textual aspects. Empirical evaluations demonstrate that this model significantly improves the performance of LMs in terms of both language modeling and prediction of review creation time. Building upon the success of the instantaneous LM, the research in this thesis is extended to more application oriented task, such as recommender systems. Recognizing that user preferences and item reviews change over time, the proposed model here leverages users’ reviews to enhance rating predictions. By developing time-interval aware representations, the proposed model outperforms several state-of-the-art recommender systems models in real-world datasets. Additionally, the integration of dynamic topic models into LMs is explored. First, the problem of skewed topic distributions in topic modeling is addressed, which can cause models to learn more general topics present in the majority of documents, rather than rare topics present in only a few documents. A neural dynamic focused topic model is proposed as a solution, which decouples topic activities from topic proportions in documents using sequences of Bernoulli random variables. Experimental evaluations show that this model outperforms state-of-the-art topic models in generalization tasks, while employing a comparable number of parameters and converging two times faster. Furthermore, the performance of large pre-trained language models (LPLMs) in dynamic environments is explored. The empirical analysis on Reddit datasets reveals significant performance drops when predicting the popularity of future posts due to temporal distribution shifts in data. To mitigate this issue, a model is proposed that combines neural variational dynamic topic models and attention mechanisms to infer temporal LM representations. The proposed model exhibit improved performance while utilizing only a fraction of the parameters of LPLMs, and providing interpretable representations that offer insights into real-world events. In summary, this thesis emphasizes the significance of incorporating temporal dynamics into LMs and explores their application in various tasks.
  • Publication
    Wie Agenten und Foundation-Modelle bei der Versorgung Schwerverletzter helfen
    ( 2024-03)
    Meyer, Mareen
    ; ;
    Defosse, Jérôme
    Hensen, Sandra
    Iser, Henri
    Salge, Torsten Oliver
    Stead, Susan
    Tjardes, Thorsten
    Waloßek, Nina
    Künstliche Intelligenz im Schockraum: Wie kann sie das medizinische Team entlasten und unterstützen, um die Behandlung für die Patient*innen sicherer und besser zu machen? Und welche Anwendungen eignen sich hierfür besonders? Hier kommt die Entwicklung neuer KI-Modelle ins Spiel. Insbesondere sogenannte Foundation-Modelle und Large-Language-Modelle (LLMs) ermöglichen die Umsetzung einer Vielzahl von neuen Use Cases im Krankenhaus. Diese umfassen die gesamte Kette klinischer Prozesse bis hin zu Extremsituationen, wie der Schwerverletzten-Versorgung im Schockraum. Besonders relevant ist, dass LLMs ein omnipräsentes Problem von Data Science in der Medizin lösen könnten: Sie können auch mit wenigen Trainingsdaten auf Use Cases adaptiert werden und liefern durch ihr tiefes Sprachverständnis fundiertere Ergebnisse, als es bisher möglich war. Eine besonders spannende Entwicklung stellen LLM-Agenten dar, die eine Umgebung analysieren und daraufhin eigenständig Aktionen, wie z. B. die Bedienung von Systemen über Schnittstellen, durchführen können. In diesem Whitepaper veranschaulichen wir den Nutzen von LLMs und Agenten anhand von zwei Einsatzmöglichkeiten im Schockraum, die im Rahmen des Projekts TraumAInterfaces umgesetzt wurden.
  • Publication
    Superkraft Sprachmodell?
    ( 2024-03)
    Dinnessen, Felix
    Bringmann, Björn
    Dang, David
    Halscheidt, Sandra
    Die deutsche Verwaltungslandschaft steht angesichts der notwendigen Digitalisierung und Automatisierung von bisher manuellen Prozessen vor einer grundlegenden Transformation. Der Anstieg an Anträgen für Wohngeld, BAföG oder Einbürgerungsverfahren setzt Behörden zusätzlich unter Druck. Der entstehende Rückstau trägt zu einem sinkenden Vertrauen in die Leistungsfähigkeit der öffentlichen Verwaltung bei. Gleichzeitig muss sie die rückläufigen Mitarbeitendenzahlen infolge des demografischen Wandels kompensieren. Generative Künstliche Intelligenz (GenAI) und insbesondere große Sprachmodelle (Large Language Models, LLMs) spielen hier eine wichtige Rolle, um die Mitarbeitenden zukünftig in ihren Aufgaben zu unterstützen, zu entlasten und hierdurch Freiräume zu schaffen, um sich verstärkt der direkten Interaktion mit Bürgerinnen und Bürgern zu widmen. In diesem Briefing präsentieren Fraunhofer IAIS und Deloitte drei Anwendungsbeispiele großer Sprachmodelle, von welchen die öffentliche Verwaltung schon heute profitieren kann. Bei der Betrachtung zu etablierender Rahmenbedingungen muss zwischen den behördeninternen Voraussetzungen und der staatlichen Infrastruktur unterschieden werden. Diese Publikation betrachtet die Voraussetzungen auf individueller Ebene der Behörden.