Measuring Semantic Coherence of RAG-Generated Abstracts Through Complex Network Metrics

Gana, Bady; Palma, Wenceslao; Lucay, Freddy A.; Missana, Cristóbal; Abarza, Carlos; Allende-Cid, Héctor

doi:10.3390/math13213472

October 20, 2025

Journal Article

Abstract

The exponential growth of scientific literature demands scalable methods to evaluate large-language-model outputs beyond surface-level fluency. We present a two-phase framework that separates generation from evaluation: a retrieval-augmented generation system first produces candidate abstracts, which are then embedded into semantic co-occurrence graphs and assessed using seven robustness metrics from complex network theory. Two experiments were conducted. The first varied model, embedding and prompt configurations, achieved results showing clear differences in performance; the best family combined gemma-2b-it, a prompt inspired by chain-of-Thought reasoning, and all-mpnet-base-v2, achieving the highest graph-based robustness. The second experiment refined the temperature setting for this family, identifying τ = 0.2 as optimal, which stabilized results (sd = 0.12) and improved robustness relative to retrieval baselines (∆E G = +0.08, ∆ρ = +0.55). While human evaluation was limited to a small set of abstracts, the results revealed a partial convergence between graph-based robustness and expert judgments of coherence and importance. Our approach contrasts with methods like GraphRAG and establishes a reproducible, model-agnostic pathway for the scalable quality control of LLM-generated scientific content.

Author(s)

Gana, Bady

Palma, Wenceslao

Lucay, Freddy A.

Missana, Cristóbal

Abarza, Carlos

Allende-Cid, Héctor

Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS

Journal

Mathematics

Options

Measuring Semantic Coherence of RAG-Generated Abstracts Through Complex Network Metrics