QAestro - semantic-based composition of question answering pipelines
The demand for interfaces that allow users to interact with computers in an intuitive, effective, and efficient way is increasing. Question Answering (QA) systems address this need by answering questions posed by humans using knowledge bases. In recent years, many QA systems and related components have been developed both by practitioners and the research community. Since QA involves a vast number of (partially overlapping) subtasks, existing QA components can be combined in various ways to build tailored QA systems that perform better in terms of scalability and accuracy in specific domains and use cases. However, to the best of our knowledge, no systematic way exists to formally describe and automatically compose such components. Thus, in this work, we introduce QAestro, a framework for semantically describing both QA components and developer requirements for QA component composition. QAestro relies on a controlled vocabulary and the Local-as-View (LAV) approach to model QA tasks and components, respectively. Furthermore, the problem of QA component composition is mapped to the problem of LAV query rewriting, and state-of-the-art SAT solvers are utilized to efficiently enumerate the solutions. We have formalized 51 existing QA components implemented in 20 QA systems using QAestro. Our empirical results suggest that QAestro enumerates the combinations of QA components that effectively implement QA developer requirements.