Robustness for regression models with asymmetric error distribution
Robustheit für Regressionsmodelle mit asymmetrischen Fehlerverteilungen mit Anwendungen in der Extremwertstatistik
: Pupashenko, Daria
 Kaiserslautern, 2015, XXV, 177 S. Kaiserslautern, TU, Diss., 2015 URN: urn:nbn:de:hbz:386kluedo40460 

 Englisch 
 Dissertation, Elektronische Publikation 
 Fraunhofer ITWM () 
Abstract
In this work we focus on the regression models with asymmetrical error distribution, more precisely, with extreme value error distributions. This thesis arises in the framework of the project "Robust Risk Estimation". Starting from July 2011, this project won three years funding by the Volkswagen foundation in the call "Extreme Events: Modelling, Analysis, and Prediction" within the initiative "New Conceptual Approaches to Modelling and Simulation of Complex Systems". The project involves applications in Financial Mathematics (Operational and Liquidity Risk), Medicine (length of stay and cost), and Hydrology (river discharge data). These applications are bridged by the common use of robustness and extreme value statistics. Within the project, in each of these applications arise issues, which can be dealt with by means of Extreme Value Theory adding extra information in the form of the regression models. The particular challenge in this context concerns asymmetric error distributions, which significantly complicate the computations and make desired robustification extremely difficult. To this end, this thesis makes a contribution. This work consists of three main parts. The first part is focused on the basic notions and it gives an overview of the existing results in the Robust Statistics and Extreme Value Theory. We also provide some diagnostics, which is an important achievement of our project work. The second part of the thesis presents deeper analysis of the basic models and tools, used to achieve the main results of the research. The second part is the most important part of the thesis, which contains our personal contributions. First, in Chapter 5, we develop robust procedures for the risk management of complex systems in the presence of extreme events. Mentioned applications use time structure (e.g. hydrology), therefore we provide extreme value theory methods with time dynamics. To this end, in the framework of the project we considered two strategies. In the first one, we capture dynamic with the statespace model and apply extreme value theory to the residuals, and in the second one, we integrate the dynamics by means of autoregressive models, where the regressors are described by generalized linear models. More precisely, since the classical procedures are not appropriate to the case of outlier presence, for the first strategy we rework classical Kalman smoother and extended Kalman procedures in a robust way for different types of outliers and illustrate the performance of the new procedures in a GPS application and a stylized outlier situation. To apply approach to shrinking neighborhoods we need some smoothness, therefore for the second strategy, we derive smoothness of the generalized linear model in terms of L2 differentiability and create sufficient conditions for it in the cases of stochastic and deterministic regressors. Moreover, we set the time dependence in these models by linking the distribution parameters to the own past observations. The advantage of our approach is its applicability to the error distributions with the higher dimensional parameter and case of regressors of possibly different length for each parameter. Further, we apply our results to the models with generalized Pareto and generalized extreme value error distributions. Finally, we create the exemplary implementation of the fixed point iteration algorithm for the computation of the optimally robust in uence curve in R. Here we do not aim to provide the most exible implementation, but rather sketch how it should be done and retain points of particular importance. In the third part of the thesis we discuss three applications, operational risk, hospitalization times and hydrological river discharge data, and apply our code to the real data set taken from Jena university hospital ICU and provide reader with the various illustrations and detailed conclusions.