On the need for model-driven engineering for data harvesters based on experiences from the german GovData.DE Portal
The German governmental data platform GovData.DE has been launched in February 2013. Since then it has accommodated a large number of datasets which are made accessible over the belonging portal. Indeed, GovData.DE serves as a meta-data hub providing a single point of access to governmental data, whereby the data itself is available over the web portals of the belonging institutions (also denoted as data providers), e.g. municipalities, city councils, or federal institutions such as the Federal Statistical Office of Germany. The meta-data is regularly being obtained from the Internet platforms of the institutions in question. In order to achieve this, a large number of so-called data harvesters had to be developed, which are regularly updating the meta-data on GovData.DE, based on updates on the data providers' side. In this paper, we brief on our experiences in developing data harvesters and identify the need for a model-driven approach to the engineering of data harvesters, which at the same time constitutes a potential for various tool providers to sell and commercialize their MDE (model-driven engineering) tools. Furthermore, we argue that the use of MDE based harvesting will improve the quality and timeliness of the provided datasets (including their meta-data) and will correspondingly encourage the utilization of Open Data platforms for commercial developments.