Extensibility of grid-enabled data mining platforms: A case study
In this paper, we discuss requirements for a distributed data mining platform, putting the requirement of extensibility in the focus. We describe the extensibility of the DataMiningGrid system and give a case study where we integrate several new algorithms of the Weka data mining suite into the grid environment. Using these algorithms on a regression problem, we evaluate the system\'s performance. Additionally we compare the extensibility with that offered by several other platforms. We conclude that DataMiningGrid offers a very flexible environment for integration of third party Data Mining algorithms, and that this flexibility does not come at the price of a large performance overhead.