Data Mining Techniques for Effective Search, Analysis and Investigation of Open Source Software
Dr. Sanjay Ranka, University of Florida
Manas Somaiya, University of Florida
Linda Fernandes, University of Florida
Chiquita Kerur, University of Florida
Supriya Sharma, University of Florida
We want to demonstrate the use of data mining techniques for effective search, analysis and investigation of open source software (OSS). We have developed a model which can be used to maintain and search a database of OSS projects with various attributes like application domain, programming languages used, number of bugs reported, number of downloads, etc. The proposed model can learn rules for predicting the success of an OSS project based on these attributes using novel data mining techniques like text mining, clustering, and rule-based classification. To the best of our knowledge, our approach is unique. We have performed initial validation using data extracted from sourceforge.net website. Our goal is to further refine our approach and test our techniques with the more comprehensive data available with UND.