Software systems are at once the most complex and the least reliable technological systems human beings construct. A large software system can have over LO20 states, and the reliability of software is infamously poor. Software engineers must usually make assertions about the reliability of software systems after having observed only an insignificant fraction of the possible states of the system. In this dissertation, three investigations into the use of computational intelligence and machine learning to support human software developers are reported.
The first contribution in this dissertation is the use of chaos theory for software reliability modeling. Software reliability growth models (SRGM) are used to gauge the current and future reliability of a software system. Virtually all current SRGM assume that software failures occur randomly in time, an assumption that has never been experimentally tested. In this dissertation, nonlinear time series analysis is used for the first time to ascertain whether software reliability data from three commercial software projects come from a stochastic process, or from a nonlinear deterministic process. Evidence of deterministic behavior was found in these datasets, calling into question almost every SRGM published over the last thirty years.
The second contribution is the use of fuzzy clustering and data mining in software metrics datasets. Software metrics are measures of source code, which are intended as a basis for software quality improvement. Literally hundreds of metrics have been published in the literature, but no generally applicable regression model relating metrics and failure rates has been found. Instead of statistical regression, this thesis uses unsupervised machine learning, in the form of the fuzzy c-means algorithm, to analyze three collections of software metrics from commercial systems. This investigation highlights additional challenges for machine learning in the software metrics domain, one of which is skewness. The most common machine learning approach to overcoming skewness is to resample the dataset; this has never been attempted in the software metrics domain. Hence, the third contribution in this dissertation is the use of resampling algorithms to calibrate a decision tree to preferentially recognize high-risk classes of modules.