Classification methods are widely used for identifying underlying groupings within datasets and predicting the class for new data objects given a trained classifier. This study introduces a project aimed at using a combination of simulations and classification techniques to predict epidemic curves and infer underlying disease parameters for an ongoing outbreak.
Six supervised classification methods (random forest, support vector machines, nearest neighbor with three decision rules, linear and flexible discriminant analysis) were used in identifying partial epidemic curves from six agent-based stochastic simulations of influenza epidemics. The accuracy of the methods was compared using a performance metric based on the McNemar test.
The findings showed that: (1) assumptions made by the methods regarding the structure of an epidemic curve influences their performance i.e. methods with fewer assumptions perform best, (2) the performance of most methods is consistent across different individual-based networks for Seattle, Los Angeles and New York and (3) combining classifiers using a weighting approach does not guarantee better prediction.
©2012 Walter de Gruyter GmbH & Co. KG, Berlin/Boston