2021

Author(s): Abdullahi T, Nitschke G, Ieee

The incidence of most diseases varies greatly with seasons, and global climate change is expected to increase its risk. Predictive models that automatically capture trends between climate and diseases are likely to be beneficial in minimizing disease outbreaks. Machine learning (ML) predictive analytic tools have been popularized across many health-care applications, however the optimal task performance of such ML tools largely depends on manual parameter tuning and calibration. Such manual tuning significantly limits the full potential of ML methods, especially for high-dimensional and complex task domains, as typified by real-world health-care application data-sets. Additionally, the inaccessibility of many health-care data-sets compounds innate problems of method comparison, predictive accuracy and the overall advancement of ML based health-care applications. In this study we investigate the impact of Relevance Estimation and Value Calibration, an evolutionary parameter optimization method applied to automate parameter tuning for comparative ML methods (Deep learning and Support Vector Machines) applied to predict daily diarrhoea cases across various geographic regions. Data-augmentation is also used to complement real-world noisy, sparse and incomplete data-sets with synthetic data-sets for training, validation and testing. Results support the efficacy of evolutionary parameter optimization and data synthesis to boost predictive accuracy in the given task, indicating a significant prediction accuracy boost for the deep-learning models across all data-sets.