The theory of Statistics is useful in the context of empirical studies to state which is the likelihood of certain data to appear given some assumptions on the sampled population. From the Central Limit Theorem we developed one sample tests. We mentioned about the two way of doing testing: inductively, by means of confidence intervals and deductively, by means of hypothesis testing. These notions have been then extended to the two sample comparison case with or without matched pairs.
Focusing on the case of our interest, the analysis of algorithms, we introduced the theory of Experimental Design. We went over a checklist of points that must be addressed before planning an experiment. By doing this we introduced the terminology: treatment and nuisance factors, blocking designs, factorial experiment and experimental unit.
In order to analyze and make inference from a typical design arising in the case of algorithm testing we need to introduce the statistical method called Analysis of Variance. We grasp the principal facts behind this method in both the single factor and two factor designs. Finally we gave a short demonstration how to accomplish the tests introduced under R.
This part is described and applied in the article: R.L. Rardin, R. Uzsoy:Experimental Evaluation of Heuristic Optimization Algorithms: A Tutorial. Journal of Heuristics 7(3): 261-304 (2001).
In the next lecture we conclude the part on Empirical Methods by treating multiple comparisons testing and sequential testing, whose principle underlies the race methodology. The following paper is a suggested reading before the lecture:
M. Birattari, T. Stützle, L. Paquete, and K. Varrentrapp. (2002) A Racing Algorithm for Configuring Metaheuristics. In Proceedings of the Genetic and Evolutionary Computation Conference, pp. 11-18. Morgan Kaufmann, San Francisco, CA, USA.
Organize the statistical tests discussed at the lecture in a concept map that crosses the possible experimental designs with the parametric nonparametric nature of the test. Consider the following experimental designs: