Tan, Kumar, and Srivastava describe a theoretical and experimental investigation of measures for association patterns. The authors survey a large number of such measures that have been developed by the statistics, machine learning, and data mining communities, and demonstrate that the measures can provide conflicting information.
Several key properties of measures are investigated, in addition to the often-cited Piatetsky-Shapiro properties. These properties include symmetry under variable permutation and a number of invariance properties. The effects of support-based pruning and standardization on the properties of the measures are explored. The authors describe two scenarios using support-based pruning and standardization under which many of the measures become consistent, and this will be of great benefit to the data mining community. However, as the authors point out, there are several situations in which these scenarios are not applicable. The final section of the paper describes an interactive process based on relative rankings provided by domain experts that can be used to select appropriate measures.
This paper is carefully organized, well written, and contains a wealth of information for data mining practitioners. The discussion of the properties of the measures, the scenarios under which many of the measures become consistent, and the interactive method for selecting an appropriate measure are all valuable contributions.