Distributional functionals are integral functionals of probability densities and include functionals such as information divergence, mutual information, and entropy. These functionals are important in the fields of machine learning, signal processing, statistics, and information theory. This talk presents multiple applications of distributional functional estimation, focusing on a principled approach to classification via the problems of dimensionality reduction and estimation of the optimal probability of error (the Bayes error). We then present a simple, computationally tractable non-parametric estimator of a wide variety of distributional functionals that achieves parametric convergence rates under competitive smoothness conditions. The estimator is demonstrated on sunspot images and neural data from epilepsy patients.