This is the exercise of the Data Training Workshop: Introduction to statistic and machine learning with R

Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553, 2009.

The data can be used to test (ordinal) regression or classification (in effect, this is a multi-class task, where the clases are ordered) methods. Other research issues are feature selection and outlier detection. The data includes two datasets:

  • winequality-red.csv - red wine preference samples;

  • winequality-white.csv - white wine preference samples


Vinho verde is a unique product from the Minho (northwest) region of Portugal. Medium in alcohol, is it particularly appreciated due to its freshness (specially in the summer). More details can be found at:

The details are described in [Cortez et al., 2009]: [©Elsevier] [Pre-press (pdf)] [bib]. P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis.