Categorical Data Analysis

Website for CATEGORICAL DATA ANALYSIS, 3rd edition

For the third edition of Categorical Data Analysis by Alan Agresti (Wiley, 2013), this site contains (1) information on the use of other software (SAS, R and S-plus, Stata, SPSS, and others), (2) data sets for examples and many exercises (for many of which, only excerpts were shown in the text itself), (3) short answers for some of the exercises, (4) extra exercises that did not fit in the text itself, and (5) corrections of errors in early printings of the book. Also, there's (6) a seminar on the history of CDA, and (7) a survey paper on Bayesian inference for CDA. Here is a link to the webpage for the Website for 2nd edition (2002) of Categorical Data Analysis, which is no longer being updated.

1. Software Appendix

In this appendix we provide details about how to use R, SAS, Stata, and SPSS statistical software for categorical data analysis, with examples in many cases showing how to perform analyses discussed in the text. This supplements the brief description found in Appendix A of the "Categorical Data Analysis" text, 3rd edition, Wiley (2013). For each package, the material is organized by chapter of presentation and refers to datasets analyzed in those chapters. The full data sets are available at datasets.


Go to SAS for a pdf file containing details about the use of SAS for CDA, with illustrations for data sets in the CDA text.

R and S-Plus

Go to R for a pdf file containing details about the use of R for CDA, and illustrations for data sets in the CDA text. Here is a manual that Dr. Laura Thompson prepared on the use of R and S-Plus to conduct all the analyses in the 2nd edition of the CDA text.


Go to Stata for discussion of using Stata for CDA.


Go to SPSS for discussion of using SPSS for CDA.

Other software

Go to other software for discussion of other software useful for CDA, such as StatXact and LogXact.

2. Primary datasets:

Here are datasets for many of the main examples in the text, and for some of the exercises. The separate directory data files. contains some individual files (Crabs for Table 4.3, Teratology for Table 4.7, Credit for Exercise 5.22, Endometrial for Table 7.2, Infection for Table 6.9, SoreThroat for Table 6.15, Substance use for Table 9.3, MBTI for Table 9.17, Substance2 for Table 10.1, Insomnia for Table 12.3, Abortion for Table 13.3). The horseshoe crab data are used to illustrate logistic regression (modeling whether a female crab has at least one satellite) and models for count data (e.g., negative binomial modeling of the number of satellites). For the count data, better models allow zero-inflation. See crab zero-inflation for an excerpt about this, taken from my new book "Foundations of Linear and Generalized Linear Models" (published by Wiley, 2015).

3. Selected short solutions to exercises:

Here is a pdf file of short solutions for some of the exercises at the ends of the chapters. These are mainly the solutions that were provided for some of the odd-numbered exercises from the 2nd edition of the book. Please report errors to AA@STAT.UFL.EDU, so they can be corrected in future revisions of this site. The author regrets that he cannot provide solutions of exercises not in this file.

4. Additional exercises:

Here is a pdf file containing Extra exercises, mainly taken from the first two editions of the book.

5. Corrections:

Here is a pdf file showing corrections of typos/errors in the third edition.

6. History of CDA:

The final chapter gives a historical tour of CDA.

Here is a seminar (in mp4 format) on the

7. Bayes:

David Hitchcock (Statistics Dept., Univ. of South Carolina) and I wrote a survey paper about Bayesian inference for categorical data analysis that appeared in Statistical Methods and Applications, the Journal of the Italian Statistical Society, in 2005 (volume 14, pages 297-330). It was partly a by-product of a very nice summer that I spent in Florence, Italy. A somewhat longer version of this paper is a UF technical report in the Statistics Department at UF.

Copyright © 2013, Alan Agresti, Department of Statistics, University of Florida.