Past Data Sets

Below is a table of data sets from past StCC projects. Each data set is provided as a tab or comma-delimited text file. The description files, each titled "datasetname_descr.dat", contain information about the data sets themselves, such as size, variable information, and those credited with granting permission for data use. The ".dat" file extensions may cause some problems, but be sure that each of these files is a plain text file, and can be opened with any text editor, (e.g.) WordPad for Windows.

Data sets such as these can be valuable resources for practicing and learning about data management and statistical data analysis. Several of the data sets are deliberately structured to give users experience in reading data from alternative formats. Many consulting groups publish data sets such as this. After playing with these data sets, we encourage you to explore the web for additional example data sets.

Questions about the use of data sets should be addressed to


Data File
Help File
Emissions DataA monitoring of 1,3-butadiene at several sites in the Louisville, KY area, with relevant weather data appended. Note: These data were initially made available online by the West Jefferson County Community Task Force.Emissions.datEmission_descr.dat
Drug Interaction
Survival data from a trial comparing single and double doses of a drug, alone and with an inhibitor.DoubleDose.datDoubleDose_descr.dat
Obesity and Pregnancy
A cohort study of the effects of obesity on pregnancy, particularly the incidence of adverse outcomes.ObesityPregnancy.datObesityPregnancy_descr.dat
Literacy and Sleep Hygeine
A questionnaire survey investigating the association between literacy and sleep hygiene.LiteracySleep.datLiteracySleep_descr.dat
Diabetic Neuropathy
A trial testing the effects of an exercise intervention on the general health of elderly diabetes patients, with specific focus on peripheral neuropathy.
Harmonic Tonsillectomy
Data from a trial comparing the ability of two surgical methods to reduce operating, recovery time, and complications.HarmTons.datHarmTons_descr.dat
Cervical Length and Pre-term BirthObservational study of the predictive value of cervical length on the incidence of pre-term birth.
Cooling Tower BiocidesSampling of treated national water towers for Legionella bacteria.Biocide.dat
Diabetes Medication
Mouse study of diabetes medication.
Sickle Cell Disease and MPIStudy of differences in heart functionality (MPI) among severity/transfusion groups.MPI.datMPI_descr.dat

