The Statistical Abstract of the United States, published from 1878 to 2012, is the authoritative and comprehensive summary of statistics on the social, political, and economic organization of the United States. It is designed to serve as a convenient volume for statistical reference, and as a guide to other statistical publications and sources both in print and on the Web. These sources of data include the U.S. Census Bureau, Bureau of Labor Statistics, Bureau of Economic Analysis, and many other Federal agencies and private organizations.
The documentation is segmented by year, and then separated into parts. For example, the documentation for 1994 can be found here.
Example in Python
import pandas as pd # Download Variables of interest from data portal # You can load the data file like any text file df = pd.read_table('default.dat')
Example in R
# Download Variables of interest from data portal # You can load the data file like any text file df <- pd.read_table("default.dat")
The GSS gathers data on contemporary American society in order to monitor and explain trends and constants in attitudes, behaviors, and attributes. The survey contains a standard core of demographic, behavioral, and attitudinal questions, plus topics of special interest. Among the topics covered are civil liberties, crime and violence, intergroup tolerance, morality, national spending priorities, psychological well-being, social mobility, and stress and traumatic events. The data is available for SPSS and STATA here.
IPUMS is not a collection of compiled statistics; it is composed of microdata. Each record is a person, with all characteristics numerically coded. In most samples persons are organized into households, making it possible to study the characteristics of people in the context of their families or other co-residents. Because the data are individuals and not tables, researchers must use a statistical package to analyze the millions of records in the database. A data extraction system enables users to select only the samples and variables they require. Data is received in a gzip file. Data that is used for publicatoin must be cited. The IPUMS download portal yields a data file as well as command files for SAS, SPSS, Stata, and R. Researchers using R are recommended to use the ipumsr package (manual).
ipumsr
Helpful Links:
The purpose of this study was to collect extensive information on the sexual experiences and other social, demographic, attitudinal, and health-related characteristics of adults in the United States. The survey collected information on sexual practices with spouses/cohabitants and other sexual partners and collected background information about the partners. Major areas of investigation include sexual experiences such as number of sexual partners in given time periods, frequency of particular practices, and timing of various sexual events. The data cover childhood and adolescence, as well as adulthood. Other topics in the survey relate to sexual victimization, marriage and cohabitation, and fertility. Respondents were also queried about their physical health, including history of sexually transmitted diseases. Respondents’ attitudes toward premarital sex, the appeal of particular practices such as oral sex, and levels of satisfaction with particular sexual relationships were also studied. Demographic items include race, education, political and religious affiliation, income, and occupation.
The codebook can be found here.
Information on the labor market activities and other significant life events of several groups of men and women at multiple points in time. For more than 4 decades, NLS data have served as an important tool for economists, sociologists, and other researchers.The NLS program includes the following cohorts :
The download functionality for these data sets provides access to files for SPSS, SAS, Stata, R, or simply a csv. A tagset, codebook, description file, and log file are also included with a download.
The R, SAS, and SPSS files contain code needed to load the data set, as well as short explanations for missing values and level names. Example in Python