ACS Data Users Group

 View Only
  • 1.  Using Stata svyset command with ACS PUMS

    Posted 10-26-2017 10:24 AM

    Hi. I'm preparing to run a probit regression in Stata using ACS PUMS data.

    Does anyone have experience using the Stata svyset command (or, more generally, specifying relevant survey design factors in a statistical analysis program) with PUMS files?

    The following is suggested for Current Population Survey analyses (see https://www.stata.com/statalist/archive/2008-04/msg00444.html), which uses state (gestcen) and consolidated statistical area (gtcsa) variables.

    egen psu=group(gestcen gtcsa) svyset [pw=mars], strat(gestcen) psu(psu)

    Does anyone have an analogous specification for PUMS?




  • 2.  RE: Using Stata svyset command with ACS PUMS

    Posted 10-27-2017 04:30 AM

    Hi Michele,

    I've used the Stata svy commands to analyze survey data (CPS, SIPP, NHIS). The first step is to svyset the data so Stata knows the sample design.

    svyset [pw=wgtp], sdr(wgtp1 - wgtp80) vce(sdr) mse

    (This example uses the single year 2010 PUMS dataset, ss10hak. The weights used are household-level weights.)

    After svysetting the data, you run the command using the svy: prefix, which passes along the options you defined above. Stata will execute this command using the full-sample weights and again for each set of replicate weights. There are two important things to note:

    (1) Not all Stata commands can be run with the svy: prefix.

    (2) If you want to limit your replicate analyses to a subset of the sample (for example, all persons aged 25-64 or all African Americans), you should not use if or in. Instead, use the subpop() option before the colon, as in

    . gen byte age25_64 = age>=25 & age<=64
    . svy, subpop(age25_64): command

    • Note that you must first define the subpopulation with a dichotomous variable coded 0 for all cases that should be excluded from the analysis.

    Here a some additional resources that may be helpful:

    https://www.stata.com/manuals13/svysvyestimation.pdf

    https://www.stata.com/manuals13/svysvysdr.pdf

    https://usa.ipums.org/usa/repwt.shtml

    http://www.cpc.unc.edu/research/tools/data_analysis/statatutorial/sample_surveys/svy_commands/

    https://stats.idre.ucla.edu/other/mult-pkg/faq/sample-setups-for-commonly-used-survey-data-sets/



  • 3.  RE: Using Stata svyset command with ACS PUMS

    Posted 10-27-2017 06:15 AM

    Applying Occam's razor to the -subpop- option, you can just as well run

    svy, subpop( if inrange(age,25,64) ): command



  • 4.  RE: Using Stata svyset command with ACS PUMS

    Posted 10-27-2017 06:19 AM
    Also, I believe you can get versions of the ACS data off IPUMS that have strata and cluster/PSU variables. It is easier to deal with these -- or at least the commands run faster. I have not compared the standard errors produced one way vs. the other, but I would not expect to see huge differences.


  • 5.  RE: Using Stata svyset command with ACS PUMS

    Posted 10-27-2017 08:30 AM
    Hi Amanda,

    Thanks so much! This is exactly what I needed.

    Michele


  • 6.  RE: Using Stata svyset command with ACS PUMS

    Posted 10-27-2017 07:29 AM
    Hi Stas,

    Thank you very much! This is extremely helpful. I'm grateful for your assistance.

    Michele