ACS Data Users Group

 View Only
  • 1.  Use of microdata file for average state income

    Posted 06-06-2023 11:09 AM

    I'm trying to calculate the average (mean) 2021 income of the state of Minnesota using PUMS to see if I can come close to the data in the ACS table S1901.

    From S1901, I find that in 2021, 1YR ACS, there are 2,281,033 households in the state and households have a mean income of $103,033.

    From microdata file PSAM_h27, I am calculating mean income (in a Power BI Measure) in two steps. First, I summarize for all records a product calculation for each record of HINCP*ADJINC*WGTP/1000000; and second, I divide that summation by the sum of WGTP of all records. In this case I get a total of 2,516,964 households with a mean income of $93,370. This does not seem close enough to the ACS table results.

    However, if I filter out all records with a minimum HINCP of 1, and use the same formula as above, I get a total of 2,262,459 households and a mean income of $104,990. This seems pretty close to the Table results.

    My question, is my general approach to getting mean income correct, and should I be filtering for HINCP?

    (While the question above is my main one, I would also be interested in knowing how the IPUM household data can be used to calculate the Median Income.)



  • 2.  RE: Use of microdata file for average state income

    Posted 06-06-2023 12:13 PM

    Hi cbachx--

    You're right to validate your results with published tabulations: a good way to identify there's a problem.

    When using the PSAM_H27 table, be sure that you've excluded vacant housing units. The "universe" of the H table is: Housing units combined with households characteristics for the occupied housing. It should be easy to filter those records of vacant units. Those records will be coded as vacant; and they'll have no households characteristics.

    I'll add... to make this post useful to everyone... also know that the "universe" of the P table includes both population in households and population in group quarters. If you were using the P table to get household characteristics (like, for example, grouping by age of reference person) then you would need to filter out the pop in GQs. (Though I don't think that's cbachx's issue.)


    Hope that helps,

    Todd Graham

    Metropolitan Council Research



  • 3.  RE: Use of microdata file for average state income

    Posted 06-06-2023 03:25 PM

    Thanks very much, Todd. Good to know that you are still out there with all your data knowledge!

    When I filtered on HINCP being at least 1, that must have filtered out the vacant units and maybe a few more that shouldn't have been excluded. I'll modify the filtering to just exclude vacant units.

    Do you have any ideas regarding how this table, PSAM_H27 can be used to find the Median Income?

    Chip Halbach



  • 4.  RE: Use of microdata file for average state income

    Posted 06-06-2023 04:31 PM

    Right. HINCP could be zero or negative; that would be valid.

    You're asking how to calc a weighted median? It depends on the stats analysis software you're using.

    Personally, I use SAS. If you're a SAS user, try the PROC MEANS process step; and be sure to invoke the WEIGHT option. PROC MEANS offers (despite its name) results that include the (weighted) 5th, 25th, 50th, 75th, and 95th percentile results.

    (I'm sure R offers some similar function, but I don't know it off the top of my head.)

    cheers,
    Todd Graham



  • 5.  RE: Use of microdata file for average state income

    Posted 06-10-2023 10:05 PM

    Thanks again Todd. I use the Microsoft Power BI program to work with databases so I'll have to probe regarding getting a median number, but my main interest was in making sure that I was approaching the microdata in the right way and you helped get me there.



  • 6.  RE: Use of microdata file for average state income

    Posted 06-11-2023 03:14 PM

    Hi!!

    absolutely agree with your statement

    Filtering for HINCP could be a good idea, as that could help you to remove any outliers that might be skewing your results. In general, your approach seems sound, but I would recommend doing some additional research to see if there are any other factors that might be affecting your results.

    regards,

    grace