Hello,
A question I get from time to time is, how to calculate/estimate/approximate the median (age/income/home value/monthly rent/etc.) when grouping together a few different census tracts. I want to have a better answer for them than, "do it using PUMS" because often they're really just looking for a quick approximation.
Here's what I'm thinking, and I would love feedback on it from this group of experts.
Step 1. Get the table for the full distribution with the most detailed intervals/smallest ranges you can find (for age, B01001; for household income, B19001)
Step 2. Create a list/set/array/dictionary of values that are the midpoint of each range for the count in each range for each tract you're interested in.
(For example, if in a tract, there were 100 people age 0 to 4, and 105 people age 5 to 9, ..., my set would look like [2, 2, ... 2 (100 times), 7, 7, 7, ... (105 times), ... etc.)]
Step 3. Combine into 1 list/set/array/dictionary. So if you have 4 tracts of interest, combine the values from all 4 into 1 big one.
Step 4. Sort and take the median (50th percentile).
Advanced question: is it even possible to approximate an MOE for this?