Hi Joe, thanks for responding. Nearly infinite they may be, but the Census Bureau knows who they are and what their ZIPs and ZCTAs are. Presumably this knowledge is behind their "most instances" statement.
I have a large dataset of people with their ZIP codes but not their ZCTAs (because who knows their ZCTA?), that I would like to match to SES data. If I assume their ZIP codes are the same as their ZCTAs and match to ACS data, there will be 2 kinds of issues:
1) ZIPs that don't match to any ZCTA (because their ZIP is a minority in any census block it occurs in or it is some entity like a PO box that CB does not assign ZCTAs to). This is the issue you mentioned. I would lose these people in my analysis, but it does give me an estimate (assuming my dataset large enough and random enough) of that no-match rate.
2) ZIPs that don't match their ZCTA. Say census block A is all ZIP code 99001 so their ZCTA is also 99001, and census block B is mostly ZIP code 99002 with a little 99001 so their ZCTA is 99002. Fred, in my dataset, lives in census block B with ZIP=99001. He is in ZCTA 99002 but my matching will place him in ZCTA 99001 because of his ZIP. It's this misclassification rate I would like to get some idea of.
I don't need to derive it -- I just want to know if anyone (besides the Census Bureau) knows what it is.
Anne