Tuesday, October 22, 2019

What Zillow Has Learned Creating Web Surveys

Zillow conducts a variety of surveys to understand the housing market and real-estate consumer behavior. With publication of the fourth annual Zillow Group Consumer Housing Trends Report, there are several key lessons we've learned that helped refine our approach and will inform our work going forward to obtain and disseminate high-quality survey data.

Challenges of online surveys

The main challenge of obtaining high quality Web survey data stems from the fact that online surveys are non-probability samples. Both phone and mail surveys allow for sampling at random from the complete population of phone numbers and addresses. This is not possible with online surveys, and the concern is that people who choose to complete Web surveys may be different than those who don't – and their responses may not be representative of the population.

Despite this challenge, Zillow conducted a Web survey for several reasons:

  • Response rates have declined considerably for traditional phone surveys, limiting their viability as true probability samples and making them more prone to non-response bias.[1]
  • Traditional phone surveys are more expensive and data collection takes much longer. This is especially true for hard-to-reach populations. For example, phone surveys are less feasible recruiting participants that have a low incidence in the population, such as recent home buyers and sellers.
  • While Web surveys are prone to some forms of satisficing short cuts, such as straight lining down a list, they are less prone to social desirability bias on more sensitive topics.[2] This is important for understanding real estate topics, such as to home affordability, preferences for neighborhood diversity, and disclosing dollar amounts on home prices, income, etc.

Unsurprisingly, online surveys are becoming more common, especially with the growth in internet use. But with the challenge of the non-probability design, it is difficult to create a Web survey that provides trustworthy results.

Drawing a More Representative Online Sample

Many Web surveys recruit participants without regard to constructing a sample resembling the target population, making it unlikely that the resulting data reflect the true sentiments of the population. This kind of survey data should be viewed with skepticism – but unfortunately, too often it is not.

A better approach to online surveys is to construct an initial sample designed to align with the population. This can be done, in part, by examining distributions of relevant characteristics in a trusted benchmark survey (like the Census) and recruiting an initial sample of Web survey participants that matches distributions of these attributes.

Representativeness can also be improved using matching techniques after drawing the initial sample:

  • First, a random or stratified sample is drawn from the target benchmark population data source.
  • Next, respondents from the online survey are selected who most closely match each member of the target benchmark sample on a set of variables.
    • Respondents without close matches to the benchmark sample are discarded, producing an online sample more representative of the target sample.[3]

We used a combination of these two techniques to create the sample in the  Zillow Group Consumer Housing Trends Report. For data on home buyers, we selected Web survey respondents with similar distributions of age, gender, race and education as respondents from the most recent American Community Survey (ACS) who purchased a home in the past year within each region of the U.S. To further improve the representativeness of our initial sample, we matched Web survey respondents to members of a stratified sample drawn from ACS based on their age, race, education level, gender, household composition, census region and number of bedrooms in their home. Individuals without a close match were removed from the online sample before constructing weights.

Weight, What?

While measures can be taken to form a representative sample prior to data collection, additional steps should also be taken after survey fielding (and matching) to adjust the sample to more accurately reflect the characteristics of the population. To do this, each respondent is given a weight used in data analysis to obtain more representative population estimates.

There are a variety of weighting techniques used in surveys, including weights to adjust for survey design (e.g. oversamples of small groups); correct for survey non-response; and to balance the sample to better reflect known characteristics of the population. Depending on the survey, several weighting strategies are employed in a stepwise process to create the final weight for each respondent.

For the report, Zillow used a combination of propensity and post-stratification weighting techniques, on top of sampling and matching mentioned above, to improve representativeness of our online survey data. Using propensity weighting and matching techniques does not always improve accuracy beyond more basic methods such as raking,[4] but can reduce bias incrementally when used in combination with post-stratification and raking techniques.[5]

Propensity weighting attempts to adjust for the probability of selection into the survey sample. After respondents from the survey sample are matched to the target sample, a model (e.g. logistic regression) calculates a propensity score (or probability) that each survey respondent was included in the sample based on a large set of observable characteristics. The propensity score function included attributes such as age, gender, race/ethnicity, years of education, Census division, homeownership, household composition and number of bedrooms.

Next, the propensity scores were grouped into deciles and weights were constructed by multiplying by the inverse of their probability of inclusion in the sample. As a final step, the propensity score weights were post-stratified to match population distributions on several demographic variables and daily internet usage. Because those responding to online surveys are more likely to be heavy internet users, we've learned it is important to adjust for internet use in the population to achieve accurate responses to questions involving technology use (e.g. searching for a home using online tools).

Validating Online Surveys

After collecting and weighting the survey data, it is important to validate the data for accuracy. Zillow puts checks into the survey to see if the respondent is paying attention and flags respondents for potential removal if they speed through the survey or make illogical patterns.

Another important step in our validation process is to benchmark against responses to other trusted data sources. While all survey data contains some error, large government surveys such as the ACS undergo substantial testing on questionnaire design; have better response rates than non-government surveys; have large sample sizes; and use government records of household addresses to draw high-quality probability samples.[6]

We validate responses in our surveys with existing benchmark data for questions included in our survey that also exist in other data sources and are not attributes used in weighting. This allows us to understand how our survey data is similar to benchmark data; and, when responses differ, where we should be more cautious about the results and learn more about why our results differ. Sometimes these areas suggest weighting should be revised or questions were misunderstood. In other instances, differences may be worth highlighting as noteworthy, especially if there is a good reason to believe there has been a change since the benchmark data was created and/or when we are examining a topic in a new or more nuanced way than prior studies.

Benchmarking informs the potential accuracy of the survey overall and can help inform the reliability of responses to questions that don't exist in any benchmarking data sources – essentially, where the only source of truth is our survey responses themselves. In these cases, we also review the results with subject matter experts for their assessment of whether the data are reasonable, and compare survey findings with Zillow's internal databases whenever possible.

Creating a Source of Truth

Ultimately, the purpose of survey research is to understand preferences, decisions or behavior when no objective data exists. It's an attempt to use better data than anecdotal stories and personal experience and come to a better truth. Armed with such insights, we can all make better decisions. But care is needed. There are right ways and wrong ways to field and analyze survey data. We continually refine our own techniques and hope any other surveys you use to inform your knowledge do the same.

 

[1] Meyer, B.D., Mok, W.K. and Sullivan, J.X., 2015. Household surveys in crisis. Journal of Economic Perspectives29(4), pp.199-226.

[2] Lavrakas, P.J., Benson, G., Blumberg, S., Buskirk, T., Cervantes, I.F., Christian, L., Dutwin, D., Fahimi, M., Fienberg, H. and Guterbock, T., 2017. The Future of US General Population Telephone Survey Research.

[3] Rivers, D., 2007, August. Sampling for web surveys. In Joint Statistical Meetings.

[4] Raking is a common weighting technique used to balance the sample to reflect population characteristics by adjusting the sample weights so proportions of selected characteristics match known totals in the target data. This process is done iteratively across each variable used in weighting, until the weighted distributions match the target population on those attributes. For a description of various weighting techniques, see: Kalton, G. and Flores-Cervantes, I., 2003. "Weighting methods" Journal of Official Statistics19(2), p.81.

[5] See: Dutwin, D. and Buskirk, T.D., 2017. "Apples to oranges or gala versus golden delicious? Comparing data quality of nonprobability Internet samples to low response rate probability samples" Public opinion quarterly81(S1), pp.213-239; and Pew Research Center, Jan 2018, "For weighting online opt-in samples, what matters most?"

[6] U.S. Census Bureau, June 2018, "American Community Survey (ACS) Design and Methodology Report".

The post What Zillow Has Learned Creating Web Surveys appeared first on Zillow Research.



via What Zillow Has Learned Creating Web Surveys

No comments:

Post a Comment