Wednesday, February 13, 2019

What is Zillow's Buyer-Seller Index, and How is it Computed?

The Zillow Buyer-Seller Index (BSI) is a measure of the balance between sellers and buyers in a given market. A hot market, or sellers market, typically occurs when buyers are forced to compete for a limited supply of homes, often resulting in higher prices and/or quicker sales that tend to benefit sellers. A cold market, or buyers market, is the opposite: general lack of demand means homes can linger on the market longer and ultimately sell for less, putting negotiating power in the hands of buyers.

The media and others often use isolated housing market indicators and broad heuristics to classify markets as either hot or cold. Zillow's BSI uses a consistent, rigorously developed set of granular housing market fundamentals to create a nuanced scale of market heat, incorporating historical and regional quantitative context.

The full index includes two measurements:

  • The cross-time BSI measures how hot a region's housing market is relative to its own history.
  • The cross-region BSI measures how hot a region's housing market is relative to other regions at a single point in time.

Together, they capture the dynamics of negotiating power in a local housing market. Both indices are computed monthly.

Input Data

The BSI is computed using three input metrics:

  1. Percentage of listings with a price cut —The percentage of current for-sale listings on Zillow with a price cut during the month.
  2. Median days on Zillow — The median days on market of homes sold within a given month, including foreclosure re-sales.
  3. Median sale-to-list price ratio — The median of the ratio between the sale price and the list price for all homes (e.g. if a home with a list price of $200k sells for $250k, its ratio would be 5:4, or 1.25)

The days on Zillow and sale-to-list ratio used in our BSI calculation are slightly different than those series published regularly on our website. We'll explain in further detail below.

Cleaning of Input Data

Because the input data from Percentage of Listings with a Price Cut is calculated separately, it has its own suppression rules to determine which observations in each regional time series are suspicious. Only unsuppressed (publicly published) observations from each aggregate series are retained as inputs to the BSI.

Using a property-level algorithm to match transactions to listings, the Median sale-to-list price ratio as well as the number of days on Zillow (time on market) per matched transaction are calculated. Unlike published data, in order to ensure a more stable input for the BSI, both of these metrics are subject to special data management concerning the observation count, the aggregation and imputation strategies. Both metrics are then subjected to the following property-level suppression rules:

  • Days on Zillow is reported as NA (missing) for a given transaction if it's lower than 10 or higher than 365.
  • Median sale-to-list price ratio is reported as NA for a given transaction if it's below 0.5 or above 2 (list price is less than half or more than twice the sale price).

Aggregation and Outlier Removal of Input Data

Percentage of listings with a price cut is pre-aggregated to all regional levels. Both directly-calculated inputs, days on zillow and median sale-to-list price ratio are aggregated at the regional level by:

  • Neighborhood
  • ZIP code
  • City
  • County
  • Metro (CBSA)
  • State
  • Nation (United States)

The aggregation uses a rolling median with a three-month window. Employing a rolling median with a smaller window leads to many regions having too few observations, while a bigger window does not significantly increase the number of regions for which data is observed. Data are considered too sparse in a given month for a region if the number of observations is fewer than 15. In these cases, the data value is reported as NA.

The three aggregate input metrics for BSI now have a time series structure for all regions of interest.

Input Data Volatility and Imputation

All input metrics can have significant month-over-month (MoM) volatility. To suppress outliers, MoM growth rates are calculated for each region and used to remove outliers. A metric's value for a given month-region combination is reported as NA if the MoM growth is less than 50 percent or more than 100 percent. This is done at all geographic levels.

Data are imputed for child regions of metros using the parent-metro's MoM growth rates by chaining back and forward. If a metro's growth rates are missing for less than 12 months, they are linearly interpolated. If data are missing for more than one year, the state's growth rates will be used. If those are missing as well (which rarely happens), then U.S. growth rates will be used. If regions are missing more than 50 percent of their respective time series observations, data will be suppressed.

From Inputs to Final Metrics

Comparable Distributions and Seasonality

As each of the inputs' units are absolute, the inputs are normalized by forcing their respective distribution to be on the inclusive interval 0 to 1. As minimum and maximum values, the "global" maxima/minima across regions and time periods of the unsuppressed data are used. This implies a comparison of data across regions. However, this may be the least restrictive way of normalizing the inputs while maintaining a valid comparison over time.

The fundamental Buyer-Seller Index (raw BSI) is defined for a given region and month as the average of the three inputs. As it is a convex combination of normalized inputs, it is bound between 0 and 1. Due to strong seasonality, there exists a seasonal decomposed version of the raw BSI (trend component). The decomposition uses the X-13ARIMA-SEATS method. If the seasonal decomposition fails (due to non-convergence of likelihood optimizer), the region gets excluded from the dataset to ensure comparability between the seasonal decomposed BSI and raw BSI (this is the case for roughly 5 regions).

Smoothing and Creating Final BSI Flavors

As the seasonally decomposed BSI and raw BSI can both still contain strong fluctuation, a three-month rolling (trailing) mean is calculated. This smoothed flavor of raw BSI is used to calculate the end products:

  1. To capture time variation in one region's market, there will be a ranking for a respective region with its own time series (BSI Cross-time).
  2. To capture how a region's market compares to surrounding areas' markets, a ranking takes place for any given month, comparing a region's raw BSI with its peer regions' one. (BSI Cross-region)

This yields two metrics (following the order described above):

  • BSI Cross-time: BSI Cross-time Smoothed and Seasonal Adjusted (SSA)
  • BSI Cross-region: BSI Cross-region Smoothed and Seasonal Adjusted (SSA)

The ranking for (2.) takes place in the following structure:

  • Neighborhoods, ZIPs, Cities are ranked in their respective parent metros
  • Metros, Counties, States are ranked across the U.S.
  • The U.S. is ranked with respect to itself (resulting in a constant value for BSI Cross-region)

Interpretation of the Index

As the index is relative to its own history, it is computed monthly for all regions and time periods.

The final BSI lies on the inclusive interval of 0 to 10. A value of 0 indicates that the market is relatively cold/not tight/buyers have more negotiating power. A value of 10 indicates that the market is relatively hot/tight/sellers have more negotiating power.

It's important to keep in mind that the index forces each observation to lie in the interval 0 through 10, even in markets which are relatively stable over time. Because of the forced boundaries, those markets may appear more volatile than they actually are.

 

The post What is Zillow's Buyer-Seller Index, and How is it Computed? appeared first on Zillow Research.



via What is Zillow's Buyer-Seller Index, and How is it Computed?

No comments:

Post a Comment