pyblp.build_differentiation_instruments(formulation, product_data, version='local', interact=False)

Construct excluded differentiation instruments.

Differentiation instruments in the spirit of Gandhi and Houde (2017) are

(1)\[Z^\text{Diff}(X) = [Z^\text{Diff,Other}(X), Z^\text{Diff,Rival}(X)],\]

in which \(X\) is a matrix of product characteristics, \(Z^\text{Diff,Other}(X)\) is a second matrix that consists of sums over functions of differences between non-rival goods, and \(Z^\text{Diff,Rival}(X)\) is a third matrix that consists of sums over rival goods. Without optional interaction terms, all three matrices have the same dimensions.


To construct simpler, firm-agnostic instruments that are sums over functions of differences between all different goods, specify a constant column of firm IDs and keep only the first half of the instrument columns.

Let \(x_{jt\ell}\) be characteristic \(\ell\) in \(X\) for product \(j\) in market \(t\), which is produced by firm \(f\). That is, \(j \in J_{ft}\). Then in the “local” version of \(Z^\text{Diff}(X)\),

(2)\[\begin{split}Z_{jt\ell}^\text{Local,Other}(X) = \sum_{k \in J_{ft} \setminus \{j\}} 1(|d_{jkt\ell}| < \text{SD}_\ell), \\ Z_{jt\ell}^\text{Local,Rival}(X) = \sum_{k \notin J_{ft}} 1(|d_{jkt\ell}| < \text{SD}_\ell),\end{split}\]

where \(d_{jkt\ell} = x_{kt\ell} - x_{jt\ell}\) is the difference between products \(j\) and \(k\) in terms of characteristic \(\ell\), \(\text{SD}_\ell\) is the standard deviation of these pairwise differences computed across all markets, and \(1(|d_{jkt\ell}| < \text{SD}_\ell)\) indicates that products \(j\) and \(k\) are close to each other in terms of characteristic \(\ell\).

The intuition behind this “local” version is that demand for products is often most influenced by a small number of other goods that are very similar. For the “quadratic” version of \(Z^\text{Diff}(X)\), which uses a more continuous measure of the distance between goods,

(3)\[\begin{split}Z_{jtk}^\text{Quad,Other}(X) = \sum_{k \in J_{ft} \setminus\{j\}} d_{jkt\ell}^2, \\ Z_{jtk}^\text{Quad,Rival}(X) = \sum_{k \notin J_{ft}} d_{jkt\ell}^2.\end{split}\]

With interaction terms, which reflect covariances between different characteristics, the summands for the “local” versions are \(1(|d_{jkt\ell}| < \text{SD}_\ell) \times d_{jkt\ell'}\) for all characteristics \(\ell'\), and the summands for the “quadratic” versions are \(d_{jkt\ell} \times d_{jkt\ell'}\) for all \(\ell' \geq \ell\).


Usually, any supply or demand shifters are added to these excluded instruments, depending on whether they are meant to be used for demand- or supply-side estimation.

  • formulation (Formulation) – Formulation configuration for \(X\), the matrix of product characteristics used to build excluded instruments. Variable names should correspond to fields in product_data.

  • product_data (structured array-like) –

    Each row corresponds to a product. Markets can have differing numbers of products. The following fields are required:

    • market_ids : (object) - IDs that associate products with markets.

    • firm_ids : (object) - IDs that associate products with firms.

    Along with market_ids and firm_ids, the names of any additional fields can be used as variables in formulation.

  • version (str, optional) –

    The version of differentiation instruments to construct:

    • 'local' (default) - Construct the instruments in (2) that consider only the characteristics of “close” products in each market.

    • 'quadratic' - Construct the more continuous instruments in (3) that consider all products in each market.

  • interact (bool, optional) – Whether to include interaction terms between different product characteristics, which can help capture covariances between product characteristics.


Excluded differentiation instruments, \(Z^\text{Diff}(X)\).

Return type