Download the Jupyter Notebook for this section: build_differentiation_instruments.ipynb
Building Differentiation Instruments Example¶
[1]:
import pyblp
import numpy as np
import pandas as pd
np.set_printoptions(precision=3)
pyblp.__version__
[1]:
'1.1.0'
In this example, we’ll load the automobile product data from Berry, Levinsohn, and Pakes (1995), build some very simple excluded demand-side instruments for the problem in the spirit of Gandhi and Houde (2017), and demonstrate how to update the problem data to use these instrument instead of the default ones.
[2]:
product_data = pd.read_csv(pyblp.data.BLP_PRODUCTS_LOCATION)
product_data.head()
[2]:
market_ids | clustering_ids | car_ids | firm_ids | region | shares | prices | hpwt | air | mpd | ... | supply_instruments2 | supply_instruments3 | supply_instruments4 | supply_instruments5 | supply_instruments6 | supply_instruments7 | supply_instruments8 | supply_instruments9 | supply_instruments10 | supply_instruments11 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1971 | AMGREM71 | 129 | 15 | US | 0.001051 | 4.935802 | 0.528997 | 0 | 1.888146 | ... | 0.0 | 1.705933 | 1.595656 | 87.0 | -61.959985 | 0.0 | 46.060389 | 29.786989 | 0.0 | 1.888146 |
1 | 1971 | AMHORN71 | 130 | 15 | US | 0.000670 | 5.516049 | 0.494324 | 0 | 1.935989 | ... | 0.0 | 1.680910 | 1.490295 | 87.0 | -61.959985 | 0.0 | 46.060389 | 29.786989 | 0.0 | 1.935989 |
2 | 1971 | AMJAVL71 | 132 | 15 | US | 0.000341 | 7.108642 | 0.467613 | 0 | 1.716799 | ... | 0.0 | 1.801067 | 1.357703 | 87.0 | -61.959985 | 0.0 | 46.060389 | 29.786989 | 0.0 | 1.716799 |
3 | 1971 | AMMATA71 | 134 | 15 | US | 0.000522 | 6.839506 | 0.426540 | 0 | 1.687871 | ... | 0.0 | 1.818061 | 1.261347 | 87.0 | -61.959985 | 0.0 | 46.060389 | 29.786989 | 0.0 | 1.687871 |
4 | 1971 | AMAMBS71 | 136 | 15 | US | 0.000442 | 8.928395 | 0.452489 | 0 | 1.504286 | ... | 0.0 | 1.933210 | 1.237365 | 87.0 | -61.959985 | 0.0 | 46.060389 | 29.786989 | 0.0 | 1.504286 |
5 rows × 33 columns
We’ll first build “local” differentiation instruments, which are constructed by default, and which consist of counts of “close” rival and non-rival products in each market. Note that we’re excluding the constant column because it yields collinear constant columns of differentiation instruments.
[3]:
formulation = pyblp.Formulation('0 + hpwt + air + mpd')
local_instruments = pyblp.build_differentiation_instruments(
formulation,
product_data
)
local_instruments
[3]:
array([[ 4., 4., 4., 42., 87., 83.],
[ 4., 4., 4., 53., 87., 84.],
[ 4., 4., 4., 51., 87., 78.],
...,
[ 0., 0., 0., 86., 70., 62.],
[ 1., 1., 1., 3., 58., 91.],
[ 1., 1., 1., 13., 58., 72.]])
Next, we’ll build a more continuous “quadratic” version of the instruments, which consist of sums over squared differences between rival and non-rival products in each market.
[4]:
quadratic_instruments = pyblp.build_differentiation_instruments(
formulation,
product_data,
version='quadratic'
)
quadratic_instruments
[4]:
array([[2.132e-02, 0.000e+00, 2.191e-01, 2.011e+00, 0.000e+00, 1.208e+01],
[8.261e-03, 0.000e+00, 2.983e-01, 2.014e+00, 0.000e+00, 1.198e+01],
[6.397e-03, 0.000e+00, 1.234e-01, 2.159e+00, 0.000e+00, 1.568e+01],
...,
[0.000e+00, 0.000e+00, 0.000e+00, 2.239e+00, 6.000e+01, 1.312e+02],
[1.467e-02, 0.000e+00, 6.317e-02, 1.864e+01, 7.100e+01, 6.185e+01],
[1.467e-02, 0.000e+00, 6.317e-02, 8.961e+00, 7.100e+01, 8.819e+01]])
We could also use interact=True
to include interaction terms in either version of instruments, which would help capture covariances between different product characteristics.
To use these instruments when setting up a Problem, the existing product data has to be updated or new product data has to be constructed. Since the existing product data is a Pandas DataFrame
, it does not support matrices, so each column of instruments has to be added individually after deleting the existing instruments.
[5]:
for i in range(8):
del product_data[f'demand_instruments{i}']
for i, column in enumerate(local_instruments.T):
product_data[f'demand_instruments{i}'] = column
product_data
[5]:
market_ids | clustering_ids | car_ids | firm_ids | region | shares | prices | hpwt | air | mpd | ... | supply_instruments8 | supply_instruments9 | supply_instruments10 | supply_instruments11 | demand_instruments0 | demand_instruments1 | demand_instruments2 | demand_instruments3 | demand_instruments4 | demand_instruments5 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1971 | AMGREM71 | 129 | 15 | US | 0.001051 | 4.935802 | 0.528997 | 0 | 1.888146 | ... | 46.060389 | 29.786989 | 0.0 | 1.888146 | 4.0 | 4.0 | 4.0 | 42.0 | 87.0 | 83.0 |
1 | 1971 | AMHORN71 | 130 | 15 | US | 0.000670 | 5.516049 | 0.494324 | 0 | 1.935989 | ... | 46.060389 | 29.786989 | 0.0 | 1.935989 | 4.0 | 4.0 | 4.0 | 53.0 | 87.0 | 84.0 |
2 | 1971 | AMJAVL71 | 132 | 15 | US | 0.000341 | 7.108642 | 0.467613 | 0 | 1.716799 | ... | 46.060389 | 29.786989 | 0.0 | 1.716799 | 4.0 | 4.0 | 4.0 | 51.0 | 87.0 | 78.0 |
3 | 1971 | AMMATA71 | 134 | 15 | US | 0.000522 | 6.839506 | 0.426540 | 0 | 1.687871 | ... | 46.060389 | 29.786989 | 0.0 | 1.687871 | 4.0 | 4.0 | 4.0 | 52.0 | 87.0 | 77.0 |
4 | 1971 | AMAMBS71 | 136 | 15 | US | 0.000442 | 8.928395 | 0.452489 | 0 | 1.504286 | ... | 46.060389 | 29.786989 | 0.0 | 1.504286 | 4.0 | 4.0 | 4.0 | 52.0 | 87.0 | 69.0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
2212 | 1990 | VV74085 | 5584 | 6 | EU | 0.000488 | 16.140015 | 0.385917 | 1 | 2.639135 | ... | 97.039220 | 27.861181 | 38.0 | 2.639135 | 2.0 | 2.0 | 2.0 | 102.0 | 57.0 | 109.0 |
2213 | 1990 | VV760G87 | 5585 | 6 | EU | 0.000091 | 25.986993 | 0.435967 | 1 | 2.136442 | ... | 97.039220 | 27.861181 | 38.0 | 2.136442 | 2.0 | 2.0 | 2.0 | 112.0 | 57.0 | 86.0 |
2214 | 1990 | YGGVPL90 | 5589 | 23 | EU | 0.000067 | 3.393267 | 0.358289 | 0 | 3.518846 | ... | 98.024103 | 28.809765 | 0.0 | 3.518846 | 0.0 | 0.0 | 0.0 | 86.0 | 70.0 | 62.0 |
2215 | 1990 | PS911C90 | 5590 | 12 | EU | 0.000039 | 44.758990 | 0.814913 | 1 | 3.016154 | ... | 97.222743 | 28.407171 | 19.0 | 3.016154 | 1.0 | 1.0 | 1.0 | 3.0 | 58.0 | 91.0 |
2216 | 1990 | PS94490 | 5592 | 12 | EU | 0.000025 | 32.058148 | 0.693796 | 1 | 3.267500 | ... | 97.222743 | 28.407171 | 19.0 | 3.267500 | 1.0 | 1.0 | 1.0 | 13.0 | 58.0 | 72.0 |
2217 rows × 31 columns
Any data type that has fields can be used as product data. An alternative way to specify problem_data
for Problem initialization is to simply use a dict
, where fields can be matrices. For example, we could use the following dict
, which includes both the new demand instruments as well as a few other variables that might be used when setting up the problem.
[6]:
product_data_dict = {k: product_data[k] for k in ['market_ids', 'firm_ids', 'shares', 'prices', 'hpwt', 'air', 'mpd']}
product_data_dict['demand_instruments'] = local_instruments