Download the Jupyter Notebook for this section: build_blp_instruments.ipynb

Building “Sums of Characteristics” BLP Instruments Example

[1]:
import pyblp
import numpy as np
import pandas as pd

np.set_printoptions(precision=3)
pyblp.__version__
[1]:
'1.1.0'

In this example, we’ll load the automobile product data from Berry, Levinsohn, and Pakes (1995) and show how to construct the included instruments from scratch.

[2]:
product_data = pd.read_csv(pyblp.data.BLP_PRODUCTS_LOCATION)
product_data.head()
[2]:
market_ids clustering_ids car_ids firm_ids region shares prices hpwt air mpd ... supply_instruments2 supply_instruments3 supply_instruments4 supply_instruments5 supply_instruments6 supply_instruments7 supply_instruments8 supply_instruments9 supply_instruments10 supply_instruments11
0 1971 AMGREM71 129 15 US 0.001051 4.935802 0.528997 0 1.888146 ... 0.0 1.705933 1.595656 87.0 -61.959985 0.0 46.060389 29.786989 0.0 1.888146
1 1971 AMHORN71 130 15 US 0.000670 5.516049 0.494324 0 1.935989 ... 0.0 1.680910 1.490295 87.0 -61.959985 0.0 46.060389 29.786989 0.0 1.935989
2 1971 AMJAVL71 132 15 US 0.000341 7.108642 0.467613 0 1.716799 ... 0.0 1.801067 1.357703 87.0 -61.959985 0.0 46.060389 29.786989 0.0 1.716799
3 1971 AMMATA71 134 15 US 0.000522 6.839506 0.426540 0 1.687871 ... 0.0 1.818061 1.261347 87.0 -61.959985 0.0 46.060389 29.786989 0.0 1.687871
4 1971 AMAMBS71 136 15 US 0.000442 8.928395 0.452489 0 1.504286 ... 0.0 1.933210 1.237365 87.0 -61.959985 0.0 46.060389 29.786989 0.0 1.504286

5 rows × 33 columns

[3]:
product_data[[f'demand_instruments{i}' for i in range(8)]]
[3]:
demand_instruments0 demand_instruments1 demand_instruments2 demand_instruments3 demand_instruments4 demand_instruments5 demand_instruments6 demand_instruments7
0 4.0 1.840967 0.0 6.844945 87.0 44.555539 0.0 167.325082
1 4.0 1.875639 0.0 6.797102 87.0 44.555539 0.0 167.325082
2 4.0 1.902350 0.0 7.016291 87.0 44.555539 0.0 167.325082
3 4.0 1.943423 0.0 7.045220 87.0 44.555539 0.0 167.325082
4 4.0 1.917475 0.0 7.228805 87.0 44.555539 0.0 167.325082
... ... ... ... ... ... ... ... ...
2212 2.0 0.826512 2.0 4.775577 128.0 57.660253 57.0 351.758942
2213 2.0 0.776462 2.0 5.278269 128.0 57.660253 57.0 351.758942
2214 0.0 0.000000 0.0 0.000000 130.0 58.514393 60.0 355.654808
2215 1.0 0.693796 1.0 3.267500 129.0 57.363973 58.0 352.890000
2216 1.0 0.814913 1.0 3.016154 129.0 57.363973 58.0 352.890000

2217 rows × 8 columns

[4]:
product_data[[f'supply_instruments{i}' for i in range(12)]]
[4]:
supply_instruments0 supply_instruments1 supply_instruments2 supply_instruments3 supply_instruments4 supply_instruments5 supply_instruments6 supply_instruments7 supply_instruments8 supply_instruments9 supply_instruments10 supply_instruments11
0 4.0 -3.109718 0.0 1.705933 1.595656 87.0 -61.959985 0.0 46.060389 29.786989 0.0 1.888146
1 4.0 -3.041927 0.0 1.680910 1.490295 87.0 -61.959985 0.0 46.060389 29.786989 0.0 1.935989
2 4.0 -2.986377 0.0 1.801067 1.357703 87.0 -61.959985 0.0 46.060389 29.786989 0.0 1.716799
3 4.0 -2.894442 0.0 1.818061 1.261347 87.0 -61.959985 0.0 46.060389 29.786989 0.0 1.687871
4 4.0 -2.953498 0.0 1.933210 1.237365 87.0 -61.959985 0.0 46.060389 29.786989 0.0 1.504286
... ... ... ... ... ... ... ... ... ... ... ... ...
2212 2.0 -1.770401 2.0 1.272566 0.511989 128.0 -104.631050 57.0 97.039220 27.861181 38.0 2.639135
2213 2.0 -1.892345 2.0 1.483875 0.511989 128.0 -104.631050 57.0 97.039220 27.861181 38.0 2.136442
2214 0.0 0.000000 0.0 0.000000 0.000000 130.0 -106.327167 60.0 98.024103 28.809765 0.0 3.518846
2215 1.0 -0.365578 1.0 0.955511 0.142876 129.0 -106.783331 58.0 97.222743 28.407171 19.0 3.016154
2216 1.0 -0.204674 1.0 0.875469 0.089795 129.0 -106.783331 58.0 97.222743 28.407171 19.0 3.267500

2217 rows × 12 columns

The demand-side “sums of characterstics” BLP instruments included in product_data can be built from scratch with the build_blp_instruments function.

[5]:
demand_instruments = pyblp.build_blp_instruments(pyblp.Formulation('1 + hpwt + air + mpd'), product_data)
demand_instruments
[5]:
array([[  4.   ,   1.841,   0.   , ...,  44.556,   0.   , 167.325],
       [  4.   ,   1.876,   0.   , ...,  44.556,   0.   , 167.325],
       [  4.   ,   1.902,   0.   , ...,  44.556,   0.   , 167.325],
       ...,
       [  0.   ,   0.   ,   0.   , ...,  58.514,  60.   , 355.655],
       [  1.   ,   0.694,   1.   , ...,  57.364,  58.   , 352.89 ],
       [  1.   ,   0.815,   1.   , ...,  57.364,  58.   , 352.89 ]])

The supply-side instruments from the original paper are “sums of characteristics” BLP instruments as well, but also include a standalone mpd shifter. Because of collinearity issues, the “rival” instrument constructed from the trend variable is excluded, and only the “own” instrument is retained.

[6]:
supply_instruments = np.c_[
    pyblp.build_blp_instruments(pyblp.Formulation('1 + log(hpwt) + air + log(mpg) + log(space)'), product_data),
    pyblp.build_blp_instruments(pyblp.Formulation('0 + trend'), product_data)[:, 0],
    product_data['mpd'],
]
supply_instruments
[6]:
array([[ 4.   , -3.11 ,  0.   , ..., 29.787,  0.   ,  1.888],
       [ 4.   , -3.042,  0.   , ..., 29.787,  0.   ,  1.936],
       [ 4.   , -2.986,  0.   , ..., 29.787,  0.   ,  1.717],
       ...,
       [ 0.   ,  0.   ,  0.   , ..., 28.81 ,  0.   ,  3.519],
       [ 1.   , -0.366,  1.   , ..., 28.407, 19.   ,  3.016],
       [ 1.   , -0.205,  1.   , ..., 28.407, 19.   ,  3.268]])