Download the Jupyter Notebook for this section: petrin.ipynb

Micro Moments Tutorial with Automobile Data

[1]:
from IPython.display import display, HTML
display(HTML("<style>pre { white-space: pre !important; }</style>"))

import pyblp
import numpy as np
import pandas as pd

pyblp.options.digits = 2
pyblp.options.verbose = False
pyblp.__version__
[1]:
'1.1.0'

In this tutorial, we’ll use data from Petrin (2002) to solve the paper’s automobile problem. This tutorial is similar to the first automobile tutorial, but exhibits how to incorporate micro moments into estimation.

Loading Data

We’ll use pandas to load two sets of data:

  1. product_data, which contains prices, shares, and other product characteristics.

  2. agent_data, which contains draws from the distribution of heterogeneity.

[2]:
product_data = pd.read_csv(pyblp.data.PETRIN_PRODUCTS_LOCATION)
product_data.head()
[2]:
market_ids clustering_ids firm_ids region jp eu q households shares prices ... supply_instruments6 supply_instruments7 supply_instruments8 supply_instruments9 supply_instruments10 supply_instruments11 supply_instruments12 supply_instruments13 supply_instruments14 supply_instruments15
0 1981 1 7 EU 0 1 18.647 83527 0.000223 10.379538 ... 9.0 0.0 0.0 144.0 -151.682461 108.724278 30.0 32.0 32.0 460.419731
1 1981 2 7 EU 0 1 17.611 83527 0.000211 13.140814 ... 9.0 0.0 0.0 144.0 -151.682461 108.724278 30.0 32.0 32.0 460.419731
2 1981 2 7 EU 0 1 6.139 83527 0.000073 19.746975 ... 9.0 0.0 0.0 144.0 -151.682461 108.724278 30.0 32.0 32.0 460.419731
3 1981 3 7 EU 0 1 2.553 83527 0.000031 13.085809 ... 9.0 0.0 0.0 144.0 -151.682461 108.724278 30.0 32.0 32.0 460.419731
4 1981 4 15 US 0 0 43.198 83527 0.000517 6.660066 ... 0.0 0.0 0.0 149.0 -157.647246 114.055507 30.0 35.0 42.0 467.806186

5 rows × 62 columns

The product_data contains market IDs, product IDs, firm IDs, shares, prices, a number of product characteristics, and instruments. The product IDs are called clustering_ids because they will be used to compute clustered standard errors. For more information about the instruments and the example data as a whole, refer to the data module.

The agent_data contains market IDs, integration weights \(w_{it}\), integration nodes \(\nu_{it}\), and demographics \(d_{it}\). Here we use \(I_t = 1000\) scrambled Halton draws in each market, along with demographics resampled from the Consumer Expenditure Survey (CEX) used by the original paper. These draws are slightly different from those used in the original paper (pseudo Monte Carlo draws and importance sampling). Note that following the original paper, the integration nodes are actually draws from a truncated \(\chi^2(3)\) distribution, rather than the more typical \(N(0, 1)\) draws that we have seen in prior tutorials.

[3]:
agent_data = pd.read_csv(pyblp.data.PETRIN_AGENTS_LOCATION)
agent_data.head()
[3]:
market_ids weights nodes0 nodes1 nodes2 nodes3 nodes4 nodes5 fv income low mid high fs age
0 1981 0.001 2.533314 7.496742 2.649343 3.892549 0.833761 1.928344 0.749785 10.346577 1 0 0 4 1
1 1981 0.001 4.422582 0.858539 1.646447 2.973352 0.033288 1.683242 5.232336 13.944210 0 1 0 2 1
2 1981 0.001 1.341509 5.041918 4.118932 2.166338 1.314582 0.360087 1.860212 5.898788 1 0 0 4 0
3 1981 0.001 3.324113 2.354892 0.802351 0.261043 3.911970 1.027856 6.980909 8.125445 1 0 0 2 0
4 1981 0.001 1.895857 1.807990 1.827797 4.080565 1.709768 0.707514 2.450663 34.397295 0 0 1 2 1

Setting up the Problem

The problem configuration is based on that of the first automobile problem. It is very similar, with both demand and supply sides, although with a few more product characteristics.

Again, we stack the three product formulations in order: \(X_1\), \(X_2\), and \(X_3\).

[4]:
product_formulations = (
    pyblp.Formulation('1 + hpwt + space + air + mpd + fwd + mi + sw + su + pv + pgnp + trend + trend2'),
    pyblp.Formulation('1 + I(-prices) + hpwt + space + air + mpd + fwd + mi + sw + su + pv'),
    pyblp.Formulation('1 + log(hpwt) + log(wt) + log(mpg) + air + fwd + trend * (jp + eu) + log(q)'),
)
product_formulations
[4]:
(1 + hpwt + space + air + mpd + fwd + mi + sw + su + pv + pgnp + trend + trend2,
 1 + I(-prices) + hpwt + space + air + mpd + fwd + mi + sw + su + pv,
 1 + log(hpwt) + log(wt) + log(mpg) + air + fwd + trend + jp + eu + trend:jp + trend:eu + log(q))

Again, we’ll use a first-order linear approximation to \(\log(y_i - p_j)\), in which \(y\) is income and \(p\) are prices. Unlike the previous automobile problem, however, we’ll allow its coefficient to vary for low- mid- and high-income consumers.

As in the original paper, we’ll also include \(log(\textit{fs}_i) \times \textit{fv}_i\) where \(\textit{fs}_i\) is family size and \(\textit{fv}_i\) is another truncated \(\chi^2(3)\) draw. Finally, to help with constructing micro moments below, we’ll also include various additional demographics in the agent formulation.

[5]:
agent_formulation = pyblp.Formulation('1 + I(low / income) + I(mid / income) + I(high / income) + I(log(fs) * fv) + age + fs + mid + high')
agent_formulation
[5]:
1 + I(low / income) + I(mid / income) + I(high / income) + I(log(fs) * fv) + age + fs + mid + high

The Problem can again be constructed by combining the product_formulations, product_data, agent_formulation, and agent_data. We’ll again choose a log-linear specification for marginal costs \(c_{jt}\).

[6]:
problem = pyblp.Problem(product_formulations, product_data, agent_formulation, agent_data, costs_type='log')
problem
[6]:
Dimensions:
========================================================
 T    N     F     I     K1    K2    K3    D    MD    MS
---  ----  ---  -----  ----  ----  ----  ---  ----  ----
13   2407  27   13000   13    11    12    9    35    28
========================================================

Formulations:
==============================================================================================================================================
       Column Indices:          0       1           2            3           4        5     6     7    8       9         10       11      12
-----------------------------  ---  ----------  ----------  -----------  ----------  ---  -----  ---  ----  --------  --------  ------  ------
 X1: Linear Characteristics     1      hpwt       space         air         mpd      fwd   mi    sw    su      pv       pgnp    trend   trend2
X2: Nonlinear Characteristics   1    -prices       hpwt        space        air      mpd   fwd   mi    sw      su        pv
X3: Log Cost Characteristics    1   log(hpwt)    log(wt)     log(mpg)       air      fwd  trend  jp    eu   jp*trend  eu*trend  log(q)
       d: Demographics          1   low/income  mid/income  high/income  fv*log(fs)  age   fs    mid  high
==============================================================================================================================================

The problem outputs a table of dimensions:

  • \(T\) denotes the number of markets.

  • \(N\) is the length of the dataset (the number of products across all markets).

  • \(F\) denotes the number of firms.

  • \(I = \sum_t I_t\) is the total number of agents across all markets (1000 draws per market times 13 markets).

  • \(K_1\) is the number of linear demand characteristics.

  • \(K_2\) is the number of nonlinear demand characteristics.

  • \(K_3\) is the number of linear supply characteristics.

  • \(D\) is the number of demographic variables.

  • \(M_D\) is the number of demand instruments, including exogenous regressors.

  • \(M_S\) is the number of supply instruments, including exogenous regressors.

The formulations table describes all four formulas for demand-side linear characteristics, demand-side nonlinear characteristics, supply-side characteristics, and demographics.

Setting up Micro Moments

Next, we will configure the micro moments that we will be adding to the problem. For background and notation involving micro moments, see Micro Moments.

Specifically, we will be adding a few more moments that match key statistics computed from the CEX survey of potential automobile consumers. For a tutorial on how to compute optimal micro moments that use all the information in a full micro dataset linking individual choices to demographics, see the post estimation tutorial.

To start, we will have to define a MicroDataset configuration that contains metadata about the micro dataset/survey. These metadata include a unique name for the dataset indexed by \(d\), the number of observations \(N_d\), a function that defines survey weights \(w_{dijt}\), and if relevant, a subset of markets from which the micro data was sampled.

[7]:
micro_dataset = pyblp.MicroDataset(
    name="CEX",
    observations=29125,
    compute_weights=lambda t, p, a: np.ones((a.size, 1 + p.size)),
)
micro_dataset
[7]:
CEX: 29125 Observations in All Markets

We called the dataset “CEX”, defined the number of observations in it, and also defined a lambda function for computing survey weights in a market. The compute_weights function has three arguments: the current market’s ID \(t\), the \(J_t\) Products inside the market, and the \(I_t\) Agents inside the market. In this case, we are assuming that each product and agent/consumer type are sampled with equal probability, so we simply return a matrix of ones of shape \(I_t \times (1 + J_t)\). This sets each \(w_{dijt} = 1\).

By using \(1 + J_t\) instead of \(J_t\), we are specifying that the micro dataset contains observations of the outside option \(j = 0\). If we instead specified a matrix of shape \(I_t \times J_t\), this would be the same as setting the first column equal to all zeros, so that outside choices are not sampled from.

We will be matching a few different statistics that were computed from this survey. For convenience, they are packaged in a data file with pyblp.

[8]:
micro_statistics = pd.read_csv(pyblp.data.PETRIN_VALUES_LOCATION, index_col=0)
micro_statistics
[8]:
value
E[age | mi] 0.7830
E[fs | mi] 3.8600
E[age | sw] 0.7300
E[fs | sw] 3.1700
E[age | su] 0.7400
E[fs | su] 2.9700
E[age | pv] 0.6520
E[fs | pv] 3.4700
E[new | mid] 0.0794
E[new | high] 0.1581

We will match the average age and family size (“fs”) conditional on purchasing a minivan (“mi”), station wagon (“sw”), sport-utility (“su”), and full-size passenger van (“pv”). We will also match the probability that a consumer actually purchases a new vehicle, conditional on them being mid- and high-income.

Each of these statistics is a conditional expectation, which we can rewrite as a ration of unconditional expectations over all consumers. Each of these unconditional expectations is called a MicroPart (used to form full micro moments), which we will now configure.

Each micro part is an average/expectation in the sample/population over micro values \(v_{pijt}\). To match the above micro values, we will need averages/expectations over interactions between agent/family size and dummies for purchasing the different automobile types. These will form the numerators in our conditional expectations.

[9]:
age_mi_part = pyblp.MicroPart(
    name="E[age_i * mi_j]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 5], np.r_[0, p.X2[:, 7]]),
)
age_sw_part = pyblp.MicroPart(
    name="E[age_i * sw_j]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 5], np.r_[0, p.X2[:, 8]]),
)
age_su_part = pyblp.MicroPart(
    name="E[age_i * su_j]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 5], np.r_[0, p.X2[:, 9]]),
)
age_pv_part = pyblp.MicroPart(
    name="E[age_i * pv_j]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 5], np.r_[0, p.X2[:, 10]]),
)
fs_mi_part = pyblp.MicroPart(
    name="E[fs_i * mi_j]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 6], np.r_[0, p.X2[:, 7]]),
)
fs_sw_part = pyblp.MicroPart(
    name="E[fs_i * sw_j]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 6], np.r_[0, p.X2[:, 8]]),
)
fs_su_part = pyblp.MicroPart(
    name="E[fs_i * su_j]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 6], np.r_[0, p.X2[:, 9]]),
)
fs_pv_part = pyblp.MicroPart(
    name="E[fs_i * pv_j]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 6], np.r_[0, p.X2[:, 10]]),
)

We will also need the denominators, which are simple averages/expectations of purchasing the different types of automobiles.

[10]:
mi_part = pyblp.MicroPart(
    name="E[mi_j]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 0], np.r_[0, p.X2[:, 7]]),
)
sw_part = pyblp.MicroPart(
    name="E[sw_j]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 0], np.r_[0, p.X2[:, 8]]),
)
su_part = pyblp.MicroPart(
    name="E[su_j]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 0], np.r_[0, p.X2[:, 9]]),
)
pv_part = pyblp.MicroPart(
    name="E[pv_j]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 0], np.r_[0, p.X2[:, 10]]),
)

To form our probability that a consumer actually purchases a new vehicle, conditional on them being mid- and high-income, we will also need the following micro parts.

[11]:
inside_mid_part = pyblp.MicroPart(
    name="E[1{j > 0} * mid_i]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 7], np.r_[0, p.X2[:, 0]]),
)
inside_high_part = pyblp.MicroPart(
    name="E[1{j > 0} * high_i]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 8], np.r_[0, p.X2[:, 0]]),
)
mid_part = pyblp.MicroPart(
    name="E[mid_i]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 7], np.r_[1, p.X2[:, 0]]),
)
high_part = pyblp.MicroPart(
    name="E[high_i]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 8], np.r_[1, p.X2[:, 0]]),
)

Finally, we’ll put these micro parts together into MicroMoments. Each micro moment is configured to have a name, a value (one of the statistics above), and micro parts that go into it.

If our micro moments were simple unconditional expectations, we could just pass a single micro part to each micro moment and be done. However, since our micro moments are functions of multiple micro parts, we have to specify this function. We also have to specify its derivative for computing standard errors and analytic objective gradients.

[12]:
compute_ratio = lambda v: v[0] / v[1]
compute_ratio_gradient = lambda v: [1 / v[1], -v[0] / v[1]**2]

Given our functions that define a conditional expectation and its derivatives, we can form our micro moments.

[13]:
micro_moments = [
    pyblp.MicroMoment(
        name="E[age_i | mi_j]",
        value=0.783,
        parts=[age_mi_part, mi_part],
        compute_value=compute_ratio,
        compute_gradient=compute_ratio_gradient,
    ),
    pyblp.MicroMoment(
        name="E[age_i | sw_j]",
        value=0.730,
        parts=[age_sw_part, sw_part],
        compute_value=compute_ratio,
        compute_gradient=compute_ratio_gradient,
    ),
    pyblp.MicroMoment(
        name="E[age_i | su_j]",
        value=0.740,
        parts=[age_su_part, su_part],
        compute_value=compute_ratio,
        compute_gradient=compute_ratio_gradient,
    ),
    pyblp.MicroMoment(
        name="E[age_i | pv_j]",
        value=0.652,
        parts=[age_pv_part, pv_part],
        compute_value=compute_ratio,
        compute_gradient=compute_ratio_gradient,
    ),
    pyblp.MicroMoment(
        name="E[fs_i | mi_j]",
        value=3.86,
        parts=[fs_mi_part, mi_part],
        compute_value=compute_ratio,
        compute_gradient=compute_ratio_gradient,
    ),
    pyblp.MicroMoment(
        name="E[fs_i | sw_j]",
        value=3.17,
        parts=[fs_sw_part, sw_part],
        compute_value=compute_ratio,
        compute_gradient=compute_ratio_gradient,
    ),
    pyblp.MicroMoment(
        name="E[fs_i | su_j]",
        value=2.97,
        parts=[fs_su_part, su_part],
        compute_value=compute_ratio,
        compute_gradient=compute_ratio_gradient,
    ),
    pyblp.MicroMoment(
        name="E[fs_i | pv_j]",
        value=3.47,
        parts=[fs_pv_part, pv_part],
        compute_value=compute_ratio,
        compute_gradient=compute_ratio_gradient,
    ),
    pyblp.MicroMoment(
        name="E[1{j > 0} | mid_i]",
        value=0.0794,
        parts=[inside_mid_part, mid_part],
        compute_value=compute_ratio,
        compute_gradient=compute_ratio_gradient,
    ),
    pyblp.MicroMoment(
        name="E[1{j > 0} | high_i]",
        value=0.1581,
        parts=[inside_high_part, high_part],
        compute_value=compute_ratio,
        compute_gradient=compute_ratio_gradient,
    ),
]

Solving the Problem

Like for the first automobile problem, here will will just use the publisehd estimates for \(\Sigma\) and \(\Pi\) starting values.

[14]:
initial_sigma = np.diag([3.23, 0, 4.43, 0.46, 0.01, 2.58, 4.42, 0, 0, 0, 0])
initial_pi = np.array([
    [0, 0, 0, 0, 0, 0, 0, 0, 0],
    [0, 7.52, 31.13, 34.49, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0.57, 0, 0, 0, 0],
    [0, 0, 0, 0, 0.28, 0, 0, 0, 0],
    [0, 0, 0, 0, 0.31, 0, 0, 0, 0],
    [0, 0, 0, 0, 0.42, 0, 0, 0, 0],
])

Finally, as in the original paper, we’ll use W_type and se_type to cluster by product IDs, which were specified as clustering_ids in product_data. We will use a simple BFGS optimization routine and slightly loosen the default tolerance of our inner SQUAREM iteration algorithm from 1e-14 to 1e-13 because the tighter tolerance tended to lead to convergence failures for this problem. We also pass our configured micro_moments when solving the problem.

[15]:
results = problem.solve(
    sigma=initial_sigma,
    pi=initial_pi,
    optimization=pyblp.Optimization('bfgs', {'gtol': 1e-4}),
    iteration=pyblp.Iteration('squarem', {'atol': 1e-13}),
    se_type='clustered',
    W_type='clustered',
    micro_moments=micro_moments,
)
results
[15]:
Problem Results Summary:
=======================================================================================================
GMM   Objective  Gradient      Hessian         Hessian     Clipped  Weighting Matrix  Covariance Matrix
Step    Value      Norm    Min Eigenvalue  Max Eigenvalue  Shares   Condition Number  Condition Number
----  ---------  --------  --------------  --------------  -------  ----------------  -----------------
 2    +1.8E+02   +4.3E-05     +3.8E-01        +1.3E+03        0         +2.8E+11          +8.1E+07
=======================================================================================================

Cumulative Statistics:
===========================================================================
Computation  Optimizer  Optimization   Objective   Fixed Point  Contraction
   Time      Converged   Iterations   Evaluations  Iterations   Evaluations
-----------  ---------  ------------  -----------  -----------  -----------
 00:34:10       Yes          72           87          10905        33592
===========================================================================

Nonlinear Coefficient Estimates (Robust SEs Adjusted for 898 Clusters in Parentheses):
================================================================================================================================================================================================================================================
Sigma:       1       -prices      hpwt       space        air         mpd         fwd         mi        sw        su        pv     |    Pi:       1      low/income  mid/income  high/income  fv*log(fs)    age        fs       mid       high
-------  ----------  --------  ----------  ----------  ----------  ----------  ----------  --------  --------  --------  --------  |  -------  --------  ----------  ----------  -----------  ----------  --------  --------  --------  --------
   1      +3.0E-02                                                                                                                 |     1     +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00     +0.0E+00   +0.0E+00  +0.0E+00  +0.0E+00  +0.0E+00
         (+5.3E-01)                                                                                                                |
                                                                                                                                   |
-prices   +0.0E+00   +0.0E+00                                                                                                      |  -prices  +0.0E+00   +3.9E+00    +1.2E+01    +2.4E+01     +0.0E+00   +0.0E+00  +0.0E+00  +0.0E+00  +0.0E+00
                                                                                                                                   |                     (+3.6E-01)  (+1.0E+00)  (+2.4E+00)
                                                                                                                                   |
 hpwt     +0.0E+00   +0.0E+00   +1.2E-01                                                                                           |   hpwt    +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00     +0.0E+00   +0.0E+00  +0.0E+00  +0.0E+00  +0.0E+00
                               (+8.1E-01)                                                                                          |
                                                                                                                                   |
 space    +0.0E+00   +0.0E+00   +0.0E+00    -9.2E-02                                                                               |   space   +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00     +0.0E+00   +0.0E+00  +0.0E+00  +0.0E+00  +0.0E+00
                                           (+6.1E-01)                                                                              |
                                                                                                                                   |
  air     +0.0E+00   +0.0E+00   +0.0E+00    +0.0E+00    -1.3E+00                                                                   |    air    +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00     +0.0E+00   +0.0E+00  +0.0E+00  +0.0E+00  +0.0E+00
                                                       (+1.1E+00)                                                                  |
                                                                                                                                   |
  mpd     +0.0E+00   +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00    -1.6E-01                                                       |    mpd    +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00     +0.0E+00   +0.0E+00  +0.0E+00  +0.0E+00  +0.0E+00
                                                                   (+2.2E-01)                                                      |
                                                                                                                                   |
  fwd     +0.0E+00   +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00    +0.0E+00    +1.6E+00                                           |    fwd    +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00     +0.0E+00   +0.0E+00  +0.0E+00  +0.0E+00  +0.0E+00
                                                                               (+3.7E-01)                                          |
                                                                                                                                   |
  mi      +0.0E+00   +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00    +0.0E+00    +0.0E+00   +0.0E+00                                |    mi     +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00     +4.2E-01   +0.0E+00  +0.0E+00  +0.0E+00  +0.0E+00
                                                                                                                                   |                                                          (+5.2E-02)
                                                                                                                                   |
  sw      +0.0E+00   +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00    +0.0E+00    +0.0E+00   +0.0E+00  +0.0E+00                      |    sw     +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00     +1.7E-01   +0.0E+00  +0.0E+00  +0.0E+00  +0.0E+00
                                                                                                                                   |                                                          (+4.2E-02)
                                                                                                                                   |
  su      +0.0E+00   +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00    +0.0E+00    +0.0E+00   +0.0E+00  +0.0E+00  +0.0E+00            |    su     +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00     +1.0E-01   +0.0E+00  +0.0E+00  +0.0E+00  +0.0E+00
                                                                                                                                   |                                                          (+5.2E-02)
                                                                                                                                   |
  pv      +0.0E+00   +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00    +0.0E+00    +0.0E+00   +0.0E+00  +0.0E+00  +0.0E+00  +0.0E+00  |    pv     +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00     +2.5E-01   +0.0E+00  +0.0E+00  +0.0E+00  +0.0E+00
                                                                                                                                   |                                                          (+8.1E-02)
================================================================================================================================================================================================================================================

Beta Estimates (Robust SEs Adjusted for 898 Clusters in Parentheses):
==========================================================================================================================================================
    1          hpwt       space        air         mpd         fwd          mi          sw          su          pv         pgnp       trend       trend2
----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------
 -8.9E+00    +8.3E+00    +4.9E+00    +3.8E+00    -1.4E-01    -6.5E+00    -2.1E+00    -1.3E+00    -1.1E+00    -3.3E+00    +3.4E-02    +2.2E-01    -1.5E-02
(+1.4E+00)  (+2.4E+00)  (+1.6E+00)  (+1.2E+00)  (+3.2E-01)  (+1.8E+00)  (+4.8E-01)  (+2.0E-01)  (+2.8E-01)  (+5.2E-01)  (+1.2E-02)  (+9.2E-02)  (+6.4E-03)
==========================================================================================================================================================

Gamma Estimates (Robust SEs Adjusted for 898 Clusters in Parentheses):
==============================================================================================================================================
    1       log(hpwt)    log(wt)     log(mpg)      air         fwd        trend         jp          eu       jp*trend    eu*trend     log(q)
----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------
 +1.4E+00    +8.8E-01    +1.4E+00    +1.2E-01    +2.7E-01    +6.9E-02    -1.2E-02    +1.0E-01    +4.6E-01    +1.6E-03    -1.1E-02    -6.9E-02
(+1.4E-01)  (+4.9E-02)  (+8.0E-02)  (+6.0E-02)  (+2.4E-02)  (+1.8E-02)  (+2.6E-03)  (+2.5E-02)  (+4.3E-02)  (+2.9E-03)  (+4.2E-03)  (+6.7E-03)
==============================================================================================================================================

Estimated Micro Moments:
===========================================================================================================
Observed  Estimated  Difference         Moment                 Part          Dataset  Observations  Markets
--------  ---------  ----------  --------------------  --------------------  -------  ------------  -------
+7.8E-01  +7.5E-01    +2.9E-02     E[age_i | mi_j]       E[age_i * mi_j]       CEX       29125        All
                                                             E[mi_j]           CEX       29125        All
+7.3E-01  +6.8E-01    +4.7E-02     E[age_i | sw_j]       E[age_i * sw_j]       CEX       29125        All
                                                             E[sw_j]           CEX       29125        All
+7.4E-01  +6.8E-01    +5.9E-02     E[age_i | su_j]       E[age_i * su_j]       CEX       29125        All
                                                             E[su_j]           CEX       29125        All
+6.5E-01  +7.3E-01    -7.7E-02     E[age_i | pv_j]       E[age_i * pv_j]       CEX       29125        All
                                                             E[pv_j]           CEX       29125        All
+3.9E+00  +3.9E+00    -1.2E-02      E[fs_i | mi_j]        E[fs_i * mi_j]       CEX       29125        All
                                                             E[mi_j]           CEX       29125        All
+3.2E+00  +3.2E+00    -7.6E-03      E[fs_i | sw_j]        E[fs_i * sw_j]       CEX       29125        All
                                                             E[sw_j]           CEX       29125        All
+3.0E+00  +3.0E+00    -8.5E-03      E[fs_i | su_j]        E[fs_i * su_j]       CEX       29125        All
                                                             E[su_j]           CEX       29125        All
+3.5E+00  +3.5E+00    -1.7E-02      E[fs_i | pv_j]        E[fs_i * pv_j]       CEX       29125        All
                                                             E[pv_j]           CEX       29125        All
+7.9E-02  +8.0E-02    -4.5E-04   E[1{j > 0} | mid_i]   E[1{j > 0} * mid_i]     CEX       29125        All
                                                             E[mid_i]          CEX       29125        All
+1.6E-01  +1.6E-01    -2.1E-03   E[1{j > 0} | high_i]  E[1{j > 0} * high_i]    CEX       29125        All
                                                            E[high_i]          CEX       29125        All
===========================================================================================================

There are some discrepances between these results and those in the original paper, but broadly estimates are similar. Although the estimates of \(\beta\) looks substantially off, this is primarily because the \(\chi^2(3)\) distributions are not mean-zero, so differences in estimates of \(\Sigma\) results in shifted estimates of \(\beta\) too.

Running the Main Counterfactual

One result that is very similar is the paper’s headline number: a $367.29 million compensating variation from a counterfactual that removes the minivan in 1984. Using our estimates, we get a very similar number.

This subsection previews some of the routines used in the next tutorial on functions available after estimation. First, we will compute implied marginal costs in 1984.

[16]:
year = 1984
costs_1984 = results.compute_costs(market_id=year)

Next, we will set up a counterfactual simulation in which the minivan is removed.

[17]:
product_data_1984 = product_data[product_data['market_ids'] == year]
xi_1984 = results.xi[product_data['market_ids'] == year]
agent_data_1984 = agent_data[agent_data['market_ids'] == year]
simulation = pyblp.Simulation(
    product_formulations=product_formulations[:2],
    product_data=product_data_1984[product_data_1984['mi'] == 0],
    xi=xi_1984[product_data_1984['mi'] == 0],
    agent_formulation=problem.agent_formulation,
    agent_data=agent_data_1984,
    beta=results.beta,
    sigma=results.sigma,
    pi=results.pi,
)

We will then solve for equilibrium prices and shares under this counterfactual, using the above-computed marginal costs.

[18]:
simulation_results = simulation.replace_endogenous(costs=costs_1984[product_data_1984['mi'] == 0])

Finally, we will compute the change in consumer surplus.

[19]:
households = product_data_1984['households'].values[0]
cs = households * results.compute_consumer_surpluses(market_id=year)
counterfactual_cs = households * simulation_results.compute_consumer_surpluses()
cs - counterfactual_cs
[19]:
array([[425.90824856]])

We get an estimate that is in the same ballpark as $367.29 million. When bootstrapping this procedure (see the next tutorial for more on this), we get a standard error around $250 million.