Download a PDF version of the documentation: pyblp.pdf
PyBLP¶
The documentation can be navigated with the sidebar, the links below, or the index.
Introduction¶
PyBLP is a Python 3 implementation of routines for estimating the demand for differentiated products with BLP-type random coefficients logit models. This package was created by Jeff Gortmaker in collaboration with Chris Conlon.
Development of the package has been guided by the work of many researchers and practitioners. For a full list of references, including the original work of Berry, Levinsohn, and Pakes (1995), refer to the references section of the documentation.
Citation¶
If you use PyBLP in your research, we ask that you also cite Conlon and Gortmaker (2020), which describes the advances implemented in the package.
@article{PyBLP,
author = {Conlon, Christopher and Gortmaker, Jeff},
title = {Best practices for differentiated products demand estimation with {PyBLP}},
journal = {The RAND Journal of Economics},
volume = {51},
number = {4},
pages = {1108-1161},
doi = {https://doi.org/10.1111/1756-2171.12352},
url = {https://onlinelibrary.wiley.com/doi/abs/10.1111/1756-2171.12352},
eprint = {https://onlinelibrary.wiley.com/doi/pdf/10.1111/1756-2171.12352},
year = {2020}
}
If you use PyBLP’s micro moments functionality, we ask that you also cite Conlon and Gortmaker (2023), which describes the standardized framework implemented by PyBLP for incorporating micro data into BLP-style estimation.
@misc{MicroPyBLP,
author = {Conlon, Christopher and Gortmaker, Jeff},
title = {Incorporating micro data into differentiated products demand estimation with {PyBLP}},
note = {Working paper},
year = {2023}
}
Installation¶
The PyBLP package has been tested on Python versions 3.6 through 3.9. The SciPy instructions for installing related packages is a good guide for how to install a scientific Python environment. A good choice is the Anaconda Distribution, since it comes packaged with the following PyBLP dependencies: NumPy, SciPy, SymPy, and Patsy. For absorption of high dimension fixed effects, PyBLP also depends on its companion package PyHDFE, which will be installed when PyBLP is installed.
However, PyBLP may not work with old versions of its dependencies. You can update PyBLP’s Anaconda dependencies with:
conda update numpy scipy sympy patsy
You can update PyHDFE with:
pip install --upgrade pyhdfe
You can install the current release of PyBLP with pip:
pip install pyblp
You can upgrade to a newer release with the --upgrade
flag:
pip install --upgrade pyblp
If you lack permissions, you can install PyBLP in your user directory with the --user
flag:
pip install --user pyblp
Alternatively, you can download a wheel or source archive from PyPI. You can find the latest development code on GitHub and the latest development documentation here.
Other Languages¶
Once installed, PyBLP can be incorporated into projects written in many other languages with the help of various tools that enable interoperability with Python.
For example, the reticulate package makes interacting with PyBLP in R straightforward (when supported, Python objects can be converted to their R counterparts with the py_to_r
function, which needs to be used manually because we set convert=FALSE
to get rid of errors about trying to automatically convert unsupported objects):
library(reticulate)
pyblp <- import("pyblp", convert=FALSE)
pyblp$options$flush_output <- TRUE
Similarly, PyCall can be used to incorporate PyBLP into a Julia workflow:
using PyCall
pyblp = pyimport("pyblp")
The py command serves a similar purpose in MATLAB:
py.pyblp
Features¶
R-style formula interface
Bertrand-Nash supply-side moments
Multiple equation GMM
Demographic interactions
Product-specific demographics
Consumer-specific product availability
Flexible micro moments that can match statistics based on survey data
Support for micro moments based on second choice data
Support for optimal micro moments that match micro data scores
Fixed effect absorption
Nonlinear functions of product characteristics
Concentrating out linear parameters
Flexible random coefficient distributions
Parameter bounds and constraints
Random coefficients nested logit (RCNL)
Approximation to the pure characteristics model
Varying nesting parameters across groups
Logit and nested logit benchmarks
Classic BLP instruments
Differentiation instruments
Optimal instruments
Covariance restrictions
Adjustments for simulation error
Tests of overidentifying and model restrictions
Parametric boostrapping post-estimation outputs
Elasticities and diversion ratios
Marginal costs and markups
Passthrough calculations
Profits and consumer surplus
Newton and fixed point methods for computing pricing equilibria
Merger simulation
Custom counterfactual simulation
Synthetic data construction
SciPy or Artleys Knitro optimization
Fixed point acceleration
Monte Carlo, quasi-random sequences, quadrature, and sparse grids
Importance sampling
Custom optimization and iteration routines
Robust and clustered errors
Linear or log-linear marginal costs
Partial ownership matrices
Analytic gradients
Finite difference Hessians
Market-by-market parallelization
Extended floating point precision
Robust error handling
Bugs and Requests¶
Please use the GitHub issue tracker to submit bugs or to request features.
Notation¶
The notation in PyBLP is a customized amalgamation of the notation employed by Berry, Levinsohn, and Pakes (1995), Nevo (2000a), Morrow and Skerlos (2011), Grigolon and Verboven (2014), and others.
Indices¶
Index |
Description |
---|---|
\(j\) |
Products |
\(t\) |
Markets |
\(i\) |
Agents/individuals |
\(f\) |
Firms |
\(h\) |
Nests |
\(c\) |
Clusters |
\(m\) |
Micro moments |
Dimensions/Sets¶
Dimension/Set |
Description |
---|---|
\(T\) |
Markets |
\(N\) |
Products across all markets |
\(F\) |
Firms across all markets |
\(I\) |
Agents across all markets |
\(J_t\) |
Products in market \(t\) |
\(F_t\) |
Firms in market \(t\) |
\(J_{ft}\) |
Products produced by firm \(f\) in market \(t\) |
\(I_t\) |
Agents in market \(t\) |
\(K_1\) |
Demand-side linear product characteristics |
\(K_1^\text{ex}\) |
Exogenous demand-side linear product characteristics |
\(K_1^\text{en}\) |
Endogenous demand-side linear product characteristics |
\(K_2\) |
Demand-side nonlinear product characteristics |
\(K_3\) |
Supply-side product characteristics |
\(K_3^\text{ex}\) |
Exogenous supply-side product characteristics |
\(K_3^\text{en}\) |
Endogenous supply-side product characteristics |
\(D\) |
Demographic variables |
\(M_D\) |
Demand-side instruments |
\(M_S\) |
Supply-side instruments |
\(M_C\) |
Covariance instruments |
\(M_M\) |
Micro moments |
\(T_m\) |
Markets over which micro moment \(m\) is averaged |
\(T_{mn}\) |
Markets over which micro moments \(m\) and \(n\) are both averaged |
\(N_m\) |
Observations underlying observed micro moment value \(m\). |
\(M\) |
All moments |
\(E_D\) |
Absorbed dimensions of demand-side fixed effects |
\(E_S\) |
Absorbed dimensions of supply-side fixed effects |
\(H\) |
Nesting groups |
\(J_{ht}\) |
Products in nesting group \(h\) and market \(t\) |
\(C\) |
Clusters |
\(J_{ct}\) |
Products in cluster \(c\) and market \(t\) |
Matrices, Vectors, and Scalars¶
Symbol |
Dimensions |
Description |
---|---|---|
\(X_1\) |
\(N \times K_1\) |
Demand-side linear product characteristics |
\(X_1^\text{ex}\) |
\(N \times K_1^\text{ex}\) |
Exogenous demand-side linear product characteristics |
\(X_1^\text{en}\) |
\(N \times K_1^\text{en}\) |
Endogenous demand-side linear product characteristics |
\(X_2\) |
\(N \times K_2\) |
Demand-side Nonlinear product characteristics |
\(X_3\) |
\(N \times K_3\) |
Supply-side product characteristics |
\(X_3^\text{ex}\) |
\(N \times K_3^\text{ex}\) |
Exogenous supply-side product characteristics |
\(X_3^\text{en}\) |
\(N \times K_3^\text{en}\) |
Endogenous supply-side product characteristics |
\(\xi\) |
\(N \times 1\) |
Unobserved demand-side product characteristics |
\(\omega\) |
\(N \times 1\) |
Unobserved supply-side product characteristics |
\(p\) |
\(N \times 1\) |
Prices |
\(s\) (\(s_{jt}\)) |
\(N \times 1\) |
Market shares |
\(s\) (\(s_{ht}\)) |
\(H \times 1\) |
Group shares in a market \(t\) |
\(s\) (\(s_{ijt}\)) |
\(N \times I_t\) |
Choice probabilities in a market \(t\) |
\(c\) |
\(N \times 1\) |
Marginal costs |
\(\tilde{c}\) |
\(N \times 1\) |
Linear or log-linear marginal costs, \(c\) or \(\log c\) |
\(\eta\) |
\(N \times 1\) |
Markup term from the BLP-markup equation |
\(\zeta\) |
\(N \times 1\) |
Markup term from the \(\zeta\)-markup equation |
\(\mathscr{H}\) |
\(J_t \times J_t\) |
Ownership or product holdings matrix in market \(t\) |
\(\kappa\) |
\(F_t \times F_t\) |
Cooperation matrix in market \(t\) |
\(\Delta\) |
\(J_t \times J_t\) |
Intra-firm matrix of (negative, transposed) demand derivatives in market \(t\) |
\(\Lambda\) |
\(J_t \times J_t\) |
Diagonal matrix used to decompose \(\eta\) and \(\zeta\) in market \(t\) |
\(\Gamma\) |
\(J_t \times J_t\) |
Another matrix used to decompose \(\eta\) and \(\zeta\) in market \(t\) |
\(d\) |
\(I_t \times D\) |
Observed agent characteristics called demographics in market \(t\) |
\(\nu\) |
\(I_t \times K_2\) |
Unobserved agent characteristics called integration nodes in market \(t\) |
\(a\) |
\(I_t \times J_t\) |
Agent-specific product availability in market \(t\) |
\(w\) |
\(I_t \times 1\) |
Integration weights in market \(t\) |
\(\delta\) |
\(N \times 1\) |
Mean utility |
\(\mu\) |
\(J_t \times I_t\) |
Agent-specific portion of utility in market \(t\) |
\(\epsilon\) |
\(N \times 1\) |
Type I Extreme Value idiosyncratic preferences |
\(\bar{\epsilon}\) (\(\bar{\epsilon}_{ijt}\)) |
\(N \times 1\) |
Type I Extreme Value term used to decompose \(\epsilon\) |
\(\bar{\epsilon}\) (\(\bar{\epsilon}_{iht}\)) |
\(N \times 1\) |
Group-specific term used to decompose \(\epsilon\) |
\(U\) |
\(J_t \times I_t\) |
Indirect utilities |
\(V\) (\(V_{ijt}\)) |
\(J_t \times I_t\) |
Indirect utilities minus \(\epsilon\) |
\(V\) (\(V_{iht}\)) |
\(J_t \times I_t\) |
Inclusive value of a nesting group |
\(\pi\) (\(\pi_{jt}\)) |
\(N \times 1\) |
Population-normalized gross expected profits |
\(\pi\) (\(\pi_{ft}\)) |
\(F_t \times 1\) |
Population-normalized gross expected profits of a firm in market \(t\) |
\(\beta\) |
\(K_1 \times 1\) |
Demand-side linear parameters |
\(\beta^\text{ex}\) |
\(K_1^\text{ex} \times 1\) |
Parameters in \(\beta\) on exogenous product characteristics |
\(\alpha\) |
\(K_1^\text{en} \times 1\) |
Parameters in \(\beta\) on endogenous product characteristics |
\(\Sigma\) |
\(K_2 \times K_2\) |
Cholesky root of the covariance matrix for unobserved taste heterogeneity |
\(\Pi\) |
\(K_2 \times D\) |
Parameters that measures how agent tastes vary with demographics |
\(\rho\) |
\(H \times 1\) |
Parameters that measures within nesting group correlation |
\(\gamma\) |
\(K_3 \times 1\) |
Supply-side linear parameters |
\(\gamma^\text{ex}\) |
\(K_3^\text{ex} \times 1\) |
Parameters in \(\gamma\) on exogenous product characteristics |
\(\gamma^\text{en}\) |
\(K_3^\text{en} \times 1\) |
Parameters in \(\gamma\) on endogenous product characteristics |
\(\theta\) |
\(P \times 1\) |
Parameters |
\(Z_D\) |
\(N \times M_D\) |
Demand-side instruments |
\(Z_S\) |
\(N \times M_S\) |
Supply-side instruments |
\(Z_C\) |
\(N \times M_C\) |
Covariance instruments |
\(W\) |
\(M \times M\) |
Weighting matrix |
\(S\) |
\(M \times M\) |
Moment covariances |
\(q\) |
\(1 \times 1\) |
Objective value |
\(g_D\) |
\(N \times M_D\) |
Demand-side moments |
\(g_S\) |
\(N \times M_S\) |
Supply-side moments |
\(g_C\) |
\(N \times M_C\) |
Covariance moments |
\(g_M\) |
\(I \times M_M\) |
Micro moments |
\(g\) (\(g_{jt}\)) |
\(N \times (M_D + M_S + M_C)\) |
Demand-side, supply-side, and covariance moments |
\(g\) (\(g_c\)) |
\(C \times (M_D + M_S + M_C)\) |
Clustered demand-side, supply-side, and covariance moments |
\(\bar{g}_D\) |
\(M_D \times 1\) |
Averaged demand-side moments |
\(\bar{g}_S\) |
\(M_S \times 1\) |
Averaged supply-side moments |
\(\bar{g}_C\) |
\(M_C \times 1\) |
Averaged covariance moments |
\(\bar{g}_M\) |
\(M_M \times 1\) |
Averaged micro moments |
\(\bar{g}\) |
\(M \times 1\) |
Averaged moments |
\(\bar{G}\) |
\(M \times P\) |
Jacobian of the averaged moments with respect to \(\theta\) |
\(\varepsilon\) |
\(J_t \times J_t\) |
Elasticities of demand in market \(t\) |
\(\mathscr{D}\) |
\(J_t \times J_t\) |
Diversion ratios in market \(t\) |
\(\bar{\mathscr{D}}\) |
\(J_t \times J_t\) |
Long-run diversion ratios in market \(t\) |
\(\mathscr{M}\) |
\(N \times 1\) |
Markups |
\(\mathscr{E}\) |
\(1 \times 1\) |
Aggregate elasticity of demand of a market |
\(\text{CS}\) |
\(1 \times 1\) |
Population-normalized consumer surplus of a market |
\(\text{HHI}\) |
\(1 \times 1\) |
Herfindahl-Hirschman Index of a market |
Background¶
The following sections provide a very brief overview of the BLP model and how it is estimated. This goal is to concisely introduce the notation and terminology used throughout the rest of the documentation. For a more in-depth overview, refer to Conlon and Gortmaker (2020).
The Model¶
There are \(t = 1, 2, \dotsc, T\) markets, each with \(j = 1, 2, \dotsc, J_t\) products produced by \(f = 1, 2, \dotsc, F_t\) firms, for a total of \(N\) products across all markets. There are \(i = 1, 2, \dotsc, I_t\) individuals/agents who choose among the \(J_t\) products and an outside good \(j = 0\). These numbers also represent sets. For example, \(J_t = \{1, 2, \dots, J_t\}\).
Demand¶
Observed demand-side product characteristics are contained in the \(N \times K_1\) matrix of linear characteristics, \(X_1\), and the \(N \times K_2\) matrix of nonlinear characteristics, \(X_2\), which is typically a subset of \(X_1\). Unobserved demand-side product characteristics, \(\xi\), are a \(N \times 1\) vector.
In market \(t\), observed agent characteristics are a \(I_t \times D\) matrix called demographics, \(d\). Unobserved agent characteristics are a \(I_t \times K_2\) matrix, \(\nu\).
The indirect utility of agent \(i\) from purchasing product \(j\) in market \(t\) is
in which the mean utility is, in vector-matrix form,
The \(K_1 \times 1\) vector of demand-side linear parameters, \(\beta\), is partitioned into two components: \(\alpha\) is a \(K_1^\text{en} \times 1\) vector of parameters on the \(N \times K_1^\text{en}\) submatrix of endogenous characteristics, \(X_1^\text{en}\), and \(\beta^\text{ex}\) is a \(K_1^\text{ex} \times 1\) vector of parameters on the \(N \times K_1^\text{ex}\) submatrix of exogenous characteristics, \(X_1^\text{ex}\). Usually, \(X_1^\text{en} = p\), prices, so \(\alpha\) is simply a scalar.
The agent-specific portion of utility in a single market is, in vector-matrix form,
The model incorporates both observable (demographics) and unobservable taste heterogeneity though random coefficients. For the unobserved heterogeneity, we let \(\nu\) denote independent draws from the standard normal distribution. These are scaled by a \(K_2 \times K_2\) lower-triangular matrix \(\Sigma\), which denotes the Cholesky root of the covariance matrix for unobserved taste heterogeneity. The \(K_2 \times D\) matrix \(\Pi\) measures how agent tastes vary with demographics.
In the above expression, random coefficients are assumed to be normally distributed, but this expression supports all elliptical distributions. To incorporate one or more lognormal random coefficients, the associated columns in the parenthesized expression can be exponentiated before being pre-multiplied by \(X_2\). For example, this allows for the coefficient on price to be lognormal so that demand slopes down for all agents. For lognormal random coefficients, a constant column is typically included in \(d\) so that its coefficients in \(\Pi\) parametrize the means of the logs of the random coefficients. More generally, all log-elliptical distributions are supported. A logit link function is also supported.
Random idiosyncratic preferences, \(\epsilon_{ijt}\), are assumed to be Type I Extreme Value, so that conditional on the heterogeneous coefficients, market shares follow the well-known logit form. Aggregate market shares are obtained by integrating over the distribution of individual heterogeneity. They are approximated with Monte Carlo integration or quadrature rules defined by the \(I_t \times K_2\) matrix of integration nodes, \(\nu\), and an \(I_t \times 1\) vector of integration weights, \(w\):
where the probability that agent \(i\) chooses product \(j\) in market \(t\) is
There is a one in the denominator because the utility of the outside good is normalized to \(U_{i0t} = 0\). The scale of utility is normalized by the variance of \(\epsilon_{ijt}\).
Supply¶
Observed supply-side product characteristics are contained in the \(N \times K_3\) matrix of supply-side characteristics, \(X_3\). Prices cannot be supply-side characteristics, but non-price product characteristics often overlap with the demand-side characteristics in \(X_1\) and \(X_2\). Unobserved supply-side product characteristics, \(\omega\), are a \(N \times 1\) vector.
Firm \(f\) chooses prices in market \(t\) to maximize the profits of its products \(J_{ft} \subset J_t\):
In a single market, the corresponding multi-product differentiated Bertrand first order conditions are, in vector-matrix form,
where the multi-product Bertrand markup \(\eta\) depends on \(\Delta\), a \(J_t \times J_t\) matrix of intra-firm (negative, transposed) demand derivatives:
Here, \(\mathscr{H}\) denotes the market-level ownership or product holdings matrix in the market, where \(\mathscr{H}_{jk}\) is typically \(1\) if the same firm produces products \(j\) and \(k\), and \(0\) otherwise.
To include a supply side, we must specify a functional form for marginal costs:
The most common choices are \(f(c) = c\) and \(f(c) = \log(c)\).
Estimation¶
A demand side is always estimated but including a supply side is optional. With only a demand side, there are three sets of parameters to be estimated: \(\beta\) (which may include \(\alpha\)), \(\Sigma\) and \(\Pi\). With a supply side, there is also \(\gamma\). The linear parameters, \(\beta\) and \(\gamma\), are typically concentrated out of the problem. The exception is \(\alpha\), which cannot be concentrated out when there is a supply side because it is needed to compute demand derivatives and hence marginal costs. Linear parameters that are not concentrated out along with unknown nonlinear parameters in \(\Sigma\) and \(\Pi\) are collectively denoted \(\theta\).
The GMM problem is
in which \(q(\theta)\) is the GMM objective. By default, PyBLP scales this value by \(N\) so that objectives across different problem sizes are comparable. This behavior can be disabled. In some of the BLP literature and in earlier versions of this package, the objective was scaled by \(N^2\).
Here, \(W\) is a \(M \times M\) weighting matrix and \(\bar{g}\) is a \(M \times 1\) vector of averaged demand- and supply-side moments:
where \(Z_D\) and \(Z_S\) are \(N \times M_D\) and \(N \times M_S\) matrices of demand- and supply-side instruments.
The vector \(\bar{g}\) contains sample analogues of the demand- and supply-side moment conditions \(E[g_{D,jt}] = E[g_{S,jt}] = 0\) where
In each GMM stage, a nonlinear optimizer finds the \(\hat{\theta}\) that minimizes the GMM objective value \(q(\theta)\).
The Objective¶
Given a \(\theta\), the first step to computing the objective \(q(\theta)\) is to compute \(\delta(\theta)\) in each market with the following standard contraction:
where \(s\) are the market’s observed shares and \(s(\delta, \theta)\) are calculated market shares. Iteration terminates when the norm of the change in \(\delta(\theta)\) is less than a small number.
With a supply side, marginal costs are then computed according to (7):
Concentrated out linear parameters are recovered with linear IV-GMM:
where
With only a demand side, \(\alpha\) can be concentrated out, so \(X = X_1\), \(Z = Z_D\), and \(Y = \delta(\theta)\) recover the full \(\hat{\beta}\) in (15).
Finally, the unobserved product characteristics (i.e., the structural errors),
are interacted with the instruments to form \(\bar{g}(\theta)\) in (11), which gives the GMM objective \(q(\theta)\) in (10).
The Gradient¶
The gradient of the GMM objective in (10) is
where
Writing \(\delta\) as an implicit function of \(s\) in (4) gives the demand-side Jacobian:
The supply-side Jacobian is derived from the definition of \(\tilde{c}\) in (9):
The second term in this expression is derived from the definition of \(\eta\) in (7):
One thing to note is that \(\frac{\partial\xi}{\partial\theta} = \frac{\partial\delta}{\partial\theta}\) and \(\frac{\partial\omega}{\partial\theta} = \frac{\partial\tilde{c}}{\partial\theta}\) need not hold during optimization if we concentrate out linear parameters because these are then functions of \(\theta\). Fortunately, one can use orthogonality conditions to show that it is fine to treat these parameters as fixed when computing the gradient.
Weighting Matrices¶
Conventionally, the 2SLS weighting matrix is used in the first stage:
With two-step GMM, \(W\) is updated before the second stage according to
For heteroscedasticity robust weighting matrices,
For clustered weighting matrices with \(c = 1, 2, \dotsc, C\) clusters,
where, letting the set \(J_{ct} \subset J_t\) denote products in cluster \(c\) and market \(t\),
For unadjusted weighting matrices,
where \(\sigma_\xi^2\), \(\sigma_\omega^2\), and \(\sigma_{\xi\omega}\) are estimates of the variances and covariance between the structural errors.
Simulation error can be accounted for by resampling agents \(r = 1, \dots, R\) times, evaluating each \(\bar{g}_r\), and adding the following to \(S\):
Standard Errors¶
An estimate of the asymptotic covariance matrix of \(\sqrt{N}(\hat{\theta} - \theta_0)\) is
Standard errors are the square root of the diagonal of this matrix divided by \(N\).
If the weighting matrix was chosen such that \(W = S^{-1}\), this simplifies to
Standard errors extracted from this simpler expression are called unadjusted.
Fixed Effects¶
The unobserved product characteristics can be partitioned into
where \(k_1, k_2, \dotsc, k_{E_D}\) and \(\ell_1, \ell_2, \dotsc, \ell_{E_S}\) index unobserved characteristics that are fixed across \(E_D\) and \(E_S\) dimensions. For example, with \(E_D = 1\) dimension of product fixed effects, \(\xi_{jt} = \xi_j + \Delta\xi_{jt}\).
Small numbers of fixed effects can be estimated with dummy variables in \(X_1\), \(X_3\), \(Z_D\), and \(Z_S\). However, this approach does not scale with high dimensional fixed effects because it requires constructing and inverting an infeasibly large matrix in (15).
Instead, fixed effects are typically absorbed into \(X\), \(Z\), and \(Y(\theta)\) in (15). With one fixed effect, these matrices are simply de-meaned within each level of the fixed effect. Both \(X\) and \(Z\) can be de-meaned just once, but \(Y(\theta)\) must be de-meaned for each new \(\theta\).
This procedure is equivalent to replacing each column of the matrices with residuals from a regression of the column on the fixed effect. The Frish-Waugh-Lovell (FWL) theorem of Frisch and Waugh (1933) and Lovell (1963) guarantees that using these residualized matrices gives the same results as including fixed effects as dummy variables. When \(E_D > 1\) or \(E_S > 1\), the matrices are residualized with more involved algorithms.
Once fixed effects have been absorbed, estimation is as described above with the structural errors \(\Delta\xi\) and \(\Delta\omega\).
Micro Moments¶
More detailed micro data on individual choices can be used to supplement the standard demand- and supply-side moments \(\bar{g}_D\) and \(\bar{g}_S\) in (11) with an additional \(m = 1, 2, \ldots, M_M\) micro moments, \(\bar{g}_M\), for a total of \(M = M_D + M_S + M_M\) moments:
Conlon and Gortmaker (2023) provides a standardized framework for incorporating micro moments into BLP-style estimation. What follows is a simplified summary of this framework. Each micro moment \(m\) is the difference between an observed value \(f_m(\bar{v})\) and its simulated analogue \(f_m(v)\):
in which \(f_m(\cdot)\) is a function that maps a vector of \(p = 1, \ldots, P_M\) micro moment parts \(\bar{v} = (\bar{v}_1, \dots, \bar{v}_{P_M})'\) or \(v = (v_1, \dots, v_{P_M})'\) into a micro statistic. Each sample micro moment part \(p\) is an average over observations \(n \in N_{d_m}\) in the associated micro dataset \(d_p\):
Its simulated analogue is
In which \(w_{it} s_{ijt} w_{d_pijt}\) is the probability an observation in the micro dataset is for an agent \(i\) who chooses \(j\) in market \(t\).
The simplest type of micro moment is just an average over the entire sample, with \(f_m(v) = v_1\). For example, with \(v_{1ijt}\) equal to the income for an agent \(i\) who chooses \(j\) in market \(t\), micro moment \(m\) would match the average income in dataset \(d_p\). Observed values such as conditional expectations, covariances, correlations, or regression coefficients can be matched by choosing the appropriate function \(f_m\). For example, with \(v_{2ijt}\) equal to the interaction between income and an indicator for the choice of the outside option, and with \(v_{3ijt}\) equal to an indicator for the choiced of the outside option, \(f_m(v) = v_2 / v_3\) would match an observed conditional mean income within those who choose the outside option.
A micro dataset \(d\), often a survey, is defined by survey weights \(w_{dijt}\). For example, \(w_{dijt} = 1\{j \neq 0, t \in T_d\}\) defines a micro dataset that is a selected sample of inside purchasers in a few markets \(T_d \subset T\), giving each market an equal sampling weight. Different micro datasets are independent.
A micro dataset will often admit multiple micro moment parts. Each micro moment part \(p\) is defined by its dataset \(d_p\) and micro values \(v_{pijt}\). For example, a micro moment part \(p\) with \(v_{pijt} = y_{it}x_{jt}\) delivers the mean \(\bar{v}_p\) or expectation \(v_p\) of an interaction between some demographic \(y_{it}\) and some product characteristic \(x_{jt}\).
A micro moment is a function of one or more micro moment parts. The simplest type is a function of only one micro moment part, and matches the simple average defined by the micro moment part. For example, \(f_m(v) = v_p\) with \(v_{pijt} = y_{it} x_{jt}\) matches the mean of an interaction between \(y_{it}\) and \(x_{jt}\). Non-simple averages such as conditional means, covariances, correlations, or regression coefficients can be matched by choosing an appropriate function \(f_m\). For example, \(f_m(v) = v_1 / v_2\) with \(v_{1ijt} = y_{it}x_{jt}1\{j \neq 0\}\) and \(v_{2ijt} = 1\{j \neq 0\}\) matches the conditional mean of an interaction between \(y_{it}\) and \(x_{jt}\) among those who do not choose the outside option \(j = 0\).
Technically, if not all micro moments \(m\) are simple averages \(f_m(v) = v_m\), then the resulting estimator will no longer be a GMM estimator, but rather a more generic minimum distance estimator, since these “micro moments” are not technically sample moments. Regardless, the package uses GMM terminology for simplicity’s sake, and the statistical expressions are all the same. Micro moments are computed for each \(\theta\) and contribute to the GMM (or minimum distance) objective \(q(\theta)\) in (10). Their derivatives with respect to \(\theta\) are added as rows to \(\bar{G}\) in (19), and blocks are added to both \(W\) and \(S\) in (23) and (24). The covariance between standard moments and micro moments is zero, so these matrices are block-diagonal. The delta method delivers the covariance matrix for the micro moments:
The scaled covariance between micro moment parts \(p\) and \(q\) in \(S_P\) is zero if they are based on different micro datasets \(d_p\) neq d_q`; otherwise, if based on the same dataset \(d_p = d_q = d\),
in which
Micro moment parts based on second choice are averages over values \(v_{pijkt}\) where \(k\) indexes second choices, and are based on datasets defined by survey weights \(w_{dijkt}\). A sample micro moment part is
Its simulated analogue is
in which \(s_{ik(-j)t}\) is the probability of choosing \(k\) when \(j\) is removed from the choice set. One can also define micro moment parts based on second choices where a group of products \(h(j)\) containing the first choice \(j\) is removed from the choice set. In this case, the above second choice probabilities become \(s_{ik(-h(j))t}\).
Covariances are defined analogously.
Random Coefficients Nested Logit¶
Incorporating parameters that measure within nesting group correlation gives the random coefficients nested logit (RCNL) model of Brenkers and Verboven (2006) and Grigolon and Verboven (2014). There are \(h = 1, 2, \dotsc, H\) nesting groups and each product \(j\) is assigned to a group \(h(j)\). The set \(J_{ht} \subset J_t\) denotes the products in group \(h\) and market \(t\).
In the RCNL model, idiosyncratic preferences are partitioned into
where \(\bar{\epsilon}_{ijt}\) is Type I Extreme Value and \(\bar{\epsilon}_{ih(j)t}\) is distributed such that \(\epsilon_{ijt}\) is still Type I Extreme Value.
The nesting parameters, \(\rho\), can either be a \(H \times 1\) vector or a scalar so that for all groups \(\rho_h = \rho\). Letting \(\rho \to 0\) gives the standard BLP model and \(\rho \to 1\) gives division by zero errors. With \(\rho_h \in (0, 1)\), the expression for choice probabilities in (5) becomes more complicated:
where
The contraction for \(\delta(\theta)\) in (13) is also slightly different:
Otherwise, estimation is as described above with \(\rho\) included in \(\theta\).
Logit and Nested Logit¶
Letting \(\Sigma = 0\) gives the simpler logit (or nested logit) model where there is a closed-form solution for \(\delta\). In the logit model,
and a lack of nonlinear parameters means that nonlinear optimization is often unneeded.
In the nested logit model, \(\rho\) must be optimized over, but there is still a closed-form solution for \(\delta\):
where
In both models, a supply side can still be estimated jointly with demand. Estimation is as described above with a representative agent in each market: \(I_t = 1\) and \(w_1 = 1\).
Equilibrium Prices¶
Counterfactual evaluation, synthetic data simulation, and optimal instrument generation often involve solving for prices implied by the Bertrand first order conditions in (7). Solving this system with Newton’s method is slow and iterating over \(p \leftarrow c + \eta(p)\) may not converge because it is not a contraction.
Instead, Morrow and Skerlos (2011) reformulate the solution to (7):
where \(\Lambda\) is a diagonal \(J_t \times J_t\) matrix approximated by
and \(\Gamma\) is a dense \(J_t \times J_t\) matrix approximated by
Equilibrium prices are computed by iterating over the \(\zeta\)-markup equation in (49),
which, unlike (7), is a contraction. Iteration terminates when the norm of firms’ first order conditions, \(||\Lambda(p)(p - c - \zeta(p))||\), is less than a small number.
If marginal costs depend on quantity, then they also depend on prices and need to be updated during each iteration: \(c_{jt} = c_{jt}(s_{jt}(p))\).
Tutorial¶
This section uses a series of Jupyter Notebooks to explain how PyBLP can be used to solve example problems, compute post-estimation outputs, and simulate problems.
For a more concise, targeted, and opinionated tutorial, see the Demand Estimation Mixtape Session, a three day workshop taught by Jeff Gortmaker in 2024. In order, the days cover pure logit estimation, aggregate BLP estimation, and micro BLP estimation. Each day has slides and extensive coding exercises (along with Jupyter Notebook solutions) that use PyBLP.
In addition to the notebooks here and the above workshop, other examples can be found in the API Documentation.
Download the Jupyter Notebook for this section: logit_nested.ipynb
Logit and Nested Logit Tutorial¶
[1]:
import pyblp
import numpy as np
import pandas as pd
pyblp.options.digits = 2
pyblp.options.verbose = False
pyblp.__version__
[1]:
'1.1.0'
In this tutorial, we’ll use data from Nevo (2000a) to solve the paper’s fake cereal problem. Locations of CSV files that contain the data are in the data module.
We will compare two simple models, the plain (IIA) logit model and the nested logit (GEV) model using the fake cereal dataset of Nevo (2000a).
Theory of Plain Logit¶
Let’s start with the plain logit model under independence of irrelevant alternatives (IIA). In this model (indirect) utility is given by
where \(\epsilon_{ijt}\) is distributed IID with the Type I Extreme Value (Gumbel) distribution. It is common to normalize the mean utility of the outside good to zero so that \(U_{i0t} = \epsilon_{i0t}\). This gives us aggregate market shares
If we take logs we get
and
By differencing the above we get a linear estimating equation:
Because the left hand side is data, we can estimate this model using linear IV GMM.
Application of Plain Logit¶
A Logit Problem can be created by simply excluding the formulation for the nonlinear parameters, \(X_2\), along with any agent information. In other words, it requires only specifying the linear component of demand.
We’ll set up and solve a simple version of the fake data cereal problem from Nevo (2000a). Since we won’t include any demand-side nonlinear characteristics or parameters, we don’t have to worry about configuring an optimization routine.
There are some important reserved variable names:
market_ids
are the unique market identifiers which we subscript with \(t\).shares
specifies the market shares which need to be between zero and one, and within a market ID, \(\sum_{j} s_{jt} \leq 1\).prices
are prices \(p_{jt}\). These have some special properties and are always treated as endogenous.demand_instruments0
,demand_instruments1
, and so on are numbered demand instruments. These represent only the excluded instruments. The exogenous regressors in \(X_1\) will be automatically added to the set of instruments.
We begin with two steps:
Load the product data which at a minimum consists of
market_ids
,shares
,prices
, and at least a single column of demand instruments,demand_instruments0
.Define a Formulation for the \(X_1\) (linear) demand model.
This and all other formulas are similar to R and patsy formulas.
It includes a constant by default. To exclude the constant, specify either a
0
or a-1
.To efficiently include fixed effects, use the
absorb
option and specify which categorical variables you would like to absorb.Some model reduction may happen automatically. The constant will be excluded if you include fixed effects and some precautions are taken against collinearity. However, you will have to make sure that differently-named variables are not collinear.
Combine the Formulation and product data to construct a Problem.
Use Problem.solve to estimate paramters.
Loading the Data¶
The product_data
argument of Problem should be a structured array-like object with fields that store data. Product data can be a structured NumPy array, a pandas DataFrame, or other similar objects.
[2]:
product_data = pd.read_csv(pyblp.data.NEVO_PRODUCTS_LOCATION)
product_data.head()
[2]:
market_ids | city_ids | quarter | product_ids | firm_ids | brand_ids | shares | prices | sugar | mushy | ... | demand_instruments10 | demand_instruments11 | demand_instruments12 | demand_instruments13 | demand_instruments14 | demand_instruments15 | demand_instruments16 | demand_instruments17 | demand_instruments18 | demand_instruments19 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | C01Q1 | 1 | 1 | F1B04 | 1 | 4 | 0.012417 | 0.072088 | 2 | 1 | ... | 2.116358 | -0.154708 | -0.005796 | 0.014538 | 0.126244 | 0.067345 | 0.068423 | 0.034800 | 0.126346 | 0.035484 |
1 | C01Q1 | 1 | 1 | F1B06 | 1 | 6 | 0.007809 | 0.114178 | 18 | 1 | ... | -7.374091 | -0.576412 | 0.012991 | 0.076143 | 0.029736 | 0.087867 | 0.110501 | 0.087784 | 0.049872 | 0.072579 |
2 | C01Q1 | 1 | 1 | F1B07 | 1 | 7 | 0.012995 | 0.132391 | 4 | 1 | ... | 2.187872 | -0.207346 | 0.003509 | 0.091781 | 0.163773 | 0.111881 | 0.108226 | 0.086439 | 0.122347 | 0.101842 |
3 | C01Q1 | 1 | 1 | F1B09 | 1 | 9 | 0.005770 | 0.130344 | 3 | 0 | ... | 2.704576 | 0.040748 | -0.003724 | 0.094732 | 0.135274 | 0.088090 | 0.101767 | 0.101777 | 0.110741 | 0.104332 |
4 | C01Q1 | 1 | 1 | F1B11 | 1 | 11 | 0.017934 | 0.154823 | 12 | 0 | ... | 1.261242 | 0.034836 | -0.000568 | 0.102451 | 0.130640 | 0.084818 | 0.101075 | 0.125169 | 0.133464 | 0.121111 |
5 rows × 30 columns
The product data contains market_ids
, product_ids
, firm_ids
, shares
, prices
, a number of other IDs and product characteristics, and some pre-computed excluded demand_instruments0
, demand_instruments1
, and so on. The product_ids
will be incorporated as fixed effects.
For more information about the instruments and the example data as a whole, refer to the data module.
Setting Up the Problem¶
We can combine the Formulation and product_data
to construct a Problem. We pass the Formulation first and the product_data
second. We can also display the properties of the problem by typing its name.
[3]:
logit_formulation = pyblp.Formulation('prices', absorb='C(product_ids)')
logit_formulation
[3]:
prices + Absorb[C(product_ids)]
[4]:
problem = pyblp.Problem(logit_formulation, product_data)
problem
[4]:
Dimensions:
================================
T N F K1 MD ED
--- ---- --- ---- ---- ----
94 2256 5 1 20 1
================================
Formulations:
==================================
Column Indices: 0
-------------------------- ------
X1: Linear Characteristics prices
==================================
Two sets of properties are displayed:
Dimensions of the data.
Formulations of the problem.
The dimensions describe the shapes of matrices as laid out in Notation. They include:
\(T\) is the number of markets.
\(N\) is the length of the dataset (the number of products across all markets).
\(F\) is the number of firms, which we won’t use in this example.
\(K_1\) is the dimension of the linear demand parameters.
\(M_D\) is the dimension of the instrument variables (excluded instruments and exogenous regressors).
\(E_D\) is the number of fixed effect dimensions (one-dimensional fixed effects, two-dimensional fixed effects, etc.).
There is only a single Formulation for this model.
\(X_1\) is the linear component of utility for demand and depends only on prices (after the fixed effects are removed).
Solving the Problem¶
The Problem.solve method always returns a ProblemResults class, which can be used to compute post-estimation outputs. See the post estimation tutorial for more information.
[5]:
logit_results = problem.solve()
logit_results
[5]:
Problem Results Summary:
==========================================
GMM Objective Clipped Weighting Matrix
Step Value Shares Condition Number
---- --------- ------- ----------------
2 +1.9E+02 0 +5.7E+07
==========================================
Cumulative Statistics:
========================
Computation Objective
Time Evaluations
----------- -----------
00:00:00 2
========================
Beta Estimates (Robust SEs in Parentheses):
==========
prices
----------
-3.0E+01
(+1.0E+00)
==========
Theory of Nested Logit¶
We can extend the logit model to allow for correlation within a group \(h\) so that
Now, we require that \(\epsilon_{jti} = \bar{\epsilon}_{h(j)ti} + (1 - \rho) \bar{\epsilon}_{jti}\) is distributed IID with the Type I Extreme Value (Gumbel) distribution. As \(\rho \rightarrow 1\), all consumers stay within their group. As \(\rho \rightarrow 0\), this collapses to the IIA logit. Note that if we wanted, we could allow \(\rho\) to differ between groups with the notation \(\rho_{h(j)}\).
This gives us aggregate market shares as the product of two logits, the within group logit and the across group logit:
where \(V_{jt} = \alpha p_{jt} + x_{jt} \beta^\text{ex} + \xi_{jt}\).
After some work we again obtain the linear estimating equation:
where \(s_{j|h(j)t} = s_{jt} / s_{h(j)t}\) and \(s_{h(j)t}\) is the share of group \(h\) in market \(t\). See Berry (1994) or Cardell (1997) for more information.
Again, the left hand side is data, though the \(\ln s_{j|h(j)t}\) is clearly endogenous which means we must instrument for it. Rather than include \(\ln s_{j|h(j)t}\) along with the linear components of utility, \(X_1\), whenever nesting_ids
are included in product_data
, \(\rho\) is treated as a nonlinear \(X_2\) parameter. This means that the linear component is given instead by
This is done for two reasons:
It forces the user to treat \(\rho\) as an endogenous parameter.
It extends much more easily to the RCNL model of Brenkers and Verboven (2006).
A common choice for an additional instrument is the number of products per nest.
Application of Nested Logit¶
By including nesting_ids
(another reserved name) as a field in product_data
, we tell the package to estimate a nested logit model, and we don’t need to change any of the formulas. We show how to construct the category groupings in two different ways:
We put all products in a single nest (only the outside good in the other nest).
We put products into two nests (either mushy or non-mushy).
We also construct an additional instrument based on the number of products per nest. Typically this is useful as a source of exogenous variation in the within group share \(\ln s_{j|h(j)t}\). However, in this example because the number of products per nest does not vary across markets, if we include product fixed effects, this instrument is irrelevant.
We’ll define a function that constructs the additional instrument and solves the nested logit problem. We’ll exclude product ID fixed effects, which are collinear with mushy
, and we’ll choose \(\rho = 0.7\) as the initial value at which the optimization routine will start.
[6]:
def solve_nl(df):
groups = df.groupby(['market_ids', 'nesting_ids'])
df['demand_instruments20'] = groups['shares'].transform(np.size)
nl_formulation = pyblp.Formulation('0 + prices')
problem = pyblp.Problem(nl_formulation, df)
return problem.solve(rho=0.7)
First, we’ll solve the problem when there’s a single nest for all products, with the outside good in its own nest.
[7]:
df1 = product_data.copy()
df1['nesting_ids'] = 1
nl_results1 = solve_nl(df1)
nl_results1
[7]:
Problem Results Summary:
======================================================================================
GMM Objective Projected Reduced Clipped Weighting Matrix Covariance Matrix
Step Value Gradient Norm Hessian Shares Condition Number Condition Number
---- --------- ------------- -------- ------- ---------------- -----------------
2 +2.0E+02 +1.2E-09 +1.1E+04 0 +2.0E+09 +3.0E+04
======================================================================================
Cumulative Statistics:
=================================================
Computation Optimizer Optimization Objective
Time Converged Iterations Evaluations
----------- --------- ------------ -----------
00:00:04 Yes 3 8
=================================================
Rho Estimates (Robust SEs in Parentheses):
==========
All Groups
----------
+9.8E-01
(+1.4E-02)
==========
Beta Estimates (Robust SEs in Parentheses):
==========
prices
----------
-1.2E+00
(+4.0E-01)
==========
When we inspect the Problem, the only changes from the plain logit model is the additional instrument that contributes to \(M_D\) and the inclusion of \(H\), the number of nesting categories.
[8]:
nl_results1.problem
[8]:
Dimensions:
===============================
T N F K1 MD H
--- ---- --- ---- ---- ---
94 2256 5 1 21 1
===============================
Formulations:
==================================
Column Indices: 0
-------------------------- ------
X1: Linear Characteristics prices
==================================
Next, we’ll solve the problem when there are two nests for mushy and non-mushy.
[9]:
df2 = product_data.copy()
df2['nesting_ids'] = df2['mushy']
nl_results2 = solve_nl(df2)
nl_results2
[9]:
Problem Results Summary:
======================================================================================
GMM Objective Projected Reduced Clipped Weighting Matrix Covariance Matrix
Step Value Gradient Norm Hessian Shares Condition Number Condition Number
---- --------- ------------- -------- ------- ---------------- -----------------
2 +6.9E+02 +3.0E-10 +5.6E+03 0 +5.1E+08 +2.0E+04
======================================================================================
Cumulative Statistics:
=================================================
Computation Optimizer Optimization Objective
Time Converged Iterations Evaluations
----------- --------- ------------ -----------
00:00:03 Yes 3 8
=================================================
Rho Estimates (Robust SEs in Parentheses):
==========
All Groups
----------
+8.9E-01
(+1.9E-02)
==========
Beta Estimates (Robust SEs in Parentheses):
==========
prices
----------
-7.8E+00
(+4.8E-01)
==========
For both cases we find that \(\hat{\rho} > 0.8\).
Finally, we’ll also look at the adjusted parameter on prices, \(\alpha / (1-\rho)\).
[10]:
nl_results1.beta[0] / (1 - nl_results1.rho)
[10]:
array([[-67.39338888]])
[11]:
nl_results2.beta[0] / (1 - nl_results2.rho)
[11]:
array([[-72.27074638]])
Download the Jupyter Notebook for this section: nevo.ipynb
Random Coefficients Logit Tutorial with the Fake Cereal Data¶
[1]:
import pyblp
import numpy as np
import pandas as pd
pyblp.options.digits = 2
pyblp.options.verbose = False
pyblp.__version__
[1]:
'1.1.0'
In this tutorial, we’ll use data from Nevo (2000a) to solve the paper’s fake cereal problem. Locations of CSV files that contain the data are in the data module.
Theory of Random Coefficients Logit¶
The random coefficients model extends the plain logit model by allowing for correlated tastes for different product characteristics. In this model (indirect) utility is given by
The main addition is that \(\beta_i = (\alpha_i, \beta_i^\text{ex})\) have individual specific subscripts \(i\).
Conditional on \(\beta_i\), the individual market share follow the same logit form as before. But now we must integrate over heterogeneous individuals to get the aggregate market share:
In general, this integral needs to be calculated numerically. This also means that we can’t directly linearize the model. It is common to re-parametrize the model to separate the aspects of mean utility that all individuals agree on, \(\delta_{jt} = \alpha p_{jt} + x_{jt} \beta^\text{ex} + \xi_{jt}\), from the individual specific heterogeneity, \(\mu_{ijt}(\theta)\). This gives us
Given a guess of \(\theta\) we can solve the system of nonlinear equations for the vector \(\delta\) which equates observed and predicted market share \(s_{jt} = s_{jt}(\delta, \theta)\). Now we can perform a linear IV GMM regression of the form
The moments are constructed by interacting the predicted residuals \(\hat{\xi}_{jt}(\theta)\) with instruments \(z_{jt}\) to form
Random Coefficients¶
To include random coefficients we need to add a Formulation for the demand-side nonlinear characteristics \(X_2\).
Just like in the logit case we have the same reserved field names in product_data
:
market_ids
are the unique market identifiers which we subscript \(t\).shares
specifies the market share which need to be between zero and one, and within a market ID, \(\sum_{j} s_{jt} < 1\).prices
are prices \(p_{jt}\). These have some special properties and are always treated as endogenous.demand_instruments0
,demand_instruments1
, and so on are numbered demand instruments. These represent only the excluded instruments. The exogenous regressors in \(X_1\) (of which \(X_2\) is typically a subset) will be automatically added to the set of instruments.
We proceed with the following steps:
Load the
product data
which at a minimum consists ofmarket_ids
,shares
,prices
, and at least a single column of demand instruments,demand_instruments0
.Define a Formulation for the \(X_1\) (linear) demand model.
This and all other formulas are similar to R and patsy formulas.
It includes a constant by default. To exclude the constant, specify either a
0
or a-1
.To efficiently include fixed effects, use the
absorb
option and specify which categorical variables you would like to absorb.Some model reduction may happen automatically. The constant will be excluded if you include fixed effects and some precautions are taken against collinearity. However, you will have to make sure that differently-named variables are not collinear.
Define a Formulation for the \(X_2\) (nonlinear) demand model.
Include only the variables over which we want random coefficients.
Do not absorb or include fixed effects.
It will include a random coefficient on the constant (to capture inside good vs. outside good preference) unless you specify not to with a
0
or a-1
.
Define an Integration configuration to solve the market share integral from several available options:
Monte Carlo integration (pseudo-random draws).
Product rule quadrature.
Sparse grid quadrature.
Combine Formulation classes,
product_data
, and the Integration configuration to construct a Problem.Use the Problem.solve method to estimate paramters.
It is required to specify an initial guess of the nonlinear parameters. This serves two primary purposes: speeding up estimation and indicating to the solver through initial values of zero which parameters are restricted to be always zero.
Specification of Random Taste Parameters¶
It is common to assume that \(f(\beta_i \mid \theta)\) follows a multivariate normal distribution, and to break it up into three parts:
A mean \(K_1 \times 1\) taste which all individuals agree on, \(\beta\).
A \(K_2 \times K_2\) covariance matrix, \(V\). As is common with multivariate normal distributions, \(V\) is not estimated directly. Rather, its matrix square (Cholesky) root \(\Sigma\) is estimated where \(\Sigma\Sigma' = V\).
Any \(K_2 \times D\) interactions, \(\Pi\), with observed \(D \times 1\) demographic data, \(d_i\).
Together this gives us that
Problem.solve takes an initial guess \(\Sigma_0\) of \(\Sigma\). It guarantees that \(\hat{\Sigma}\) (the estimated parameters) will have the same sparsity structure as \(\Sigma_0\). So any zero element of \(\Sigma\) is restricted to be zero in the solution \(\hat{\Sigma}\). For example, a popular restriction is that \(\Sigma\) is diagonal, this can be achieved by passing a diagonal matrix as \(\Sigma_0\).
Loading Data¶
The product_data
argument of Problem should be a structured array-like object with fields that store data. Product data can be a structured NumPy array, a pandas DataFrame, or other similar objects.
[2]:
product_data = pd.read_csv(pyblp.data.NEVO_PRODUCTS_LOCATION)
product_data.head()
[2]:
market_ids | city_ids | quarter | product_ids | firm_ids | brand_ids | shares | prices | sugar | mushy | ... | demand_instruments10 | demand_instruments11 | demand_instruments12 | demand_instruments13 | demand_instruments14 | demand_instruments15 | demand_instruments16 | demand_instruments17 | demand_instruments18 | demand_instruments19 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | C01Q1 | 1 | 1 | F1B04 | 1 | 4 | 0.012417 | 0.072088 | 2 | 1 | ... | 2.116358 | -0.154708 | -0.005796 | 0.014538 | 0.126244 | 0.067345 | 0.068423 | 0.034800 | 0.126346 | 0.035484 |
1 | C01Q1 | 1 | 1 | F1B06 | 1 | 6 | 0.007809 | 0.114178 | 18 | 1 | ... | -7.374091 | -0.576412 | 0.012991 | 0.076143 | 0.029736 | 0.087867 | 0.110501 | 0.087784 | 0.049872 | 0.072579 |
2 | C01Q1 | 1 | 1 | F1B07 | 1 | 7 | 0.012995 | 0.132391 | 4 | 1 | ... | 2.187872 | -0.207346 | 0.003509 | 0.091781 | 0.163773 | 0.111881 | 0.108226 | 0.086439 | 0.122347 | 0.101842 |
3 | C01Q1 | 1 | 1 | F1B09 | 1 | 9 | 0.005770 | 0.130344 | 3 | 0 | ... | 2.704576 | 0.040748 | -0.003724 | 0.094732 | 0.135274 | 0.088090 | 0.101767 | 0.101777 | 0.110741 | 0.104332 |
4 | C01Q1 | 1 | 1 | F1B11 | 1 | 11 | 0.017934 | 0.154823 | 12 | 0 | ... | 1.261242 | 0.034836 | -0.000568 | 0.102451 | 0.130640 | 0.084818 | 0.101075 | 0.125169 | 0.133464 | 0.121111 |
5 rows × 30 columns
The product data contains market_ids
, product_ids
, firm_ids
, shares
, prices
, a number of other firm IDs and product characteristics, and some pre-computed excluded demand_instruments0
, demand_instruments1
, and so on. The product_ids
will be incorporated as fixed effects.
For more information about the instruments and the example data as a whole, refer to the data module.
Setting Up and Solving the Problem Without Demographics¶
Formulations, product data, and an integration configuration are collectively used to initialize a Problem. Once initialized, Problem.solve runs the estimation routine. The arguments to Problem.solve configure how estimation is performed. For example, optimization
and iteration
arguments configure the optimization and
iteration routines that are used by the outer and inner loops of estimation.
We’ll specify Formulation configurations for \(X_1\), the demand-side linear characteristics, and \(X_2\), the nonlinear characteristics.
The formulation for \(X_1\) consists of
prices
and fixed effects constructed fromproduct_ids
, which we will absorb usingabsorb
argument of Formulation.If we were interested in reporting estimates for each fixed effect, we could replace the formulation for \(X_1\) with
Formulation('prices + C(product_ids)')
.Because
sugar
,mushy
, and the constant are collinear withproduct_ids
, we can include them in \(X_2\) but not in \(X_1\).
[3]:
X1_formulation = pyblp.Formulation('0 + prices', absorb='C(product_ids)')
X2_formulation = pyblp.Formulation('1 + prices + sugar + mushy')
product_formulations = (X1_formulation, X2_formulation)
product_formulations
[3]:
(prices + Absorb[C(product_ids)], 1 + prices + sugar + mushy)
We also need to specify an Integration configuration. We consider two alternatives:
Monte Carlo draws: we simulate 50 individuals from a random normal distribution. This is just for simplicity. In practice quasi-Monte Carlo sequences such as Halton draws are preferable, and there should generally be many more simulated individuals than just 50.
Product rules: we construct nodes and weights according to a product rule that exactly integrates polynomials of degree \(5 \times 2 - 1 = 9\) or less.
[4]:
mc_integration = pyblp.Integration('monte_carlo', size=50, specification_options={'seed': 0})
mc_integration
[4]:
Configured to construct nodes and weights with Monte Carlo simulation with options {seed: 0}.
[5]:
pr_integration = pyblp.Integration('product', size=5)
pr_integration
[5]:
Configured to construct nodes and weights according to the level-5 Gauss-Hermite product rule with options {}.
[6]:
mc_problem = pyblp.Problem(product_formulations, product_data, integration=mc_integration)
mc_problem
[6]:
Dimensions:
============================================
T N F I K1 K2 MD ED
--- ---- --- ---- ---- ---- ---- ----
94 2256 5 4700 1 4 20 1
============================================
Formulations:
===========================================================
Column Indices: 0 1 2 3
----------------------------- ------ ------ ----- -----
X1: Linear Characteristics prices
X2: Nonlinear Characteristics 1 prices sugar mushy
===========================================================
[7]:
pr_problem = pyblp.Problem(product_formulations, product_data, integration=pr_integration)
pr_problem
[7]:
Dimensions:
=============================================
T N F I K1 K2 MD ED
--- ---- --- ----- ---- ---- ---- ----
94 2256 5 58750 1 4 20 1
=============================================
Formulations:
===========================================================
Column Indices: 0 1 2 3
----------------------------- ------ ------ ----- -----
X1: Linear Characteristics prices
X2: Nonlinear Characteristics 1 prices sugar mushy
===========================================================
As an illustration of how to configure the optimization routine, we’ll use a simpler, non-default Optimization configuration that doesn’t support parameter bounds, and use a relatively loose tolerance so the problems are solved quickly. In practice along with more integration draws, it’s a good idea to use a tighter termination tolerance.
[8]:
bfgs = pyblp.Optimization('bfgs', {'gtol': 1e-4})
bfgs
[8]:
Configured to optimize using the BFGS algorithm implemented in SciPy with analytic gradients and options {gtol: +1.0E-04}.
We estimate three versions of the model:
An unrestricted covariance matrix for random tastes using Monte Carlo integration.
An unrestricted covariance matrix for random tastes using the product rule.
A restricted diagonal matrix for random tastes using Monte Carlo Integration.
Notice that the only thing that changes when we estimate the restricted covariance is our initial guess of \(\Sigma_0\). The upper diagonal in this initial guess is ignored because we are optimizing over the lower-triangular Cholesky root of \(V = \Sigma\Sigma'\).
[9]:
results1 = mc_problem.solve(sigma=np.ones((4, 4)), optimization=bfgs)
results1
[9]:
Problem Results Summary:
=======================================================================================================
GMM Objective Gradient Hessian Hessian Clipped Weighting Matrix Covariance Matrix
Step Value Norm Min Eigenvalue Max Eigenvalue Shares Condition Number Condition Number
---- --------- -------- -------------- -------------- ------- ---------------- -----------------
2 +1.5E+02 +8.7E-05 +8.5E-02 +6.5E+03 0 +5.2E+07 +8.3E+05
=======================================================================================================
Cumulative Statistics:
===========================================================================
Computation Optimizer Optimization Objective Fixed Point Contraction
Time Converged Iterations Evaluations Iterations Evaluations
----------- --------- ------------ ----------- ----------- -----------
00:01:48 Yes 58 75 82256 252902
===========================================================================
Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
=========================================================================================================================
Sigma: 1 prices sugar mushy | Sigma Squared: 1 prices sugar mushy
------ ---------- ---------- ---------- ---------- | -------------- ---------- ---------- ---------- ----------
1 +1.2E+00 | 1 +1.5E+00 -1.4E+01 +7.3E-02 -7.1E-01
(+3.0E+00) | (+7.2E+00) (+5.2E+01) (+2.2E-01) (+2.3E+00)
|
prices -1.1E+01 +8.4E+00 | prices -1.4E+01 +2.0E+02 -1.5E+00 +1.5E+00
(+1.8E+01) (+1.2E+01) | (+5.2E+01) (+3.1E+02) (+1.2E+00) (+1.5E+01)
|
sugar +6.1E-02 -9.1E-02 +3.8E-02 | sugar +7.3E-02 -1.5E+00 +1.3E-02 +2.0E-02
(+2.5E-01) (+2.3E-01) (+8.3E-02) | (+2.2E-01) (+1.2E+00) (+2.8E-02) (+2.7E-01)
|
mushy -5.9E-01 -6.2E-01 -2.3E-02 +4.8E-01 | mushy -7.1E-01 +1.5E+00 +2.0E-02 +9.6E-01
(+2.1E+00) (+1.5E+00) (+2.5E+00) (+1.3E+00) | (+2.3E+00) (+1.5E+01) (+2.7E-01) (+4.0E+00)
=========================================================================================================================
Beta Estimates (Robust SEs in Parentheses):
==========
prices
----------
-3.1E+01
(+6.0E+00)
==========
[10]:
results2 = pr_problem.solve(sigma=np.ones((4, 4)), optimization=bfgs)
results2
[10]:
Problem Results Summary:
=======================================================================================================
GMM Objective Gradient Hessian Hessian Clipped Weighting Matrix Covariance Matrix
Step Value Norm Min Eigenvalue Max Eigenvalue Shares Condition Number Condition Number
---- --------- -------- -------------- -------------- ------- ---------------- -----------------
2 +1.6E+02 +1.1E-05 +1.6E-02 +5.3E+03 0 +5.3E+07 +5.0E+20
=======================================================================================================
Cumulative Statistics:
===========================================================================
Computation Optimizer Optimization Objective Fixed Point Contraction
Time Converged Iterations Evaluations Iterations Evaluations
----------- --------- ------------ ----------- ----------- -----------
00:04:57 No 63 130 96884 301280
===========================================================================
Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
=========================================================================================================================
Sigma: 1 prices sugar mushy | Sigma Squared: 1 prices sugar mushy
------ ---------- ---------- ---------- ---------- | -------------- ---------- ---------- ---------- ----------
1 -7.4E-01 | 1 +5.5E-01 -9.4E+00 +8.3E-02 -1.1E-01
(+2.3E+00) | (+3.4E+00) (+3.5E+01) (+1.6E-01) (+6.4E-01)
|
prices +1.3E+01 +6.6E-06 | prices -9.4E+00 +1.6E+02 -1.4E+00 +1.9E+00
(+7.5E+00) (+2.7E+03) | (+3.5E+01) (+1.9E+02) (+8.0E-01) (+8.9E+00)
|
sugar -1.1E-01 -7.4E-08 -8.9E-09 | sugar +8.3E-02 -1.4E+00 +1.2E-02 -1.7E-02
(+2.0E-01) (+2.2E+05) (+5.2E+04) | (+1.6E-01) (+8.0E-01) (+2.2E-02) (+1.6E-01)
|
mushy +1.5E-01 -4.3E-07 +1.6E-07 +4.7E-08 | mushy -1.1E-01 +1.9E+00 -1.7E-02 +2.2E-02
(+6.8E-01) (+5.0E+03) (+7.0E+02) (+4.3E+02) | (+6.4E-01) (+8.9E+00) (+1.6E-01) (+2.0E-01)
=========================================================================================================================
Beta Estimates (Robust SEs in Parentheses):
==========
prices
----------
-3.1E+01
(+4.0E+00)
==========
[11]:
results3 = mc_problem.solve(sigma=np.eye(4), optimization=bfgs)
results3
[11]:
Problem Results Summary:
=======================================================================================================
GMM Objective Gradient Hessian Hessian Clipped Weighting Matrix Covariance Matrix
Step Value Norm Min Eigenvalue Max Eigenvalue Shares Condition Number Condition Number
---- --------- -------- -------------- -------------- ------- ---------------- -----------------
2 +1.8E+02 +1.3E-06 +1.1E+00 +6.0E+03 0 +5.8E+07 +4.8E+04
=======================================================================================================
Cumulative Statistics:
===========================================================================
Computation Optimizer Optimization Objective Fixed Point Contraction
Time Converged Iterations Evaluations Iterations Evaluations
----------- --------- ------------ ----------- ----------- -----------
00:00:28 Yes 16 24 19529 60447
===========================================================================
Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
======================================================
Sigma: 1 prices sugar mushy
------ ---------- ---------- ---------- ----------
1 +5.2E-02
(+1.1E+00)
prices +0.0E+00 -4.3E-01
(+8.0E+00)
sugar +0.0E+00 +0.0E+00 +3.6E-02
(+5.8E-02)
mushy +0.0E+00 +0.0E+00 +0.0E+00 +5.0E-01
(+1.7E+00)
======================================================
Beta Estimates (Robust SEs in Parentheses):
==========
prices
----------
-3.0E+01
(+1.4E+00)
==========
We see that all three models give similar estimates of the price coefficient \(\hat{\alpha} \approx -30\). Note a few of the estimated terms on the diagonal of \(\Sigma\) are negative. Since the diagonal consists of standard deviations, negative values are unrealistic. When using another optimization routine that supports bounds (like the default L-BFGS-B routine), these diagonal elements are by default bounded from below by zero.
Adding Demographics to the Problem¶
To add demographic data we need to make two changes:
We need to load
agent_data
, which for this cereal problem contains pre-computed Monte Carlo draws and demographics.We need to add an
agent_formulation
to the model.
The agent data
has several reserved column names.
market_ids
are the index that link theagent data
to themarket_ids
inproduct data
.weights
are the weights \(w_{it}\) attached to each agent. In each market, these should sum to one so that \(\sum_i w_{it} = 1\). It is often the case the \(w_{it} = 1 / I_t\) where \(I_t\) is the number of agents in market \(t\), so that each agent gets equal weight. Other possibilities include quadrature nodes and weights.nodes0
,nodes1
, and so on are the nodes at which the unobserved agent tastes \(\mu_{ijt}\) are evaluated. The nodes should be labeled from \(0, \ldots, K_2 - 1\) where \(K_2\) is the number of random coefficients.Other fields are the realizations of the demographics \(d\) themselves.
[12]:
agent_data = pd.read_csv(pyblp.data.NEVO_AGENTS_LOCATION)
agent_data.head()
[12]:
market_ids | city_ids | quarter | weights | nodes0 | nodes1 | nodes2 | nodes3 | income | income_squared | age | child | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | C01Q1 | 1 | 1 | 0.05 | 0.434101 | -1.500838 | -1.151079 | 0.161017 | 0.495123 | 8.331304 | -0.230109 | -0.230851 |
1 | C01Q1 | 1 | 1 | 0.05 | -0.726649 | 0.133182 | -0.500750 | 0.129732 | 0.378762 | 6.121865 | -2.532694 | 0.769149 |
2 | C01Q1 | 1 | 1 | 0.05 | -0.623061 | -0.138241 | 0.797441 | -0.795549 | 0.105015 | 1.030803 | -0.006965 | -0.230851 |
3 | C01Q1 | 1 | 1 | 0.05 | -0.041317 | 1.257136 | -0.683054 | 0.259044 | -1.485481 | -25.583605 | -0.827946 | 0.769149 |
4 | C01Q1 | 1 | 1 | 0.05 | -0.466691 | 0.226968 | 1.044424 | 0.092019 | -0.316597 | -6.517009 | -0.230109 | -0.230851 |
The agent formulation
tells us which columns of demographic information to interact with \(X_2\).
[13]:
agent_formulation = pyblp.Formulation('0 + income + income_squared + age + child')
agent_formulation
[13]:
income + income_squared + age + child
This tells us to include demographic realizations for income
, income_squared
, age
, and the presence of children, child
, but to ignore other possible demographics when interacting demographics with \(X_2\). We should also generally exclude the constant from the demographic formula.
Now we configure the Problem to include the agent_formulation
and agent_data
, which follow the product_formulations
and product_data
.
When we display the class, it lists the demographic interactions in the table of formulations and reports \(D = 4\), the dimension of the demographic draws.
[14]:
nevo_problem = pyblp.Problem(
product_formulations,
product_data,
agent_formulation,
agent_data
)
nevo_problem
[14]:
Dimensions:
=================================================
T N F I K1 K2 D MD ED
--- ---- --- ---- ---- ---- --- ---- ----
94 2256 5 1880 1 4 4 20 1
=================================================
Formulations:
===================================================================
Column Indices: 0 1 2 3
----------------------------- ------ -------------- ----- -----
X1: Linear Characteristics prices
X2: Nonlinear Characteristics 1 prices sugar mushy
d: Demographics income income_squared age child
===================================================================
The initialized problem can be solved with Problem.solve. We’ll use the same starting values as Nevo (2000a). By passing a diagonal matrix as starting values for \(\Sigma\), we’re choosing to ignore covariance terms. Similarly, zeros in the starting values for \(\Pi\) mean that those parameters will be fixed at zero.
To replicate common estimates, we’ll use the non-default BFGS optimization routine (with a slightly tighter tolerance to avoid getting stuck at a spurious local minimum), and we’ll configure Problem.solve to use 1-step GMM instead of 2-step GMM.
[15]:
initial_sigma = np.diag([0.3302, 2.4526, 0.0163, 0.2441])
initial_pi = np.array([
[ 5.4819, 0, 0.2037, 0 ],
[15.8935, -1.2000, 0, 2.6342],
[-0.2506, 0, 0.0511, 0 ],
[ 1.2650, 0, -0.8091, 0 ]
])
tighter_bfgs = pyblp.Optimization('bfgs', {'gtol': 1e-5})
nevo_results = nevo_problem.solve(
initial_sigma,
initial_pi,
optimization=tighter_bfgs,
method='1s'
)
nevo_results
[15]:
Problem Results Summary:
=======================================================================================================
GMM Objective Gradient Hessian Hessian Clipped Weighting Matrix Covariance Matrix
Step Value Norm Min Eigenvalue Max Eigenvalue Shares Condition Number Condition Number
---- --------- -------- -------------- -------------- ------- ---------------- -----------------
1 +4.6E+00 +6.9E-06 +3.3E-05 +1.6E+04 0 +6.9E+07 +8.4E+08
=======================================================================================================
Cumulative Statistics:
===========================================================================
Computation Optimizer Optimization Objective Fixed Point Contraction
Time Converged Iterations Evaluations Iterations Evaluations
----------- --------- ------------ ----------- ----------- -----------
00:01:12 Yes 51 57 46389 143977
===========================================================================
Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
=====================================================================================================================
Sigma: 1 prices sugar mushy | Pi: income income_squared age child
------ ---------- ---------- ---------- ---------- | ------ ---------- -------------- ---------- ----------
1 +5.6E-01 | 1 +2.3E+00 +0.0E+00 +1.3E+00 +0.0E+00
(+1.6E-01) | (+1.2E+00) (+6.3E-01)
|
prices +0.0E+00 +3.3E+00 | prices +5.9E+02 -3.0E+01 +0.0E+00 +1.1E+01
(+1.3E+00) | (+2.7E+02) (+1.4E+01) (+4.1E+00)
|
sugar +0.0E+00 +0.0E+00 -5.8E-03 | sugar -3.8E-01 +0.0E+00 +5.2E-02 +0.0E+00
(+1.4E-02) | (+1.2E-01) (+2.6E-02)
|
mushy +0.0E+00 +0.0E+00 +0.0E+00 +9.3E-02 | mushy +7.5E-01 +0.0E+00 -1.4E+00 +0.0E+00
(+1.9E-01) | (+8.0E-01) (+6.7E-01)
=====================================================================================================================
Beta Estimates (Robust SEs in Parentheses):
==========
prices
----------
-6.3E+01
(+1.5E+01)
==========
Results are similar to those in the original paper with a (scaled) objective value of \(q(\hat{\theta}) = 4.65\) and a price coefficient of \(\hat{\alpha} = -62.7\).
Restricting Parameters¶
Because the interactions between price
, income
, and income_squared
are potentially collinear, we might worry that \(\hat{\Pi}_{21} = 588\) and \(\hat{\Pi}_{22} = -30.2\) are pushing the price coefficient in opposite directions. Both are large in magnitude but statistically insignficant. One way of dealing with this is to restrict \(\Pi_{22} = 0\).
There are two ways we can do this:
Change the initial \(\Pi_0\) values to make this term zero.
Change the agent formula to drop
income_squared
.
First, we’ll change the initial \(\Pi_0\) values.
[16]:
restricted_pi = initial_pi.copy()
restricted_pi[1, 1] = 0
nevo_problem.solve(
initial_sigma,
restricted_pi,
optimization=tighter_bfgs,
method='1s'
)
[16]:
Problem Results Summary:
=======================================================================================================
GMM Objective Gradient Hessian Hessian Clipped Weighting Matrix Covariance Matrix
Step Value Norm Min Eigenvalue Max Eigenvalue Shares Condition Number Condition Number
---- --------- -------- -------------- -------------- ------- ---------------- -----------------
1 +1.5E+01 +5.2E-06 +4.7E-02 +1.7E+04 0 +6.9E+07 +5.7E+05
=======================================================================================================
Cumulative Statistics:
===========================================================================
Computation Optimizer Optimization Objective Fixed Point Contraction
Time Converged Iterations Evaluations Iterations Evaluations
----------- --------- ------------ ----------- ----------- -----------
00:00:53 Yes 34 40 34390 106492
===========================================================================
Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
=====================================================================================================================
Sigma: 1 prices sugar mushy | Pi: income income_squared age child
------ ---------- ---------- ---------- ---------- | ------ ---------- -------------- ---------- ----------
1 +3.7E-01 | 1 +3.1E+00 +0.0E+00 +1.2E+00 +0.0E+00
(+1.2E-01) | (+1.1E+00) (+1.0E+00)
|
prices +0.0E+00 +1.8E+00 | prices +4.2E+00 +0.0E+00 +0.0E+00 +1.2E+01
(+9.2E-01) | (+4.6E+00) (+5.2E+00)
|
sugar +0.0E+00 +0.0E+00 -4.4E-03 | sugar -1.9E-01 +0.0E+00 +2.8E-02 +0.0E+00
(+1.2E-02) | (+3.5E-02) (+3.2E-02)
|
mushy +0.0E+00 +0.0E+00 +0.0E+00 +8.6E-02 | mushy +1.5E+00 +0.0E+00 -1.5E+00 +0.0E+00
(+1.9E-01) | (+6.5E-01) (+1.1E+00)
=====================================================================================================================
Beta Estimates (Robust SEs in Parentheses):
==========
prices
----------
-3.2E+01
(+2.3E+00)
==========
Now we’ll drop both income_squared
and the corresponding column in \(\Pi_0\).
[17]:
restricted_formulation = pyblp.Formulation('0 + income + age + child')
deleted_pi = np.delete(initial_pi, 1, axis=1)
restricted_problem = pyblp.Problem(
product_formulations,
product_data,
restricted_formulation,
agent_data
)
restricted_problem.solve(
initial_sigma,
deleted_pi,
optimization=tighter_bfgs,
method='1s'
)
[17]:
Problem Results Summary:
=======================================================================================================
GMM Objective Gradient Hessian Hessian Clipped Weighting Matrix Covariance Matrix
Step Value Norm Min Eigenvalue Max Eigenvalue Shares Condition Number Condition Number
---- --------- -------- -------------- -------------- ------- ---------------- -----------------
1 +1.5E+01 +5.2E-06 +4.7E-02 +1.7E+04 0 +6.9E+07 +5.7E+05
=======================================================================================================
Cumulative Statistics:
===========================================================================
Computation Optimizer Optimization Objective Fixed Point Contraction
Time Converged Iterations Evaluations Iterations Evaluations
----------- --------- ------------ ----------- ----------- -----------
00:00:55 Yes 34 40 34371 106443
===========================================================================
Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
=====================================================================================================
Sigma: 1 prices sugar mushy | Pi: income age child
------ ---------- ---------- ---------- ---------- | ------ ---------- ---------- ----------
1 +3.7E-01 | 1 +3.1E+00 +1.2E+00 +0.0E+00
(+1.2E-01) | (+1.1E+00) (+1.0E+00)
|
prices +0.0E+00 +1.8E+00 | prices +4.2E+00 +0.0E+00 +1.2E+01
(+9.2E-01) | (+4.6E+00) (+5.2E+00)
|
sugar +0.0E+00 +0.0E+00 -4.4E-03 | sugar -1.9E-01 +2.8E-02 +0.0E+00
(+1.2E-02) | (+3.5E-02) (+3.2E-02)
|
mushy +0.0E+00 +0.0E+00 +0.0E+00 +8.6E-02 | mushy +1.5E+00 -1.5E+00 +0.0E+00
(+1.9E-01) | (+6.5E-01) (+1.1E+00)
=====================================================================================================
Beta Estimates (Robust SEs in Parentheses):
==========
prices
----------
-3.2E+01
(+2.3E+00)
==========
The parameter estimates and standard errors are identical for both approaches. Based on the number of fixed point iterations, there is some evidence that the solver took a slightly different path for each problem, but both restricted problems arrived at identical answers. The \(\hat{\Pi}_{12}\) interaction term is still insignificant.
Download the Jupyter Notebook for this section: blp.ipynb
Supply Side Tutorial with Automobile Data¶
[1]:
import pyblp
import numpy as np
import pandas as pd
pyblp.options.digits = 2
pyblp.options.verbose = False
pyblp.__version__
[1]:
'1.1.0'
In this tutorial, we’ll use data from Berry, Levinsohn, and Pakes (1995) to solve the paper’s automobile problem. This tutorial is similar to the fake cereal tutorial, but exhibits some other features of pyblp:
Incorporating a supply side into demand estimation.
Allowing for simple price-income demographic effects.
Calculating clustered standard errors.
Loading Data¶
We’ll use pandas to load two sets of data:
product_data
, which contains prices, shares, and other product characteristics.agent_data
, which contains draws from the distribution of heterogeneity.
[2]:
product_data = pd.read_csv(pyblp.data.BLP_PRODUCTS_LOCATION)
product_data.head()
[2]:
market_ids | clustering_ids | car_ids | firm_ids | region | shares | prices | hpwt | air | mpd | ... | supply_instruments2 | supply_instruments3 | supply_instruments4 | supply_instruments5 | supply_instruments6 | supply_instruments7 | supply_instruments8 | supply_instruments9 | supply_instruments10 | supply_instruments11 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1971 | AMGREM71 | 129 | 15 | US | 0.001051 | 4.935802 | 0.528997 | 0 | 1.888146 | ... | 0.0 | 1.705933 | 1.595656 | 87.0 | -61.959985 | 0.0 | 46.060389 | 29.786989 | 0.0 | 1.888146 |
1 | 1971 | AMHORN71 | 130 | 15 | US | 0.000670 | 5.516049 | 0.494324 | 0 | 1.935989 | ... | 0.0 | 1.680910 | 1.490295 | 87.0 | -61.959985 | 0.0 | 46.060389 | 29.786989 | 0.0 | 1.935989 |
2 | 1971 | AMJAVL71 | 132 | 15 | US | 0.000341 | 7.108642 | 0.467613 | 0 | 1.716799 | ... | 0.0 | 1.801067 | 1.357703 | 87.0 | -61.959985 | 0.0 | 46.060389 | 29.786989 | 0.0 | 1.716799 |
3 | 1971 | AMMATA71 | 134 | 15 | US | 0.000522 | 6.839506 | 0.426540 | 0 | 1.687871 | ... | 0.0 | 1.818061 | 1.261347 | 87.0 | -61.959985 | 0.0 | 46.060389 | 29.786989 | 0.0 | 1.687871 |
4 | 1971 | AMAMBS71 | 136 | 15 | US | 0.000442 | 8.928395 | 0.452489 | 0 | 1.504286 | ... | 0.0 | 1.933210 | 1.237365 | 87.0 | -61.959985 | 0.0 | 46.060389 | 29.786989 | 0.0 | 1.504286 |
5 rows × 33 columns
The product_data
contains market IDs, product IDs, firm IDs, shares, prices, a number of product characteristics, and instruments. The product IDs are called clustering_ids
because they will be used to compute clustered standard errors. For more information about the instruments and the example data as a whole, refer to the data module.
The agent_data
contains market IDs, integration weights \(w_{it}\), integration nodes \(\nu_{it}\), and demographics \(d_{it}\). Here we use the \(I_t = 200\) importance sampling weights and nodes from the original paper.
In non-example problems, it is usually a better idea to use many more draws, or a more sophisticated Integration configuration such as sparse grid quadrature.
[3]:
agent_data = pd.read_csv(pyblp.data.BLP_AGENTS_LOCATION)
agent_data.head()
[3]:
market_ids | weights | nodes0 | nodes1 | nodes2 | nodes3 | nodes4 | income | |
---|---|---|---|---|---|---|---|---|
0 | 1971 | 0.000543 | 1.192188 | 0.478777 | 0.980830 | -0.824410 | 2.473301 | 109.560369 |
1 | 1971 | 0.000723 | 1.497074 | -2.026204 | -1.741316 | 1.412568 | -0.747468 | 45.457314 |
2 | 1971 | 0.000544 | 1.438081 | 0.813280 | -1.749974 | -1.203509 | 0.049558 | 127.146548 |
3 | 1971 | 0.000701 | 1.768655 | -0.177453 | 0.286602 | 0.391517 | 0.683669 | 22.604045 |
4 | 1971 | 0.000549 | 0.849970 | -0.135337 | 0.735920 | 1.036247 | -1.143436 | 170.226032 |
Setting up the Problem¶
Unlike the fake cereal problem, we won’t absorb any fixed effects in the automobile problem, so the linear part of demand \(X_1\) has more components. We also need to specify a formula for the random coefficients \(X_2\), including a random coefficient on the constant, which captures correlation among all inside goods.
The primary new addition to the model relative to the fake cereal problem is that we add a supply side formula for product characteristics that contribute to marginal costs, \(X_3\). The patsy-style formulas support functions of regressors such as the log
function used below.
We stack the three product formulations in order: \(X_1\), \(X_2\), and \(X_3\).
[4]:
product_formulations = (
pyblp.Formulation('1 + hpwt + air + mpd + space'),
pyblp.Formulation('1 + prices + hpwt + air + mpd + space'),
pyblp.Formulation('1 + log(hpwt) + air + log(mpg) + log(space) + trend')
)
product_formulations
[4]:
(1 + hpwt + air + mpd + space,
1 + prices + hpwt + air + mpd + space,
1 + log(hpwt) + air + log(mpg) + log(space) + trend)
The original specification for the automobile problem includes the term \(\log(y_i - p_j)\), in which \(y\) is income and \(p\) are prices. Instead of including this term, which gives rise to a host of numerical problems, we’ll follow Berry, Levinsohn, and Pakes (1999) and use its first-order linear approximation, \(p_j / y_i\).
The agent formulation for demographics, \(d\), includes a column of \(1 / y_i\) values, which we’ll interact with \(p_j\). To do this, we will treat draws of \(y_i\) as demographic variables.
[5]:
agent_formulation = pyblp.Formulation('0 + I(1 / income)')
agent_formulation
[5]:
I(1 / income)
As in the cereal example, the Problem can be constructed by combining the product_formulations
, product_data
, agent_formulation
, and agent_data
. We’ll also choose the functional form of marginal costs \(c_{jt}\). A linear marginal cost specification is the default setting, so we’ll need to use the costs_type
argument of Problem to employ the log-linear specification used by
Berry, Levinsohn, and Pakes (1995).
When initializing the problem, we get a warning about integration weights not summing to one. This is because the above product data were created by the original paper with importance sampling. To disable this warning, we could increase pyblp.options.weights_tol
.
[6]:
problem = pyblp.Problem(product_formulations, product_data, agent_formulation, agent_data, costs_type='log')
problem
Integration weights in the following markets sum to a value that differs from 1 by more than options.weights_tol: all markets. Sometimes this is fine, for example when weights were built with importance sampling. Otherwise, it is a sign that there is a data problem.
[6]:
Dimensions:
=======================================================
T N F I K1 K2 K3 D MD MS
--- ---- --- ---- ---- ---- ---- --- ---- ----
20 2217 26 4000 5 6 6 1 13 18
=======================================================
Formulations:
=======================================================================================
Column Indices: 0 1 2 3 4 5
----------------------------- ---------- --------- ---- -------- ---------- -----
X1: Linear Characteristics 1 hpwt air mpd space
X2: Nonlinear Characteristics 1 prices hpwt air mpd space
X3: Log Cost Characteristics 1 log(hpwt) air log(mpg) log(space) trend
d: Demographics 1*1/income
=======================================================================================
The problem outputs a table of dimensions:
\(T\) denotes the number of markets.
\(N\) is the length of the dataset (the number of products across all markets).
\(F\) denotes the number of firms.
\(I = \sum_t I_t\) is the total number of agents across all markets (200 draws per market times 20 markets).
\(K_1\) is the number of linear demand characteristics.
\(K_2\) is the number of nonlinear demand characteristics.
\(K_3\) is the number of linear supply characteristics.
\(D\) is the number of demographic variables.
\(M_D\) is the number of demand instruments, including exogenous regressors.
\(M_S\) is the number of supply instruments, including exogenous regressors.
The formulations table describes all four formulas for demand-side linear characteristics, demand-side nonlinear characteristics, supply-side characteristics, and demographics.
Solving the Problem¶
The only remaining decisions are:
Choosing \(\Sigma\) and \(\Pi\) starting values, \(\Sigma_0\) and \(\Pi_0\).
Potentially choosing bounds for \(\Sigma\) and \(\Pi\).
The decisions we will use are:
Use published estimates as our starting values in \(\Sigma_0\).
Interact the inverse of income, \(1 / y_i\), only with prices, and use the published estimate on \(\log(y_i - p_j)\) as our starting value for \(\alpha\) in \(\Pi_0\).
Bound \(\Sigma_0\) to be positive since it is a diagonal matrix where the diagonal consists of standard deviations.
When using a routine that supports bounds, it’s usually a good idea to set your own more bounds so that the routine doesn’t try out large parameter values that create numerical issues.
[7]:
initial_sigma = np.diag([3.612, 0, 4.628, 1.818, 1.050, 2.056])
initial_pi = np.c_[[0, -43.501, 0, 0, 0, 0]]
Note that there are only 5 nonzeros on the diagonal of \(\Sigma\), which means that we only need 5 columns of integration nodes to integrate over these 5 dimensions of unobserved heterogeneity. Indeed, agent_data
contains exactly 5 columns of nodes. If we were to ignore the \(\log(y_i - p_j)\) term (by not configuring \(\Pi\)) and include a term on prices in \(\Sigma\) instead, we would have needed 6 columns of integration nodes in our agent_data
.
A downside of the log-linear marginal costs specification is that nonpositive estimated marginal costs can create problems for the optimization routine when computing \(\log c(\hat{\theta})\). We’ll use the costs_bounds
argument to bound marginal costs from below by a small number.
Finally, as in the original paper, we’ll use W_type
and se_type
to cluster by product IDs, which were specified as clustering_ids
in product_data
, and set initial_update=True
to update the initial GMM weighting matrix and the mean utility at the starting parameter values.
[8]:
results = problem.solve(
initial_sigma,
initial_pi,
costs_bounds=(0.001, None),
W_type='clustered',
se_type='clustered',
initial_update=True,
)
results
[8]:
Problem Results Summary:
=======================================================================================================================
GMM Objective Projected Reduced Hessian Reduced Hessian Clipped Clipped Weighting Matrix Covariance Matrix
Step Value Gradient Norm Min Eigenvalue Max Eigenvalue Shares Costs Condition Number Condition Number
---- --------- ------------- --------------- --------------- ------- ------- ---------------- -----------------
2 +5.0E+02 +1.1E-08 +4.9E-01 +5.1E+02 0 0 +4.2E+09 +3.8E+08
=======================================================================================================================
Cumulative Statistics:
===========================================================================
Computation Optimizer Optimization Objective Fixed Point Contraction
Time Converged Iterations Evaluations Iterations Evaluations
----------- --------- ------------ ----------- ----------- -----------
00:10:14 No 58 126 36807 112905
===========================================================================
Nonlinear Coefficient Estimates (Robust SEs Adjusted for 999 Clusters in Parentheses):
===================================================================================================
Sigma: 1 prices hpwt air mpd space | Pi: 1*1/income
------ ---------- -------- ---------- ---------- ---------- ---------- | ------ ----------
1 +2.0E+00 | 1 +0.0E+00
(+6.1E+00) |
|
prices +0.0E+00 +0.0E+00 | prices -4.5E+01
| (+9.2E+00)
|
hpwt +0.0E+00 +0.0E+00 +6.1E+00 | hpwt +0.0E+00
(+2.2E+00) |
|
air +0.0E+00 +0.0E+00 +0.0E+00 +4.0E+00 | air +0.0E+00
(+2.1E+00) |
|
mpd +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +2.5E-01 | mpd +0.0E+00
(+5.5E-01) |
|
space +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +1.9E+00 | space +0.0E+00
(+1.1E+00) |
===================================================================================================
Beta Estimates (Robust SEs Adjusted for 999 Clusters in Parentheses):
==========================================================
1 hpwt air mpd space
---------- ---------- ---------- ---------- ----------
-7.3E+00 +3.5E+00 -1.0E+00 +4.2E-01 +4.2E+00
(+2.8E+00) (+1.4E+00) (+2.1E+00) (+2.5E-01) (+6.6E-01)
==========================================================
Gamma Estimates (Robust SEs Adjusted for 999 Clusters in Parentheses):
======================================================================
1 log(hpwt) air log(mpg) log(space) trend
---------- ---------- ---------- ---------- ---------- ----------
+2.8E+00 +9.0E-01 +4.2E-01 -5.2E-01 -2.6E-01 +2.7E-02
(+1.2E-01) (+7.2E-02) (+8.7E-02) (+7.3E-02) (+2.1E-01) (+3.1E-03)
======================================================================
There are some discrepancies between our results and the original paper, but results are similar.
Download the Jupyter Notebook for this section: petrin.ipynb
Micro Moments Tutorial with Automobile Data¶
[1]:
from IPython.display import display, HTML
display(HTML("<style>pre { white-space: pre !important; }</style>"))
import pyblp
import numpy as np
import pandas as pd
pyblp.options.digits = 2
pyblp.options.verbose = False
pyblp.__version__
[1]:
'1.1.0'
In this tutorial, we’ll use data from Petrin (2002) to solve the paper’s automobile problem. This tutorial is similar to the first automobile tutorial, but exhibits how to incorporate micro moments into estimation.
Loading Data¶
We’ll use pandas to load two sets of data:
product_data
, which contains prices, shares, and other product characteristics.agent_data
, which contains draws from the distribution of heterogeneity.
[2]:
product_data = pd.read_csv(pyblp.data.PETRIN_PRODUCTS_LOCATION)
product_data.head()
[2]:
market_ids | clustering_ids | firm_ids | region | jp | eu | q | households | shares | prices | ... | supply_instruments6 | supply_instruments7 | supply_instruments8 | supply_instruments9 | supply_instruments10 | supply_instruments11 | supply_instruments12 | supply_instruments13 | supply_instruments14 | supply_instruments15 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1981 | 1 | 7 | EU | 0 | 1 | 18.647 | 83527 | 0.000223 | 10.379538 | ... | 9.0 | 0.0 | 0.0 | 144.0 | -151.682461 | 108.724278 | 30.0 | 32.0 | 32.0 | 460.419731 |
1 | 1981 | 2 | 7 | EU | 0 | 1 | 17.611 | 83527 | 0.000211 | 13.140814 | ... | 9.0 | 0.0 | 0.0 | 144.0 | -151.682461 | 108.724278 | 30.0 | 32.0 | 32.0 | 460.419731 |
2 | 1981 | 2 | 7 | EU | 0 | 1 | 6.139 | 83527 | 0.000073 | 19.746975 | ... | 9.0 | 0.0 | 0.0 | 144.0 | -151.682461 | 108.724278 | 30.0 | 32.0 | 32.0 | 460.419731 |
3 | 1981 | 3 | 7 | EU | 0 | 1 | 2.553 | 83527 | 0.000031 | 13.085809 | ... | 9.0 | 0.0 | 0.0 | 144.0 | -151.682461 | 108.724278 | 30.0 | 32.0 | 32.0 | 460.419731 |
4 | 1981 | 4 | 15 | US | 0 | 0 | 43.198 | 83527 | 0.000517 | 6.660066 | ... | 0.0 | 0.0 | 0.0 | 149.0 | -157.647246 | 114.055507 | 30.0 | 35.0 | 42.0 | 467.806186 |
5 rows × 62 columns
The product_data
contains market IDs, product IDs, firm IDs, shares, prices, a number of product characteristics, and instruments. The product IDs are called clustering_ids
because they will be used to compute clustered standard errors. For more information about the instruments and the example data as a whole, refer to the data module.
The agent_data
contains market IDs, integration weights \(w_{it}\), integration nodes \(\nu_{it}\), and demographics \(d_{it}\). Here we use \(I_t = 1000\) scrambled Halton draws in each market, along with demographics resampled from the Consumer Expenditure Survey (CEX) used by the original paper. These draws are slightly different from those used in the original paper (pseudo Monte Carlo draws and importance sampling). Note that following the original paper, the integration
nodes are actually draws from a truncated \(\chi^2(3)\) distribution, rather than the more typical \(N(0, 1)\) draws that we have seen in prior tutorials.
[3]:
agent_data = pd.read_csv(pyblp.data.PETRIN_AGENTS_LOCATION)
agent_data.head()
[3]:
market_ids | weights | nodes0 | nodes1 | nodes2 | nodes3 | nodes4 | nodes5 | fv | income | low | mid | high | fs | age | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1981 | 0.001 | 2.533314 | 7.496742 | 2.649343 | 3.892549 | 0.833761 | 1.928344 | 0.749785 | 10.346577 | 1 | 0 | 0 | 4 | 1 |
1 | 1981 | 0.001 | 4.422582 | 0.858539 | 1.646447 | 2.973352 | 0.033288 | 1.683242 | 5.232336 | 13.944210 | 0 | 1 | 0 | 2 | 1 |
2 | 1981 | 0.001 | 1.341509 | 5.041918 | 4.118932 | 2.166338 | 1.314582 | 0.360087 | 1.860212 | 5.898788 | 1 | 0 | 0 | 4 | 0 |
3 | 1981 | 0.001 | 3.324113 | 2.354892 | 0.802351 | 0.261043 | 3.911970 | 1.027856 | 6.980909 | 8.125445 | 1 | 0 | 0 | 2 | 0 |
4 | 1981 | 0.001 | 1.895857 | 1.807990 | 1.827797 | 4.080565 | 1.709768 | 0.707514 | 2.450663 | 34.397295 | 0 | 0 | 1 | 2 | 1 |
Setting up the Problem¶
The problem configuration is based on that of the first automobile problem. It is very similar, with both demand and supply sides, although with a few more product characteristics.
Again, we stack the three product formulations in order: \(X_1\), \(X_2\), and \(X_3\).
[4]:
product_formulations = (
pyblp.Formulation('1 + hpwt + space + air + mpd + fwd + mi + sw + su + pv + pgnp + trend + trend2'),
pyblp.Formulation('1 + I(-prices) + hpwt + space + air + mpd + fwd + mi + sw + su + pv'),
pyblp.Formulation('1 + log(hpwt) + log(wt) + log(mpg) + air + fwd + trend * (jp + eu) + log(q)'),
)
product_formulations
[4]:
(1 + hpwt + space + air + mpd + fwd + mi + sw + su + pv + pgnp + trend + trend2,
1 + I(-prices) + hpwt + space + air + mpd + fwd + mi + sw + su + pv,
1 + log(hpwt) + log(wt) + log(mpg) + air + fwd + trend + jp + eu + trend:jp + trend:eu + log(q))
Again, we’ll use a first-order linear approximation to \(\log(y_i - p_j)\), in which \(y\) is income and \(p\) are prices. Unlike the previous automobile problem, however, we’ll allow its coefficient to vary for low- mid- and high-income consumers.
As in the original paper, we’ll also include \(log(\textit{fs}_i) \times \textit{fv}_i\) where \(\textit{fs}_i\) is family size and \(\textit{fv}_i\) is another truncated \(\chi^2(3)\) draw. Finally, to help with constructing micro moments below, we’ll also include various additional demographics in the agent formulation.
[5]:
agent_formulation = pyblp.Formulation('1 + I(low / income) + I(mid / income) + I(high / income) + I(log(fs) * fv) + age + fs + mid + high')
agent_formulation
[5]:
1 + I(low / income) + I(mid / income) + I(high / income) + I(log(fs) * fv) + age + fs + mid + high
The Problem can again be constructed by combining the product_formulations
, product_data
, agent_formulation
, and agent_data
. We’ll again choose a log-linear specification for marginal costs \(c_{jt}\).
[6]:
problem = pyblp.Problem(product_formulations, product_data, agent_formulation, agent_data, costs_type='log')
problem
[6]:
Dimensions:
========================================================
T N F I K1 K2 K3 D MD MS
--- ---- --- ----- ---- ---- ---- --- ---- ----
13 2407 27 13000 13 11 12 9 35 28
========================================================
Formulations:
==============================================================================================================================================
Column Indices: 0 1 2 3 4 5 6 7 8 9 10 11 12
----------------------------- --- ---------- ---------- ----------- ---------- --- ----- --- ---- -------- -------- ------ ------
X1: Linear Characteristics 1 hpwt space air mpd fwd mi sw su pv pgnp trend trend2
X2: Nonlinear Characteristics 1 -prices hpwt space air mpd fwd mi sw su pv
X3: Log Cost Characteristics 1 log(hpwt) log(wt) log(mpg) air fwd trend jp eu jp*trend eu*trend log(q)
d: Demographics 1 low/income mid/income high/income fv*log(fs) age fs mid high
==============================================================================================================================================
The problem outputs a table of dimensions:
\(T\) denotes the number of markets.
\(N\) is the length of the dataset (the number of products across all markets).
\(F\) denotes the number of firms.
\(I = \sum_t I_t\) is the total number of agents across all markets (1000 draws per market times 13 markets).
\(K_1\) is the number of linear demand characteristics.
\(K_2\) is the number of nonlinear demand characteristics.
\(K_3\) is the number of linear supply characteristics.
\(D\) is the number of demographic variables.
\(M_D\) is the number of demand instruments, including exogenous regressors.
\(M_S\) is the number of supply instruments, including exogenous regressors.
The formulations table describes all four formulas for demand-side linear characteristics, demand-side nonlinear characteristics, supply-side characteristics, and demographics.
Setting up Micro Moments¶
Next, we will configure the micro moments that we will be adding to the problem. For background and notation involving micro moments, see Micro Moments.
Specifically, we will be adding a few more moments that match key statistics computed from the CEX survey of potential automobile consumers. For a tutorial on how to compute optimal micro moments that use all the information in a full micro dataset linking individual choices to demographics, see the post estimation tutorial.
To start, we will have to define a MicroDataset configuration that contains metadata about the micro dataset/survey. These metadata include a unique name for the dataset indexed by \(d\), the number of observations \(N_d\), a function that defines survey weights \(w_{dijt}\), and if relevant, a subset of markets from which the micro data was sampled.
[7]:
micro_dataset = pyblp.MicroDataset(
name="CEX",
observations=29125,
compute_weights=lambda t, p, a: np.ones((a.size, 1 + p.size)),
)
micro_dataset
[7]:
CEX: 29125 Observations in All Markets
We called the dataset “CEX”, defined the number of observations in it, and also defined a lambda function for computing survey weights in a market. The compute_weights
function has three arguments: the current market’s ID \(t\), the \(J_t\) Products inside the market, and the \(I_t\) Agents inside the market. In this case, we are assuming that each product and agent/consumer type are
sampled with equal probability, so we simply return a matrix of ones of shape \(I_t \times (1 + J_t)\). This sets each \(w_{dijt} = 1\).
By using \(1 + J_t\) instead of \(J_t\), we are specifying that the micro dataset contains observations of the outside option \(j = 0\). If we instead specified a matrix of shape \(I_t \times J_t\), this would be the same as setting the first column equal to all zeros, so that outside choices are not sampled from.
We will be matching a few different statistics that were computed from this survey. For convenience, they are packaged in a data file with pyblp.
[8]:
micro_statistics = pd.read_csv(pyblp.data.PETRIN_VALUES_LOCATION, index_col=0)
micro_statistics
[8]:
value | |
---|---|
E[age | mi] | 0.7830 |
E[fs | mi] | 3.8600 |
E[age | sw] | 0.7300 |
E[fs | sw] | 3.1700 |
E[age | su] | 0.7400 |
E[fs | su] | 2.9700 |
E[age | pv] | 0.6520 |
E[fs | pv] | 3.4700 |
E[new | mid] | 0.0794 |
E[new | high] | 0.1581 |
We will match the average age and family size (“fs”) conditional on purchasing a minivan (“mi”), station wagon (“sw”), sport-utility (“su”), and full-size passenger van (“pv”). We will also match the probability that a consumer actually purchases a new vehicle, conditional on them being mid- and high-income.
Each of these statistics is a conditional expectation, which we can rewrite as a ration of unconditional expectations over all consumers. Each of these unconditional expectations is called a MicroPart (used to form full micro moments), which we will now configure.
Each micro part is an average/expectation in the sample/population over micro values \(v_{pijt}\). To match the above micro values, we will need averages/expectations over interactions between agent/family size and dummies for purchasing the different automobile types. These will form the numerators in our conditional expectations.
[9]:
age_mi_part = pyblp.MicroPart(
name="E[age_i * mi_j]",
dataset=micro_dataset,
compute_values=lambda t, p, a: np.outer(a.demographics[:, 5], np.r_[0, p.X2[:, 7]]),
)
age_sw_part = pyblp.MicroPart(
name="E[age_i * sw_j]",
dataset=micro_dataset,
compute_values=lambda t, p, a: np.outer(a.demographics[:, 5], np.r_[0, p.X2[:, 8]]),
)
age_su_part = pyblp.MicroPart(
name="E[age_i * su_j]",
dataset=micro_dataset,
compute_values=lambda t, p, a: np.outer(a.demographics[:, 5], np.r_[0, p.X2[:, 9]]),
)
age_pv_part = pyblp.MicroPart(
name="E[age_i * pv_j]",
dataset=micro_dataset,
compute_values=lambda t, p, a: np.outer(a.demographics[:, 5], np.r_[0, p.X2[:, 10]]),
)
fs_mi_part = pyblp.MicroPart(
name="E[fs_i * mi_j]",
dataset=micro_dataset,
compute_values=lambda t, p, a: np.outer(a.demographics[:, 6], np.r_[0, p.X2[:, 7]]),
)
fs_sw_part = pyblp.MicroPart(
name="E[fs_i * sw_j]",
dataset=micro_dataset,
compute_values=lambda t, p, a: np.outer(a.demographics[:, 6], np.r_[0, p.X2[:, 8]]),
)
fs_su_part = pyblp.MicroPart(
name="E[fs_i * su_j]",
dataset=micro_dataset,
compute_values=lambda t, p, a: np.outer(a.demographics[:, 6], np.r_[0, p.X2[:, 9]]),
)
fs_pv_part = pyblp.MicroPart(
name="E[fs_i * pv_j]",
dataset=micro_dataset,
compute_values=lambda t, p, a: np.outer(a.demographics[:, 6], np.r_[0, p.X2[:, 10]]),
)
We will also need the denominators, which are simple averages/expectations of purchasing the different types of automobiles.
[10]:
mi_part = pyblp.MicroPart(
name="E[mi_j]",
dataset=micro_dataset,
compute_values=lambda t, p, a: np.outer(a.demographics[:, 0], np.r_[0, p.X2[:, 7]]),
)
sw_part = pyblp.MicroPart(
name="E[sw_j]",
dataset=micro_dataset,
compute_values=lambda t, p, a: np.outer(a.demographics[:, 0], np.r_[0, p.X2[:, 8]]),
)
su_part = pyblp.MicroPart(
name="E[su_j]",
dataset=micro_dataset,
compute_values=lambda t, p, a: np.outer(a.demographics[:, 0], np.r_[0, p.X2[:, 9]]),
)
pv_part = pyblp.MicroPart(
name="E[pv_j]",
dataset=micro_dataset,
compute_values=lambda t, p, a: np.outer(a.demographics[:, 0], np.r_[0, p.X2[:, 10]]),
)
To form our probability that a consumer actually purchases a new vehicle, conditional on them being mid- and high-income, we will also need the following micro parts.
[11]:
inside_mid_part = pyblp.MicroPart(
name="E[1{j > 0} * mid_i]",
dataset=micro_dataset,
compute_values=lambda t, p, a: np.outer(a.demographics[:, 7], np.r_[0, p.X2[:, 0]]),
)
inside_high_part = pyblp.MicroPart(
name="E[1{j > 0} * high_i]",
dataset=micro_dataset,
compute_values=lambda t, p, a: np.outer(a.demographics[:, 8], np.r_[0, p.X2[:, 0]]),
)
mid_part = pyblp.MicroPart(
name="E[mid_i]",
dataset=micro_dataset,
compute_values=lambda t, p, a: np.outer(a.demographics[:, 7], np.r_[1, p.X2[:, 0]]),
)
high_part = pyblp.MicroPart(
name="E[high_i]",
dataset=micro_dataset,
compute_values=lambda t, p, a: np.outer(a.demographics[:, 8], np.r_[1, p.X2[:, 0]]),
)
Finally, we’ll put these micro parts together into MicroMoments. Each micro moment is configured to have a name, a value (one of the statistics above), and micro parts that go into it.
If our micro moments were simple unconditional expectations, we could just pass a single micro part to each micro moment and be done. However, since our micro moments are functions of multiple micro parts, we have to specify this function. We also have to specify its derivative for computing standard errors and analytic objective gradients.
[12]:
compute_ratio = lambda v: v[0] / v[1]
compute_ratio_gradient = lambda v: [1 / v[1], -v[0] / v[1]**2]
Given our functions that define a conditional expectation and its derivatives, we can form our micro moments.
[13]:
micro_moments = [
pyblp.MicroMoment(
name="E[age_i | mi_j]",
value=0.783,
parts=[age_mi_part, mi_part],
compute_value=compute_ratio,
compute_gradient=compute_ratio_gradient,
),
pyblp.MicroMoment(
name="E[age_i | sw_j]",
value=0.730,
parts=[age_sw_part, sw_part],
compute_value=compute_ratio,
compute_gradient=compute_ratio_gradient,
),
pyblp.MicroMoment(
name="E[age_i | su_j]",
value=0.740,
parts=[age_su_part, su_part],
compute_value=compute_ratio,
compute_gradient=compute_ratio_gradient,
),
pyblp.MicroMoment(
name="E[age_i | pv_j]",
value=0.652,
parts=[age_pv_part, pv_part],
compute_value=compute_ratio,
compute_gradient=compute_ratio_gradient,
),
pyblp.MicroMoment(
name="E[fs_i | mi_j]",
value=3.86,
parts=[fs_mi_part, mi_part],
compute_value=compute_ratio,
compute_gradient=compute_ratio_gradient,
),
pyblp.MicroMoment(
name="E[fs_i | sw_j]",
value=3.17,
parts=[fs_sw_part, sw_part],
compute_value=compute_ratio,
compute_gradient=compute_ratio_gradient,
),
pyblp.MicroMoment(
name="E[fs_i | su_j]",
value=2.97,
parts=[fs_su_part, su_part],
compute_value=compute_ratio,
compute_gradient=compute_ratio_gradient,
),
pyblp.MicroMoment(
name="E[fs_i | pv_j]",
value=3.47,
parts=[fs_pv_part, pv_part],
compute_value=compute_ratio,
compute_gradient=compute_ratio_gradient,
),
pyblp.MicroMoment(
name="E[1{j > 0} | mid_i]",
value=0.0794,
parts=[inside_mid_part, mid_part],
compute_value=compute_ratio,
compute_gradient=compute_ratio_gradient,
),
pyblp.MicroMoment(
name="E[1{j > 0} | high_i]",
value=0.1581,
parts=[inside_high_part, high_part],
compute_value=compute_ratio,
compute_gradient=compute_ratio_gradient,
),
]
Solving the Problem¶
Like for the first automobile problem, here will will just use the publisehd estimates for \(\Sigma\) and \(\Pi\) starting values.
[14]:
initial_sigma = np.diag([3.23, 0, 4.43, 0.46, 0.01, 2.58, 4.42, 0, 0, 0, 0])
initial_pi = np.array([
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 7.52, 31.13, 34.49, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0.57, 0, 0, 0, 0],
[0, 0, 0, 0, 0.28, 0, 0, 0, 0],
[0, 0, 0, 0, 0.31, 0, 0, 0, 0],
[0, 0, 0, 0, 0.42, 0, 0, 0, 0],
])
Finally, as in the original paper, we’ll use W_type
and se_type
to cluster by product IDs, which were specified as clustering_ids
in product_data
. We will use a simple BFGS optimization routine and slightly loosen the default tolerance of our inner SQUAREM iteration algorithm from 1e-14
to 1e-13
because the tighter tolerance tended to lead to convergence failures for this problem. We also pass our configured micro_moments
when solving the problem.
[15]:
results = problem.solve(
sigma=initial_sigma,
pi=initial_pi,
optimization=pyblp.Optimization('bfgs', {'gtol': 1e-4}),
iteration=pyblp.Iteration('squarem', {'atol': 1e-13}),
se_type='clustered',
W_type='clustered',
micro_moments=micro_moments,
)
results
[15]:
Problem Results Summary:
=======================================================================================================
GMM Objective Gradient Hessian Hessian Clipped Weighting Matrix Covariance Matrix
Step Value Norm Min Eigenvalue Max Eigenvalue Shares Condition Number Condition Number
---- --------- -------- -------------- -------------- ------- ---------------- -----------------
2 +1.8E+02 +4.3E-05 +3.8E-01 +1.3E+03 0 +2.8E+11 +8.1E+07
=======================================================================================================
Cumulative Statistics:
===========================================================================
Computation Optimizer Optimization Objective Fixed Point Contraction
Time Converged Iterations Evaluations Iterations Evaluations
----------- --------- ------------ ----------- ----------- -----------
00:34:10 Yes 72 87 10905 33592
===========================================================================
Nonlinear Coefficient Estimates (Robust SEs Adjusted for 898 Clusters in Parentheses):
================================================================================================================================================================================================================================================
Sigma: 1 -prices hpwt space air mpd fwd mi sw su pv | Pi: 1 low/income mid/income high/income fv*log(fs) age fs mid high
------- ---------- -------- ---------- ---------- ---------- ---------- ---------- -------- -------- -------- -------- | ------- -------- ---------- ---------- ----------- ---------- -------- -------- -------- --------
1 +3.0E-02 | 1 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00
(+5.3E-01) |
|
-prices +0.0E+00 +0.0E+00 | -prices +0.0E+00 +3.9E+00 +1.2E+01 +2.4E+01 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00
| (+3.6E-01) (+1.0E+00) (+2.4E+00)
|
hpwt +0.0E+00 +0.0E+00 +1.2E-01 | hpwt +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00
(+8.1E-01) |
|
space +0.0E+00 +0.0E+00 +0.0E+00 -9.2E-02 | space +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00
(+6.1E-01) |
|
air +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 -1.3E+00 | air +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00
(+1.1E+00) |
|
mpd +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 -1.6E-01 | mpd +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00
(+2.2E-01) |
|
fwd +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +1.6E+00 | fwd +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00
(+3.7E-01) |
|
mi +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 | mi +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +4.2E-01 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00
| (+5.2E-02)
|
sw +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 | sw +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +1.7E-01 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00
| (+4.2E-02)
|
su +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 | su +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +1.0E-01 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00
| (+5.2E-02)
|
pv +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 | pv +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00 +2.5E-01 +0.0E+00 +0.0E+00 +0.0E+00 +0.0E+00
| (+8.1E-02)
================================================================================================================================================================================================================================================
Beta Estimates (Robust SEs Adjusted for 898 Clusters in Parentheses):
==========================================================================================================================================================
1 hpwt space air mpd fwd mi sw su pv pgnp trend trend2
---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
-8.9E+00 +8.3E+00 +4.9E+00 +3.8E+00 -1.4E-01 -6.5E+00 -2.1E+00 -1.3E+00 -1.1E+00 -3.3E+00 +3.4E-02 +2.2E-01 -1.5E-02
(+1.4E+00) (+2.4E+00) (+1.6E+00) (+1.2E+00) (+3.2E-01) (+1.8E+00) (+4.8E-01) (+2.0E-01) (+2.8E-01) (+5.2E-01) (+1.2E-02) (+9.2E-02) (+6.4E-03)
==========================================================================================================================================================
Gamma Estimates (Robust SEs Adjusted for 898 Clusters in Parentheses):
==============================================================================================================================================
1 log(hpwt) log(wt) log(mpg) air fwd trend jp eu jp*trend eu*trend log(q)
---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
+1.4E+00 +8.8E-01 +1.4E+00 +1.2E-01 +2.7E-01 +6.9E-02 -1.2E-02 +1.0E-01 +4.6E-01 +1.6E-03 -1.1E-02 -6.9E-02
(+1.4E-01) (+4.9E-02) (+8.0E-02) (+6.0E-02) (+2.4E-02) (+1.8E-02) (+2.6E-03) (+2.5E-02) (+4.3E-02) (+2.9E-03) (+4.2E-03) (+6.7E-03)
==============================================================================================================================================
Estimated Micro Moments:
===========================================================================================================
Observed Estimated Difference Moment Part Dataset Observations Markets
-------- --------- ---------- -------------------- -------------------- ------- ------------ -------
+7.8E-01 +7.5E-01 +2.9E-02 E[age_i | mi_j] E[age_i * mi_j] CEX 29125 All
E[mi_j] CEX 29125 All
+7.3E-01 +6.8E-01 +4.7E-02 E[age_i | sw_j] E[age_i * sw_j] CEX 29125 All
E[sw_j] CEX 29125 All
+7.4E-01 +6.8E-01 +5.9E-02 E[age_i | su_j] E[age_i * su_j] CEX 29125 All
E[su_j] CEX 29125 All
+6.5E-01 +7.3E-01 -7.7E-02 E[age_i | pv_j] E[age_i * pv_j] CEX 29125 All
E[pv_j] CEX 29125 All
+3.9E+00 +3.9E+00 -1.2E-02 E[fs_i | mi_j] E[fs_i * mi_j] CEX 29125 All
E[mi_j] CEX 29125 All
+3.2E+00 +3.2E+00 -7.6E-03 E[fs_i | sw_j] E[fs_i * sw_j] CEX 29125 All
E[sw_j] CEX 29125 All
+3.0E+00 +3.0E+00 -8.5E-03 E[fs_i | su_j] E[fs_i * su_j] CEX 29125 All
E[su_j] CEX 29125 All
+3.5E+00 +3.5E+00 -1.7E-02 E[fs_i | pv_j] E[fs_i * pv_j] CEX 29125 All
E[pv_j] CEX 29125 All
+7.9E-02 +8.0E-02 -4.5E-04 E[1{j > 0} | mid_i] E[1{j > 0} * mid_i] CEX 29125 All
E[mid_i] CEX 29125 All
+1.6E-01 +1.6E-01 -2.1E-03 E[1{j > 0} | high_i] E[1{j > 0} * high_i] CEX 29125 All
E[high_i] CEX 29125 All
===========================================================================================================
There are some discrepances between these results and those in the original paper, but broadly estimates are similar. Although the estimates of \(\beta\) looks substantially off, this is primarily because the \(\chi^2(3)\) distributions are not mean-zero, so differences in estimates of \(\Sigma\) results in shifted estimates of \(\beta\) too.
Running the Main Counterfactual¶
One result that is very similar is the paper’s headline number: a $367.29 million compensating variation from a counterfactual that removes the minivan in 1984. Using our estimates, we get a very similar number.
This subsection previews some of the routines used in the next tutorial on functions available after estimation. First, we will compute implied marginal costs in 1984.
[16]:
year = 1984
costs_1984 = results.compute_costs(market_id=year)
Next, we will set up a counterfactual simulation in which the minivan is removed.
[17]:
product_data_1984 = product_data[product_data['market_ids'] == year]
xi_1984 = results.xi[product_data['market_ids'] == year]
agent_data_1984 = agent_data[agent_data['market_ids'] == year]
simulation = pyblp.Simulation(
product_formulations=product_formulations[:2],
product_data=product_data_1984[product_data_1984['mi'] == 0],
xi=xi_1984[product_data_1984['mi'] == 0],
agent_formulation=problem.agent_formulation,
agent_data=agent_data_1984,
beta=results.beta,
sigma=results.sigma,
pi=results.pi,
)
We will then solve for equilibrium prices and shares under this counterfactual, using the above-computed marginal costs.
[18]:
simulation_results = simulation.replace_endogenous(costs=costs_1984[product_data_1984['mi'] == 0])
Finally, we will compute the change in consumer surplus.
[19]:
households = product_data_1984['households'].values[0]
cs = households * results.compute_consumer_surpluses(market_id=year)
counterfactual_cs = households * simulation_results.compute_consumer_surpluses()
cs - counterfactual_cs
[19]:
array([[425.90824856]])
We get an estimate that is in the same ballpark as $367.29 million. When bootstrapping this procedure (see the next tutorial for more on this), we get a standard error around $250 million.
Download the Jupyter Notebook for this section: post_estimation.ipynb
Post-Estimation Tutorial¶
[1]:
%matplotlib inline
import pyblp
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
pyblp.options.digits = 2
pyblp.options.verbose = False
pyblp.__version__
[1]:
'1.1.0'
This tutorial covers several features of pyblp
which are available after estimation including:
Calculating elasticities and diversion ratios.
Calculating marginal costs and markups.
Computing the effects of mergers: prices, shares, and HHI.
Using a parametric bootstrap to estimate standard errors.
Estimating optimal instruments.
Constructing optimal micro moments.
Problem Results¶
As in the fake cereal tutorial, we’ll first solve the fake cereal problem from Nevo (2000a). We load the fake data and estimate the model as in the previous tutorial. We output the setup of the model to confirm we have correctly configured the Problem
[2]:
product_data = pd.read_csv(pyblp.data.NEVO_PRODUCTS_LOCATION)
agent_data = pd.read_csv(pyblp.data.NEVO_AGENTS_LOCATION)
product_formulations = (
pyblp.Formulation('0 + prices', absorb='C(product_ids)'),
pyblp.Formulation('1 + prices + sugar + mushy')
)
agent_formulation = pyblp.Formulation('0 + income + income_squared + age + child')
problem = pyblp.Problem(product_formulations, product_data, agent_formulation, agent_data)
problem
[2]:
Dimensions:
=================================================
T N F I K1 K2 D MD ED
--- ---- --- ---- ---- ---- --- ---- ----
94 2256 5 1880 1 4 4 20 1
=================================================
Formulations:
===================================================================
Column Indices: 0 1 2 3
----------------------------- ------ -------------- ----- -----
X1: Linear Characteristics prices
X2: Nonlinear Characteristics 1 prices sugar mushy
d: Demographics income income_squared age child
===================================================================
We’ll solve the problem in the same way as before. The Problem.solve method returns a ProblemResults class, which displays basic estimation results. The results that are displayed are simply formatted information extracted from various class attributes such as ProblemResults.sigma and ProblemResults.sigma_se.
[3]:
initial_sigma = np.diag([0.3302, 2.4526, 0.0163, 0.2441])
initial_pi = [
[ 5.4819, 0, 0.2037, 0 ],
[15.8935, -1.2000, 0, 2.6342],
[-0.2506, 0, 0.0511, 0 ],
[ 1.2650, 0, -0.8091, 0 ]
]
results = problem.solve(
initial_sigma,
initial_pi,
optimization=pyblp.Optimization('bfgs', {'gtol': 1e-5}),
method='1s'
)
results
[3]:
Problem Results Summary:
=======================================================================================================
GMM Objective Gradient Hessian Hessian Clipped Weighting Matrix Covariance Matrix
Step Value Norm Min Eigenvalue Max Eigenvalue Shares Condition Number Condition Number
---- --------- -------- -------------- -------------- ------- ---------------- -----------------
1 +4.6E+00 +6.9E-06 +3.3E-05 +1.6E+04 0 +6.9E+07 +8.4E+08
=======================================================================================================
Cumulative Statistics:
===========================================================================
Computation Optimizer Optimization Objective Fixed Point Contraction
Time Converged Iterations Evaluations Iterations Evaluations
----------- --------- ------------ ----------- ----------- -----------
00:01:16 Yes 51 57 46389 143977
===========================================================================
Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
=====================================================================================================================
Sigma: 1 prices sugar mushy | Pi: income income_squared age child
------ ---------- ---------- ---------- ---------- | ------ ---------- -------------- ---------- ----------
1 +5.6E-01 | 1 +2.3E+00 +0.0E+00 +1.3E+00 +0.0E+00
(+1.6E-01) | (+1.2E+00) (+6.3E-01)
|
prices +0.0E+00 +3.3E+00 | prices +5.9E+02 -3.0E+01 +0.0E+00 +1.1E+01
(+1.3E+00) | (+2.7E+02) (+1.4E+01) (+4.1E+00)
|
sugar +0.0E+00 +0.0E+00 -5.8E-03 | sugar -3.8E-01 +0.0E+00 +5.2E-02 +0.0E+00
(+1.4E-02) | (+1.2E-01) (+2.6E-02)
|
mushy +0.0E+00 +0.0E+00 +0.0E+00 +9.3E-02 | mushy +7.5E-01 +0.0E+00 -1.4E+00 +0.0E+00
(+1.9E-01) | (+8.0E-01) (+6.7E-01)
=====================================================================================================================
Beta Estimates (Robust SEs in Parentheses):
==========
prices
----------
-6.3E+01
(+1.5E+01)
==========
Additional post-estimation outputs can be computed with ProblemResults methods.
Elasticities and Diversion Ratios¶
We can estimate elasticities, \(\varepsilon\), and diversion ratios, \(\mathscr{D}\), with ProblemResults.compute_elasticities and ProblemResults.compute_diversion_ratios.
As a reminder, elasticities in each market are
Diversion ratios are
Following Conlon and Mortimer (2018), we report the diversion to the outside good \(D_{j0}\) on the diagonal instead of \(D_{jj}=-1\).
[4]:
elasticities = results.compute_elasticities()
diversions = results.compute_diversion_ratios()
Post-estimation outputs are computed for each market and stacked. We’ll use matplotlib functions to display the matrices associated with a single market.
[5]:
single_market = product_data['market_ids'] == 'C01Q1'
plt.colorbar(plt.matshow(elasticities[single_market]));

[6]:
plt.colorbar(plt.matshow(diversions[single_market]));

The diagonal of the first image consists of own elasticities and the diagonal of the second image consists of diversion ratios to the outside good. As one might expect, own price elasticities are large and negative while cross-price elasticities are positive but much smaller.
Elasticities and diversion ratios can be computed with respect to variables other than prices
with the name
argument of ProblemResults.compute_elasticities and ProblemResults.compute_diversion_ratios. Additionally,
ProblemResults.compute_long_run_diversion_ratios can be used to used to understand substitution when products are eliminated from the choice set.
The convenience methods ProblemResults.extract_diagonals and ProblemResults.extract_diagonal_means can be used to extract information about own elasticities of demand from elasticity matrices.
[7]:
means = results.extract_diagonal_means(elasticities)
An alternative to summarizing full elasticity matrices is to use ProblemResults.compute_aggregate_elasticities to estimate aggregate elasticities of demand, \(E\), in each market, which reflect the change in total sales under a proportional sales tax of some factor.
[8]:
aggregates = results.compute_aggregate_elasticities(factor=0.1)
Since demand for an entire product category is generally less elastic than the average elasticity of individual products, mean own elasticities are generally larger in magnitude than aggregate elasticities.
[9]:
plt.hist(
[means.flatten(), aggregates.flatten()],
color=['red', 'blue'],
bins=50
);
plt.legend(['Mean Own Elasticities', 'Aggregate Elasticities']);

Marginal Costs and Markups¶
To compute marginal costs, \(c\), the product_data
passed to Problem must have had a firm_ids
field. Since we included firm IDs when configuring the problem, we can use ProblemResults.compute_costs.
[10]:
costs = results.compute_costs()
plt.hist(costs, bins=50);
plt.legend(["Marginal Costs"]);

Other methods that compute supply-side outputs often compute marginal costs themselves. For example, ProblemResults.compute_markups will compute marginal costs when estimating markups, \(\mathscr{M}\), but computation can be sped up if we just use our pre-computed values.
[11]:
markups = results.compute_markups(costs=costs)
plt.hist(markups, bins=50);
plt.legend(["Markups"]);

Mergers¶
Before computing post-merger outputs, we’ll supplement our pre-merger markups with some other outputs. We’ll compute Herfindahl-Hirschman Indices, \(\text{HHI}\), with ProblemResults.compute_hhi; population-normalized gross expected profits, \(\pi\), with ProblemResults.compute_profits; and population-normalized consumer surpluses, \(\text{CS}\), with ProblemResults.compute_consumer_surpluses.
[12]:
hhi = results.compute_hhi()
profits = results.compute_profits(costs=costs)
cs = results.compute_consumer_surpluses()
To compute post-merger outputs, we’ll create a new set of firm IDs that represent a merger of firms 2
and 1
.
[13]:
product_data['merger_ids'] = product_data['firm_ids'].replace(2, 1)
We can use ProblemResults.compute_approximate_prices or ProblemResults.compute_prices to estimate post-merger prices. The first method, which is in the spirit of early approaches to merger evaluation such as Hausman, Leonard, and Zona (1994) and Werden (1997), is only a partial merger simulation in that it assumes shares and their price derivatives are unaffected by the merger.
The second method, which is used by Nevo (2000b), is a full merger simulation in that it does not make these assumptions, and is the preferred approach to merger simulation. By default, we iterate over the \(\zeta\)-markup equation from Morrow and Skerlos (2011) to solve the full system of \(J_t\) equations and \(J_t\) unknowns in each market \(t\). We’ll use the latter, since it is fast enough for this example problem.
[14]:
changed_prices = results.compute_prices(
firm_ids=product_data['merger_ids'],
costs=costs
)
We’ll compute post-merger shares with ProblemResults.compute_shares.
[15]:
changed_shares = results.compute_shares(changed_prices)
Post-merger prices and shares are used to compute other post-merger outputs. For example, \(\text{HHI}\) increases.
[16]:
changed_hhi = results.compute_hhi(
firm_ids=product_data['merger_ids'],
shares=changed_shares
)
plt.hist(changed_hhi - hhi, bins=50);
plt.legend(["HHI Changes"]);

Markups, \(\mathscr{M}\), and profits, \(\pi\), generally increase as well.
[17]:
changed_markups = results.compute_markups(changed_prices, costs)
plt.hist(changed_markups - markups, bins=50);
plt.legend(["Markup Changes"]);

[18]:
changed_profits = results.compute_profits(changed_prices, changed_shares, costs)
plt.hist(changed_profits - profits, bins=50);
plt.legend(["Profit Changes"]);

On the other hand, consumer surpluses, \(\text{CS}\), generally decrease.
[19]:
changed_cs = results.compute_consumer_surpluses(changed_prices)
plt.hist(changed_cs - cs, bins=50);
plt.legend(["Consumer Surplus Changes"]);

Bootstrapping Results¶
Post-estimation outputs can be informative, but they don’t mean much without a sense sample-to-sample variability. One way to estimate confidence intervals for post-estimation outputs is with a standard bootstrap procedure:
Construct a large number of bootstrap samples by sampling with replacement from the original product data.
Initialize and solve a Problem for each bootstrap sample.
Compute the desired post-estimation output for each bootstrapped ProblemResults and from the resulting empirical distribution, construct boostrap confidence intervals.
Although appealing because of its simplicity, the computational resources required for this procedure are often prohibitively expensive. Furthermore, human oversight of the optimization routine is often required to determine whether the routine ran into any problems and if it successfully converged. Human oversight of estimation for each bootstrapped problem is usually not feasible.
A more reasonable alternative is a parametric bootstrap procedure:
Construct a large number of draws from the estimated joint distribution of parameters.
Compute the implied mean utility, \(\delta\), and shares, \(s\), for each draw. If a supply side was estimated, also computed the implied marginal costs, \(c\), and prices, \(p\).
Compute the desired post-estimation output under each of these parametric bootstrap samples. Again, from the resulting empirical distribution, construct boostrap confidence intervals.
Compared to the standard bootstrap procedure, the parametric bootstrap requires far fewer computational resources, and is simple enough to not require human oversight of each bootstrap iteration. The primary complication to this procedure is that when supply is estimated, equilibrium prices and shares need to be computed for each parametric bootstrap sample by iterating over the \(\zeta\)-markup equation from Morrow and Skerlos (2011). Although nontrivial, this fixed point iteration problem is much less demanding than the full optimization routine required to solve the BLP problem from the start.
An empirical distribution of results computed according to this parametric bootstrap procedure can be created with the ProblemResults.bootstrap method, which returns a BootstrappedResults class that can be used just like ProblemResults to compute various post-estimation outputs. The difference is that BootstrappedResults methods return arrays with an extra first dimension, along which bootstrapped results are stacked.
We’ll construct 90% parametric bootstrap confidence intervals for estimated mean own elasticities in each market of the fake cereal problem. Usually, bootstrapped confidence intervals should be based on thousands of draws, but we’ll only use a few for the sake of speed in this example.
[20]:
bootstrapped_results = results.bootstrap(draws=100, seed=0)
bootstrapped_results
[20]:
Bootstrapped Results Summary:
======================
Computation Bootstrap
Time Draws
----------- ---------
00:00:09 100
======================
[21]:
bounds = np.percentile(
bootstrapped_results.extract_diagonal_means(
bootstrapped_results.compute_elasticities()
),
q=[10, 90],
axis=0
)
table = pd.DataFrame(index=problem.unique_market_ids, data={
'Lower Bound': bounds[0].flatten(),
'Mean Own Elasticity': means.flatten(),
'Upper Bound': bounds[1].flatten()
})
table.round(2).head()
[21]:
Lower Bound | Mean Own Elasticity | Upper Bound | |
---|---|---|---|
C01Q1 | -4.31 | -4.21 | -3.88 |
C01Q2 | -4.07 | -3.96 | -3.68 |
C03Q1 | -3.71 | -3.40 | -3.20 |
C03Q2 | -3.65 | -3.34 | -3.16 |
C04Q1 | -3.31 | -3.15 | -2.97 |
Optimal Instruments¶
Given a consistent estimate of \(\theta\), we may want to compute the optimal instruments of Chamberlain (1987) and use them to re-solve the problem. Optimal instruments have been shown, for example, by Reynaert and Verboven (2014), to reduce bias, improve efficiency, and enhance stability of BLP estimates.
The ProblemResults.compute_optimal_instruments method computes the expected Jacobians that comprise the optimal instruments by integrating over the density of \(\xi\) (and \(\omega\) if a supply side was estimated). By default, the method approximates this integral by averaging over the Jacobian realizations computed under draws from the asymptotic normal distribution of
the error terms. Since this process is computationally expensive and often doesn’t make much of a difference, we’ll use method='approximate'
in this example to simply evaluate the Jacobians at the expected value of \(\xi\), zero.
[22]:
instrument_results = results.compute_optimal_instruments(method='approximate')
instrument_results
[22]:
Optimal Instrument Results Summary:
=======================
Computation Error Term
Time Draws
----------- ----------
00:00:00 1
=======================
We can use the OptimalInstrumentResults.to_problem method to re-create the fake cereal problem with the estimated optimal excluded instruments.
[23]:
updated_problem = instrument_results.to_problem()
updated_problem
[23]:
Dimensions:
=================================================
T N F I K1 K2 D MD ED
--- ---- --- ---- ---- ---- --- ---- ----
94 2256 5 1880 1 4 4 14 1
=================================================
Formulations:
===================================================================
Column Indices: 0 1 2 3
----------------------------- ------ -------------- ----- -----
X1: Linear Characteristics prices
X2: Nonlinear Characteristics 1 prices sugar mushy
d: Demographics income income_squared age child
===================================================================
We can solve this updated problem just like the original one. We’ll start at our consistent estimate of \(\theta\).
[24]:
updated_results = updated_problem.solve(
results.sigma,
results.pi,
optimization=pyblp.Optimization('bfgs', {'gtol': 1e-5}),
method='1s'
)
updated_results
[24]:
Problem Results Summary:
=======================================================================================================
GMM Objective Gradient Hessian Hessian Clipped Weighting Matrix Covariance Matrix
Step Value Norm Min Eigenvalue Max Eigenvalue Shares Condition Number Condition Number
---- --------- -------- -------------- -------------- ------- ---------------- -----------------
1 +8.0E-14 +3.0E-06 +1.6E-04 +2.9E+04 0 +7.8E+07 +1.8E+08
=======================================================================================================
Cumulative Statistics:
===========================================================================
Computation Optimizer Optimization Objective Fixed Point Contraction
Time Converged Iterations Evaluations Iterations Evaluations
----------- --------- ------------ ----------- ----------- -----------
00:01:05 Yes 42 50 45902 142164
===========================================================================
Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
=====================================================================================================================
Sigma: 1 prices sugar mushy | Pi: income income_squared age child
------ ---------- ---------- ---------- ---------- | ------ ---------- -------------- ---------- ----------
1 +2.1E-01 | 1 +6.0E+00 +0.0E+00 +1.6E-01 +0.0E+00
(+7.8E-02) | (+5.2E-01) (+2.0E-01)
|
prices +0.0E+00 +3.0E+00 | prices +9.8E+01 -5.6E+00 +0.0E+00 +4.1E+00
(+6.5E-01) | (+8.6E+01) (+4.5E+00) (+2.2E+00)
|
sugar +0.0E+00 +0.0E+00 +2.7E-02 | sugar -3.1E-01 +0.0E+00 +4.9E-02 +0.0E+00
(+7.2E-03) | (+3.5E-02) (+1.3E-02)
|
mushy +0.0E+00 +0.0E+00 +0.0E+00 +3.0E-01 | mushy +9.7E-01 +0.0E+00 -5.4E-01 +0.0E+00
(+1.0E-01) | (+2.9E-01) (+1.8E-01)
=====================================================================================================================
Beta Estimates (Robust SEs in Parentheses):
==========
prices
----------
-3.1E+01
(+4.5E+00)
==========
Optimal Micro Moments¶
Similarly, if we have micro data that links individual choices to demographics, we can use all the information in this data by constructing optimal micro moments that match the score of the micro data, evaluated at our consistent estimate of \(\theta\). See the micro moments tutorial for an introduction to constructing non-optimal micro moments.
We don’t have actual micro data for this empirical example, but we can simulate some just to demonstrate how to construct optimal micro moments. We’ll use the ProblemResults.simulate_micro_data method to simulate 1,000 observations from a micro dataset at the estimated \(\theta\). Like most micro datasets, we’ll define compute_weights
such that we only have observations from consumer
who purchase an inside good \(j \neq 0\). For simplicity, we’ll assume we only have micro data from a single market, 'C61Q1'
. Again, see the micro moments tutorial for a more in-depth discussion of MicroDataset and micro moments.
[25]:
micro_dataset = pyblp.MicroDataset(
name="Simulated micro data",
observations=1_000,
compute_weights=lambda t, p, a: np.ones((a.size, 1 + p.size)),
market_ids=['C61Q1'],
)
micro_data = results.simulate_micro_data(
dataset=micro_dataset,
seed=0,
)
The simulated micro data are a record array, which can be difficult to visualize. We’ll convert it to a pandas dataframe, which is what we would usually load from an actual micro dataset file.
[26]:
micro_data = pd.DataFrame(pyblp.data_to_dict(micro_data))
micro_data
[26]:
micro_ids | market_ids | agent_indices | choice_indices | |
---|---|---|---|---|
0 | 0 | C61Q1 | 10 | 24 |
1 | 1 | C61Q1 | 14 | 0 |
2 | 2 | C61Q1 | 12 | 0 |
3 | 3 | C61Q1 | 10 | 21 |
4 | 4 | C61Q1 | 8 | 10 |
... | ... | ... | ... | ... |
995 | 995 | C61Q1 | 1 | 23 |
996 | 996 | C61Q1 | 10 | 4 |
997 | 997 | C61Q1 | 18 | 0 |
998 | 998 | C61Q1 | 4 | 0 |
999 | 999 | C61Q1 | 13 | 2 |
1000 rows × 4 columns
The simulated micro data contain four columns:
micro_ids
: This is simply an index from \(0\) to \(999\), indexing each micro observation.market_ids
: This is the market of each observation. We configured our MicroDatasetto only sample from one market.agent_indices
: This is the within-market index (from \(0\) to \(I_t - 1\) in market \(t\)) of the agent data row that was sampled with probability \(w_{it}\), configured by theweights
column in agent data.choice_indices
: This is the within-market index (from \(0\) to \(J_t - 1\) in market \(t\)) of the produt data row that was sampled with probability \(s_{ijt}\) evaluated at the consistent estimate of \(\theta\).
The simulated data contain a bit more information than we would usually have in actual micro data. The agent_indices
contain information not only about observed demographics of the micro observation, but also about unobserved preferences. We will merge in only observed demographics for each agent index to get a more realistic simulated micro dataset.
[27]:
agent_data['agent_indices'] = agent_data.groupby('market_ids').cumcount()
micro_data = micro_data.merge(
agent_data[['market_ids', 'agent_indices', 'income', 'income_squared', 'age', 'child']],
on=['market_ids', 'agent_indices']
)
del micro_data['agent_indices']
micro_data
[27]:
micro_ids | market_ids | choice_indices | income | income_squared | age | child | |
---|---|---|---|---|---|---|---|
0 | 0 | C61Q1 | 24 | 1.212283 | 22.546328 | 0.624306 | -0.230851 |
1 | 3 | C61Q1 | 21 | 1.212283 | 22.546328 | 0.624306 | -0.230851 |
2 | 11 | C61Q1 | 11 | 1.212283 | 22.546328 | 0.624306 | -0.230851 |
3 | 28 | C61Q1 | 10 | 1.212283 | 22.546328 | 0.624306 | -0.230851 |
4 | 86 | C61Q1 | 10 | 1.212283 | 22.546328 | 0.624306 | -0.230851 |
... | ... | ... | ... | ... | ... | ... | ... |
995 | 937 | C61Q1 | 0 | 0.636478 | 11.051747 | -0.090347 | -0.230851 |
996 | 952 | C61Q1 | 5 | 0.636478 | 11.051747 | -0.090347 | -0.230851 |
997 | 954 | C61Q1 | 14 | 0.636478 | 11.051747 | -0.090347 | -0.230851 |
998 | 971 | C61Q1 | 13 | 0.636478 | 11.051747 | -0.090347 | -0.230851 |
999 | 975 | C61Q1 | 22 | 0.636478 | 11.051747 | -0.090347 | -0.230851 |
1000 rows × 7 columns
This is more like real micro data that we would actually load from a file. For each observation, we know the market, choice, and demographics of the consumer.
To compute optimal micro moments at our consistent estimate of \(\theta\), we need to compute two types of scores:
The score for each observation in our micro data via ProblemResults.compute_micro_scores.
The score for each possible agent-choice in the model via ProblemResults.compute_agent_scores.
For each, we will integrate over unobserved heterogeneity with quadrature, but one could use Monte Carlo methods too. We will do so by just passing an Integration configuration to these two methods, but we could also do so by duplicating each row of micro data by as many integration nodes/weights we wanted for each, adding weights
and nodes
columns as with agent data.
[28]:
score_integration = pyblp.Integration('product', 5)
micro_scores = results.compute_micro_scores(micro_dataset, micro_data, integration=score_integration)
agent_scores = results.compute_agent_scores(micro_dataset, integration=score_integration)
Both micro_scores
and agent_scores
are lists, with one element for each nonlinear parameter in \(\theta\). The ordering is
[29]:
results.theta_labels
[29]:
['1 x 1',
'prices x prices',
'sugar x sugar',
'mushy x mushy',
'1 x income',
'1 x age',
'prices x income',
'prices x income_squared',
'prices x child',
'sugar x income',
'sugar x age',
'mushy x income',
'mushy x age']
The first element of micro_scores
corresponds to the \(\sigma\) on the constant term (i.e., the '1 x 1'
above). It has the estimated score for each observation in micro_data
:
[30]:
micro_scores[0].shape
[30]:
(1000,)
The first element of agent_scores
also correspondds to the \(\sigma\) on the constant term. It is a mapping from market IDs to arrays of scores for each of the \(I_t \times J_t\) possible agent-choices in that market.
[31]:
agent_scores[0]['C61Q1'].shape
[31]:
(20, 25)
We will construct one optimal MicroMoment for each parameter, matching the average score (via a MicroPart) for that parameter from the micro data with its model counterpart.
[32]:
optimal_micro_moments = []
for m, (micro_scores_m, agent_scores_m) in enumerate(zip(micro_scores, agent_scores)):
optimal_micro_moments.append(pyblp.MicroMoment(
name=f"Score for parameter #{m}",
value=micro_scores_m.mean(),
parts=pyblp.MicroPart(
name=f"Score for parameter #{m}",
dataset=micro_dataset,
compute_values=lambda t, p, a, v=agent_scores_m: v[t],
),
))
For example, some information about the optimal micro moment for the first parameter is as follows.
[33]:
optimal_micro_moments[0]
[33]:
Score for parameter #0: -2.8E-02 (Score for parameter #0 on Simulated micro data: 1000 Observations in Market 'C61Q1')
Now, we can use our problem with updated optimal IVs, including our optimal micro moments, to obtain an efficient estimator.
[34]:
updated_results = updated_problem.solve(
results.sigma,
results.pi,
optimization=pyblp.Optimization('bfgs', {'gtol': 1e-5}),
method='1s',
micro_moments=optimal_micro_moments,
)
updated_results
[34]:
Problem Results Summary:
=======================================================================================================
GMM Objective Gradient Hessian Hessian Clipped Weighting Matrix Covariance Matrix
Step Value Norm Min Eigenvalue Max Eigenvalue Shares Condition Number Condition Number
---- --------- -------- -------------- -------------- ------- ---------------- -----------------
1 +1.1E+02 +8.8E-06 +2.3E-03 +8.3E+04 0 +2.2E+08 +4.1E+07
=======================================================================================================
Cumulative Statistics:
===========================================================================
Computation Optimizer Optimization Objective Fixed Point Contraction
Time Converged Iterations Evaluations Iterations Evaluations
----------- --------- ------------ ----------- ----------- -----------
00:01:26 Yes 34 55 42977 133908
===========================================================================
Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
=====================================================================================================================
Sigma: 1 prices sugar mushy | Pi: income income_squared age child
------ ---------- ---------- ---------- ---------- | ------ ---------- -------------- ---------- ----------
1 +4.6E-01 | 1 +4.8E+00 +0.0E+00 +6.5E-01 +0.0E+00
(+6.7E-02) | (+3.5E-01) (+1.5E-01)
|
prices +0.0E+00 +3.1E+00 | prices +4.7E+02 -2.5E+01 +0.0E+00 +3.2E+00
(+5.3E-01) | (+3.1E+01) (+1.6E+00) (+1.7E+00)
|
sugar +0.0E+00 +0.0E+00 +2.0E-02 | sugar -4.0E-01 +0.0E+00 +4.6E-02 +0.0E+00
(+6.1E-03) | (+2.4E-02) (+9.2E-03)
|
mushy +0.0E+00 +0.0E+00 +0.0E+00 -6.6E-02 | mushy +1.1E+00 +0.0E+00 -1.1E+00 +0.0E+00
(+8.3E-02) | (+1.9E-01) (+1.1E-01)
=====================================================================================================================
Beta Estimates (Robust SEs in Parentheses):
==========
prices
----------
-5.2E+01
(+2.2E+00)
==========
Estimated Micro Moments:
==============================================================================================================================
Observed Estimated Difference Moment Part Dataset Observations Markets
-------- --------- ---------- ----------------------- ----------------------- -------------------- ------------ -------
-2.8E-02 -2.3E-02 -5.5E-03 Score for parameter #0 Score for parameter #0 Simulated micro data 1000 1
+6.4E-04 +9.8E-04 -3.4E-04 Score for parameter #1 Score for parameter #1 Simulated micro data 1000 1
+1.1E-01 +1.5E-01 -3.4E-02 Score for parameter #2 Score for parameter #2 Simulated micro data 1000 1
-2.9E-04 -3.6E-04 +7.0E-05 Score for parameter #3 Score for parameter #3 Simulated micro data 1000 1
-2.0E-03 +2.1E-02 -2.3E-02 Score for parameter #4 Score for parameter #4 Simulated micro data 1000 1
-3.6E-02 -3.0E-02 -6.3E-03 Score for parameter #5 Score for parameter #5 Simulated micro data 1000 1
-2.5E-04 +2.1E-03 -2.4E-03 Score for parameter #6 Score for parameter #6 Simulated micro data 1000 1
-6.8E-03 +4.0E-02 -4.7E-02 Score for parameter #7 Score for parameter #7 Simulated micro data 1000 1
+1.7E-03 +1.0E-03 +6.8E-04 Score for parameter #8 Score for parameter #8 Simulated micro data 1000 1
+5.5E-02 +1.8E-01 -1.3E-01 Score for parameter #9 Score for parameter #9 Simulated micro data 1000 1
-4.8E-01 -4.7E-01 -1.0E-02 Score for parameter #10 Score for parameter #10 Simulated micro data 1000 1
-2.9E-03 +8.1E-03 -1.1E-02 Score for parameter #11 Score for parameter #11 Simulated micro data 1000 1
-2.6E-02 -5.5E-03 -2.0E-02 Score for parameter #12 Score for parameter #12 Simulated micro data 1000 1
==============================================================================================================================
Results are fairly similar to before because we simulated the micro data from our first-stage estimate of \(\theta\), which was somewhat close to our second-stage estimate. Scores are not matched perfectly because the model is now over-identified, with two times as many moments as there are parameters (one optimal IV and one optimal micro moment for each nonlinear parameter).
Download the Jupyter Notebook for this section: simulation.ipynb
Problem Simulation Tutorial¶
[1]:
import pyblp
import numpy as np
import pandas as pd
pyblp.options.digits = 2
pyblp.options.verbose = False
pyblp.__version__
[1]:
'1.1.0'
Before configuring and solving a problem with real data, it may be a good idea to perform Monte Carlo analysis on simulated data to verify that it is possible to accurately estimate model parameters. For example, before configuring and solving the example problems in the prior tutorials, it may have been a good idea to simulate data according to the assumed models of supply and demand. During such Monte Carlo anaysis, the data would only be used to determine sample sizes and perhaps to choose reasonable true parameters.
Simulations are configured with the Simulation class, which requires many of the same inputs as Problem. The two main differences are:
Variables in formulations that cannot be loaded from
product_data
oragent_data
will be drawn from independent uniform distributions.True parameters and the distribution of unobserved product characteristics are specified.
First, we’ll use build_id_data to build market and firm IDs for a model in which there are \(T = 50\) markets, and in each market \(t\), a total of \(J_t = 20\) products produced by \(F = 10\) firms.
[2]:
id_data = pyblp.build_id_data(T=50, J=20, F=10)
Next, we’ll create an Integration configuration to build agent data according to a Gauss-Hermite product rule that exactly integrates polynomials of degree \(2 \times 9 - 1 = 17\) or less.
[3]:
integration = pyblp.Integration('product', 9)
integration
[3]:
Configured to construct nodes and weights according to the level-9 Gauss-Hermite product rule with options {}.
We’ll then pass these data to Simulation. We’ll use Formulation configurations to create an \(X_1\) that consists of a constant, prices, and an exogenous characteristic; an \(X_2\) that consists only of the same exogenous characteristic; and an \(X_3\) that consists of the common exogenous characteristic and a cost-shifter.
[4]:
simulation = pyblp.Simulation(
product_formulations=(
pyblp.Formulation('1 + prices + x'),
pyblp.Formulation('0 + x'),
pyblp.Formulation('0 + x + z')
),
beta=[1, -2, 2],
sigma=1,
gamma=[1, 4],
product_data=id_data,
integration=integration,
seed=0
)
simulation
[4]:
Dimensions:
=====================================
T N F I K1 K2 K3
--- ---- --- --- ---- ---- ----
50 1000 10 450 3 1 2
=====================================
Formulations:
=================================================
Column Indices: 0 1 2
------------------------------- --- ------ ---
X1: Linear Characteristics 1 prices x
X2: Nonlinear Characteristics x
X3: Linear Cost Characteristics x z
=================================================
Nonlinear Coefficient True Values:
================
Sigma: x
------ --------
x +1.0E+00
================
Beta True Values:
============================
1 prices x
-------- -------- --------
+1.0E+00 -2.0E+00 +2.0E+00
============================
Gamma True Values:
==================
x z
-------- --------
+1.0E+00 +4.0E+00
==================
When Simulation is initialized, it constructs Simulation.agent_data and simulates Simulation.product_data.
The Simulation can be further configured with other arguments that determine how unobserved product characteristics are simulated and how marginal costs are specified.
At this stage, simulated variables are not consistent with true parameters, so we still need to solve the simulation with Simulation.replace_endogenous. This method replaced simulated prices and market shares with values that are consistent with the true parameters. Just like ProblemResults.compute_prices, to do so it iterates over the \(\zeta\)-markup equation from Morrow and Skerlos (2011).
[5]:
simulation_results = simulation.replace_endogenous()
simulation_results
[5]:
Simulation Results Summary:
======================================================================================================
Computation Fixed Point Fixed Point Contraction Profit Gradients Profit Hessians Profit Hessians
Time Failures Iterations Evaluations Max Norm Min Eigenvalue Max Eigenvalue
----------- ----------- ----------- ----------- ---------------- --------------- ---------------
00:00:01 0 721 721 +1.3E-13 -8.4E-01 -9.6E-06
======================================================================================================
Now, we can try to recover the true parameters by creating and solving a Problem.
The convenience method SimulationResults.to_problem constructs some basic “sums of characteristics” BLP instruments that are functions of all exogenous numerical variables in the problem. In this example, excluded demand-side instruments are the cost-shifter z
and traditional BLP instruments constructed from x
. Excluded supply-side instruments are traditional BLP instruments constructed from x
and z
.
[6]:
problem = simulation_results.to_problem()
problem
[6]:
Dimensions:
=================================================
T N F I K1 K2 K3 MD MS
--- ---- --- --- ---- ---- ---- ---- ----
50 1000 10 450 3 1 2 5 6
=================================================
Formulations:
=================================================
Column Indices: 0 1 2
------------------------------- --- ------ ---
X1: Linear Characteristics 1 prices x
X2: Nonlinear Characteristics x
X3: Linear Cost Characteristics x z
=================================================
We’ll choose starting values that are half the true parameters so that the optimization routine has to do some work. Note that since we’re jointly estimating the supply side, we need to provide an initial value for the linear coefficient on prices because this parameter cannot be concentrated out of the problem (unlike linear coefficients on exogenous characteristics).
[7]:
results = problem.solve(
sigma=0.5 * simulation.sigma,
pi=0.5 * simulation.pi,
beta=[None, 0.5 * simulation.beta[1], None],
optimization=pyblp.Optimization('l-bfgs-b', {'gtol': 1e-5})
)
results
[7]:
Problem Results Summary:
==============================================================================================================
GMM Objective Projected Reduced Hessian Reduced Hessian Clipped Weighting Matrix Covariance Matrix
Step Value Gradient Norm Min Eigenvalue Max Eigenvalue Shares Condition Number Condition Number
---- --------- ------------- --------------- --------------- ------- ---------------- -----------------
2 +6.4E+00 +6.9E-08 +7.2E+00 +3.8E+03 0 +3.7E+04 +1.5E+04
==============================================================================================================
Cumulative Statistics:
===========================================================================
Computation Optimizer Optimization Objective Fixed Point Contraction
Time Converged Iterations Evaluations Iterations Evaluations
----------- --------- ------------ ----------- ----------- -----------
00:00:16 Yes 23 30 8295 26312
===========================================================================
Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
==================
Sigma: x
------ ----------
x +7.8E-01
(+5.2E-01)
==================
Beta Estimates (Robust SEs in Parentheses):
==================================
1 prices x
---------- ---------- ----------
+9.6E-01 -2.0E+00 +2.1E+00
(+9.3E-02) (+2.4E-02) (+1.4E-01)
==================================
Gamma Estimates (Robust SEs in Parentheses):
======================
x z
---------- ----------
+9.8E-01 +4.0E+00
(+8.7E-02) (+8.5E-02)
======================
The parameters seem to have been estimated reasonably well.
[8]:
np.c_[simulation.beta, results.beta]
[8]:
array([[ 1. , 0.96223514],
[-2. , -2.00792431],
[ 2. , 2.10032015]])
[9]:
np.c_[simulation.gamma, results.gamma]
[9]:
array([[1. , 0.97820624],
[4. , 4.03121577]])
[10]:
np.c_[simulation.sigma, results.sigma]
[10]:
array([[1. , 0.78358853]])
In addition to checking that the configuration for a model based on actual data makes sense, the Simulation class can also be a helpful tool for better understanding under what general conditions BLP models can be accurately estimated. Simulations are also used extensively in pyblp’s test suite.
API Documentation¶
The majority of the package consists of classes, which compartmentalize different aspects of the BLP model. There are some convenience functions as well.
Configuration Classes¶
Various components of the package require configurations for how to approximate integrals, solve fixed point problems, and solve optimimzation problems. Such configurations are specified with the following classes.
|
Configuration for designing matrices and absorbing fixed effects. |
|
Configuration for building integration nodes and weights. |
|
Configuration for solving fixed point problems. |
|
Configuration for solving optimization problems. |
Custom optimization configurations can be used to help debug optimization, to define non-standard optimization routines, or to add ad-hoc moments to configured problems. They can use various information about optimization progress so far.
Information about the current progress of optimization. |
Data Manipulation Functions¶
There are also a number of convenience functions that can be used to construct common components of product and agent data, or manipulate other PyBLP objects.
|
Construct a matrix according to a formulation. |
|
Construct “sums of characteristics” excluded BLP instruments. |
Construct excluded differentiation instruments. |
|
|
Build a balanced panel of market and firm IDs. |
|
Build ownership matrices, \(O\). |
|
Build nodes and weights for integration over agent choice probabilities. |
|
Convert a NumPy record array into a dictionary. |
|
Save an object as a pickle file. |
|
Load a pickled object into memory. |
Problem Class¶
Given data and appropriate configurations, a BLP-type problem can be structured by initializing the following class.
|
A BLP-type problem. |
Once initialized, the following method solves the problem.
|
Solve the problem. |
Micro Moment Classes¶
Micro dataset configurations are passed to micro part configurations, which are passed to micro moment configurations, which in turn can be passed to Problem.solve()
.
|
Configuration for a micro dataset \(d\) on which micro moments are computed. |
|
Configuration for a micro moment part \(p\). |
|
Configuration for a micro moment \(m\). |
Problem Results Class¶
Solved problems return the following results class.
Results of a solved BLP problem. |
The results can be pickled or converted into a dictionary.
|
Save these results as a pickle file. |
|
Convert these results into a dictionary that maps attribute names to values. |
The following methods test the validity of overidentifying and model restrictions.
Test the validity of overidentifying restrictions with the Hansen \(J\) test. |
|
|
Test the validity of model restrictions with the distance test. |
Test the validity of model restrictions with the Lagrange multiplier test. |
|
|
Test the validity of model restrictions with the Wald test. |
In addition to class attributes, other post-estimation outputs can be estimated market-by-market with the following methods, each of which return an array.
Estimate aggregate elasticities of demand, \(\mathscr{E}\), with respect to a variable, \(x\). |
|
|
Estimate matrices of elasticities of demand, \(\varepsilon\), with respect to a variable, \(x\). |
Estimate matrices of derivatives of demand with respect to a variable, \(x\). |
|
Estimate arrays of second derivatives of demand with respect to a variable, \(x\). |
|
Estimate arrays of second derivatives of profits with respect to a prices. |
|
Estimate matrices of diversion ratios, \(\mathscr{D}\), with respect to a variable, \(x\). |
|
Estimate matrices of long-run diversion ratios, \(\bar{\mathscr{D}}\). |
|
Estimate matrices of choice probabilities. |
|
|
Extract diagonals from stacked \(J_t \times J_t\) matrices for each market \(t\). |
|
Extract means of diagonals from stacked \(J_t \times J_t\) matrices for each market \(t\). |
|
Estimate mean utilities, \(\delta\). |
|
Estimate marginal costs, \(c\). |
Estimate matrices of passthrough of marginal costs to equilibrium prices, \(\Upsilon\). |
|
Approximate equilibrium prices after firm or cost changes, \(p^*\), under the assumption that shares and their price derivatives are unaffected by such changes. |
|
|
Estimate equilibrium prices after firm or cost changes, \(p^*\). |
|
Estimate shares. |
|
Estimate Herfindahl-Hirschman Indices, \(\text{HHI}\). |
|
Estimate markups, \(\mathscr{M}\). |
|
Estimate population-normalized gross expected profits, \(\pi\). |
Estimate population-normalized consumer surpluses, \(\text{CS}\). |
A parametric bootstrap can be used, for example, to compute standard errors forpost-estimation outputs. The following method returns a results class with the same methods in the list directly above, which returns a distribution of post-estimation outputs corresponding to different bootstrapped samples.
|
Use a parametric bootstrap to create an empirical distribution of results. |
Optimal instruments, which also return a results class instead of an array, can be estimated with the following method.
Estimate feasible optimal or efficient instruments, \(Z_D^\text{opt}\) and \(Z_S^\text{opt}\). |
Importance sampling can be used to create new integration nodes and weights. Its method also returns a results class.
|
Use importance sampling to construct nodes and weights for integration. |
The following methods can compute micro moment values, compute scores from micro data, or simulate such data.
Estimate micro moment values, \(f_m(v)\). |
|
|
Compute scores for observations \(n \in N_d\) from a micro dataset \(d\). |
|
Compute scores for all agent-choices, treated as observations \(n \in N_d\) from a micro dataset \(d\). |
|
Simulate observations \(n \in N_d\) from a micro dataset \(d\). |
Bootstrapped Problem Results Class¶
Parametric bootstrap computation returns the following class.
Bootstrapped results of a solved problem. |
This class has many of the same methods as ProblemResults()
. It can also be pickled or converted into a dictionary.
Save these results as a pickle file. |
|
|
Convert these results into a dictionary that maps attribute names to values. |
Optimal Instrument Results Class¶
Optimal instrument computation returns the following results class.
Results of optimal instrument computation. |
The results can be pickled or converted into a dictionary.
Save these results as a pickle file. |
|
|
Convert these results into a dictionary that maps attribute names to values. |
They can also be converted into a Problem
with the following method.
Re-create the problem with estimated feasible optimal instruments. |
This method returns the following class, which behaves exactly like a Problem
.
A BLP problem updated with optimal excluded instruments. |
Importance Sampling Results Class¶
Importance sampling returns the following results class:
Results of importance sampling. |
The results can be pickled or converted into a dictionary.
Save these results as a pickle file. |
|
|
Convert these results into a dictionary that maps attribute names to values. |
They can also be converted into a Problem
with the following method.
Re-create the problem with the agent data constructed from importance sampling. |
This method returns the following class, which behaves exactly like a Problem
.
A BLP problem updated after importance sampling. |
Simulation Class¶
The following class allows for evaluation of more complicated counterfactuals than is possible with ProblemResults
methods, or for simulation of synthetic data from scratch.
|
Simulation of data in BLP-type models. |
Once initialized, the following method replaces prices and shares with equilibrium values that are consistent with true parameters.
|
Replace simulated prices and market shares with equilibrium values that are consistent with true parameters. |
A less common way to solve the simulation is to assume simulated prices and shares represent and equilibrium and to replace exogenous variables instead.
|
Replace exogenous product characteristics with values that are consistent with true parameters. |
Simulation Results Class¶
Solved simulations return the following results class.
Results of a solved simulation of synthetic BLP data. |
This class has many of the same methods as ProblemResults
. It can also be pickled or converted into a dictionary.
Save these results as a pickle file. |
|
|
Convert these results into a dictionary that maps attribute names to values. |
It can also be converted into a Problem
with the following method.
Convert the solved simulation into a problem. |
Structured Data Classes¶
Product and agent data that are passed or constructed by Problem
and Simulation
are structured internally into classes with field names that more closely resemble BLP notation. Although these structured data classes are not directly constructable, they can be accessed with Problem
and Simulation
class attributes. It can be helpful to compare these structured data classes with the data or configurations used to create them.
Product data structured as a record array. |
|
Agent data structured as a record array. |
Multiprocessing¶
A context manager can be used to enable parallel processing for methods that perform market-by-market computation.
|
Context manager used for parallel processing in a |
Options and Example Data¶
In addition to classes and functions, there are also two modules that can be used to configure global package options and locate example data that comes with the package.
Global options. |
|
Locations of example data that are included in the package for convenience. |
Exceptions¶
When errors occur, they will either be displayed as warnings or raised as exceptions.
Multiple errors that occurred around the same time. |
|
Encountered nonpositive marginal costs in a log-linear specification. |
|
Encountered nonpositive synthetic marginal costs in a log-linear specification. |
|
Failed to compute standard errors because of invalid estimated covariances of GMM parameters. |
|
Failed to compute a weighting matrix because of invalid estimated covariances of GMM moments. |
|
Encountered a numerical error. |
|
Encountered a numerical error when computing \(\delta\). |
|
Encountered a numerical error when computing marginal costs. |
|
Encountered a numerical error when computing micro moments. |
|
Encountered a numerical error when computing the Jacobian (holding \(\beta\) fixed) of \(\xi\) (equivalently, of \(\delta\)) with respect to \(\theta\). |
|
Encountered a numerical error when computing the Jacobian (holding \(\gamma\) fixed) of \(\omega\) (equivalently, of transformed marginal costs) with respect to \(\theta\). |
|
Encountered a numerical error when computing the Jacobian of micro moments with respect to \(\theta\). |
|
Encountered a numerical error when computing micro moment covariances. |
|
Encountered a numerical error when computing synthetic prices. |
|
Encountered a numerical error when computing synthetic shares. |
|
Encountered a numerical error when computing the synthetic \(\delta\). |
|
Encountered a numerical error when computing synthetic marginal costs. |
|
Encountered a numerical error when computing synthetic micro data. |
|
Encountered a numerical error when computing synthetic micro moments. |
|
Encountered a numerical error when computing micro scores. |
|
Encountered a numerical error when solving for a realization of equilibrium prices and shares. |
|
Encountered a numerical error when computing a realization of the Jacobian (holding \(\beta\) fixed) of \(\xi\) (equivalently, of \(\delta\)) or \(\omega\) (equivalently, of transformed marginal costs) with respect to \(\theta\). |
|
Encountered a numerical error when computing a post-estimation output. |
|
A fixed effect absorption procedure failed to properly absorb fixed effects. |
|
Shares were clipped during the final iteration of the fixed point routine for computing \(\delta\). |
|
The optimization routine failed to converge. |
|
The fixed point computation of \(\delta\) failed to converge. |
|
The fixed point computation of synthetic prices failed to converge. |
|
The fixed point computation of the synthetic \(\delta\) failed to converge. |
|
The fixed point computation of equilibrium prices failed to converge. |
|
Reverted a problematic GMM objective value. |
|
Reverted problematic elements in the GMM objective gradient. |
|
Reverted problematic elements in \(\delta\). |
|
Reverted problematic marginal costs. |
|
Reverted problematic micro moments. |
|
Reverted problematic elements in the Jacobian (holding \(\beta\) fixed) of \(\xi\) (equivalently, of \(\delta\)) with respect to \(\theta\). |
|
Reverted problematic elements in the Jacobian (holding \(\gamma\) fixed) of \(\omega\) (equivalently, of transformed marginal costs) with respect to \(\theta\). |
|
Reverted problematic elements in the Jacobian of micro moments with respect to \(\theta\). |
|
Failed to compute eigenvalues for the GMM objective’s (reduced) Hessian matrix. |
|
Failed to compute eigenvalues for a firm’s profit Hessian. |
|
Failed to invert an estimated covariance when computing fitted values. |
|
Failed to invert a Jacobian of shares with respect to \(\xi\) when computing the Jacobian (holding \(\beta\) fixed) of \(\xi\) (equivalently, of \(\delta\)) with respect to \(\theta\). |
|
Failed to invert an intra-firm Jacobian of shares with respect to prices. |
|
Failed to invert the matrix to recover the passthrough matrix. |
|
Failed to invert an estimated covariance matrix of linear parameters. |
|
Failed to invert an estimated covariance matrix of GMM parameters. |
|
Failed to invert an estimated covariance matrix of GMM moments. |
References¶
This page contains a full list of references cited in the documentation, including the original work of Berry, Levinsohn, and Pakes (1995). If you use PyBLP in your research, we ask that you also cite the below Conlon and Gortmaker (2020), which describes the advances implemented in the package. If you use PyBLP’s micro moments functionality, we ask that you also cite Conlon and Gortmaker (2023), which describes the standardized framework implemented by PyBLP for incorporating micro data into BLP-style estimation.
Conlon and Gortmaker (2020)¶
Conlon, Christopher, and Jeff Gortmaker (2020). Best practices for differentiated products demand estimation with PyBLP. RAND Journal of Economics, 51 (4), 1108-1161.
Conlon and Gortmaker (2023)¶
Conlon, Christopher, and Jeff Gortmaker (2023). Incorporating micro data into differentiated products demand estimation with PyBLP. Working paper.
Other References¶
Amemiya (1977)¶
Amemiya, Takeshi (1977). A note on a heteroscedastic model. Journal of Econometrics, 6 (3), 365-370.
Andrews, Gentzkow, and Shapiro (2017)¶
Andrews, Isaiah, Matthew Gentzkow, and Jesse M. Shapiro (2017). Measuring the sensitivity of parameter estimates to estimation moments. Quarterly Journal of Economics, 132 (4), 1553-1592.
Armstrong (2016)¶
Armstrong, Timothy B. (2016). Large market asymptotics for differentiated product demand estimators with economic models of supply. Econometrica, 84 (5), 1961-1980.
Berry (1994)¶
Berry, Steven (1994). Estimating discrete-choice models of product differentiation. RAND Journal of Economics, 25 (2), 242-262.
Berry, Levinsohn, and Pakes (1995)¶
Berry, Steven, James Levinsohn, and Ariel Pakes (1995). Automobile prices in market equilibrium. Econometrica, 63 (4), 841-890.
Berry, Levinsohn, and Pakes (1999)¶
Berry, Steven, James Levinsohn, and Ariel Pakes (1999). Voluntary export restraints on automobiles: Evaluating a trade policy. American Economic Review, 83 (9), 400-430.
Berry, Levinsohn, and Pakes (2004)¶
Berry, Steven, James Levinsohn, and Ariel Pakes (2004). Differentiated products demand systems from a combination of micro and macro data: The new car market. Journal of Political Economy, 112 (1), 68-105.
Berry and Pakes (2007)¶
Berry, Steven, and Ariel Pakes (2007). The pure characteristics demand model. International Economic Review, 48 (4), 1193-1225.
Brenkers and Verboven (2006)¶
Brenkers, Randy, and Frank Verboven (2006). Liberalizing a distribution system: The European car market. Journal of the European Economic Association, 4 (1), 216-251.
Brunner, Heiss, Romahn, and Weiser (2017)¶
Brunner, Daniel, Florian Heiss, André Romahn, and Constantin Weiser (2017) Reliable estimation of random coefficient logit demand models. DICE Discussion Paper 267.
Cardell (1997)¶
Cardell, N. Scott (1997). Variance components structures for the extreme-value and logistic distributions with application to models of heterogeneity. Econometric Theory, 13 (2), 185-213.
Chamberlain (1987)¶
Chamberlain, Gary (1987) Asymptotic efficiency in estimation with conditional moment restrictions. Journal of Econometrics, 34 (3), 305-334.
Conlon and Mortimer (2018)¶
Conlon, Christopher T., and Julie H. Mortimer (2018). Empirical properties of diversion ratios. NBER working paper 24816.
Frisch and Waugh (1933)¶
Frisch, Ragnar, and Frederick V. Waugh (1933). Partial time regressions as compared with individual trends. Econometrica, 1 (4), 387-401.
Gandhi and Houde (2017)¶
Gandhi, Amit, and Jean-François Houde (2017). Measuring substitution patterns in differentiated products industries. Working paper.
Grigolon and Verboven (2014)¶
Grigolon, Laura, and Frank Verboven (2014). Nested logit or random coefficients logit? A comparison of alternative discrete choice models of product differentiation. Review of Economics and Statistics, 96 (5), 916-935.
Hausman, Leonard, and Zona (1994)¶
Hausman, Jerry, Gregory Leonard, and J. Douglas Zona (1994). Competitive analysis with differentiated products. Annals of Economics and Statistics, 34, 143-157.
Hansen (1982)¶
Hansen, Lars Peter (1982). Large sample properties of generalized method of moments estimators. Econometrica, 50 (4), 1029-1054.
Heiss and Winschel (2008)¶
Heiss, Florian, and Viktor Winschel (2008). Likelihood approximation by numerical integration on sparse grids. Journal of Econometrics, 144 (1), 62-80.
Hess, Train, and Polak (2004)¶
Hess, Stephane, Kenneth E. Train, and John W. Polak (2004). On the use of a Modified Latin Hypercube Sampling (MLHS) method in the estimation of a mixed logit model for vehicle choice. Transportation Research Part B (40), 147-167.
Imbens and Lancaster (1994)¶
Imbens, Guido W., and Tony Lancaster (1994). Combining micro and macro data in microeconometric models. Review of Economic Studies, 61 (4), 655-680.
Judd and Skrainka (2011)¶
Judd, Kenneth L., and Ben Skrainka (2011). High performance quadrature rules: How numerical integration affects a popular model of product differentiation. CeMMAP working paper CWP03/11.
Knittel and Metaxoglou (2014)¶
Knittel, Christopher R., and Konstantinos Metaxoglou (2014). Estimation of random-coefficient demand models: Two empiricists’ perspective. Review of Economics and Statistics, 96 (1), 34-59.
Lovell (1963)¶
Lovell, Michael C. (1963). Seasonal adjustment of economic time series and multiple regression analysis. Journal of the American Statistical Association, 58 (304), 993-1010.
MacKay and Miller (2023)¶
MacKay, Alexander and Nathan Miller (2023). Estimating models of supply and demand: Instruments and covariance restrictions. HBS working paper 19-051.
Morrow and Skerlos (2011)¶
Morrow, W. Ross, and Steven J. Skerlos (2011). Fixed-point approaches to computing Bertrand-Nash equilibrium prices under mixed-logit demand. Operations Research, 59 (2), 328-345.
Nevo (2000a)¶
Nevo, Aviv (2000). A practitioner’s guide to estimation of random‐coefficients logit models of demand. Journal of Economics & Management Strategy, 9 (4), 513-548.
Nevo (2000b)¶
Nevo, Aviv (2000). Mergers with differentiated products: The case of the ready-to-eat cereal industry. RAND Journal of Economics, 31 (3), 395-421.
Newey and West (1987)¶
Newey, Whitney K., and Kenneth D. West (1987). Hypothesis testing with efficient method of moments estimation. International Economic Review, 28 (3), 777-787.
Owen (2013)¶
Owen, Art B. (2013). Monte Carlo theory, methods and examples.
Owen (2017)¶
Owen, Art B. (2017). A randomized Halton algorithm in R.
Petrin (2002)¶
Petrin, Amil (2002). Quantifying the benefits of new products: The case of the minivan. Journal of Political Economy, 110 (4), 705-729.
Reynaert and Verboven (2014)¶
Reynaert, Mathias, and Frank Verboven (2014). Improving the performance of random coefficients demand models: The role of optimal instruments. Journal of Econometrics, 179 (1), 83-98.
Reynaerts, Varadhan, and Nash (2012)¶
Reynaerts, Jo, Ravi Varadhan, and John C. Nash (2012). Enhancing the convergence properties of the BLP (1995) contraction mapping. VIVES discussion paper 35.
Varadhan and Roland (2008)¶
Varadhan, Ravi, and Christophe Roland (2008). Simple and globally convergent methods for accelerating the convergence of any EM algorithm. Scandinavian Journal of Statistics, 35 (2), 335-353.
Werden (1997)¶
Werden, Gregory J. (1997). Simulating the effects of differentiated products mergers: A practitioners’ guide. Economic Analysis Group, Proceedings of NE-165 Conference, Washington, D.C., June 20–21, 1996, 1997.
Legal¶
Copyright 2021 Jeff Gortmaker and Christopher Conlon
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Contributing¶
Please use the GitHub issue tracker to report bugs or to request features. Contributions are welcome. Examples include:
Code optimizations.
Documentation improvements.
Alternate formulations that have been implemented in the literature but not in PyBLP.
Testing¶
Testing is done with the tox automation tool, which runs a pytest-backed test suite in the tests/
directory.
Testing Requirements¶
In addition to the installation requirements for the package itself, running tests and building documentation requires additional packages specified by the tests
and docs
extras in setup.py
, along with any other explicitly specified deps
in tox.ini
.
The full suite of tests also requires installation of the following software:
Artleys Knitro version 10.3 or newer: testing optimization routines.
MATLAB: comparing sparse grids with those created by the function nwspgr created by Florian Heiss and Viktor Winschel, which must be included in a directory on the MATLAB path.
R: simulating nested logit errors created by the package evd created by Alec Stephenson, which must be installed.
If software is not installed, its associated tests will be skipped. Additionally, some tests that require support for extended precision will be skipped if on the platform running the tests, numpy.longdouble
has the same precision as numpy.float64
. This tends to be the case on Windows.
Running Tests¶
Defined in tox.ini
are environments that test the package under different python versions, check types, enforce style guidelines, verify the integrity of the documentation, and release the package. First, tox should be installed on top of an Anaconda installation. The following command can be run in the top-level pyblp
directory to run all testing environments:
tox
You can choose to run only one environment, such as the one that builds the documentation, with the -e
flag:
tox -e docs
Test Organization¶
Fixtures, which are defined in tests.conftest
, configure the testing environment and simulate problems according to a range of specifications.
Most BLP-specific tests in tests.test_blp
verify properties about results obtained by solving the simulated problems under various parameterizations. Examples include:
Reasonable formulations of problems should give rise to estimated parameters that are close to their true values.
Cosmetic changes such as the number of processes should not change estimates.
Post-estimation outputs should satisfy certain properties.
Optimization routines should behave as expected.
Derivatives computed with finite differences should approach analytic derivatives.
Tests of generic utilities in tests.test_formulation
, tests.test_integration
, tests.test_iteration
, and tests.test_optimization
verify that matrix formulation, integral approximation, fixed point iteration, and nonlinear optimization all work as expected. Example include:
Nonlinear formulas give rise to expected matrices and derivatives.
Gauss-Hermite integrals are better approximated with quadrature based on Gauss-Hermite rules than with Monte Carlo integration.
To solve a fixed point iteration problem for which it was developed, SQUAREM requires fewer fixed point evaluations than does simple iteration.
All optimization routines manage to solve a well-known optimization problem under different parameterizations.
Version Notes¶
These notes will only include major changes.
1.1¶
Covariance restrictions
Demographic-specific product availability
1.0¶
Support matching smooth functions of micro means
Optimal micro moments
Support elimination of groups of products for second choices
Micro data simulation
Micro moment tutorials
0.13¶
Overhauled micro moment API
Product-specific demographics
Passthrough calculations
Added problem results methods to simulation results
Profit Hessian computation
Checks of pricing second order conditions
Newton-based methods for computing equilibrium prices
Large speedups for supply-side and micro moment derivatives
Universal display for fixed point iteration progress
Support adjusting for simulation error in moment covariances
0.12¶
Refactored micro moment API
Custom micro moments
Properly scale micro moment covariances
Pickling support
0.11¶
Elasticities and diversion ratios with respect to mean utility
Willingness to pay calculations
0.10¶
Simplify micro moment API
Second choice or diversion micro moments
Add share clipping to make fixed point more robust
Report covariance matrix estimates in addition to Cholesky root
Approximation to the pure characteristics model
Add option to always use finite differences
0.9¶
More control over matrices of instruments
Split off fixed effect absorption into companion package PyHDFE
Scrambled Halton and Modified Latin Hypercube Sampling (MLHS) integration
Importance sampling
Quantity dependent marginal costs
Speed up various matrix construction routines
Option to do initial GMM update at starting values
Update BLP example data to better replicate original paper
Lognormal random coefficients
Removed outdated default parameter bounds
Change default objective scaling for more comparable objective values across problem sizes
Add post-estimation routines to simplify integration error comparison
0.8¶
Micro moments that match product and agent characteristic covariances
Extended use of pseudo-inverses
Added more information to error messages
More flexible simulation interface
Alternative way to simulate data with specified prices and shares
Tests of overidentifying and model restrictions
Report projected gradients and reduced Hessians
Change objective gradient scaling
Switch to a lower-triangular covariance matrix to fix a bug with off-diagonal parameters
0.7¶
Support more fixed point and optimization solvers
Hessian computation with finite differences
Simplified interface for firm changes
Construction of differentiation instruments
Add collinearity checks
Update notation and explanations
0.6¶
Optimal instrument estimation
Structured all results as classes
Additional information in progress reports
Parametric bootstrapping of post-estimation outputs
Replaced all examples in the documentation with Jupyter notebooks
Updated the instruments for the BLP example problem
Improved support for multiple equation GMM
Made concentrating out linear parameters optional
Better support for larger nesting parameters
Improved robustness to overflow
0.5¶
Estimation of nesting parameters
Performance improvements for matrix algebra and matrix construction
Support for Python 3.7
Computation of reasonable default bounds on nonlinear parameters
Additional information in progress updates
Improved error handling and documentation
Simplified multiprocessing interface
Cancelled out delta in the nonlinear contraction to improve performance
Additional example data and improvements to the example problems
Cleaned up covariance estimation
Added type annotations and overhauled the testing suite
0.4¶
Estimation of a Logit benchmark model
Support for fixing of all nonlinear parameters
More efficient two-way fixed effect absorption
Clustered standard errors
0.3¶
Patsy- and SymPy-backed R-style formula API
More informative errors and displays of information
Absorption of arbitrary fixed effects
Reduction of memory footprint
0.2¶
Improved support for longdouble precision
Custom ownership matrices
New benchmarking statistics
Supply-side gradient computation
Improved configuration for the automobile example problem
0.1¶
Initial release