Download a PDF version of the documentation: pyblp.pdf

PyBLP

The documentation can be navigated with the sidebar, the links below, or the index.

Introduction

PyBLP is a Python 3 implementation of routines for estimating the demand for differentiated products with BLP-type random coefficients logit models. This package was created by Jeff Gortmaker in collaboration with Chris Conlon.

Development of the package has been guided by the work of many researchers and practitioners. For a full list of references, including the original work of Berry, Levinsohn, and Pakes (1995), refer to the references section of the documentation.

Citation

If you use PyBLP in your research, we ask that you also cite Conlon and Gortmaker (2020), which describes the advances implemented in the package.

@article{PyBLP,
    author = {Conlon, Christopher and Gortmaker, Jeff},
    title = {Best practices for differentiated products demand estimation with {PyBLP}},
    journal = {The RAND Journal of Economics},
    volume = {51},
    number = {4},
    pages = {1108-1161},
    doi = {https://doi.org/10.1111/1756-2171.12352},
    url = {https://onlinelibrary.wiley.com/doi/abs/10.1111/1756-2171.12352},
    eprint = {https://onlinelibrary.wiley.com/doi/pdf/10.1111/1756-2171.12352},
    year = {2020}
}

If you use PyBLP’s micro moments functionality, we ask that you also cite Conlon and Gortmaker (2023), which describes the standardized framework implemented by PyBLP for incorporating micro data into BLP-style estimation.

@misc{MicroPyBLP,
    author = {Conlon, Christopher and Gortmaker, Jeff},
    title = {Incorporating micro data into differentiated products demand estimation with {PyBLP}},
    note = {Working paper},
    year = {2023}
}

Installation

The PyBLP package has been tested on Python versions 3.6 through 3.9. The SciPy instructions for installing related packages is a good guide for how to install a scientific Python environment. A good choice is the Anaconda Distribution, since it comes packaged with the following PyBLP dependencies: NumPy, SciPy, SymPy, and Patsy. For absorption of high dimension fixed effects, PyBLP also depends on its companion package PyHDFE, which will be installed when PyBLP is installed.

However, PyBLP may not work with old versions of its dependencies. You can update PyBLP’s Anaconda dependencies with:

conda update numpy scipy sympy patsy

You can update PyHDFE with:

pip install --upgrade pyhdfe

You can install the current release of PyBLP with pip:

pip install pyblp

You can upgrade to a newer release with the --upgrade flag:

pip install --upgrade pyblp

If you lack permissions, you can install PyBLP in your user directory with the --user flag:

pip install --user pyblp

Alternatively, you can download a wheel or source archive from PyPI. You can find the latest development code on GitHub and the latest development documentation here.

Other Languages

Once installed, PyBLP can be incorporated into projects written in many other languages with the help of various tools that enable interoperability with Python.

For example, the reticulate package makes interacting with PyBLP in R straightforward (when supported, Python objects can be converted to their R counterparts with the py_to_r function, which needs to be used manually because we set convert=FALSE to get rid of errors about trying to automatically convert unsupported objects):

library(reticulate)
pyblp <- import("pyblp", convert=FALSE)
pyblp$options$flush_output <- TRUE

Similarly, PyCall can be used to incorporate PyBLP into a Julia workflow:

using PyCall
pyblp = pyimport("pyblp")

The py command serves a similar purpose in MATLAB:

py.pyblp

Features

  • R-style formula interface

  • Bertrand-Nash supply-side moments

  • Multiple equation GMM

  • Demographic interactions

  • Product-specific demographics

  • Consumer-specific product availability

  • Flexible micro moments that can match statistics based on survey data

  • Support for micro moments based on second choice data

  • Support for optimal micro moments that match micro data scores

  • Fixed effect absorption

  • Nonlinear functions of product characteristics

  • Concentrating out linear parameters

  • Flexible random coefficient distributions

  • Parameter bounds and constraints

  • Random coefficients nested logit (RCNL)

  • Approximation to the pure characteristics model

  • Varying nesting parameters across groups

  • Logit and nested logit benchmarks

  • Classic BLP instruments

  • Differentiation instruments

  • Optimal instruments

  • Covariance restrictions

  • Adjustments for simulation error

  • Tests of overidentifying and model restrictions

  • Parametric boostrapping post-estimation outputs

  • Elasticities and diversion ratios

  • Marginal costs and markups

  • Passthrough calculations

  • Profits and consumer surplus

  • Newton and fixed point methods for computing pricing equilibria

  • Merger simulation

  • Custom counterfactual simulation

  • Synthetic data construction

  • SciPy or Artleys Knitro optimization

  • Fixed point acceleration

  • Monte Carlo, quasi-random sequences, quadrature, and sparse grids

  • Importance sampling

  • Custom optimization and iteration routines

  • Robust and clustered errors

  • Linear or log-linear marginal costs

  • Partial ownership matrices

  • Analytic gradients

  • Finite difference Hessians

  • Market-by-market parallelization

  • Extended floating point precision

  • Robust error handling

Bugs and Requests

Please use the GitHub issue tracker to submit bugs or to request features.

Notation

The notation in PyBLP is a customized amalgamation of the notation employed by Berry, Levinsohn, and Pakes (1995), Nevo (2000a), Morrow and Skerlos (2011), Grigolon and Verboven (2014), and others.

Indices

Index

Description

\(j\)

Products

\(t\)

Markets

\(i\)

Agents/individuals

\(f\)

Firms

\(h\)

Nests

\(c\)

Clusters

\(m\)

Micro moments

Dimensions/Sets

Dimension/Set

Description

\(T\)

Markets

\(N\)

Products across all markets

\(F\)

Firms across all markets

\(I\)

Agents across all markets

\(J_t\)

Products in market \(t\)

\(F_t\)

Firms in market \(t\)

\(J_{ft}\)

Products produced by firm \(f\) in market \(t\)

\(I_t\)

Agents in market \(t\)

\(K_1\)

Demand-side linear product characteristics

\(K_1^\text{ex}\)

Exogenous demand-side linear product characteristics

\(K_1^\text{en}\)

Endogenous demand-side linear product characteristics

\(K_2\)

Demand-side nonlinear product characteristics

\(K_3\)

Supply-side product characteristics

\(K_3^\text{ex}\)

Exogenous supply-side product characteristics

\(K_3^\text{en}\)

Endogenous supply-side product characteristics

\(D\)

Demographic variables

\(M_D\)

Demand-side instruments

\(M_S\)

Supply-side instruments

\(M_C\)

Covariance instruments

\(M_M\)

Micro moments

\(T_m\)

Markets over which micro moment \(m\) is averaged

\(T_{mn}\)

Markets over which micro moments \(m\) and \(n\) are both averaged

\(N_m\)

Observations underlying observed micro moment value \(m\).

\(M\)

All moments

\(E_D\)

Absorbed dimensions of demand-side fixed effects

\(E_S\)

Absorbed dimensions of supply-side fixed effects

\(H\)

Nesting groups

\(J_{ht}\)

Products in nesting group \(h\) and market \(t\)

\(C\)

Clusters

\(J_{ct}\)

Products in cluster \(c\) and market \(t\)

Matrices, Vectors, and Scalars

Symbol

Dimensions

Description

\(X_1\)

\(N \times K_1\)

Demand-side linear product characteristics

\(X_1^\text{ex}\)

\(N \times K_1^\text{ex}\)

Exogenous demand-side linear product characteristics

\(X_1^\text{en}\)

\(N \times K_1^\text{en}\)

Endogenous demand-side linear product characteristics

\(X_2\)

\(N \times K_2\)

Demand-side Nonlinear product characteristics

\(X_3\)

\(N \times K_3\)

Supply-side product characteristics

\(X_3^\text{ex}\)

\(N \times K_3^\text{ex}\)

Exogenous supply-side product characteristics

\(X_3^\text{en}\)

\(N \times K_3^\text{en}\)

Endogenous supply-side product characteristics

\(\xi\)

\(N \times 1\)

Unobserved demand-side product characteristics

\(\omega\)

\(N \times 1\)

Unobserved supply-side product characteristics

\(p\)

\(N \times 1\)

Prices

\(s\) (\(s_{jt}\))

\(N \times 1\)

Market shares

\(s\) (\(s_{ht}\))

\(H \times 1\)

Group shares in a market \(t\)

\(s\) (\(s_{ijt}\))

\(N \times I_t\)

Choice probabilities in a market \(t\)

\(c\)

\(N \times 1\)

Marginal costs

\(\tilde{c}\)

\(N \times 1\)

Linear or log-linear marginal costs, \(c\) or \(\log c\)

\(\eta\)

\(N \times 1\)

Markup term from the BLP-markup equation

\(\zeta\)

\(N \times 1\)

Markup term from the \(\zeta\)-markup equation

\(\mathscr{H}\)

\(J_t \times J_t\)

Ownership or product holdings matrix in market \(t\)

\(\kappa\)

\(F_t \times F_t\)

Cooperation matrix in market \(t\)

\(\Delta\)

\(J_t \times J_t\)

Intra-firm matrix of (negative, transposed) demand derivatives in market \(t\)

\(\Lambda\)

\(J_t \times J_t\)

Diagonal matrix used to decompose \(\eta\) and \(\zeta\) in market \(t\)

\(\Gamma\)

\(J_t \times J_t\)

Another matrix used to decompose \(\eta\) and \(\zeta\) in market \(t\)

\(d\)

\(I_t \times D\)

Observed agent characteristics called demographics in market \(t\)

\(\nu\)

\(I_t \times K_2\)

Unobserved agent characteristics called integration nodes in market \(t\)

\(a\)

\(I_t \times J_t\)

Agent-specific product availability in market \(t\)

\(w\)

\(I_t \times 1\)

Integration weights in market \(t\)

\(\delta\)

\(N \times 1\)

Mean utility

\(\mu\)

\(J_t \times I_t\)

Agent-specific portion of utility in market \(t\)

\(\epsilon\)

\(N \times 1\)

Type I Extreme Value idiosyncratic preferences

\(\bar{\epsilon}\) (\(\bar{\epsilon}_{ijt}\))

\(N \times 1\)

Type I Extreme Value term used to decompose \(\epsilon\)

\(\bar{\epsilon}\) (\(\bar{\epsilon}_{iht}\))

\(N \times 1\)

Group-specific term used to decompose \(\epsilon\)

\(U\)

\(J_t \times I_t\)

Indirect utilities

\(V\) (\(V_{ijt}\))

\(J_t \times I_t\)

Indirect utilities minus \(\epsilon\)

\(V\) (\(V_{iht}\))

\(J_t \times I_t\)

Inclusive value of a nesting group

\(\pi\) (\(\pi_{jt}\))

\(N \times 1\)

Population-normalized gross expected profits

\(\pi\) (\(\pi_{ft}\))

\(F_t \times 1\)

Population-normalized gross expected profits of a firm in market \(t\)

\(\beta\)

\(K_1 \times 1\)

Demand-side linear parameters

\(\beta^\text{ex}\)

\(K_1^\text{ex} \times 1\)

Parameters in \(\beta\) on exogenous product characteristics

\(\alpha\)

\(K_1^\text{en} \times 1\)

Parameters in \(\beta\) on endogenous product characteristics

\(\Sigma\)

\(K_2 \times K_2\)

Cholesky root of the covariance matrix for unobserved taste heterogeneity

\(\Pi\)

\(K_2 \times D\)

Parameters that measures how agent tastes vary with demographics

\(\rho\)

\(H \times 1\)

Parameters that measures within nesting group correlation

\(\gamma\)

\(K_3 \times 1\)

Supply-side linear parameters

\(\gamma^\text{ex}\)

\(K_3^\text{ex} \times 1\)

Parameters in \(\gamma\) on exogenous product characteristics

\(\gamma^\text{en}\)

\(K_3^\text{en} \times 1\)

Parameters in \(\gamma\) on endogenous product characteristics

\(\theta\)

\(P \times 1\)

Parameters

\(Z_D\)

\(N \times M_D\)

Demand-side instruments

\(Z_S\)

\(N \times M_S\)

Supply-side instruments

\(Z_C\)

\(N \times M_C\)

Covariance instruments

\(W\)

\(M \times M\)

Weighting matrix

\(S\)

\(M \times M\)

Moment covariances

\(q\)

\(1 \times 1\)

Objective value

\(g_D\)

\(N \times M_D\)

Demand-side moments

\(g_S\)

\(N \times M_S\)

Supply-side moments

\(g_C\)

\(N \times M_C\)

Covariance moments

\(g_M\)

\(I \times M_M\)

Micro moments

\(g\) (\(g_{jt}\))

\(N \times (M_D + M_S + M_C)\)

Demand-side, supply-side, and covariance moments

\(g\) (\(g_c\))

\(C \times (M_D + M_S + M_C)\)

Clustered demand-side, supply-side, and covariance moments

\(\bar{g}_D\)

\(M_D \times 1\)

Averaged demand-side moments

\(\bar{g}_S\)

\(M_S \times 1\)

Averaged supply-side moments

\(\bar{g}_C\)

\(M_C \times 1\)

Averaged covariance moments

\(\bar{g}_M\)

\(M_M \times 1\)

Averaged micro moments

\(\bar{g}\)

\(M \times 1\)

Averaged moments

\(\bar{G}\)

\(M \times P\)

Jacobian of the averaged moments with respect to \(\theta\)

\(\varepsilon\)

\(J_t \times J_t\)

Elasticities of demand in market \(t\)

\(\mathscr{D}\)

\(J_t \times J_t\)

Diversion ratios in market \(t\)

\(\bar{\mathscr{D}}\)

\(J_t \times J_t\)

Long-run diversion ratios in market \(t\)

\(\mathscr{M}\)

\(N \times 1\)

Markups

\(\mathscr{E}\)

\(1 \times 1\)

Aggregate elasticity of demand of a market

\(\text{CS}\)

\(1 \times 1\)

Population-normalized consumer surplus of a market

\(\text{HHI}\)

\(1 \times 1\)

Herfindahl-Hirschman Index of a market

Background

The following sections provide a very brief overview of the BLP model and how it is estimated. This goal is to concisely introduce the notation and terminology used throughout the rest of the documentation. For a more in-depth overview, refer to Conlon and Gortmaker (2020).

The Model

There are \(t = 1, 2, \dotsc, T\) markets, each with \(j = 1, 2, \dotsc, J_t\) products produced by \(f = 1, 2, \dotsc, F_t\) firms, for a total of \(N\) products across all markets. There are \(i = 1, 2, \dotsc, I_t\) individuals/agents who choose among the \(J_t\) products and an outside good \(j = 0\). These numbers also represent sets. For example, \(J_t = \{1, 2, \dots, J_t\}\).

Demand

Observed demand-side product characteristics are contained in the \(N \times K_1\) matrix of linear characteristics, \(X_1\), and the \(N \times K_2\) matrix of nonlinear characteristics, \(X_2\), which is typically a subset of \(X_1\). Unobserved demand-side product characteristics, \(\xi\), are a \(N \times 1\) vector.

In market \(t\), observed agent characteristics are a \(I_t \times D\) matrix called demographics, \(d\). Unobserved agent characteristics are a \(I_t \times K_2\) matrix, \(\nu\).

The indirect utility of agent \(i\) from purchasing product \(j\) in market \(t\) is

(1)\[U_{ijt} = \underbrace{\delta_{jt} + \mu_{ijt}}_{V_{ijt}} + \epsilon_{ijt},\]

in which the mean utility is, in vector-matrix form,

(2)\[\delta = \underbrace{X_1^\text{en}\alpha + X_1^\text{ex}\beta^\text{ex}}_{X_1\beta} + \xi.\]

The \(K_1 \times 1\) vector of demand-side linear parameters, \(\beta\), is partitioned into two components: \(\alpha\) is a \(K_1^\text{en} \times 1\) vector of parameters on the \(N \times K_1^\text{en}\) submatrix of endogenous characteristics, \(X_1^\text{en}\), and \(\beta^\text{ex}\) is a \(K_1^\text{ex} \times 1\) vector of parameters on the \(N \times K_1^\text{ex}\) submatrix of exogenous characteristics, \(X_1^\text{ex}\). Usually, \(X_1^\text{en} = p\), prices, so \(\alpha\) is simply a scalar.

The agent-specific portion of utility in a single market is, in vector-matrix form,

(3)\[\mu = X_2(\Sigma \nu' + \Pi d').\]

The model incorporates both observable (demographics) and unobservable taste heterogeneity though random coefficients. For the unobserved heterogeneity, we let \(\nu\) denote independent draws from the standard normal distribution. These are scaled by a \(K_2 \times K_2\) lower-triangular matrix \(\Sigma\), which denotes the Cholesky root of the covariance matrix for unobserved taste heterogeneity. The \(K_2 \times D\) matrix \(\Pi\) measures how agent tastes vary with demographics.

In the above expression, random coefficients are assumed to be normally distributed, but this expression supports all elliptical distributions. To incorporate one or more lognormal random coefficients, the associated columns in the parenthesized expression can be exponentiated before being pre-multiplied by \(X_2\). For example, this allows for the coefficient on price to be lognormal so that demand slopes down for all agents. For lognormal random coefficients, a constant column is typically included in \(d\) so that its coefficients in \(\Pi\) parametrize the means of the logs of the random coefficients. More generally, all log-elliptical distributions are supported. A logit link function is also supported.

Random idiosyncratic preferences, \(\epsilon_{ijt}\), are assumed to be Type I Extreme Value, so that conditional on the heterogeneous coefficients, market shares follow the well-known logit form. Aggregate market shares are obtained by integrating over the distribution of individual heterogeneity. They are approximated with Monte Carlo integration or quadrature rules defined by the \(I_t \times K_2\) matrix of integration nodes, \(\nu\), and an \(I_t \times 1\) vector of integration weights, \(w\):

(4)\[s_{jt} \approx \sum_{i \in I_t} w_{it} s_{ijt},\]

where the probability that agent \(i\) chooses product \(j\) in market \(t\) is

(5)\[s_{ijt} = \frac{\exp V_{ijt}}{1 + \sum_{k \in J_t} \exp V_{ikt}}.\]

There is a one in the denominator because the utility of the outside good is normalized to \(U_{i0t} = 0\). The scale of utility is normalized by the variance of \(\epsilon_{ijt}\).

Supply

Observed supply-side product characteristics are contained in the \(N \times K_3\) matrix of supply-side characteristics, \(X_3\). Prices cannot be supply-side characteristics, but non-price product characteristics often overlap with the demand-side characteristics in \(X_1\) and \(X_2\). Unobserved supply-side product characteristics, \(\omega\), are a \(N \times 1\) vector.

Firm \(f\) chooses prices in market \(t\) to maximize the profits of its products \(J_{ft} \subset J_t\):

(6)\[\pi_{ft} = \sum_{j \in J_{ft}} (p_{jt} - c_{jt})s_{jt}.\]

In a single market, the corresponding multi-product differentiated Bertrand first order conditions are, in vector-matrix form,

(7)\[p - c = \underbrace{\Delta^{-1}s}_{\eta},\]

where the multi-product Bertrand markup \(\eta\) depends on \(\Delta\), a \(J_t \times J_t\) matrix of intra-firm (negative, transposed) demand derivatives:

(8)\[\Delta = -\mathscr{H} \odot \frac{\partial s}{\partial p}'.\]

Here, \(\mathscr{H}\) denotes the market-level ownership or product holdings matrix in the market, where \(\mathscr{H}_{jk}\) is typically \(1\) if the same firm produces products \(j\) and \(k\), and \(0\) otherwise.

To include a supply side, we must specify a functional form for marginal costs:

(9)\[\tilde{c} = f(c) = X_3\gamma + \omega.\]

The most common choices are \(f(c) = c\) and \(f(c) = \log(c)\).

Estimation

A demand side is always estimated but including a supply side is optional. With only a demand side, there are three sets of parameters to be estimated: \(\beta\) (which may include \(\alpha\)), \(\Sigma\) and \(\Pi\). With a supply side, there is also \(\gamma\). The linear parameters, \(\beta\) and \(\gamma\), are typically concentrated out of the problem. The exception is \(\alpha\), which cannot be concentrated out when there is a supply side because it is needed to compute demand derivatives and hence marginal costs. Linear parameters that are not concentrated out along with unknown nonlinear parameters in \(\Sigma\) and \(\Pi\) are collectively denoted \(\theta\).

The GMM problem is

(10)\[\min_\theta q(\theta) = \bar{g}(\theta)'W\bar{g}(\theta),\]

in which \(q(\theta)\) is the GMM objective. By default, PyBLP scales this value by \(N\) so that objectives across different problem sizes are comparable. This behavior can be disabled. In some of the BLP literature and in earlier versions of this package, the objective was scaled by \(N^2\).

Here, \(W\) is a \(M \times M\) weighting matrix and \(\bar{g}\) is a \(M \times 1\) vector of averaged demand- and supply-side moments:

(11)\[\begin{split}\bar{g} = \begin{bmatrix} \bar{g}_D \\ \bar{g}_S \end{bmatrix} = \frac{1}{N} \begin{bmatrix} \sum_{j,t} Z_{D,jt}'\xi_{jt} \\ \sum_{j,t} Z_{S,jt}'\omega_{jt} \end{bmatrix}\end{split}\]

where \(Z_D\) and \(Z_S\) are \(N \times M_D\) and \(N \times M_S\) matrices of demand- and supply-side instruments.

The vector \(\bar{g}\) contains sample analogues of the demand- and supply-side moment conditions \(E[g_{D,jt}] = E[g_{S,jt}] = 0\) where

(12)\[\begin{bmatrix} g_{D,jt} & g_{S,jt} \end{bmatrix} = \begin{bmatrix} \xi_{jt}Z_{D,jt} & \omega_{jt}Z_{S,jt} \end{bmatrix}.\]

In each GMM stage, a nonlinear optimizer finds the \(\hat{\theta}\) that minimizes the GMM objective value \(q(\theta)\).

The Objective

Given a \(\theta\), the first step to computing the objective \(q(\theta)\) is to compute \(\delta(\theta)\) in each market with the following standard contraction:

(13)\[\delta_{jt} \leftarrow \delta_{jt} + \log s_{jt} - \log s_{jt}(\delta, \theta)\]

where \(s\) are the market’s observed shares and \(s(\delta, \theta)\) are calculated market shares. Iteration terminates when the norm of the change in \(\delta(\theta)\) is less than a small number.

With a supply side, marginal costs are then computed according to (7):

(14)\[c_{jt}(\theta) = p_{jt} - \eta_{jt}(\theta).\]

Concentrated out linear parameters are recovered with linear IV-GMM:

(15)\[\begin{split}\begin{bmatrix} \hat{\beta}^\text{ex} \\ \hat{\gamma} \end{bmatrix} = (X'ZWZ'X)^{-1}X'ZWZ'Y(\theta)\end{split}\]

where

(16)\[\begin{split}X = \begin{bmatrix} X_1^\text{ex} & 0 \\ 0 & X_3 \end{bmatrix}, \quad Z = \begin{bmatrix} Z_D & 0 \\ 0 & Z_S \end{bmatrix}, \quad Y(\theta) = \begin{bmatrix} \delta(\theta) - X_1^\text{en}\hat{\alpha} \\ \tilde{c}(\theta) \end{bmatrix}.\end{split}\]

With only a demand side, \(\alpha\) can be concentrated out, so \(X = X_1\), \(Z = Z_D\), and \(Y = \delta(\theta)\) recover the full \(\hat{\beta}\) in (15).

Finally, the unobserved product characteristics (i.e., the structural errors),

(17)\[\begin{split}\begin{bmatrix} \xi(\theta) \\ \omega(\theta) \end{bmatrix} = \begin{bmatrix} \delta(\theta) - X_1\hat{\beta} \\ \tilde{c}(\theta) - X_3\hat{\gamma} \end{bmatrix},\end{split}\]

are interacted with the instruments to form \(\bar{g}(\theta)\) in (11), which gives the GMM objective \(q(\theta)\) in (10).

The Gradient

The gradient of the GMM objective in (10) is

(18)\[\nabla q(\theta) = 2\bar{G}(\theta)'W\bar{g}(\theta)\]

where

(19)\[\begin{split}\bar{G} = \begin{bmatrix} \bar{G}_D \\ \bar{G}_S \end{bmatrix} = \frac{1}{N} \begin{bmatrix} \sum_{j,t} Z_{D,jt}'\frac{\partial\xi_{jt}}{\partial\theta} \\ \sum_{j,t} Z_{S,jt}'\frac{\partial\omega_{jt}}{\partial\theta} \end{bmatrix}.\end{split}\]

Writing \(\delta\) as an implicit function of \(s\) in (4) gives the demand-side Jacobian:

(20)\[\frac{\partial\xi}{\partial\theta} = \frac{\partial\delta}{\partial\theta} = -\left(\frac{\partial s}{\partial\delta}\right)^{-1}\frac{\partial s}{\partial\theta}.\]

The supply-side Jacobian is derived from the definition of \(\tilde{c}\) in (9):

(21)\[\frac{\partial\omega}{\partial\theta} = \frac{\partial\tilde{c}}{\partial\theta} = -\frac{\partial\tilde{c}}{\partial c}\frac{\partial\eta}{\partial\theta}.\]

The second term in this expression is derived from the definition of \(\eta\) in (7):

(22)\[\frac{\partial\eta}{\partial\theta} = -\Delta^{-1}\left(\frac{\partial\Delta}{\partial\theta}\eta + \frac{\partial\Delta}{\partial\xi}\eta\frac{\partial\xi}{\partial\theta}\right).\]

One thing to note is that \(\frac{\partial\xi}{\partial\theta} = \frac{\partial\delta}{\partial\theta}\) and \(\frac{\partial\omega}{\partial\theta} = \frac{\partial\tilde{c}}{\partial\theta}\) need not hold during optimization if we concentrate out linear parameters because these are then functions of \(\theta\). Fortunately, one can use orthogonality conditions to show that it is fine to treat these parameters as fixed when computing the gradient.

Weighting Matrices

Conventionally, the 2SLS weighting matrix is used in the first stage:

(23)\[\begin{split}W = \begin{bmatrix} (Z_D'Z_D / N)^{-1} & 0 \\ 0 & (Z_S'Z_S / N)^{-1} \end{bmatrix}.\end{split}\]

With two-step GMM, \(W\) is updated before the second stage according to

(24)\[W = S^{-1}.\]

For heteroscedasticity robust weighting matrices,

(25)\[S = \frac{1}{N}\sum_{j,t} g_{jt}g_{jt}'.\]

For clustered weighting matrices with \(c = 1, 2, \dotsc, C\) clusters,

(26)\[S = \frac{1}{N}\sum_{c=1}^C g_cg_c',\]

where, letting the set \(J_{ct} \subset J_t\) denote products in cluster \(c\) and market \(t\),

(27)\[g_c = \sum_{t \in T} \sum_{j \in J_{ct}} g_{jt}.\]

For unadjusted weighting matrices,

(28)\[\begin{split}S = \frac{1}{N} \begin{bmatrix} \sigma_\xi^2 Z_D'Z_D & \sigma_{\xi\omega} Z_D'Z_S \\ \sigma_{\xi\omega} Z_S'Z_D & \sigma_\omega^2 Z_S'Z_S \end{bmatrix}\end{split}\]

where \(\sigma_\xi^2\), \(\sigma_\omega^2\), and \(\sigma_{\xi\omega}\) are estimates of the variances and covariance between the structural errors.

Simulation error can be accounted for by resampling agents \(r = 1, \dots, R\) times, evaluating each \(\bar{g}_r\), and adding the following to \(S\):

(29)\[\frac{1}{R - 1} \sum_{r=1}^R (\bar{g}_r - \bar{\bar{g}})(\bar{g}_r - \bar{\bar{g}})', \quad \bar{\bar{g}} = \frac{1}{R} \sum_{r=1}^R \bar{g}_r.\]

Standard Errors

An estimate of the asymptotic covariance matrix of \(\sqrt{N}(\hat{\theta} - \theta_0)\) is

(30)\[(\bar{G}'W\bar{G})^{-1}\bar{G}'WSW\bar{G}(\bar{G}'W\bar{G})^{-1}.\]

Standard errors are the square root of the diagonal of this matrix divided by \(N\).

If the weighting matrix was chosen such that \(W = S^{-1}\), this simplifies to

(31)\[(\bar{G}'W\bar{G})^{-1}.\]

Standard errors extracted from this simpler expression are called unadjusted.

Fixed Effects

The unobserved product characteristics can be partitioned into

(32)\[\begin{split}\begin{bmatrix} \xi_{jt} \\ \omega_{jt} \end{bmatrix} = \begin{bmatrix} \xi_{k_1} + \xi_{k_2} + \cdots + \xi_{k_{E_D}} + \Delta\xi_{jt} \\ \omega_{\ell_1} + \omega_{\ell_2} + \cdots + \omega_{\ell_{E_S}} + \Delta\omega_{jt} \end{bmatrix}\end{split}\]

where \(k_1, k_2, \dotsc, k_{E_D}\) and \(\ell_1, \ell_2, \dotsc, \ell_{E_S}\) index unobserved characteristics that are fixed across \(E_D\) and \(E_S\) dimensions. For example, with \(E_D = 1\) dimension of product fixed effects, \(\xi_{jt} = \xi_j + \Delta\xi_{jt}\).

Small numbers of fixed effects can be estimated with dummy variables in \(X_1\), \(X_3\), \(Z_D\), and \(Z_S\). However, this approach does not scale with high dimensional fixed effects because it requires constructing and inverting an infeasibly large matrix in (15).

Instead, fixed effects are typically absorbed into \(X\), \(Z\), and \(Y(\theta)\) in (15). With one fixed effect, these matrices are simply de-meaned within each level of the fixed effect. Both \(X\) and \(Z\) can be de-meaned just once, but \(Y(\theta)\) must be de-meaned for each new \(\theta\).

This procedure is equivalent to replacing each column of the matrices with residuals from a regression of the column on the fixed effect. The Frish-Waugh-Lovell (FWL) theorem of Frisch and Waugh (1933) and Lovell (1963) guarantees that using these residualized matrices gives the same results as including fixed effects as dummy variables. When \(E_D > 1\) or \(E_S > 1\), the matrices are residualized with more involved algorithms.

Once fixed effects have been absorbed, estimation is as described above with the structural errors \(\Delta\xi\) and \(\Delta\omega\).

Micro Moments

More detailed micro data on individual choices can be used to supplement the standard demand- and supply-side moments \(\bar{g}_D\) and \(\bar{g}_S\) in (11) with an additional \(m = 1, 2, \ldots, M_M\) micro moments, \(\bar{g}_M\), for a total of \(M = M_D + M_S + M_M\) moments:

(33)\[\begin{split}\bar{g} = \begin{bmatrix} \bar{g}_D \\ \bar{g}_S \\ \bar{g}_M \end{bmatrix}.\end{split}\]

Conlon and Gortmaker (2023) provides a standardized framework for incorporating micro moments into BLP-style estimation. What follows is a simplified summary of this framework. Each micro moment \(m\) is the difference between an observed value \(f_m(\bar{v})\) and its simulated analogue \(f_m(v)\):

(34)\[\bar{g}_{M,m} = f_m(\bar{v}) - f_m(v),\]

in which \(f_m(\cdot)\) is a function that maps a vector of \(p = 1, \ldots, P_M\) micro moment parts \(\bar{v} = (\bar{v}_1, \dots, \bar{v}_{P_M})'\) or \(v = (v_1, \dots, v_{P_M})'\) into a micro statistic. Each sample micro moment part \(p\) is an average over observations \(n \in N_{d_m}\) in the associated micro dataset \(d_p\):

(35)\[\bar{v}_p = \frac{1}{N_{d_p}} \sum_{n \in N_{d_p}} v_{pi_nj_nt_n}.\]

Its simulated analogue is

(36)\[v_p = \frac{\sum_{t \in T} \sum_{i \in I_t} \sum_{j \in J_t \cup \{0\}} w_{it} s_{ijt} w_{d_pijt} v_{pijt}}{\sum_{t \in T} \sum_{i \in I_t} \sum_{j \in J_t \cup \{0\}} w_{it} s_{ijt} w_{d_pijt}},\]

In which \(w_{it} s_{ijt} w_{d_pijt}\) is the probability an observation in the micro dataset is for an agent \(i\) who chooses \(j\) in market \(t\).

The simplest type of micro moment is just an average over the entire sample, with \(f_m(v) = v_1\). For example, with \(v_{1ijt}\) equal to the income for an agent \(i\) who chooses \(j\) in market \(t\), micro moment \(m\) would match the average income in dataset \(d_p\). Observed values such as conditional expectations, covariances, correlations, or regression coefficients can be matched by choosing the appropriate function \(f_m\). For example, with \(v_{2ijt}\) equal to the interaction between income and an indicator for the choice of the outside option, and with \(v_{3ijt}\) equal to an indicator for the choiced of the outside option, \(f_m(v) = v_2 / v_3\) would match an observed conditional mean income within those who choose the outside option.

A micro dataset \(d\), often a survey, is defined by survey weights \(w_{dijt}\). For example, \(w_{dijt} = 1\{j \neq 0, t \in T_d\}\) defines a micro dataset that is a selected sample of inside purchasers in a few markets \(T_d \subset T\), giving each market an equal sampling weight. Different micro datasets are independent.

A micro dataset will often admit multiple micro moment parts. Each micro moment part \(p\) is defined by its dataset \(d_p\) and micro values \(v_{pijt}\). For example, a micro moment part \(p\) with \(v_{pijt} = y_{it}x_{jt}\) delivers the mean \(\bar{v}_p\) or expectation \(v_p\) of an interaction between some demographic \(y_{it}\) and some product characteristic \(x_{jt}\).

A micro moment is a function of one or more micro moment parts. The simplest type is a function of only one micro moment part, and matches the simple average defined by the micro moment part. For example, \(f_m(v) = v_p\) with \(v_{pijt} = y_{it} x_{jt}\) matches the mean of an interaction between \(y_{it}\) and \(x_{jt}\). Non-simple averages such as conditional means, covariances, correlations, or regression coefficients can be matched by choosing an appropriate function \(f_m\). For example, \(f_m(v) = v_1 / v_2\) with \(v_{1ijt} = y_{it}x_{jt}1\{j \neq 0\}\) and \(v_{2ijt} = 1\{j \neq 0\}\) matches the conditional mean of an interaction between \(y_{it}\) and \(x_{jt}\) among those who do not choose the outside option \(j = 0\).

Technically, if not all micro moments \(m\) are simple averages \(f_m(v) = v_m\), then the resulting estimator will no longer be a GMM estimator, but rather a more generic minimum distance estimator, since these “micro moments” are not technically sample moments. Regardless, the package uses GMM terminology for simplicity’s sake, and the statistical expressions are all the same. Micro moments are computed for each \(\theta\) and contribute to the GMM (or minimum distance) objective \(q(\theta)\) in (10). Their derivatives with respect to \(\theta\) are added as rows to \(\bar{G}\) in (19), and blocks are added to both \(W\) and \(S\) in (23) and (24). The covariance between standard moments and micro moments is zero, so these matrices are block-diagonal. The delta method delivers the covariance matrix for the micro moments:

(37)\[S_M = \frac{\partial f(v)}{\partial v'} S_P \frac{\partial f(v)'}{\partial v}.\]

The scaled covariance between micro moment parts \(p\) and \(q\) in \(S_P\) is zero if they are based on different micro datasets \(d_p\) neq d_q`; otherwise, if based on the same dataset \(d_p = d_q = d\),

(38)\[S_{P,pq} = \frac{N}{N_d} \text{Cov}(v_{pi_nj_nt_n}, v_{qi_nj_nt_n}),\]

in which

(39)\[\text{Cov}(v_{pi_nj_nt_n}, v_{qi_nj_nt_n}) = \frac{\sum_{t \in T} \sum_{i \in I_t} \sum_{j \in J_t \cup \{0\}} w_{it} s_{ijt} w_{dijt} (v_{pijt} - v_p)(v_{qijt} - v_q)}{\sum_{t \in T} \sum_{i \in I_t} \sum_{j \in J_t \cup \{0\}} w_{it} s_{ijt} w_{dijt}}.\]

Micro moment parts based on second choice are averages over values \(v_{pijkt}\) where \(k\) indexes second choices, and are based on datasets defined by survey weights \(w_{dijkt}\). A sample micro moment part is

(40)\[\bar{v}_p = \frac{1}{N_{d_p}} \sum_{n \in N_{d_p}} v_{pi_nj_nk_nt_n}.\]

Its simulated analogue is

(41)\[v_p = \frac{\sum_{t \in T} \sum_{i \in I_t} \sum_{j, k \in J_t \cup \{0\}} w_{it} s_{ijt} s_{ik(-j)t} w_{d_pijkt} v_{pijkt}}{\sum_{t \in T} \sum_{i \in I_t} \sum_{j, k \in J_t \cup \{0\}} w_{it} s_{ijt} s_{ik(-j)t} w_{d_pijkt}},\]

in which \(s_{ik(-j)t}\) is the probability of choosing \(k\) when \(j\) is removed from the choice set. One can also define micro moment parts based on second choices where a group of products \(h(j)\) containing the first choice \(j\) is removed from the choice set. In this case, the above second choice probabilities become \(s_{ik(-h(j))t}\).

Covariances are defined analogously.

Random Coefficients Nested Logit

Incorporating parameters that measure within nesting group correlation gives the random coefficients nested logit (RCNL) model of Brenkers and Verboven (2006) and Grigolon and Verboven (2014). There are \(h = 1, 2, \dotsc, H\) nesting groups and each product \(j\) is assigned to a group \(h(j)\). The set \(J_{ht} \subset J_t\) denotes the products in group \(h\) and market \(t\).

In the RCNL model, idiosyncratic preferences are partitioned into

(42)\[\epsilon_{ijt} = \bar{\epsilon}_{ih(j)t} + (1 - \rho_{h(j)})\bar{\epsilon}_{ijt}\]

where \(\bar{\epsilon}_{ijt}\) is Type I Extreme Value and \(\bar{\epsilon}_{ih(j)t}\) is distributed such that \(\epsilon_{ijt}\) is still Type I Extreme Value.

The nesting parameters, \(\rho\), can either be a \(H \times 1\) vector or a scalar so that for all groups \(\rho_h = \rho\). Letting \(\rho \to 0\) gives the standard BLP model and \(\rho \to 1\) gives division by zero errors. With \(\rho_h \in (0, 1)\), the expression for choice probabilities in (5) becomes more complicated:

(43)\[s_{ijt} = \frac{\exp[V_{ijt} / (1 - \rho_{h(j)})]}{\exp[V_{ih(j)t} / (1 - \rho_{h(j)})]}\cdot\frac{\exp V_{ih(j)t}}{1 + \sum_{h \in H} \exp V_{iht}}\]

where

(44)\[V_{iht} = (1 - \rho_h)\log\sum_{k \in J_{ht}} \exp[V_{ikt} / (1 - \rho_h)].\]

The contraction for \(\delta(\theta)\) in (13) is also slightly different:

(45)\[\delta_{jt} \leftarrow \delta_{jt} + (1 - \rho_{h(j)})[\log s_{jt} - \log s_{jt}(\delta, \theta)].\]

Otherwise, estimation is as described above with \(\rho\) included in \(\theta\).

Logit and Nested Logit

Letting \(\Sigma = 0\) gives the simpler logit (or nested logit) model where there is a closed-form solution for \(\delta\). In the logit model,

(46)\[\delta_{jt} = \log s_{jt} - \log s_{0t},\]

and a lack of nonlinear parameters means that nonlinear optimization is often unneeded.

In the nested logit model, \(\rho\) must be optimized over, but there is still a closed-form solution for \(\delta\):

(47)\[\delta_{jt} = \log s_{jt} - \log s_{0t} - \rho_{h(j)}[\log s_{jt} - \log s_{h(j)t}].\]

where

(48)\[s_{ht} = \sum_{j \in J_{ht}} s_{jt}.\]

In both models, a supply side can still be estimated jointly with demand. Estimation is as described above with a representative agent in each market: \(I_t = 1\) and \(w_1 = 1\).

Equilibrium Prices

Counterfactual evaluation, synthetic data simulation, and optimal instrument generation often involve solving for prices implied by the Bertrand first order conditions in (7). Solving this system with Newton’s method is slow and iterating over \(p \leftarrow c + \eta(p)\) may not converge because it is not a contraction.

Instead, Morrow and Skerlos (2011) reformulate the solution to (7):

(49)\[p - c = \underbrace{\Lambda^{-1}(\mathscr{H} \odot \Gamma)'(p - c) - \Lambda^{-1}s}_{\zeta}\]

where \(\Lambda\) is a diagonal \(J_t \times J_t\) matrix approximated by

(50)\[\Lambda_{jj} \approx \sum_{i \in I_t} w_{it} s_{ijt}\frac{\partial U_{ijt}}{\partial p_{jt}}\]

and \(\Gamma\) is a dense \(J_t \times J_t\) matrix approximated by

(51)\[\Gamma_{jk} \approx \sum_{i \in I_t} w_{it} s_{ijt}s_{ikt}\frac{\partial U_{ikt}}{\partial p_{kt}}.\]

Equilibrium prices are computed by iterating over the \(\zeta\)-markup equation in (49),

(52)\[p \leftarrow c + \zeta(p),\]

which, unlike (7), is a contraction. Iteration terminates when the norm of firms’ first order conditions, \(||\Lambda(p)(p - c - \zeta(p))||\), is less than a small number.

If marginal costs depend on quantity, then they also depend on prices and need to be updated during each iteration: \(c_{jt} = c_{jt}(s_{jt}(p))\).

Tutorial

This section uses a series of Jupyter Notebooks to explain how PyBLP can be used to solve example problems, compute post-estimation outputs, and simulate problems.

For a more concise, targeted, and opinionated tutorial, see the Demand Estimation Mixtape Session, a three day workshop taught by Jeff Gortmaker in 2024. In order, the days cover pure logit estimation, aggregate BLP estimation, and micro BLP estimation. Each day has slides and extensive coding exercises (along with Jupyter Notebook solutions) that use PyBLP.

In addition to the notebooks here and the above workshop, other examples can be found in the API Documentation.

Download the Jupyter Notebook for this section: logit_nested.ipynb

Logit and Nested Logit Tutorial

[1]:
import pyblp
import numpy as np
import pandas as pd

pyblp.options.digits = 2
pyblp.options.verbose = False
pyblp.__version__
[1]:
'1.1.0'

In this tutorial, we’ll use data from Nevo (2000a) to solve the paper’s fake cereal problem. Locations of CSV files that contain the data are in the data module.

We will compare two simple models, the plain (IIA) logit model and the nested logit (GEV) model using the fake cereal dataset of Nevo (2000a).

Theory of Plain Logit

Let’s start with the plain logit model under independence of irrelevant alternatives (IIA). In this model (indirect) utility is given by

(1)\[U_{ijt} = \alpha p_{jt} + x_{jt} \beta^\text{ex} + \xi_{jt} + \epsilon_{ijt},\]

where \(\epsilon_{ijt}\) is distributed IID with the Type I Extreme Value (Gumbel) distribution. It is common to normalize the mean utility of the outside good to zero so that \(U_{i0t} = \epsilon_{i0t}\). This gives us aggregate market shares

(2)\[s_{jt} = \frac{\exp(\alpha p_{jt} + x_{jt} \beta^\text{ex} + \xi_{jt})}{1 + \sum_k \exp(\alpha p_{kt} + x_{kt} \beta^\text{ex} + \xi_{kt})}.\]

If we take logs we get

(3)\[\log s_{jt} = \alpha p_{jt} + x_{jt} \beta^\text{ex} + \xi_{jt} - \log \sum_k \exp(\alpha p_{kt} + x_{kt} \beta^\text{ex} + \xi_{kt})\]

and

(4)\[\log s_{0t} = -\log \sum_k \exp(\alpha p_{kt} + x_{kt} \beta^\text{ex} + \xi_{kt}).\]

By differencing the above we get a linear estimating equation:

(5)\[\log s_{jt} - \log s_{0t} = \alpha p_{jt} + x_{jt} \beta^\text{ex} + \xi_{jt}.\]

Because the left hand side is data, we can estimate this model using linear IV GMM.

Application of Plain Logit

A Logit Problem can be created by simply excluding the formulation for the nonlinear parameters, \(X_2\), along with any agent information. In other words, it requires only specifying the linear component of demand.

We’ll set up and solve a simple version of the fake data cereal problem from Nevo (2000a). Since we won’t include any demand-side nonlinear characteristics or parameters, we don’t have to worry about configuring an optimization routine.

There are some important reserved variable names:

  • market_ids are the unique market identifiers which we subscript with \(t\).

  • shares specifies the market shares which need to be between zero and one, and within a market ID, \(\sum_{j} s_{jt} \leq 1\).

  • prices are prices \(p_{jt}\). These have some special properties and are always treated as endogenous.

  • demand_instruments0, demand_instruments1, and so on are numbered demand instruments. These represent only the excluded instruments. The exogenous regressors in \(X_1\) will be automatically added to the set of instruments.

We begin with two steps:

  1. Load the product data which at a minimum consists of market_ids, shares, prices, and at least a single column of demand instruments, demand_instruments0.

  2. Define a Formulation for the \(X_1\) (linear) demand model.

    • This and all other formulas are similar to R and patsy formulas.

    • It includes a constant by default. To exclude the constant, specify either a 0 or a -1.

    • To efficiently include fixed effects, use the absorb option and specify which categorical variables you would like to absorb.

    • Some model reduction may happen automatically. The constant will be excluded if you include fixed effects and some precautions are taken against collinearity. However, you will have to make sure that differently-named variables are not collinear.

  3. Combine the Formulation and product data to construct a Problem.

  4. Use Problem.solve to estimate paramters.

Loading the Data

The product_data argument of Problem should be a structured array-like object with fields that store data. Product data can be a structured NumPy array, a pandas DataFrame, or other similar objects.

[2]:
product_data = pd.read_csv(pyblp.data.NEVO_PRODUCTS_LOCATION)
product_data.head()
[2]:
market_ids city_ids quarter product_ids firm_ids brand_ids shares prices sugar mushy ... demand_instruments10 demand_instruments11 demand_instruments12 demand_instruments13 demand_instruments14 demand_instruments15 demand_instruments16 demand_instruments17 demand_instruments18 demand_instruments19
0 C01Q1 1 1 F1B04 1 4 0.012417 0.072088 2 1 ... 2.116358 -0.154708 -0.005796 0.014538 0.126244 0.067345 0.068423 0.034800 0.126346 0.035484
1 C01Q1 1 1 F1B06 1 6 0.007809 0.114178 18 1 ... -7.374091 -0.576412 0.012991 0.076143 0.029736 0.087867 0.110501 0.087784 0.049872 0.072579
2 C01Q1 1 1 F1B07 1 7 0.012995 0.132391 4 1 ... 2.187872 -0.207346 0.003509 0.091781 0.163773 0.111881 0.108226 0.086439 0.122347 0.101842
3 C01Q1 1 1 F1B09 1 9 0.005770 0.130344 3 0 ... 2.704576 0.040748 -0.003724 0.094732 0.135274 0.088090 0.101767 0.101777 0.110741 0.104332
4 C01Q1 1 1 F1B11 1 11 0.017934 0.154823 12 0 ... 1.261242 0.034836 -0.000568 0.102451 0.130640 0.084818 0.101075 0.125169 0.133464 0.121111

5 rows × 30 columns

The product data contains market_ids, product_ids, firm_ids, shares, prices, a number of other IDs and product characteristics, and some pre-computed excluded demand_instruments0, demand_instruments1, and so on. The product_ids will be incorporated as fixed effects.

For more information about the instruments and the example data as a whole, refer to the data module.

Setting Up the Problem

We can combine the Formulation and product_data to construct a Problem. We pass the Formulation first and the product_data second. We can also display the properties of the problem by typing its name.

[3]:
logit_formulation = pyblp.Formulation('prices', absorb='C(product_ids)')
logit_formulation
[3]:
prices + Absorb[C(product_ids)]
[4]:
problem = pyblp.Problem(logit_formulation, product_data)
problem
[4]:
Dimensions:
================================
 T    N     F    K1    MD    ED
---  ----  ---  ----  ----  ----
94   2256   5    1     20    1
================================

Formulations:
==================================
     Column Indices:          0
--------------------------  ------
X1: Linear Characteristics  prices
==================================

Two sets of properties are displayed:

  1. Dimensions of the data.

  2. Formulations of the problem.

The dimensions describe the shapes of matrices as laid out in Notation. They include:

  • \(T\) is the number of markets.

  • \(N\) is the length of the dataset (the number of products across all markets).

  • \(F\) is the number of firms, which we won’t use in this example.

  • \(K_1\) is the dimension of the linear demand parameters.

  • \(M_D\) is the dimension of the instrument variables (excluded instruments and exogenous regressors).

  • \(E_D\) is the number of fixed effect dimensions (one-dimensional fixed effects, two-dimensional fixed effects, etc.).

There is only a single Formulation for this model.

  • \(X_1\) is the linear component of utility for demand and depends only on prices (after the fixed effects are removed).

Solving the Problem

The Problem.solve method always returns a ProblemResults class, which can be used to compute post-estimation outputs. See the post estimation tutorial for more information.

[5]:
logit_results = problem.solve()
logit_results
[5]:
Problem Results Summary:
==========================================
GMM   Objective  Clipped  Weighting Matrix
Step    Value    Shares   Condition Number
----  ---------  -------  ----------------
 2    +1.9E+02      0         +5.7E+07
==========================================

Cumulative Statistics:
========================
Computation   Objective
   Time      Evaluations
-----------  -----------
 00:00:00         2
========================

Beta Estimates (Robust SEs in Parentheses):
==========
  prices
----------
 -3.0E+01
(+1.0E+00)
==========

Theory of Nested Logit

We can extend the logit model to allow for correlation within a group \(h\) so that

(6)\[U_{ijt} = \alpha p_{jt} + x_{jt} \beta^\text{ex} + \xi_{jt} + \bar{\epsilon}_{h(j)ti} + (1 - \rho) \bar{\epsilon}_{ijt}.\]

Now, we require that \(\epsilon_{jti} = \bar{\epsilon}_{h(j)ti} + (1 - \rho) \bar{\epsilon}_{jti}\) is distributed IID with the Type I Extreme Value (Gumbel) distribution. As \(\rho \rightarrow 1\), all consumers stay within their group. As \(\rho \rightarrow 0\), this collapses to the IIA logit. Note that if we wanted, we could allow \(\rho\) to differ between groups with the notation \(\rho_{h(j)}\).

This gives us aggregate market shares as the product of two logits, the within group logit and the across group logit:

(7)\[s_{jt} = \frac{\exp[V_{jt} / (1 - \rho)]}{\exp[V_{h(j)t} / (1 - \rho)]}\cdot\frac{\exp V_{h(j)t}}{1 + \sum_h \exp V_{ht}},\]

where \(V_{jt} = \alpha p_{jt} + x_{jt} \beta^\text{ex} + \xi_{jt}\).

After some work we again obtain the linear estimating equation:

(8)\[\log s_{jt} - \log s_{0t} = \alpha p_{jt}+ x_{jt} \beta^\text{ex} +\rho \log s_{j|h(j)t} + \xi_{jt},\]

where \(s_{j|h(j)t} = s_{jt} / s_{h(j)t}\) and \(s_{h(j)t}\) is the share of group \(h\) in market \(t\). See Berry (1994) or Cardell (1997) for more information.

Again, the left hand side is data, though the \(\ln s_{j|h(j)t}\) is clearly endogenous which means we must instrument for it. Rather than include \(\ln s_{j|h(j)t}\) along with the linear components of utility, \(X_1\), whenever nesting_ids are included in product_data, \(\rho\) is treated as a nonlinear \(X_2\) parameter. This means that the linear component is given instead by

(9)\[\log s_{jt} - \log s_{0t} - \rho \log s_{j|h(j)t} = \alpha p_{jt} + x_{jt} \beta^\text{ex} + \xi_{jt}.\]

This is done for two reasons:

  1. It forces the user to treat \(\rho\) as an endogenous parameter.

  2. It extends much more easily to the RCNL model of Brenkers and Verboven (2006).

A common choice for an additional instrument is the number of products per nest.

Application of Nested Logit

By including nesting_ids (another reserved name) as a field in product_data, we tell the package to estimate a nested logit model, and we don’t need to change any of the formulas. We show how to construct the category groupings in two different ways:

  1. We put all products in a single nest (only the outside good in the other nest).

  2. We put products into two nests (either mushy or non-mushy).

We also construct an additional instrument based on the number of products per nest. Typically this is useful as a source of exogenous variation in the within group share \(\ln s_{j|h(j)t}\). However, in this example because the number of products per nest does not vary across markets, if we include product fixed effects, this instrument is irrelevant.

We’ll define a function that constructs the additional instrument and solves the nested logit problem. We’ll exclude product ID fixed effects, which are collinear with mushy, and we’ll choose \(\rho = 0.7\) as the initial value at which the optimization routine will start.

[6]:
def solve_nl(df):
    groups = df.groupby(['market_ids', 'nesting_ids'])
    df['demand_instruments20'] = groups['shares'].transform(np.size)
    nl_formulation = pyblp.Formulation('0 + prices')
    problem = pyblp.Problem(nl_formulation, df)
    return problem.solve(rho=0.7)

First, we’ll solve the problem when there’s a single nest for all products, with the outside good in its own nest.

[7]:
df1 = product_data.copy()
df1['nesting_ids'] = 1
nl_results1 = solve_nl(df1)
nl_results1
[7]:
Problem Results Summary:
======================================================================================
GMM   Objective    Projected    Reduced   Clipped  Weighting Matrix  Covariance Matrix
Step    Value    Gradient Norm  Hessian   Shares   Condition Number  Condition Number
----  ---------  -------------  --------  -------  ----------------  -----------------
 2    +2.0E+02     +1.2E-09     +1.1E+04     0         +2.0E+09          +3.0E+04
======================================================================================

Cumulative Statistics:
=================================================
Computation  Optimizer  Optimization   Objective
   Time      Converged   Iterations   Evaluations
-----------  ---------  ------------  -----------
 00:00:04       Yes          3             8
=================================================

Rho Estimates (Robust SEs in Parentheses):
==========
All Groups
----------
 +9.8E-01
(+1.4E-02)
==========

Beta Estimates (Robust SEs in Parentheses):
==========
  prices
----------
 -1.2E+00
(+4.0E-01)
==========

When we inspect the Problem, the only changes from the plain logit model is the additional instrument that contributes to \(M_D\) and the inclusion of \(H\), the number of nesting categories.

[8]:
nl_results1.problem
[8]:
Dimensions:
===============================
 T    N     F    K1    MD    H
---  ----  ---  ----  ----  ---
94   2256   5    1     21    1
===============================

Formulations:
==================================
     Column Indices:          0
--------------------------  ------
X1: Linear Characteristics  prices
==================================

Next, we’ll solve the problem when there are two nests for mushy and non-mushy.

[9]:
df2 = product_data.copy()
df2['nesting_ids'] = df2['mushy']
nl_results2 = solve_nl(df2)
nl_results2
[9]:
Problem Results Summary:
======================================================================================
GMM   Objective    Projected    Reduced   Clipped  Weighting Matrix  Covariance Matrix
Step    Value    Gradient Norm  Hessian   Shares   Condition Number  Condition Number
----  ---------  -------------  --------  -------  ----------------  -----------------
 2    +6.9E+02     +3.0E-10     +5.6E+03     0         +5.1E+08          +2.0E+04
======================================================================================

Cumulative Statistics:
=================================================
Computation  Optimizer  Optimization   Objective
   Time      Converged   Iterations   Evaluations
-----------  ---------  ------------  -----------
 00:00:03       Yes          3             8
=================================================

Rho Estimates (Robust SEs in Parentheses):
==========
All Groups
----------
 +8.9E-01
(+1.9E-02)
==========

Beta Estimates (Robust SEs in Parentheses):
==========
  prices
----------
 -7.8E+00
(+4.8E-01)
==========

For both cases we find that \(\hat{\rho} > 0.8\).

Finally, we’ll also look at the adjusted parameter on prices, \(\alpha / (1-\rho)\).

[10]:
nl_results1.beta[0] / (1 - nl_results1.rho)
[10]:
array([[-67.39338888]])
[11]:
nl_results2.beta[0] / (1 - nl_results2.rho)
[11]:
array([[-72.27074638]])
Treating Within Group Shares as Exogenous

The package is designed to prevent the user from treating the within group share, \(\log s_{j|h(j)t}\), as an exogenous variable. For example, if we were to compute a group_share variable and use the algebraic functionality of Formulation by including the expression log(shares / group_share) in our formula for \(X_1\), the package would raise an error because the package knows that shares should not be included in this formulation.

To demonstrate why this is a bad idea, we’ll override this feature by calculating \(\log s_{j|h(j)t}\) and including it as an additional variable in \(X_1\). To do so, we’ll first re-define our function for setting up and solving the nested logit problem.

[12]:
def solve_nl2(df):
    groups = df.groupby(['market_ids', 'nesting_ids'])
    df['group_share'] = groups['shares'].transform(np.sum)
    df['within_share'] = df['shares'] / df['group_share']
    df['demand_instruments20'] = groups['shares'].transform(np.size)
    nl2_formulation = pyblp.Formulation('0 + prices + log(within_share)')
    problem = pyblp.Problem(nl2_formulation, df.drop(columns=['nesting_ids']))
    return problem.solve()

Again, we’ll solve the problem when there’s a single nest for all products, with the outside good in its own nest.

[13]:
nl2_results1 = solve_nl2(df1)
nl2_results1
[13]:
Problem Results Summary:
=============================================================
GMM   Objective  Clipped  Weighting Matrix  Covariance Matrix
Step    Value    Shares   Condition Number  Condition Number
----  ---------  -------  ----------------  -----------------
 2    +2.0E+02      0         +2.1E+09          +1.1E+04
=============================================================

Cumulative Statistics:
========================
Computation   Objective
   Time      Evaluations
-----------  -----------
 00:00:00         2
========================

Beta Estimates (Robust SEs in Parentheses):
=============================
  prices    log(within_share)
----------  -----------------
 -1.0E+00       +9.9E-01
(+2.4E-01)     (+7.9E-03)
=============================

And again, we’ll solve the problem when there are two nests for mushy and non-mushy.

[14]:
nl2_results2 = solve_nl2(df2)
nl2_results2
[14]:
Problem Results Summary:
=============================================================
GMM   Objective  Clipped  Weighting Matrix  Covariance Matrix
Step    Value    Shares   Condition Number  Condition Number
----  ---------  -------  ----------------  -----------------
 2    +7.0E+02      0         +5.5E+08          +7.7E+03
=============================================================

Cumulative Statistics:
========================
Computation   Objective
   Time      Evaluations
-----------  -----------
 00:00:00         2
========================

Beta Estimates (Robust SEs in Parentheses):
=============================
  prices    log(within_share)
----------  -----------------
 -6.8E+00       +9.3E-01
(+2.9E-01)     (+1.1E-02)
=============================

One can observe that we obtain parameter estimates which are quite different than above.

[15]:
nl2_results1.beta[0] / (1 - nl2_results1.beta[1])
[15]:
array([-86.37368445])
[16]:
nl2_results2.beta[0] / (1 - nl2_results2.beta[1])
[16]:
array([-100.14496891])

Download the Jupyter Notebook for this section: nevo.ipynb

Random Coefficients Logit Tutorial with the Fake Cereal Data

[1]:
import pyblp
import numpy as np
import pandas as pd

pyblp.options.digits = 2
pyblp.options.verbose = False
pyblp.__version__
[1]:
'1.1.0'

In this tutorial, we’ll use data from Nevo (2000a) to solve the paper’s fake cereal problem. Locations of CSV files that contain the data are in the data module.

Theory of Random Coefficients Logit

The random coefficients model extends the plain logit model by allowing for correlated tastes for different product characteristics. In this model (indirect) utility is given by

(1)\[u_{ijt} = \alpha_i p_{jt} + x_{jt} \beta_i^\text{ex} + \xi_{jt} + \epsilon_{ijt}\]

The main addition is that \(\beta_i = (\alpha_i, \beta_i^\text{ex})\) have individual specific subscripts \(i\).

Conditional on \(\beta_i\), the individual market share follow the same logit form as before. But now we must integrate over heterogeneous individuals to get the aggregate market share:

(2)\[s_{jt}(\alpha, \beta, \theta) = \int \frac{\exp(\alpha_i p_{jt} + x_{jt} \beta_i^\text{ex} + \xi_{jt})}{1 + \sum_k \exp(\alpha_i p_{jt} + x_{kt} \beta_i^\text{ex} + \xi_{kt})} f(\alpha_i, \beta_i \mid \theta).\]

In general, this integral needs to be calculated numerically. This also means that we can’t directly linearize the model. It is common to re-parametrize the model to separate the aspects of mean utility that all individuals agree on, \(\delta_{jt} = \alpha p_{jt} + x_{jt} \beta^\text{ex} + \xi_{jt}\), from the individual specific heterogeneity, \(\mu_{ijt}(\theta)\). This gives us

(3)\[s_{jt}(\delta_{jt}, \theta) = \int \frac{\exp(\delta_{jt} + \mu_{ijt})}{1 + \sum_k \exp(\delta_{kt} + \mu_{ikt})} f(\mu_{it} | \theta).\]

Given a guess of \(\theta\) we can solve the system of nonlinear equations for the vector \(\delta\) which equates observed and predicted market share \(s_{jt} = s_{jt}(\delta, \theta)\). Now we can perform a linear IV GMM regression of the form

(4)\[\delta_{jt}(\theta) = \alpha p_{jt} + x_{jt} \beta^\text{ex} + \xi_{jt}.\]

The moments are constructed by interacting the predicted residuals \(\hat{\xi}_{jt}(\theta)\) with instruments \(z_{jt}\) to form

(5)\[\bar{g}(\theta) =\frac{1}{N} \sum_{j,t} z_{jt}' \hat{\xi}_{jt}(\theta).\]

Random Coefficients

To include random coefficients we need to add a Formulation for the demand-side nonlinear characteristics \(X_2\).

Just like in the logit case we have the same reserved field names in product_data:

  • market_ids are the unique market identifiers which we subscript \(t\).

  • shares specifies the market share which need to be between zero and one, and within a market ID, \(\sum_{j} s_{jt} < 1\).

  • prices are prices \(p_{jt}\). These have some special properties and are always treated as endogenous.

  • demand_instruments0, demand_instruments1, and so on are numbered demand instruments. These represent only the excluded instruments. The exogenous regressors in \(X_1\) (of which \(X_2\) is typically a subset) will be automatically added to the set of instruments.

We proceed with the following steps:

  1. Load the product data which at a minimum consists of market_ids, shares, prices, and at least a single column of demand instruments, demand_instruments0.

  2. Define a Formulation for the \(X_1\) (linear) demand model.

    • This and all other formulas are similar to R and patsy formulas.

    • It includes a constant by default. To exclude the constant, specify either a 0 or a -1.

    • To efficiently include fixed effects, use the absorb option and specify which categorical variables you would like to absorb.

    • Some model reduction may happen automatically. The constant will be excluded if you include fixed effects and some precautions are taken against collinearity. However, you will have to make sure that differently-named variables are not collinear.

  3. Define a Formulation for the \(X_2\) (nonlinear) demand model.

    • Include only the variables over which we want random coefficients.

    • Do not absorb or include fixed effects.

    • It will include a random coefficient on the constant (to capture inside good vs. outside good preference) unless you specify not to with a 0 or a -1.

  4. Define an Integration configuration to solve the market share integral from several available options:

    • Monte Carlo integration (pseudo-random draws).

    • Product rule quadrature.

    • Sparse grid quadrature.

  5. Combine Formulation classes, product_data, and the Integration configuration to construct a Problem.

  6. Use the Problem.solve method to estimate paramters.

    • It is required to specify an initial guess of the nonlinear parameters. This serves two primary purposes: speeding up estimation and indicating to the solver through initial values of zero which parameters are restricted to be always zero.

Specification of Random Taste Parameters

It is common to assume that \(f(\beta_i \mid \theta)\) follows a multivariate normal distribution, and to break it up into three parts:

  1. A mean \(K_1 \times 1\) taste which all individuals agree on, \(\beta\).

  2. A \(K_2 \times K_2\) covariance matrix, \(V\). As is common with multivariate normal distributions, \(V\) is not estimated directly. Rather, its matrix square (Cholesky) root \(\Sigma\) is estimated where \(\Sigma\Sigma' = V\).

  3. Any \(K_2 \times D\) interactions, \(\Pi\), with observed \(D \times 1\) demographic data, \(d_i\).

Together this gives us that

(6)\[\beta_i \sim N(\beta + \Pi d_i, \Sigma\Sigma').\]

Problem.solve takes an initial guess \(\Sigma_0\) of \(\Sigma\). It guarantees that \(\hat{\Sigma}\) (the estimated parameters) will have the same sparsity structure as \(\Sigma_0\). So any zero element of \(\Sigma\) is restricted to be zero in the solution \(\hat{\Sigma}\). For example, a popular restriction is that \(\Sigma\) is diagonal, this can be achieved by passing a diagonal matrix as \(\Sigma_0\).

Loading Data

The product_data argument of Problem should be a structured array-like object with fields that store data. Product data can be a structured NumPy array, a pandas DataFrame, or other similar objects.

[2]:
product_data = pd.read_csv(pyblp.data.NEVO_PRODUCTS_LOCATION)
product_data.head()
[2]:
market_ids city_ids quarter product_ids firm_ids brand_ids shares prices sugar mushy ... demand_instruments10 demand_instruments11 demand_instruments12 demand_instruments13 demand_instruments14 demand_instruments15 demand_instruments16 demand_instruments17 demand_instruments18 demand_instruments19
0 C01Q1 1 1 F1B04 1 4 0.012417 0.072088 2 1 ... 2.116358 -0.154708 -0.005796 0.014538 0.126244 0.067345 0.068423 0.034800 0.126346 0.035484
1 C01Q1 1 1 F1B06 1 6 0.007809 0.114178 18 1 ... -7.374091 -0.576412 0.012991 0.076143 0.029736 0.087867 0.110501 0.087784 0.049872 0.072579
2 C01Q1 1 1 F1B07 1 7 0.012995 0.132391 4 1 ... 2.187872 -0.207346 0.003509 0.091781 0.163773 0.111881 0.108226 0.086439 0.122347 0.101842
3 C01Q1 1 1 F1B09 1 9 0.005770 0.130344 3 0 ... 2.704576 0.040748 -0.003724 0.094732 0.135274 0.088090 0.101767 0.101777 0.110741 0.104332
4 C01Q1 1 1 F1B11 1 11 0.017934 0.154823 12 0 ... 1.261242 0.034836 -0.000568 0.102451 0.130640 0.084818 0.101075 0.125169 0.133464 0.121111

5 rows × 30 columns

The product data contains market_ids, product_ids, firm_ids, shares, prices, a number of other firm IDs and product characteristics, and some pre-computed excluded demand_instruments0, demand_instruments1, and so on. The product_ids will be incorporated as fixed effects.

For more information about the instruments and the example data as a whole, refer to the data module.

Setting Up and Solving the Problem Without Demographics

Formulations, product data, and an integration configuration are collectively used to initialize a Problem. Once initialized, Problem.solve runs the estimation routine. The arguments to Problem.solve configure how estimation is performed. For example, optimization and iteration arguments configure the optimization and iteration routines that are used by the outer and inner loops of estimation.

We’ll specify Formulation configurations for \(X_1\), the demand-side linear characteristics, and \(X_2\), the nonlinear characteristics.

  • The formulation for \(X_1\) consists of prices and fixed effects constructed from product_ids, which we will absorb using absorb argument of Formulation.

  • If we were interested in reporting estimates for each fixed effect, we could replace the formulation for \(X_1\) with Formulation('prices + C(product_ids)').

  • Because sugar, mushy, and the constant are collinear with product_ids, we can include them in \(X_2\) but not in \(X_1\).

[3]:
X1_formulation = pyblp.Formulation('0 + prices', absorb='C(product_ids)')
X2_formulation = pyblp.Formulation('1 + prices + sugar + mushy')
product_formulations = (X1_formulation, X2_formulation)
product_formulations
[3]:
(prices + Absorb[C(product_ids)], 1 + prices + sugar + mushy)

We also need to specify an Integration configuration. We consider two alternatives:

  1. Monte Carlo draws: we simulate 50 individuals from a random normal distribution. This is just for simplicity. In practice quasi-Monte Carlo sequences such as Halton draws are preferable, and there should generally be many more simulated individuals than just 50.

  2. Product rules: we construct nodes and weights according to a product rule that exactly integrates polynomials of degree \(5 \times 2 - 1 = 9\) or less.

[4]:
mc_integration = pyblp.Integration('monte_carlo', size=50, specification_options={'seed': 0})
mc_integration
[4]:
Configured to construct nodes and weights with Monte Carlo simulation with options {seed: 0}.
[5]:
pr_integration = pyblp.Integration('product', size=5)
pr_integration
[5]:
Configured to construct nodes and weights according to the level-5 Gauss-Hermite product rule with options {}.
[6]:
mc_problem = pyblp.Problem(product_formulations, product_data, integration=mc_integration)
mc_problem
[6]:
Dimensions:
============================================
 T    N     F    I     K1    K2    MD    ED
---  ----  ---  ----  ----  ----  ----  ----
94   2256   5   4700   1     4     20    1
============================================

Formulations:
===========================================================
       Column Indices:           0       1       2      3
-----------------------------  ------  ------  -----  -----
 X1: Linear Characteristics    prices
X2: Nonlinear Characteristics    1     prices  sugar  mushy
===========================================================
[7]:
pr_problem = pyblp.Problem(product_formulations, product_data, integration=pr_integration)
pr_problem
[7]:
Dimensions:
=============================================
 T    N     F     I     K1    K2    MD    ED
---  ----  ---  -----  ----  ----  ----  ----
94   2256   5   58750   1     4     20    1
=============================================

Formulations:
===========================================================
       Column Indices:           0       1       2      3
-----------------------------  ------  ------  -----  -----
 X1: Linear Characteristics    prices
X2: Nonlinear Characteristics    1     prices  sugar  mushy
===========================================================

As an illustration of how to configure the optimization routine, we’ll use a simpler, non-default Optimization configuration that doesn’t support parameter bounds, and use a relatively loose tolerance so the problems are solved quickly. In practice along with more integration draws, it’s a good idea to use a tighter termination tolerance.

[8]:
bfgs = pyblp.Optimization('bfgs', {'gtol': 1e-4})
bfgs
[8]:
Configured to optimize using the BFGS algorithm implemented in SciPy with analytic gradients and options {gtol: +1.0E-04}.

We estimate three versions of the model:

  1. An unrestricted covariance matrix for random tastes using Monte Carlo integration.

  2. An unrestricted covariance matrix for random tastes using the product rule.

  3. A restricted diagonal matrix for random tastes using Monte Carlo Integration.

Notice that the only thing that changes when we estimate the restricted covariance is our initial guess of \(\Sigma_0\). The upper diagonal in this initial guess is ignored because we are optimizing over the lower-triangular Cholesky root of \(V = \Sigma\Sigma'\).

[9]:
results1 = mc_problem.solve(sigma=np.ones((4, 4)), optimization=bfgs)
results1
[9]:
Problem Results Summary:
=======================================================================================================
GMM   Objective  Gradient      Hessian         Hessian     Clipped  Weighting Matrix  Covariance Matrix
Step    Value      Norm    Min Eigenvalue  Max Eigenvalue  Shares   Condition Number  Condition Number
----  ---------  --------  --------------  --------------  -------  ----------------  -----------------
 2    +1.5E+02   +8.7E-05     +8.5E-02        +6.5E+03        0         +5.2E+07          +8.3E+05
=======================================================================================================

Cumulative Statistics:
===========================================================================
Computation  Optimizer  Optimization   Objective   Fixed Point  Contraction
   Time      Converged   Iterations   Evaluations  Iterations   Evaluations
-----------  ---------  ------------  -----------  -----------  -----------
 00:01:48       Yes          58           75          82256       252902
===========================================================================

Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
=========================================================================================================================
Sigma:      1         prices      sugar       mushy     |  Sigma Squared:      1         prices      sugar       mushy
------  ----------  ----------  ----------  ----------  |  --------------  ----------  ----------  ----------  ----------
  1      +1.2E+00                                       |        1          +1.5E+00    -1.4E+01    +7.3E-02    -7.1E-01
        (+3.0E+00)                                      |                  (+7.2E+00)  (+5.2E+01)  (+2.2E-01)  (+2.3E+00)
                                                        |
prices   -1.1E+01    +8.4E+00                           |      prices       -1.4E+01    +2.0E+02    -1.5E+00    +1.5E+00
        (+1.8E+01)  (+1.2E+01)                          |                  (+5.2E+01)  (+3.1E+02)  (+1.2E+00)  (+1.5E+01)
                                                        |
sugar    +6.1E-02    -9.1E-02    +3.8E-02               |      sugar        +7.3E-02    -1.5E+00    +1.3E-02    +2.0E-02
        (+2.5E-01)  (+2.3E-01)  (+8.3E-02)              |                  (+2.2E-01)  (+1.2E+00)  (+2.8E-02)  (+2.7E-01)
                                                        |
mushy    -5.9E-01    -6.2E-01    -2.3E-02    +4.8E-01   |      mushy        -7.1E-01    +1.5E+00    +2.0E-02    +9.6E-01
        (+2.1E+00)  (+1.5E+00)  (+2.5E+00)  (+1.3E+00)  |                  (+2.3E+00)  (+1.5E+01)  (+2.7E-01)  (+4.0E+00)
=========================================================================================================================

Beta Estimates (Robust SEs in Parentheses):
==========
  prices
----------
 -3.1E+01
(+6.0E+00)
==========
[10]:
results2 = pr_problem.solve(sigma=np.ones((4, 4)), optimization=bfgs)
results2
[10]:
Problem Results Summary:
=======================================================================================================
GMM   Objective  Gradient      Hessian         Hessian     Clipped  Weighting Matrix  Covariance Matrix
Step    Value      Norm    Min Eigenvalue  Max Eigenvalue  Shares   Condition Number  Condition Number
----  ---------  --------  --------------  --------------  -------  ----------------  -----------------
 2    +1.6E+02   +1.1E-05     +1.6E-02        +5.3E+03        0         +5.3E+07          +5.0E+20
=======================================================================================================

Cumulative Statistics:
===========================================================================
Computation  Optimizer  Optimization   Objective   Fixed Point  Contraction
   Time      Converged   Iterations   Evaluations  Iterations   Evaluations
-----------  ---------  ------------  -----------  -----------  -----------
 00:04:57       No           63           130         96884       301280
===========================================================================

Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
=========================================================================================================================
Sigma:      1         prices      sugar       mushy     |  Sigma Squared:      1         prices      sugar       mushy
------  ----------  ----------  ----------  ----------  |  --------------  ----------  ----------  ----------  ----------
  1      -7.4E-01                                       |        1          +5.5E-01    -9.4E+00    +8.3E-02    -1.1E-01
        (+2.3E+00)                                      |                  (+3.4E+00)  (+3.5E+01)  (+1.6E-01)  (+6.4E-01)
                                                        |
prices   +1.3E+01    +6.6E-06                           |      prices       -9.4E+00    +1.6E+02    -1.4E+00    +1.9E+00
        (+7.5E+00)  (+2.7E+03)                          |                  (+3.5E+01)  (+1.9E+02)  (+8.0E-01)  (+8.9E+00)
                                                        |
sugar    -1.1E-01    -7.4E-08    -8.9E-09               |      sugar        +8.3E-02    -1.4E+00    +1.2E-02    -1.7E-02
        (+2.0E-01)  (+2.2E+05)  (+5.2E+04)              |                  (+1.6E-01)  (+8.0E-01)  (+2.2E-02)  (+1.6E-01)
                                                        |
mushy    +1.5E-01    -4.3E-07    +1.6E-07    +4.7E-08   |      mushy        -1.1E-01    +1.9E+00    -1.7E-02    +2.2E-02
        (+6.8E-01)  (+5.0E+03)  (+7.0E+02)  (+4.3E+02)  |                  (+6.4E-01)  (+8.9E+00)  (+1.6E-01)  (+2.0E-01)
=========================================================================================================================

Beta Estimates (Robust SEs in Parentheses):
==========
  prices
----------
 -3.1E+01
(+4.0E+00)
==========
[11]:
results3 = mc_problem.solve(sigma=np.eye(4), optimization=bfgs)
results3
[11]:
Problem Results Summary:
=======================================================================================================
GMM   Objective  Gradient      Hessian         Hessian     Clipped  Weighting Matrix  Covariance Matrix
Step    Value      Norm    Min Eigenvalue  Max Eigenvalue  Shares   Condition Number  Condition Number
----  ---------  --------  --------------  --------------  -------  ----------------  -----------------
 2    +1.8E+02   +1.3E-06     +1.1E+00        +6.0E+03        0         +5.8E+07          +4.8E+04
=======================================================================================================

Cumulative Statistics:
===========================================================================
Computation  Optimizer  Optimization   Objective   Fixed Point  Contraction
   Time      Converged   Iterations   Evaluations  Iterations   Evaluations
-----------  ---------  ------------  -----------  -----------  -----------
 00:00:28       Yes          16           24          19529        60447
===========================================================================

Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
======================================================
Sigma:      1         prices      sugar       mushy
------  ----------  ----------  ----------  ----------
  1      +5.2E-02
        (+1.1E+00)

prices   +0.0E+00    -4.3E-01
                    (+8.0E+00)

sugar    +0.0E+00    +0.0E+00    +3.6E-02
                                (+5.8E-02)

mushy    +0.0E+00    +0.0E+00    +0.0E+00    +5.0E-01
                                            (+1.7E+00)
======================================================

Beta Estimates (Robust SEs in Parentheses):
==========
  prices
----------
 -3.0E+01
(+1.4E+00)
==========

We see that all three models give similar estimates of the price coefficient \(\hat{\alpha} \approx -30\). Note a few of the estimated terms on the diagonal of \(\Sigma\) are negative. Since the diagonal consists of standard deviations, negative values are unrealistic. When using another optimization routine that supports bounds (like the default L-BFGS-B routine), these diagonal elements are by default bounded from below by zero.

Adding Demographics to the Problem

To add demographic data we need to make two changes:

  1. We need to load agent_data, which for this cereal problem contains pre-computed Monte Carlo draws and demographics.

  2. We need to add an agent_formulation to the model.

The agent data has several reserved column names.

  • market_ids are the index that link the agent data to the market_ids in product data.

  • weights are the weights \(w_{it}\) attached to each agent. In each market, these should sum to one so that \(\sum_i w_{it} = 1\). It is often the case the \(w_{it} = 1 / I_t\) where \(I_t\) is the number of agents in market \(t\), so that each agent gets equal weight. Other possibilities include quadrature nodes and weights.

  • nodes0, nodes1, and so on are the nodes at which the unobserved agent tastes \(\mu_{ijt}\) are evaluated. The nodes should be labeled from \(0, \ldots, K_2 - 1\) where \(K_2\) is the number of random coefficients.

  • Other fields are the realizations of the demographics \(d\) themselves.

[12]:
agent_data = pd.read_csv(pyblp.data.NEVO_AGENTS_LOCATION)
agent_data.head()
[12]:
market_ids city_ids quarter weights nodes0 nodes1 nodes2 nodes3 income income_squared age child
0 C01Q1 1 1 0.05 0.434101 -1.500838 -1.151079 0.161017 0.495123 8.331304 -0.230109 -0.230851
1 C01Q1 1 1 0.05 -0.726649 0.133182 -0.500750 0.129732 0.378762 6.121865 -2.532694 0.769149
2 C01Q1 1 1 0.05 -0.623061 -0.138241 0.797441 -0.795549 0.105015 1.030803 -0.006965 -0.230851
3 C01Q1 1 1 0.05 -0.041317 1.257136 -0.683054 0.259044 -1.485481 -25.583605 -0.827946 0.769149
4 C01Q1 1 1 0.05 -0.466691 0.226968 1.044424 0.092019 -0.316597 -6.517009 -0.230109 -0.230851

The agent formulation tells us which columns of demographic information to interact with \(X_2\).

[13]:
agent_formulation = pyblp.Formulation('0 + income + income_squared + age + child')
agent_formulation
[13]:
income + income_squared + age + child

This tells us to include demographic realizations for income, income_squared, age, and the presence of children, child, but to ignore other possible demographics when interacting demographics with \(X_2\). We should also generally exclude the constant from the demographic formula.

Now we configure the Problem to include the agent_formulation and agent_data, which follow the product_formulations and product_data.

When we display the class, it lists the demographic interactions in the table of formulations and reports \(D = 4\), the dimension of the demographic draws.

[14]:
nevo_problem = pyblp.Problem(
    product_formulations,
    product_data,
    agent_formulation,
    agent_data
)
nevo_problem
[14]:
Dimensions:
=================================================
 T    N     F    I     K1    K2    D    MD    ED
---  ----  ---  ----  ----  ----  ---  ----  ----
94   2256   5   1880   1     4     4    20    1
=================================================

Formulations:
===================================================================
       Column Indices:           0           1           2      3
-----------------------------  ------  --------------  -----  -----
 X1: Linear Characteristics    prices
X2: Nonlinear Characteristics    1         prices      sugar  mushy
       d: Demographics         income  income_squared   age   child
===================================================================

The initialized problem can be solved with Problem.solve. We’ll use the same starting values as Nevo (2000a). By passing a diagonal matrix as starting values for \(\Sigma\), we’re choosing to ignore covariance terms. Similarly, zeros in the starting values for \(\Pi\) mean that those parameters will be fixed at zero.

To replicate common estimates, we’ll use the non-default BFGS optimization routine (with a slightly tighter tolerance to avoid getting stuck at a spurious local minimum), and we’ll configure Problem.solve to use 1-step GMM instead of 2-step GMM.

[15]:
initial_sigma = np.diag([0.3302, 2.4526, 0.0163, 0.2441])
initial_pi = np.array([
  [ 5.4819,  0,      0.2037,  0     ],
  [15.8935, -1.2000, 0,       2.6342],
  [-0.2506,  0,      0.0511,  0     ],
  [ 1.2650,  0,     -0.8091,  0     ]
])
tighter_bfgs = pyblp.Optimization('bfgs', {'gtol': 1e-5})
nevo_results = nevo_problem.solve(
    initial_sigma,
    initial_pi,
    optimization=tighter_bfgs,
    method='1s'
)
nevo_results
[15]:
Problem Results Summary:
=======================================================================================================
GMM   Objective  Gradient      Hessian         Hessian     Clipped  Weighting Matrix  Covariance Matrix
Step    Value      Norm    Min Eigenvalue  Max Eigenvalue  Shares   Condition Number  Condition Number
----  ---------  --------  --------------  --------------  -------  ----------------  -----------------
 1    +4.6E+00   +6.9E-06     +3.3E-05        +1.6E+04        0         +6.9E+07          +8.4E+08
=======================================================================================================

Cumulative Statistics:
===========================================================================
Computation  Optimizer  Optimization   Objective   Fixed Point  Contraction
   Time      Converged   Iterations   Evaluations  Iterations   Evaluations
-----------  ---------  ------------  -----------  -----------  -----------
 00:01:12       Yes          51           57          46389       143977
===========================================================================

Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
=====================================================================================================================
Sigma:      1         prices      sugar       mushy     |   Pi:      income    income_squared     age        child
------  ----------  ----------  ----------  ----------  |  ------  ----------  --------------  ----------  ----------
  1      +5.6E-01                                       |    1      +2.3E+00      +0.0E+00      +1.3E+00    +0.0E+00
        (+1.6E-01)                                      |          (+1.2E+00)                  (+6.3E-01)
                                                        |
prices   +0.0E+00    +3.3E+00                           |  prices   +5.9E+02      -3.0E+01      +0.0E+00    +1.1E+01
                    (+1.3E+00)                          |          (+2.7E+02)    (+1.4E+01)                (+4.1E+00)
                                                        |
sugar    +0.0E+00    +0.0E+00    -5.8E-03               |  sugar    -3.8E-01      +0.0E+00      +5.2E-02    +0.0E+00
                                (+1.4E-02)              |          (+1.2E-01)                  (+2.6E-02)
                                                        |
mushy    +0.0E+00    +0.0E+00    +0.0E+00    +9.3E-02   |  mushy    +7.5E-01      +0.0E+00      -1.4E+00    +0.0E+00
                                            (+1.9E-01)  |          (+8.0E-01)                  (+6.7E-01)
=====================================================================================================================

Beta Estimates (Robust SEs in Parentheses):
==========
  prices
----------
 -6.3E+01
(+1.5E+01)
==========

Results are similar to those in the original paper with a (scaled) objective value of \(q(\hat{\theta}) = 4.65\) and a price coefficient of \(\hat{\alpha} = -62.7\).

Restricting Parameters

Because the interactions between price, income, and income_squared are potentially collinear, we might worry that \(\hat{\Pi}_{21} = 588\) and \(\hat{\Pi}_{22} = -30.2\) are pushing the price coefficient in opposite directions. Both are large in magnitude but statistically insignficant. One way of dealing with this is to restrict \(\Pi_{22} = 0\).

There are two ways we can do this:

  1. Change the initial \(\Pi_0\) values to make this term zero.

  2. Change the agent formula to drop income_squared.

First, we’ll change the initial \(\Pi_0\) values.

[16]:
restricted_pi = initial_pi.copy()
restricted_pi[1, 1] = 0
nevo_problem.solve(
    initial_sigma,
    restricted_pi,
    optimization=tighter_bfgs,
    method='1s'
)
[16]:
Problem Results Summary:
=======================================================================================================
GMM   Objective  Gradient      Hessian         Hessian     Clipped  Weighting Matrix  Covariance Matrix
Step    Value      Norm    Min Eigenvalue  Max Eigenvalue  Shares   Condition Number  Condition Number
----  ---------  --------  --------------  --------------  -------  ----------------  -----------------
 1    +1.5E+01   +5.2E-06     +4.7E-02        +1.7E+04        0         +6.9E+07          +5.7E+05
=======================================================================================================

Cumulative Statistics:
===========================================================================
Computation  Optimizer  Optimization   Objective   Fixed Point  Contraction
   Time      Converged   Iterations   Evaluations  Iterations   Evaluations
-----------  ---------  ------------  -----------  -----------  -----------
 00:00:53       Yes          34           40          34390       106492
===========================================================================

Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
=====================================================================================================================
Sigma:      1         prices      sugar       mushy     |   Pi:      income    income_squared     age        child
------  ----------  ----------  ----------  ----------  |  ------  ----------  --------------  ----------  ----------
  1      +3.7E-01                                       |    1      +3.1E+00      +0.0E+00      +1.2E+00    +0.0E+00
        (+1.2E-01)                                      |          (+1.1E+00)                  (+1.0E+00)
                                                        |
prices   +0.0E+00    +1.8E+00                           |  prices   +4.2E+00      +0.0E+00      +0.0E+00    +1.2E+01
                    (+9.2E-01)                          |          (+4.6E+00)                              (+5.2E+00)
                                                        |
sugar    +0.0E+00    +0.0E+00    -4.4E-03               |  sugar    -1.9E-01      +0.0E+00      +2.8E-02    +0.0E+00
                                (+1.2E-02)              |          (+3.5E-02)                  (+3.2E-02)
                                                        |
mushy    +0.0E+00    +0.0E+00    +0.0E+00    +8.6E-02   |  mushy    +1.5E+00      +0.0E+00      -1.5E+00    +0.0E+00
                                            (+1.9E-01)  |          (+6.5E-01)                  (+1.1E+00)
=====================================================================================================================

Beta Estimates (Robust SEs in Parentheses):
==========
  prices
----------
 -3.2E+01
(+2.3E+00)
==========

Now we’ll drop both income_squared and the corresponding column in \(\Pi_0\).

[17]:
restricted_formulation = pyblp.Formulation('0 + income  + age + child')
deleted_pi = np.delete(initial_pi, 1, axis=1)
restricted_problem = pyblp.Problem(
    product_formulations,
    product_data,
    restricted_formulation,
    agent_data
)
restricted_problem.solve(
    initial_sigma,
    deleted_pi,
    optimization=tighter_bfgs,
    method='1s'
)
[17]:
Problem Results Summary:
=======================================================================================================
GMM   Objective  Gradient      Hessian         Hessian     Clipped  Weighting Matrix  Covariance Matrix
Step    Value      Norm    Min Eigenvalue  Max Eigenvalue  Shares   Condition Number  Condition Number
----  ---------  --------  --------------  --------------  -------  ----------------  -----------------
 1    +1.5E+01   +5.2E-06     +4.7E-02        +1.7E+04        0         +6.9E+07          +5.7E+05
=======================================================================================================

Cumulative Statistics:
===========================================================================
Computation  Optimizer  Optimization   Objective   Fixed Point  Contraction
   Time      Converged   Iterations   Evaluations  Iterations   Evaluations
-----------  ---------  ------------  -----------  -----------  -----------
 00:00:55       Yes          34           40          34371       106443
===========================================================================

Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
=====================================================================================================
Sigma:      1         prices      sugar       mushy     |   Pi:      income       age        child
------  ----------  ----------  ----------  ----------  |  ------  ----------  ----------  ----------
  1      +3.7E-01                                       |    1      +3.1E+00    +1.2E+00    +0.0E+00
        (+1.2E-01)                                      |          (+1.1E+00)  (+1.0E+00)
                                                        |
prices   +0.0E+00    +1.8E+00                           |  prices   +4.2E+00    +0.0E+00    +1.2E+01
                    (+9.2E-01)                          |          (+4.6E+00)              (+5.2E+00)
                                                        |
sugar    +0.0E+00    +0.0E+00    -4.4E-03               |  sugar    -1.9E-01    +2.8E-02    +0.0E+00
                                (+1.2E-02)              |          (+3.5E-02)  (+3.2E-02)
                                                        |
mushy    +0.0E+00    +0.0E+00    +0.0E+00    +8.6E-02   |  mushy    +1.5E+00    -1.5E+00    +0.0E+00
                                            (+1.9E-01)  |          (+6.5E-01)  (+1.1E+00)
=====================================================================================================

Beta Estimates (Robust SEs in Parentheses):
==========
  prices
----------
 -3.2E+01
(+2.3E+00)
==========

The parameter estimates and standard errors are identical for both approaches. Based on the number of fixed point iterations, there is some evidence that the solver took a slightly different path for each problem, but both restricted problems arrived at identical answers. The \(\hat{\Pi}_{12}\) interaction term is still insignificant.

Download the Jupyter Notebook for this section: blp.ipynb

Supply Side Tutorial with Automobile Data

[1]:
import pyblp
import numpy as np
import pandas as pd

pyblp.options.digits = 2
pyblp.options.verbose = False
pyblp.__version__
[1]:
'1.1.0'

In this tutorial, we’ll use data from Berry, Levinsohn, and Pakes (1995) to solve the paper’s automobile problem. This tutorial is similar to the fake cereal tutorial, but exhibits some other features of pyblp:

  • Incorporating a supply side into demand estimation.

  • Allowing for simple price-income demographic effects.

  • Calculating clustered standard errors.

Loading Data

We’ll use pandas to load two sets of data:

  1. product_data, which contains prices, shares, and other product characteristics.

  2. agent_data, which contains draws from the distribution of heterogeneity.

[2]:
product_data = pd.read_csv(pyblp.data.BLP_PRODUCTS_LOCATION)
product_data.head()
[2]:
market_ids clustering_ids car_ids firm_ids region shares prices hpwt air mpd ... supply_instruments2 supply_instruments3 supply_instruments4 supply_instruments5 supply_instruments6 supply_instruments7 supply_instruments8 supply_instruments9 supply_instruments10 supply_instruments11
0 1971 AMGREM71 129 15 US 0.001051 4.935802 0.528997 0 1.888146 ... 0.0 1.705933 1.595656 87.0 -61.959985 0.0 46.060389 29.786989 0.0 1.888146
1 1971 AMHORN71 130 15 US 0.000670 5.516049 0.494324 0 1.935989 ... 0.0 1.680910 1.490295 87.0 -61.959985 0.0 46.060389 29.786989 0.0 1.935989
2 1971 AMJAVL71 132 15 US 0.000341 7.108642 0.467613 0 1.716799 ... 0.0 1.801067 1.357703 87.0 -61.959985 0.0 46.060389 29.786989 0.0 1.716799
3 1971 AMMATA71 134 15 US 0.000522 6.839506 0.426540 0 1.687871 ... 0.0 1.818061 1.261347 87.0 -61.959985 0.0 46.060389 29.786989 0.0 1.687871
4 1971 AMAMBS71 136 15 US 0.000442 8.928395 0.452489 0 1.504286 ... 0.0 1.933210 1.237365 87.0 -61.959985 0.0 46.060389 29.786989 0.0 1.504286

5 rows × 33 columns

The product_data contains market IDs, product IDs, firm IDs, shares, prices, a number of product characteristics, and instruments. The product IDs are called clustering_ids because they will be used to compute clustered standard errors. For more information about the instruments and the example data as a whole, refer to the data module.

The agent_data contains market IDs, integration weights \(w_{it}\), integration nodes \(\nu_{it}\), and demographics \(d_{it}\). Here we use the \(I_t = 200\) importance sampling weights and nodes from the original paper.

In non-example problems, it is usually a better idea to use many more draws, or a more sophisticated Integration configuration such as sparse grid quadrature.

[3]:
agent_data = pd.read_csv(pyblp.data.BLP_AGENTS_LOCATION)
agent_data.head()
[3]:
market_ids weights nodes0 nodes1 nodes2 nodes3 nodes4 income
0 1971 0.000543 1.192188 0.478777 0.980830 -0.824410 2.473301 109.560369
1 1971 0.000723 1.497074 -2.026204 -1.741316 1.412568 -0.747468 45.457314
2 1971 0.000544 1.438081 0.813280 -1.749974 -1.203509 0.049558 127.146548
3 1971 0.000701 1.768655 -0.177453 0.286602 0.391517 0.683669 22.604045
4 1971 0.000549 0.849970 -0.135337 0.735920 1.036247 -1.143436 170.226032

Setting up the Problem

Unlike the fake cereal problem, we won’t absorb any fixed effects in the automobile problem, so the linear part of demand \(X_1\) has more components. We also need to specify a formula for the random coefficients \(X_2\), including a random coefficient on the constant, which captures correlation among all inside goods.

The primary new addition to the model relative to the fake cereal problem is that we add a supply side formula for product characteristics that contribute to marginal costs, \(X_3\). The patsy-style formulas support functions of regressors such as the log function used below.

We stack the three product formulations in order: \(X_1\), \(X_2\), and \(X_3\).

[4]:
product_formulations = (
   pyblp.Formulation('1 + hpwt + air + mpd + space'),
   pyblp.Formulation('1 + prices + hpwt + air + mpd + space'),
   pyblp.Formulation('1 + log(hpwt) + air + log(mpg) + log(space) + trend')
)
product_formulations
[4]:
(1 + hpwt + air + mpd + space,
 1 + prices + hpwt + air + mpd + space,
 1 + log(hpwt) + air + log(mpg) + log(space) + trend)

The original specification for the automobile problem includes the term \(\log(y_i - p_j)\), in which \(y\) is income and \(p\) are prices. Instead of including this term, which gives rise to a host of numerical problems, we’ll follow Berry, Levinsohn, and Pakes (1999) and use its first-order linear approximation, \(p_j / y_i\).

The agent formulation for demographics, \(d\), includes a column of \(1 / y_i\) values, which we’ll interact with \(p_j\). To do this, we will treat draws of \(y_i\) as demographic variables.

[5]:
agent_formulation = pyblp.Formulation('0 + I(1 / income)')
agent_formulation
[5]:
I(1 / income)

As in the cereal example, the Problem can be constructed by combining the product_formulations, product_data, agent_formulation, and agent_data. We’ll also choose the functional form of marginal costs \(c_{jt}\). A linear marginal cost specification is the default setting, so we’ll need to use the costs_type argument of Problem to employ the log-linear specification used by Berry, Levinsohn, and Pakes (1995).

When initializing the problem, we get a warning about integration weights not summing to one. This is because the above product data were created by the original paper with importance sampling. To disable this warning, we could increase pyblp.options.weights_tol.

[6]:
problem = pyblp.Problem(product_formulations, product_data, agent_formulation, agent_data, costs_type='log')
problem
Integration weights in the following markets sum to a value that differs from 1 by more than options.weights_tol: all markets. Sometimes this is fine, for example when weights were built with importance sampling. Otherwise, it is a sign that there is a data problem.
[6]:
Dimensions:
=======================================================
 T    N     F    I     K1    K2    K3    D    MD    MS
---  ----  ---  ----  ----  ----  ----  ---  ----  ----
20   2217  26   4000   5     6     6     1    13    18
=======================================================

Formulations:
=======================================================================================
       Column Indices:             0           1       2       3          4         5
-----------------------------  ----------  ---------  ----  --------  ----------  -----
 X1: Linear Characteristics        1         hpwt     air     mpd       space
X2: Nonlinear Characteristics      1        prices    hpwt    air        mpd      space
X3: Log Cost Characteristics       1       log(hpwt)  air   log(mpg)  log(space)  trend
       d: Demographics         1*1/income
=======================================================================================

The problem outputs a table of dimensions:

  • \(T\) denotes the number of markets.

  • \(N\) is the length of the dataset (the number of products across all markets).

  • \(F\) denotes the number of firms.

  • \(I = \sum_t I_t\) is the total number of agents across all markets (200 draws per market times 20 markets).

  • \(K_1\) is the number of linear demand characteristics.

  • \(K_2\) is the number of nonlinear demand characteristics.

  • \(K_3\) is the number of linear supply characteristics.

  • \(D\) is the number of demographic variables.

  • \(M_D\) is the number of demand instruments, including exogenous regressors.

  • \(M_S\) is the number of supply instruments, including exogenous regressors.

The formulations table describes all four formulas for demand-side linear characteristics, demand-side nonlinear characteristics, supply-side characteristics, and demographics.

Solving the Problem

The only remaining decisions are:

  • Choosing \(\Sigma\) and \(\Pi\) starting values, \(\Sigma_0\) and \(\Pi_0\).

  • Potentially choosing bounds for \(\Sigma\) and \(\Pi\).

The decisions we will use are:

  • Use published estimates as our starting values in \(\Sigma_0\).

  • Interact the inverse of income, \(1 / y_i\), only with prices, and use the published estimate on \(\log(y_i - p_j)\) as our starting value for \(\alpha\) in \(\Pi_0\).

  • Bound \(\Sigma_0\) to be positive since it is a diagonal matrix where the diagonal consists of standard deviations.

When using a routine that supports bounds, it’s usually a good idea to set your own more bounds so that the routine doesn’t try out large parameter values that create numerical issues.

[7]:
initial_sigma = np.diag([3.612, 0, 4.628, 1.818, 1.050, 2.056])
initial_pi = np.c_[[0, -43.501, 0, 0, 0, 0]]

Note that there are only 5 nonzeros on the diagonal of \(\Sigma\), which means that we only need 5 columns of integration nodes to integrate over these 5 dimensions of unobserved heterogeneity. Indeed, agent_data contains exactly 5 columns of nodes. If we were to ignore the \(\log(y_i - p_j)\) term (by not configuring \(\Pi\)) and include a term on prices in \(\Sigma\) instead, we would have needed 6 columns of integration nodes in our agent_data.

A downside of the log-linear marginal costs specification is that nonpositive estimated marginal costs can create problems for the optimization routine when computing \(\log c(\hat{\theta})\). We’ll use the costs_bounds argument to bound marginal costs from below by a small number.

Finally, as in the original paper, we’ll use W_type and se_type to cluster by product IDs, which were specified as clustering_ids in product_data, and set initial_update=True to update the initial GMM weighting matrix and the mean utility at the starting parameter values.

[8]:
results = problem.solve(
    initial_sigma,
    initial_pi,
    costs_bounds=(0.001, None),
    W_type='clustered',
    se_type='clustered',
    initial_update=True,
)
results
[8]:
Problem Results Summary:
=======================================================================================================================
GMM   Objective    Projected    Reduced Hessian  Reduced Hessian  Clipped  Clipped  Weighting Matrix  Covariance Matrix
Step    Value    Gradient Norm  Min Eigenvalue   Max Eigenvalue   Shares    Costs   Condition Number  Condition Number
----  ---------  -------------  ---------------  ---------------  -------  -------  ----------------  -----------------
 2    +5.0E+02     +1.1E-08        +4.9E-01         +5.1E+02         0        0         +4.2E+09          +3.8E+08
=======================================================================================================================

Cumulative Statistics:
===========================================================================
Computation  Optimizer  Optimization   Objective   Fixed Point  Contraction
   Time      Converged   Iterations   Evaluations  Iterations   Evaluations
-----------  ---------  ------------  -----------  -----------  -----------
 00:10:14       No           58           126         36807       112905
===========================================================================

Nonlinear Coefficient Estimates (Robust SEs Adjusted for 999 Clusters in Parentheses):
===================================================================================================
Sigma:      1        prices      hpwt        air         mpd        space     |   Pi:    1*1/income
------  ----------  --------  ----------  ----------  ----------  ----------  |  ------  ----------
  1      +2.0E+00                                                             |    1      +0.0E+00
        (+6.1E+00)                                                            |
                                                                              |
prices   +0.0E+00   +0.0E+00                                                  |  prices   -4.5E+01
                                                                              |          (+9.2E+00)
                                                                              |
 hpwt    +0.0E+00   +0.0E+00   +6.1E+00                                       |   hpwt    +0.0E+00
                              (+2.2E+00)                                      |
                                                                              |
 air     +0.0E+00   +0.0E+00   +0.0E+00    +4.0E+00                           |   air     +0.0E+00
                                          (+2.1E+00)                          |
                                                                              |
 mpd     +0.0E+00   +0.0E+00   +0.0E+00    +0.0E+00    +2.5E-01               |   mpd     +0.0E+00
                                                      (+5.5E-01)              |
                                                                              |
space    +0.0E+00   +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00    +1.9E+00   |  space    +0.0E+00
                                                                  (+1.1E+00)  |
===================================================================================================

Beta Estimates (Robust SEs Adjusted for 999 Clusters in Parentheses):
==========================================================
    1          hpwt        air         mpd        space
----------  ----------  ----------  ----------  ----------
 -7.3E+00    +3.5E+00    -1.0E+00    +4.2E-01    +4.2E+00
(+2.8E+00)  (+1.4E+00)  (+2.1E+00)  (+2.5E-01)  (+6.6E-01)
==========================================================

Gamma Estimates (Robust SEs Adjusted for 999 Clusters in Parentheses):
======================================================================
    1       log(hpwt)      air       log(mpg)   log(space)    trend
----------  ----------  ----------  ----------  ----------  ----------
 +2.8E+00    +9.0E-01    +4.2E-01    -5.2E-01    -2.6E-01    +2.7E-02
(+1.2E-01)  (+7.2E-02)  (+8.7E-02)  (+7.3E-02)  (+2.1E-01)  (+3.1E-03)
======================================================================

There are some discrepancies between our results and the original paper, but results are similar.

Download the Jupyter Notebook for this section: petrin.ipynb

Micro Moments Tutorial with Automobile Data

[1]:
from IPython.display import display, HTML
display(HTML("<style>pre { white-space: pre !important; }</style>"))

import pyblp
import numpy as np
import pandas as pd

pyblp.options.digits = 2
pyblp.options.verbose = False
pyblp.__version__
[1]:
'1.1.0'

In this tutorial, we’ll use data from Petrin (2002) to solve the paper’s automobile problem. This tutorial is similar to the first automobile tutorial, but exhibits how to incorporate micro moments into estimation.

Loading Data

We’ll use pandas to load two sets of data:

  1. product_data, which contains prices, shares, and other product characteristics.

  2. agent_data, which contains draws from the distribution of heterogeneity.

[2]:
product_data = pd.read_csv(pyblp.data.PETRIN_PRODUCTS_LOCATION)
product_data.head()
[2]:
market_ids clustering_ids firm_ids region jp eu q households shares prices ... supply_instruments6 supply_instruments7 supply_instruments8 supply_instruments9 supply_instruments10 supply_instruments11 supply_instruments12 supply_instruments13 supply_instruments14 supply_instruments15
0 1981 1 7 EU 0 1 18.647 83527 0.000223 10.379538 ... 9.0 0.0 0.0 144.0 -151.682461 108.724278 30.0 32.0 32.0 460.419731
1 1981 2 7 EU 0 1 17.611 83527 0.000211 13.140814 ... 9.0 0.0 0.0 144.0 -151.682461 108.724278 30.0 32.0 32.0 460.419731
2 1981 2 7 EU 0 1 6.139 83527 0.000073 19.746975 ... 9.0 0.0 0.0 144.0 -151.682461 108.724278 30.0 32.0 32.0 460.419731
3 1981 3 7 EU 0 1 2.553 83527 0.000031 13.085809 ... 9.0 0.0 0.0 144.0 -151.682461 108.724278 30.0 32.0 32.0 460.419731
4 1981 4 15 US 0 0 43.198 83527 0.000517 6.660066 ... 0.0 0.0 0.0 149.0 -157.647246 114.055507 30.0 35.0 42.0 467.806186

5 rows × 62 columns

The product_data contains market IDs, product IDs, firm IDs, shares, prices, a number of product characteristics, and instruments. The product IDs are called clustering_ids because they will be used to compute clustered standard errors. For more information about the instruments and the example data as a whole, refer to the data module.

The agent_data contains market IDs, integration weights \(w_{it}\), integration nodes \(\nu_{it}\), and demographics \(d_{it}\). Here we use \(I_t = 1000\) scrambled Halton draws in each market, along with demographics resampled from the Consumer Expenditure Survey (CEX) used by the original paper. These draws are slightly different from those used in the original paper (pseudo Monte Carlo draws and importance sampling). Note that following the original paper, the integration nodes are actually draws from a truncated \(\chi^2(3)\) distribution, rather than the more typical \(N(0, 1)\) draws that we have seen in prior tutorials.

[3]:
agent_data = pd.read_csv(pyblp.data.PETRIN_AGENTS_LOCATION)
agent_data.head()
[3]:
market_ids weights nodes0 nodes1 nodes2 nodes3 nodes4 nodes5 fv income low mid high fs age
0 1981 0.001 2.533314 7.496742 2.649343 3.892549 0.833761 1.928344 0.749785 10.346577 1 0 0 4 1
1 1981 0.001 4.422582 0.858539 1.646447 2.973352 0.033288 1.683242 5.232336 13.944210 0 1 0 2 1
2 1981 0.001 1.341509 5.041918 4.118932 2.166338 1.314582 0.360087 1.860212 5.898788 1 0 0 4 0
3 1981 0.001 3.324113 2.354892 0.802351 0.261043 3.911970 1.027856 6.980909 8.125445 1 0 0 2 0
4 1981 0.001 1.895857 1.807990 1.827797 4.080565 1.709768 0.707514 2.450663 34.397295 0 0 1 2 1

Setting up the Problem

The problem configuration is based on that of the first automobile problem. It is very similar, with both demand and supply sides, although with a few more product characteristics.

Again, we stack the three product formulations in order: \(X_1\), \(X_2\), and \(X_3\).

[4]:
product_formulations = (
    pyblp.Formulation('1 + hpwt + space + air + mpd + fwd + mi + sw + su + pv + pgnp + trend + trend2'),
    pyblp.Formulation('1 + I(-prices) + hpwt + space + air + mpd + fwd + mi + sw + su + pv'),
    pyblp.Formulation('1 + log(hpwt) + log(wt) + log(mpg) + air + fwd + trend * (jp + eu) + log(q)'),
)
product_formulations
[4]:
(1 + hpwt + space + air + mpd + fwd + mi + sw + su + pv + pgnp + trend + trend2,
 1 + I(-prices) + hpwt + space + air + mpd + fwd + mi + sw + su + pv,
 1 + log(hpwt) + log(wt) + log(mpg) + air + fwd + trend + jp + eu + trend:jp + trend:eu + log(q))

Again, we’ll use a first-order linear approximation to \(\log(y_i - p_j)\), in which \(y\) is income and \(p\) are prices. Unlike the previous automobile problem, however, we’ll allow its coefficient to vary for low- mid- and high-income consumers.

As in the original paper, we’ll also include \(log(\textit{fs}_i) \times \textit{fv}_i\) where \(\textit{fs}_i\) is family size and \(\textit{fv}_i\) is another truncated \(\chi^2(3)\) draw. Finally, to help with constructing micro moments below, we’ll also include various additional demographics in the agent formulation.

[5]:
agent_formulation = pyblp.Formulation('1 + I(low / income) + I(mid / income) + I(high / income) + I(log(fs) * fv) + age + fs + mid + high')
agent_formulation
[5]:
1 + I(low / income) + I(mid / income) + I(high / income) + I(log(fs) * fv) + age + fs + mid + high

The Problem can again be constructed by combining the product_formulations, product_data, agent_formulation, and agent_data. We’ll again choose a log-linear specification for marginal costs \(c_{jt}\).

[6]:
problem = pyblp.Problem(product_formulations, product_data, agent_formulation, agent_data, costs_type='log')
problem
[6]:
Dimensions:
========================================================
 T    N     F     I     K1    K2    K3    D    MD    MS
---  ----  ---  -----  ----  ----  ----  ---  ----  ----
13   2407  27   13000   13    11    12    9    35    28
========================================================

Formulations:
==============================================================================================================================================
       Column Indices:          0       1           2            3           4        5     6     7    8       9         10       11      12
-----------------------------  ---  ----------  ----------  -----------  ----------  ---  -----  ---  ----  --------  --------  ------  ------
 X1: Linear Characteristics     1      hpwt       space         air         mpd      fwd   mi    sw    su      pv       pgnp    trend   trend2
X2: Nonlinear Characteristics   1    -prices       hpwt        space        air      mpd   fwd   mi    sw      su        pv
X3: Log Cost Characteristics    1   log(hpwt)    log(wt)     log(mpg)       air      fwd  trend  jp    eu   jp*trend  eu*trend  log(q)
       d: Demographics          1   low/income  mid/income  high/income  fv*log(fs)  age   fs    mid  high
==============================================================================================================================================

The problem outputs a table of dimensions:

  • \(T\) denotes the number of markets.

  • \(N\) is the length of the dataset (the number of products across all markets).

  • \(F\) denotes the number of firms.

  • \(I = \sum_t I_t\) is the total number of agents across all markets (1000 draws per market times 13 markets).

  • \(K_1\) is the number of linear demand characteristics.

  • \(K_2\) is the number of nonlinear demand characteristics.

  • \(K_3\) is the number of linear supply characteristics.

  • \(D\) is the number of demographic variables.

  • \(M_D\) is the number of demand instruments, including exogenous regressors.

  • \(M_S\) is the number of supply instruments, including exogenous regressors.

The formulations table describes all four formulas for demand-side linear characteristics, demand-side nonlinear characteristics, supply-side characteristics, and demographics.

Setting up Micro Moments

Next, we will configure the micro moments that we will be adding to the problem. For background and notation involving micro moments, see Micro Moments.

Specifically, we will be adding a few more moments that match key statistics computed from the CEX survey of potential automobile consumers. For a tutorial on how to compute optimal micro moments that use all the information in a full micro dataset linking individual choices to demographics, see the post estimation tutorial.

To start, we will have to define a MicroDataset configuration that contains metadata about the micro dataset/survey. These metadata include a unique name for the dataset indexed by \(d\), the number of observations \(N_d\), a function that defines survey weights \(w_{dijt}\), and if relevant, a subset of markets from which the micro data was sampled.

[7]:
micro_dataset = pyblp.MicroDataset(
    name="CEX",
    observations=29125,
    compute_weights=lambda t, p, a: np.ones((a.size, 1 + p.size)),
)
micro_dataset
[7]:
CEX: 29125 Observations in All Markets

We called the dataset “CEX”, defined the number of observations in it, and also defined a lambda function for computing survey weights in a market. The compute_weights function has three arguments: the current market’s ID \(t\), the \(J_t\) Products inside the market, and the \(I_t\) Agents inside the market. In this case, we are assuming that each product and agent/consumer type are sampled with equal probability, so we simply return a matrix of ones of shape \(I_t \times (1 + J_t)\). This sets each \(w_{dijt} = 1\).

By using \(1 + J_t\) instead of \(J_t\), we are specifying that the micro dataset contains observations of the outside option \(j = 0\). If we instead specified a matrix of shape \(I_t \times J_t\), this would be the same as setting the first column equal to all zeros, so that outside choices are not sampled from.

We will be matching a few different statistics that were computed from this survey. For convenience, they are packaged in a data file with pyblp.

[8]:
micro_statistics = pd.read_csv(pyblp.data.PETRIN_VALUES_LOCATION, index_col=0)
micro_statistics
[8]:
value
E[age | mi] 0.7830
E[fs | mi] 3.8600
E[age | sw] 0.7300
E[fs | sw] 3.1700
E[age | su] 0.7400
E[fs | su] 2.9700
E[age | pv] 0.6520
E[fs | pv] 3.4700
E[new | mid] 0.0794
E[new | high] 0.1581

We will match the average age and family size (“fs”) conditional on purchasing a minivan (“mi”), station wagon (“sw”), sport-utility (“su”), and full-size passenger van (“pv”). We will also match the probability that a consumer actually purchases a new vehicle, conditional on them being mid- and high-income.

Each of these statistics is a conditional expectation, which we can rewrite as a ration of unconditional expectations over all consumers. Each of these unconditional expectations is called a MicroPart (used to form full micro moments), which we will now configure.

Each micro part is an average/expectation in the sample/population over micro values \(v_{pijt}\). To match the above micro values, we will need averages/expectations over interactions between agent/family size and dummies for purchasing the different automobile types. These will form the numerators in our conditional expectations.

[9]:
age_mi_part = pyblp.MicroPart(
    name="E[age_i * mi_j]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 5], np.r_[0, p.X2[:, 7]]),
)
age_sw_part = pyblp.MicroPart(
    name="E[age_i * sw_j]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 5], np.r_[0, p.X2[:, 8]]),
)
age_su_part = pyblp.MicroPart(
    name="E[age_i * su_j]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 5], np.r_[0, p.X2[:, 9]]),
)
age_pv_part = pyblp.MicroPart(
    name="E[age_i * pv_j]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 5], np.r_[0, p.X2[:, 10]]),
)
fs_mi_part = pyblp.MicroPart(
    name="E[fs_i * mi_j]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 6], np.r_[0, p.X2[:, 7]]),
)
fs_sw_part = pyblp.MicroPart(
    name="E[fs_i * sw_j]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 6], np.r_[0, p.X2[:, 8]]),
)
fs_su_part = pyblp.MicroPart(
    name="E[fs_i * su_j]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 6], np.r_[0, p.X2[:, 9]]),
)
fs_pv_part = pyblp.MicroPart(
    name="E[fs_i * pv_j]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 6], np.r_[0, p.X2[:, 10]]),
)

We will also need the denominators, which are simple averages/expectations of purchasing the different types of automobiles.

[10]:
mi_part = pyblp.MicroPart(
    name="E[mi_j]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 0], np.r_[0, p.X2[:, 7]]),
)
sw_part = pyblp.MicroPart(
    name="E[sw_j]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 0], np.r_[0, p.X2[:, 8]]),
)
su_part = pyblp.MicroPart(
    name="E[su_j]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 0], np.r_[0, p.X2[:, 9]]),
)
pv_part = pyblp.MicroPart(
    name="E[pv_j]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 0], np.r_[0, p.X2[:, 10]]),
)

To form our probability that a consumer actually purchases a new vehicle, conditional on them being mid- and high-income, we will also need the following micro parts.

[11]:
inside_mid_part = pyblp.MicroPart(
    name="E[1{j > 0} * mid_i]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 7], np.r_[0, p.X2[:, 0]]),
)
inside_high_part = pyblp.MicroPart(
    name="E[1{j > 0} * high_i]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 8], np.r_[0, p.X2[:, 0]]),
)
mid_part = pyblp.MicroPart(
    name="E[mid_i]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 7], np.r_[1, p.X2[:, 0]]),
)
high_part = pyblp.MicroPart(
    name="E[high_i]",
    dataset=micro_dataset,
    compute_values=lambda t, p, a: np.outer(a.demographics[:, 8], np.r_[1, p.X2[:, 0]]),
)

Finally, we’ll put these micro parts together into MicroMoments. Each micro moment is configured to have a name, a value (one of the statistics above), and micro parts that go into it.

If our micro moments were simple unconditional expectations, we could just pass a single micro part to each micro moment and be done. However, since our micro moments are functions of multiple micro parts, we have to specify this function. We also have to specify its derivative for computing standard errors and analytic objective gradients.

[12]:
compute_ratio = lambda v: v[0] / v[1]
compute_ratio_gradient = lambda v: [1 / v[1], -v[0] / v[1]**2]

Given our functions that define a conditional expectation and its derivatives, we can form our micro moments.

[13]:
micro_moments = [
    pyblp.MicroMoment(
        name="E[age_i | mi_j]",
        value=0.783,
        parts=[age_mi_part, mi_part],
        compute_value=compute_ratio,
        compute_gradient=compute_ratio_gradient,
    ),
    pyblp.MicroMoment(
        name="E[age_i | sw_j]",
        value=0.730,
        parts=[age_sw_part, sw_part],
        compute_value=compute_ratio,
        compute_gradient=compute_ratio_gradient,
    ),
    pyblp.MicroMoment(
        name="E[age_i | su_j]",
        value=0.740,
        parts=[age_su_part, su_part],
        compute_value=compute_ratio,
        compute_gradient=compute_ratio_gradient,
    ),
    pyblp.MicroMoment(
        name="E[age_i | pv_j]",
        value=0.652,
        parts=[age_pv_part, pv_part],
        compute_value=compute_ratio,
        compute_gradient=compute_ratio_gradient,
    ),
    pyblp.MicroMoment(
        name="E[fs_i | mi_j]",
        value=3.86,
        parts=[fs_mi_part, mi_part],
        compute_value=compute_ratio,
        compute_gradient=compute_ratio_gradient,
    ),
    pyblp.MicroMoment(
        name="E[fs_i | sw_j]",
        value=3.17,
        parts=[fs_sw_part, sw_part],
        compute_value=compute_ratio,
        compute_gradient=compute_ratio_gradient,
    ),
    pyblp.MicroMoment(
        name="E[fs_i | su_j]",
        value=2.97,
        parts=[fs_su_part, su_part],
        compute_value=compute_ratio,
        compute_gradient=compute_ratio_gradient,
    ),
    pyblp.MicroMoment(
        name="E[fs_i | pv_j]",
        value=3.47,
        parts=[fs_pv_part, pv_part],
        compute_value=compute_ratio,
        compute_gradient=compute_ratio_gradient,
    ),
    pyblp.MicroMoment(
        name="E[1{j > 0} | mid_i]",
        value=0.0794,
        parts=[inside_mid_part, mid_part],
        compute_value=compute_ratio,
        compute_gradient=compute_ratio_gradient,
    ),
    pyblp.MicroMoment(
        name="E[1{j > 0} | high_i]",
        value=0.1581,
        parts=[inside_high_part, high_part],
        compute_value=compute_ratio,
        compute_gradient=compute_ratio_gradient,
    ),
]

Solving the Problem

Like for the first automobile problem, here will will just use the publisehd estimates for \(\Sigma\) and \(\Pi\) starting values.

[14]:
initial_sigma = np.diag([3.23, 0, 4.43, 0.46, 0.01, 2.58, 4.42, 0, 0, 0, 0])
initial_pi = np.array([
    [0, 0, 0, 0, 0, 0, 0, 0, 0],
    [0, 7.52, 31.13, 34.49, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0.57, 0, 0, 0, 0],
    [0, 0, 0, 0, 0.28, 0, 0, 0, 0],
    [0, 0, 0, 0, 0.31, 0, 0, 0, 0],
    [0, 0, 0, 0, 0.42, 0, 0, 0, 0],
])

Finally, as in the original paper, we’ll use W_type and se_type to cluster by product IDs, which were specified as clustering_ids in product_data. We will use a simple BFGS optimization routine and slightly loosen the default tolerance of our inner SQUAREM iteration algorithm from 1e-14 to 1e-13 because the tighter tolerance tended to lead to convergence failures for this problem. We also pass our configured micro_moments when solving the problem.

[15]:
results = problem.solve(
    sigma=initial_sigma,
    pi=initial_pi,
    optimization=pyblp.Optimization('bfgs', {'gtol': 1e-4}),
    iteration=pyblp.Iteration('squarem', {'atol': 1e-13}),
    se_type='clustered',
    W_type='clustered',
    micro_moments=micro_moments,
)
results
[15]:
Problem Results Summary:
=======================================================================================================
GMM   Objective  Gradient      Hessian         Hessian     Clipped  Weighting Matrix  Covariance Matrix
Step    Value      Norm    Min Eigenvalue  Max Eigenvalue  Shares   Condition Number  Condition Number
----  ---------  --------  --------------  --------------  -------  ----------------  -----------------
 2    +1.8E+02   +4.3E-05     +3.8E-01        +1.3E+03        0         +2.8E+11          +8.1E+07
=======================================================================================================

Cumulative Statistics:
===========================================================================
Computation  Optimizer  Optimization   Objective   Fixed Point  Contraction
   Time      Converged   Iterations   Evaluations  Iterations   Evaluations
-----------  ---------  ------------  -----------  -----------  -----------
 00:34:10       Yes          72           87          10905        33592
===========================================================================

Nonlinear Coefficient Estimates (Robust SEs Adjusted for 898 Clusters in Parentheses):
================================================================================================================================================================================================================================================
Sigma:       1       -prices      hpwt       space        air         mpd         fwd         mi        sw        su        pv     |    Pi:       1      low/income  mid/income  high/income  fv*log(fs)    age        fs       mid       high
-------  ----------  --------  ----------  ----------  ----------  ----------  ----------  --------  --------  --------  --------  |  -------  --------  ----------  ----------  -----------  ----------  --------  --------  --------  --------
   1      +3.0E-02                                                                                                                 |     1     +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00     +0.0E+00   +0.0E+00  +0.0E+00  +0.0E+00  +0.0E+00
         (+5.3E-01)                                                                                                                |
                                                                                                                                   |
-prices   +0.0E+00   +0.0E+00                                                                                                      |  -prices  +0.0E+00   +3.9E+00    +1.2E+01    +2.4E+01     +0.0E+00   +0.0E+00  +0.0E+00  +0.0E+00  +0.0E+00
                                                                                                                                   |                     (+3.6E-01)  (+1.0E+00)  (+2.4E+00)
                                                                                                                                   |
 hpwt     +0.0E+00   +0.0E+00   +1.2E-01                                                                                           |   hpwt    +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00     +0.0E+00   +0.0E+00  +0.0E+00  +0.0E+00  +0.0E+00
                               (+8.1E-01)                                                                                          |
                                                                                                                                   |
 space    +0.0E+00   +0.0E+00   +0.0E+00    -9.2E-02                                                                               |   space   +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00     +0.0E+00   +0.0E+00  +0.0E+00  +0.0E+00  +0.0E+00
                                           (+6.1E-01)                                                                              |
                                                                                                                                   |
  air     +0.0E+00   +0.0E+00   +0.0E+00    +0.0E+00    -1.3E+00                                                                   |    air    +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00     +0.0E+00   +0.0E+00  +0.0E+00  +0.0E+00  +0.0E+00
                                                       (+1.1E+00)                                                                  |
                                                                                                                                   |
  mpd     +0.0E+00   +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00    -1.6E-01                                                       |    mpd    +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00     +0.0E+00   +0.0E+00  +0.0E+00  +0.0E+00  +0.0E+00
                                                                   (+2.2E-01)                                                      |
                                                                                                                                   |
  fwd     +0.0E+00   +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00    +0.0E+00    +1.6E+00                                           |    fwd    +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00     +0.0E+00   +0.0E+00  +0.0E+00  +0.0E+00  +0.0E+00
                                                                               (+3.7E-01)                                          |
                                                                                                                                   |
  mi      +0.0E+00   +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00    +0.0E+00    +0.0E+00   +0.0E+00                                |    mi     +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00     +4.2E-01   +0.0E+00  +0.0E+00  +0.0E+00  +0.0E+00
                                                                                                                                   |                                                          (+5.2E-02)
                                                                                                                                   |
  sw      +0.0E+00   +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00    +0.0E+00    +0.0E+00   +0.0E+00  +0.0E+00                      |    sw     +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00     +1.7E-01   +0.0E+00  +0.0E+00  +0.0E+00  +0.0E+00
                                                                                                                                   |                                                          (+4.2E-02)
                                                                                                                                   |
  su      +0.0E+00   +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00    +0.0E+00    +0.0E+00   +0.0E+00  +0.0E+00  +0.0E+00            |    su     +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00     +1.0E-01   +0.0E+00  +0.0E+00  +0.0E+00  +0.0E+00
                                                                                                                                   |                                                          (+5.2E-02)
                                                                                                                                   |
  pv      +0.0E+00   +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00    +0.0E+00    +0.0E+00   +0.0E+00  +0.0E+00  +0.0E+00  +0.0E+00  |    pv     +0.0E+00   +0.0E+00    +0.0E+00    +0.0E+00     +2.5E-01   +0.0E+00  +0.0E+00  +0.0E+00  +0.0E+00
                                                                                                                                   |                                                          (+8.1E-02)
================================================================================================================================================================================================================================================

Beta Estimates (Robust SEs Adjusted for 898 Clusters in Parentheses):
==========================================================================================================================================================
    1          hpwt       space        air         mpd         fwd          mi          sw          su          pv         pgnp       trend       trend2
----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------
 -8.9E+00    +8.3E+00    +4.9E+00    +3.8E+00    -1.4E-01    -6.5E+00    -2.1E+00    -1.3E+00    -1.1E+00    -3.3E+00    +3.4E-02    +2.2E-01    -1.5E-02
(+1.4E+00)  (+2.4E+00)  (+1.6E+00)  (+1.2E+00)  (+3.2E-01)  (+1.8E+00)  (+4.8E-01)  (+2.0E-01)  (+2.8E-01)  (+5.2E-01)  (+1.2E-02)  (+9.2E-02)  (+6.4E-03)
==========================================================================================================================================================

Gamma Estimates (Robust SEs Adjusted for 898 Clusters in Parentheses):
==============================================================================================================================================
    1       log(hpwt)    log(wt)     log(mpg)      air         fwd        trend         jp          eu       jp*trend    eu*trend     log(q)
----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------
 +1.4E+00    +8.8E-01    +1.4E+00    +1.2E-01    +2.7E-01    +6.9E-02    -1.2E-02    +1.0E-01    +4.6E-01    +1.6E-03    -1.1E-02    -6.9E-02
(+1.4E-01)  (+4.9E-02)  (+8.0E-02)  (+6.0E-02)  (+2.4E-02)  (+1.8E-02)  (+2.6E-03)  (+2.5E-02)  (+4.3E-02)  (+2.9E-03)  (+4.2E-03)  (+6.7E-03)
==============================================================================================================================================

Estimated Micro Moments:
===========================================================================================================
Observed  Estimated  Difference         Moment                 Part          Dataset  Observations  Markets
--------  ---------  ----------  --------------------  --------------------  -------  ------------  -------
+7.8E-01  +7.5E-01    +2.9E-02     E[age_i | mi_j]       E[age_i * mi_j]       CEX       29125        All
                                                             E[mi_j]           CEX       29125        All
+7.3E-01  +6.8E-01    +4.7E-02     E[age_i | sw_j]       E[age_i * sw_j]       CEX       29125        All
                                                             E[sw_j]           CEX       29125        All
+7.4E-01  +6.8E-01    +5.9E-02     E[age_i | su_j]       E[age_i * su_j]       CEX       29125        All
                                                             E[su_j]           CEX       29125        All
+6.5E-01  +7.3E-01    -7.7E-02     E[age_i | pv_j]       E[age_i * pv_j]       CEX       29125        All
                                                             E[pv_j]           CEX       29125        All
+3.9E+00  +3.9E+00    -1.2E-02      E[fs_i | mi_j]        E[fs_i * mi_j]       CEX       29125        All
                                                             E[mi_j]           CEX       29125        All
+3.2E+00  +3.2E+00    -7.6E-03      E[fs_i | sw_j]        E[fs_i * sw_j]       CEX       29125        All
                                                             E[sw_j]           CEX       29125        All
+3.0E+00  +3.0E+00    -8.5E-03      E[fs_i | su_j]        E[fs_i * su_j]       CEX       29125        All
                                                             E[su_j]           CEX       29125        All
+3.5E+00  +3.5E+00    -1.7E-02      E[fs_i | pv_j]        E[fs_i * pv_j]       CEX       29125        All
                                                             E[pv_j]           CEX       29125        All
+7.9E-02  +8.0E-02    -4.5E-04   E[1{j > 0} | mid_i]   E[1{j > 0} * mid_i]     CEX       29125        All
                                                             E[mid_i]          CEX       29125        All
+1.6E-01  +1.6E-01    -2.1E-03   E[1{j > 0} | high_i]  E[1{j > 0} * high_i]    CEX       29125        All
                                                            E[high_i]          CEX       29125        All
===========================================================================================================

There are some discrepances between these results and those in the original paper, but broadly estimates are similar. Although the estimates of \(\beta\) looks substantially off, this is primarily because the \(\chi^2(3)\) distributions are not mean-zero, so differences in estimates of \(\Sigma\) results in shifted estimates of \(\beta\) too.

Running the Main Counterfactual

One result that is very similar is the paper’s headline number: a $367.29 million compensating variation from a counterfactual that removes the minivan in 1984. Using our estimates, we get a very similar number.

This subsection previews some of the routines used in the next tutorial on functions available after estimation. First, we will compute implied marginal costs in 1984.

[16]:
year = 1984
costs_1984 = results.compute_costs(market_id=year)

Next, we will set up a counterfactual simulation in which the minivan is removed.

[17]:
product_data_1984 = product_data[product_data['market_ids'] == year]
xi_1984 = results.xi[product_data['market_ids'] == year]
agent_data_1984 = agent_data[agent_data['market_ids'] == year]
simulation = pyblp.Simulation(
    product_formulations=product_formulations[:2],
    product_data=product_data_1984[product_data_1984['mi'] == 0],
    xi=xi_1984[product_data_1984['mi'] == 0],
    agent_formulation=problem.agent_formulation,
    agent_data=agent_data_1984,
    beta=results.beta,
    sigma=results.sigma,
    pi=results.pi,
)

We will then solve for equilibrium prices and shares under this counterfactual, using the above-computed marginal costs.

[18]:
simulation_results = simulation.replace_endogenous(costs=costs_1984[product_data_1984['mi'] == 0])

Finally, we will compute the change in consumer surplus.

[19]:
households = product_data_1984['households'].values[0]
cs = households * results.compute_consumer_surpluses(market_id=year)
counterfactual_cs = households * simulation_results.compute_consumer_surpluses()
cs - counterfactual_cs
[19]:
array([[425.90824856]])

We get an estimate that is in the same ballpark as $367.29 million. When bootstrapping this procedure (see the next tutorial for more on this), we get a standard error around $250 million.

Download the Jupyter Notebook for this section: post_estimation.ipynb

Post-Estimation Tutorial

[1]:
%matplotlib inline

import pyblp
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

pyblp.options.digits = 2
pyblp.options.verbose = False
pyblp.__version__
[1]:
'1.1.0'

This tutorial covers several features of pyblp which are available after estimation including:

  1. Calculating elasticities and diversion ratios.

  2. Calculating marginal costs and markups.

  3. Computing the effects of mergers: prices, shares, and HHI.

  4. Using a parametric bootstrap to estimate standard errors.

  5. Estimating optimal instruments.

  6. Constructing optimal micro moments.

Problem Results

As in the fake cereal tutorial, we’ll first solve the fake cereal problem from Nevo (2000a). We load the fake data and estimate the model as in the previous tutorial. We output the setup of the model to confirm we have correctly configured the Problem

[2]:
product_data = pd.read_csv(pyblp.data.NEVO_PRODUCTS_LOCATION)
agent_data = pd.read_csv(pyblp.data.NEVO_AGENTS_LOCATION)
product_formulations = (
   pyblp.Formulation('0 + prices', absorb='C(product_ids)'),
   pyblp.Formulation('1 + prices + sugar + mushy')
)
agent_formulation = pyblp.Formulation('0 + income + income_squared + age + child')
problem = pyblp.Problem(product_formulations, product_data, agent_formulation, agent_data)
problem
[2]:
Dimensions:
=================================================
 T    N     F    I     K1    K2    D    MD    ED
---  ----  ---  ----  ----  ----  ---  ----  ----
94   2256   5   1880   1     4     4    20    1
=================================================

Formulations:
===================================================================
       Column Indices:           0           1           2      3
-----------------------------  ------  --------------  -----  -----
 X1: Linear Characteristics    prices
X2: Nonlinear Characteristics    1         prices      sugar  mushy
       d: Demographics         income  income_squared   age   child
===================================================================

We’ll solve the problem in the same way as before. The Problem.solve method returns a ProblemResults class, which displays basic estimation results. The results that are displayed are simply formatted information extracted from various class attributes such as ProblemResults.sigma and ProblemResults.sigma_se.

[3]:
initial_sigma = np.diag([0.3302, 2.4526, 0.0163, 0.2441])
initial_pi = [
  [ 5.4819,  0,      0.2037,  0     ],
  [15.8935, -1.2000, 0,       2.6342],
  [-0.2506,  0,      0.0511,  0     ],
  [ 1.2650,  0,     -0.8091,  0     ]
]
results = problem.solve(
    initial_sigma,
    initial_pi,
    optimization=pyblp.Optimization('bfgs', {'gtol': 1e-5}),
    method='1s'
)
results
[3]:
Problem Results Summary:
=======================================================================================================
GMM   Objective  Gradient      Hessian         Hessian     Clipped  Weighting Matrix  Covariance Matrix
Step    Value      Norm    Min Eigenvalue  Max Eigenvalue  Shares   Condition Number  Condition Number
----  ---------  --------  --------------  --------------  -------  ----------------  -----------------
 1    +4.6E+00   +6.9E-06     +3.3E-05        +1.6E+04        0         +6.9E+07          +8.4E+08
=======================================================================================================

Cumulative Statistics:
===========================================================================
Computation  Optimizer  Optimization   Objective   Fixed Point  Contraction
   Time      Converged   Iterations   Evaluations  Iterations   Evaluations
-----------  ---------  ------------  -----------  -----------  -----------
 00:01:16       Yes          51           57          46389       143977
===========================================================================

Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
=====================================================================================================================
Sigma:      1         prices      sugar       mushy     |   Pi:      income    income_squared     age        child
------  ----------  ----------  ----------  ----------  |  ------  ----------  --------------  ----------  ----------
  1      +5.6E-01                                       |    1      +2.3E+00      +0.0E+00      +1.3E+00    +0.0E+00
        (+1.6E-01)                                      |          (+1.2E+00)                  (+6.3E-01)
                                                        |
prices   +0.0E+00    +3.3E+00                           |  prices   +5.9E+02      -3.0E+01      +0.0E+00    +1.1E+01
                    (+1.3E+00)                          |          (+2.7E+02)    (+1.4E+01)                (+4.1E+00)
                                                        |
sugar    +0.0E+00    +0.0E+00    -5.8E-03               |  sugar    -3.8E-01      +0.0E+00      +5.2E-02    +0.0E+00
                                (+1.4E-02)              |          (+1.2E-01)                  (+2.6E-02)
                                                        |
mushy    +0.0E+00    +0.0E+00    +0.0E+00    +9.3E-02   |  mushy    +7.5E-01      +0.0E+00      -1.4E+00    +0.0E+00
                                            (+1.9E-01)  |          (+8.0E-01)                  (+6.7E-01)
=====================================================================================================================

Beta Estimates (Robust SEs in Parentheses):
==========
  prices
----------
 -6.3E+01
(+1.5E+01)
==========

Additional post-estimation outputs can be computed with ProblemResults methods.

Elasticities and Diversion Ratios

We can estimate elasticities, \(\varepsilon\), and diversion ratios, \(\mathscr{D}\), with ProblemResults.compute_elasticities and ProblemResults.compute_diversion_ratios.

As a reminder, elasticities in each market are

(1)\[\varepsilon_{jk} = \frac{x_k}{s_j}\frac{\partial s_j}{\partial x_k}.\]

Diversion ratios are

(2)\[\mathscr{D}_{jk} = -\frac{\partial s_k}{\partial x_j} \Big/ \frac{\partial s_j}{\partial x_j}.\]

Following Conlon and Mortimer (2018), we report the diversion to the outside good \(D_{j0}\) on the diagonal instead of \(D_{jj}=-1\).

[4]:
elasticities = results.compute_elasticities()
diversions = results.compute_diversion_ratios()

Post-estimation outputs are computed for each market and stacked. We’ll use matplotlib functions to display the matrices associated with a single market.

[5]:
single_market = product_data['market_ids'] == 'C01Q1'
plt.colorbar(plt.matshow(elasticities[single_market]));
_images/_notebooks_tutorial_post_estimation_9_0.png
[6]:
plt.colorbar(plt.matshow(diversions[single_market]));
_images/_notebooks_tutorial_post_estimation_10_0.png

The diagonal of the first image consists of own elasticities and the diagonal of the second image consists of diversion ratios to the outside good. As one might expect, own price elasticities are large and negative while cross-price elasticities are positive but much smaller.

Elasticities and diversion ratios can be computed with respect to variables other than prices with the name argument of ProblemResults.compute_elasticities and ProblemResults.compute_diversion_ratios. Additionally, ProblemResults.compute_long_run_diversion_ratios can be used to used to understand substitution when products are eliminated from the choice set.

The convenience methods ProblemResults.extract_diagonals and ProblemResults.extract_diagonal_means can be used to extract information about own elasticities of demand from elasticity matrices.

[7]:
means = results.extract_diagonal_means(elasticities)

An alternative to summarizing full elasticity matrices is to use ProblemResults.compute_aggregate_elasticities to estimate aggregate elasticities of demand, \(E\), in each market, which reflect the change in total sales under a proportional sales tax of some factor.

[8]:
aggregates = results.compute_aggregate_elasticities(factor=0.1)

Since demand for an entire product category is generally less elastic than the average elasticity of individual products, mean own elasticities are generally larger in magnitude than aggregate elasticities.

[9]:
plt.hist(
    [means.flatten(), aggregates.flatten()],
    color=['red', 'blue'],
    bins=50
);
plt.legend(['Mean Own Elasticities', 'Aggregate Elasticities']);
_images/_notebooks_tutorial_post_estimation_16_0.png

Marginal Costs and Markups

To compute marginal costs, \(c\), the product_data passed to Problem must have had a firm_ids field. Since we included firm IDs when configuring the problem, we can use ProblemResults.compute_costs.

[10]:
costs = results.compute_costs()
plt.hist(costs, bins=50);
plt.legend(["Marginal Costs"]);
_images/_notebooks_tutorial_post_estimation_18_0.png

Other methods that compute supply-side outputs often compute marginal costs themselves. For example, ProblemResults.compute_markups will compute marginal costs when estimating markups, \(\mathscr{M}\), but computation can be sped up if we just use our pre-computed values.

[11]:
markups = results.compute_markups(costs=costs)
plt.hist(markups, bins=50);
plt.legend(["Markups"]);
_images/_notebooks_tutorial_post_estimation_20_0.png

Mergers

Before computing post-merger outputs, we’ll supplement our pre-merger markups with some other outputs. We’ll compute Herfindahl-Hirschman Indices, \(\text{HHI}\), with ProblemResults.compute_hhi; population-normalized gross expected profits, \(\pi\), with ProblemResults.compute_profits; and population-normalized consumer surpluses, \(\text{CS}\), with ProblemResults.compute_consumer_surpluses.

[12]:
hhi = results.compute_hhi()
profits = results.compute_profits(costs=costs)
cs = results.compute_consumer_surpluses()

To compute post-merger outputs, we’ll create a new set of firm IDs that represent a merger of firms 2 and 1.

[13]:
product_data['merger_ids'] = product_data['firm_ids'].replace(2, 1)

We can use ProblemResults.compute_approximate_prices or ProblemResults.compute_prices to estimate post-merger prices. The first method, which is in the spirit of early approaches to merger evaluation such as Hausman, Leonard, and Zona (1994) and Werden (1997), is only a partial merger simulation in that it assumes shares and their price derivatives are unaffected by the merger.

The second method, which is used by Nevo (2000b), is a full merger simulation in that it does not make these assumptions, and is the preferred approach to merger simulation. By default, we iterate over the \(\zeta\)-markup equation from Morrow and Skerlos (2011) to solve the full system of \(J_t\) equations and \(J_t\) unknowns in each market \(t\). We’ll use the latter, since it is fast enough for this example problem.

[14]:
changed_prices = results.compute_prices(
    firm_ids=product_data['merger_ids'],
    costs=costs
)

We’ll compute post-merger shares with ProblemResults.compute_shares.

[15]:
changed_shares = results.compute_shares(changed_prices)

Post-merger prices and shares are used to compute other post-merger outputs. For example, \(\text{HHI}\) increases.

[16]:
changed_hhi = results.compute_hhi(
    firm_ids=product_data['merger_ids'],
    shares=changed_shares
)
plt.hist(changed_hhi - hhi, bins=50);
plt.legend(["HHI Changes"]);
_images/_notebooks_tutorial_post_estimation_30_0.png

Markups, \(\mathscr{M}\), and profits, \(\pi\), generally increase as well.

[17]:
changed_markups = results.compute_markups(changed_prices, costs)
plt.hist(changed_markups - markups, bins=50);
plt.legend(["Markup Changes"]);
_images/_notebooks_tutorial_post_estimation_32_0.png
[18]:
changed_profits = results.compute_profits(changed_prices, changed_shares, costs)
plt.hist(changed_profits - profits, bins=50);
plt.legend(["Profit Changes"]);
_images/_notebooks_tutorial_post_estimation_33_0.png

On the other hand, consumer surpluses, \(\text{CS}\), generally decrease.

[19]:
changed_cs = results.compute_consumer_surpluses(changed_prices)
plt.hist(changed_cs - cs, bins=50);
plt.legend(["Consumer Surplus Changes"]);
_images/_notebooks_tutorial_post_estimation_35_0.png

Bootstrapping Results

Post-estimation outputs can be informative, but they don’t mean much without a sense sample-to-sample variability. One way to estimate confidence intervals for post-estimation outputs is with a standard bootstrap procedure:

  1. Construct a large number of bootstrap samples by sampling with replacement from the original product data.

  2. Initialize and solve a Problem for each bootstrap sample.

  3. Compute the desired post-estimation output for each bootstrapped ProblemResults and from the resulting empirical distribution, construct boostrap confidence intervals.

Although appealing because of its simplicity, the computational resources required for this procedure are often prohibitively expensive. Furthermore, human oversight of the optimization routine is often required to determine whether the routine ran into any problems and if it successfully converged. Human oversight of estimation for each bootstrapped problem is usually not feasible.

A more reasonable alternative is a parametric bootstrap procedure:

  1. Construct a large number of draws from the estimated joint distribution of parameters.

  2. Compute the implied mean utility, \(\delta\), and shares, \(s\), for each draw. If a supply side was estimated, also computed the implied marginal costs, \(c\), and prices, \(p\).

  3. Compute the desired post-estimation output under each of these parametric bootstrap samples. Again, from the resulting empirical distribution, construct boostrap confidence intervals.

Compared to the standard bootstrap procedure, the parametric bootstrap requires far fewer computational resources, and is simple enough to not require human oversight of each bootstrap iteration. The primary complication to this procedure is that when supply is estimated, equilibrium prices and shares need to be computed for each parametric bootstrap sample by iterating over the \(\zeta\)-markup equation from Morrow and Skerlos (2011). Although nontrivial, this fixed point iteration problem is much less demanding than the full optimization routine required to solve the BLP problem from the start.

An empirical distribution of results computed according to this parametric bootstrap procedure can be created with the ProblemResults.bootstrap method, which returns a BootstrappedResults class that can be used just like ProblemResults to compute various post-estimation outputs. The difference is that BootstrappedResults methods return arrays with an extra first dimension, along which bootstrapped results are stacked.

We’ll construct 90% parametric bootstrap confidence intervals for estimated mean own elasticities in each market of the fake cereal problem. Usually, bootstrapped confidence intervals should be based on thousands of draws, but we’ll only use a few for the sake of speed in this example.

[20]:
bootstrapped_results = results.bootstrap(draws=100, seed=0)
bootstrapped_results
[20]:
Bootstrapped Results Summary:
======================
Computation  Bootstrap
   Time        Draws
-----------  ---------
 00:00:09       100
======================
[21]:
bounds = np.percentile(
    bootstrapped_results.extract_diagonal_means(
        bootstrapped_results.compute_elasticities()
    ),
    q=[10, 90],
    axis=0
)
table = pd.DataFrame(index=problem.unique_market_ids, data={
    'Lower Bound': bounds[0].flatten(),
    'Mean Own Elasticity': means.flatten(),
    'Upper Bound': bounds[1].flatten()
})
table.round(2).head()
[21]:
Lower Bound Mean Own Elasticity Upper Bound
C01Q1 -4.31 -4.21 -3.88
C01Q2 -4.07 -3.96 -3.68
C03Q1 -3.71 -3.40 -3.20
C03Q2 -3.65 -3.34 -3.16
C04Q1 -3.31 -3.15 -2.97

Optimal Instruments

Given a consistent estimate of \(\theta\), we may want to compute the optimal instruments of Chamberlain (1987) and use them to re-solve the problem. Optimal instruments have been shown, for example, by Reynaert and Verboven (2014), to reduce bias, improve efficiency, and enhance stability of BLP estimates.

The ProblemResults.compute_optimal_instruments method computes the expected Jacobians that comprise the optimal instruments by integrating over the density of \(\xi\) (and \(\omega\) if a supply side was estimated). By default, the method approximates this integral by averaging over the Jacobian realizations computed under draws from the asymptotic normal distribution of the error terms. Since this process is computationally expensive and often doesn’t make much of a difference, we’ll use method='approximate' in this example to simply evaluate the Jacobians at the expected value of \(\xi\), zero.

[22]:
instrument_results = results.compute_optimal_instruments(method='approximate')
instrument_results
[22]:
Optimal Instrument Results Summary:
=======================
Computation  Error Term
   Time        Draws
-----------  ----------
 00:00:00        1
=======================

We can use the OptimalInstrumentResults.to_problem method to re-create the fake cereal problem with the estimated optimal excluded instruments.

[23]:
updated_problem = instrument_results.to_problem()
updated_problem
[23]:
Dimensions:
=================================================
 T    N     F    I     K1    K2    D    MD    ED
---  ----  ---  ----  ----  ----  ---  ----  ----
94   2256   5   1880   1     4     4    14    1
=================================================

Formulations:
===================================================================
       Column Indices:           0           1           2      3
-----------------------------  ------  --------------  -----  -----
 X1: Linear Characteristics    prices
X2: Nonlinear Characteristics    1         prices      sugar  mushy
       d: Demographics         income  income_squared   age   child
===================================================================

We can solve this updated problem just like the original one. We’ll start at our consistent estimate of \(\theta\).

[24]:
updated_results = updated_problem.solve(
    results.sigma,
    results.pi,
    optimization=pyblp.Optimization('bfgs', {'gtol': 1e-5}),
    method='1s'
)
updated_results
[24]:
Problem Results Summary:
=======================================================================================================
GMM   Objective  Gradient      Hessian         Hessian     Clipped  Weighting Matrix  Covariance Matrix
Step    Value      Norm    Min Eigenvalue  Max Eigenvalue  Shares   Condition Number  Condition Number
----  ---------  --------  --------------  --------------  -------  ----------------  -----------------
 1    +8.0E-14   +3.0E-06     +1.6E-04        +2.9E+04        0         +7.8E+07          +1.8E+08
=======================================================================================================

Cumulative Statistics:
===========================================================================
Computation  Optimizer  Optimization   Objective   Fixed Point  Contraction
   Time      Converged   Iterations   Evaluations  Iterations   Evaluations
-----------  ---------  ------------  -----------  -----------  -----------
 00:01:05       Yes          42           50          45902       142164
===========================================================================

Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
=====================================================================================================================
Sigma:      1         prices      sugar       mushy     |   Pi:      income    income_squared     age        child
------  ----------  ----------  ----------  ----------  |  ------  ----------  --------------  ----------  ----------
  1      +2.1E-01                                       |    1      +6.0E+00      +0.0E+00      +1.6E-01    +0.0E+00
        (+7.8E-02)                                      |          (+5.2E-01)                  (+2.0E-01)
                                                        |
prices   +0.0E+00    +3.0E+00                           |  prices   +9.8E+01      -5.6E+00      +0.0E+00    +4.1E+00
                    (+6.5E-01)                          |          (+8.6E+01)    (+4.5E+00)                (+2.2E+00)
                                                        |
sugar    +0.0E+00    +0.0E+00    +2.7E-02               |  sugar    -3.1E-01      +0.0E+00      +4.9E-02    +0.0E+00
                                (+7.2E-03)              |          (+3.5E-02)                  (+1.3E-02)
                                                        |
mushy    +0.0E+00    +0.0E+00    +0.0E+00    +3.0E-01   |  mushy    +9.7E-01      +0.0E+00      -5.4E-01    +0.0E+00
                                            (+1.0E-01)  |          (+2.9E-01)                  (+1.8E-01)
=====================================================================================================================

Beta Estimates (Robust SEs in Parentheses):
==========
  prices
----------
 -3.1E+01
(+4.5E+00)
==========

Optimal Micro Moments

Similarly, if we have micro data that links individual choices to demographics, we can use all the information in this data by constructing optimal micro moments that match the score of the micro data, evaluated at our consistent estimate of \(\theta\). See the micro moments tutorial for an introduction to constructing non-optimal micro moments.

We don’t have actual micro data for this empirical example, but we can simulate some just to demonstrate how to construct optimal micro moments. We’ll use the ProblemResults.simulate_micro_data method to simulate 1,000 observations from a micro dataset at the estimated \(\theta\). Like most micro datasets, we’ll define compute_weights such that we only have observations from consumer who purchase an inside good \(j \neq 0\). For simplicity, we’ll assume we only have micro data from a single market, 'C61Q1'. Again, see the micro moments tutorial for a more in-depth discussion of MicroDataset and micro moments.

[25]:
micro_dataset = pyblp.MicroDataset(
    name="Simulated micro data",
    observations=1_000,
    compute_weights=lambda t, p, a: np.ones((a.size, 1 + p.size)),
    market_ids=['C61Q1'],
)
micro_data = results.simulate_micro_data(
    dataset=micro_dataset,
    seed=0,
)

The simulated micro data are a record array, which can be difficult to visualize. We’ll convert it to a pandas dataframe, which is what we would usually load from an actual micro dataset file.

[26]:
micro_data = pd.DataFrame(pyblp.data_to_dict(micro_data))
micro_data
[26]:
micro_ids market_ids agent_indices choice_indices
0 0 C61Q1 10 24
1 1 C61Q1 14 0
2 2 C61Q1 12 0
3 3 C61Q1 10 21
4 4 C61Q1 8 10
... ... ... ... ...
995 995 C61Q1 1 23
996 996 C61Q1 10 4
997 997 C61Q1 18 0
998 998 C61Q1 4 0
999 999 C61Q1 13 2

1000 rows × 4 columns

The simulated micro data contain four columns:

  • micro_ids: This is simply an index from \(0\) to \(999\), indexing each micro observation.

  • market_ids: This is the market of each observation. We configured our MicroDatasetto only sample from one market.

  • agent_indices: This is the within-market index (from \(0\) to \(I_t - 1\) in market \(t\)) of the agent data row that was sampled with probability \(w_{it}\), configured by the weights column in agent data.

  • choice_indices: This is the within-market index (from \(0\) to \(J_t - 1\) in market \(t\)) of the produt data row that was sampled with probability \(s_{ijt}\) evaluated at the consistent estimate of \(\theta\).

The simulated data contain a bit more information than we would usually have in actual micro data. The agent_indices contain information not only about observed demographics of the micro observation, but also about unobserved preferences. We will merge in only observed demographics for each agent index to get a more realistic simulated micro dataset.

[27]:
agent_data['agent_indices'] = agent_data.groupby('market_ids').cumcount()
micro_data = micro_data.merge(
    agent_data[['market_ids', 'agent_indices', 'income', 'income_squared', 'age', 'child']],
    on=['market_ids', 'agent_indices']
)
del micro_data['agent_indices']
micro_data
[27]:
micro_ids market_ids choice_indices income income_squared age child
0 0 C61Q1 24 1.212283 22.546328 0.624306 -0.230851
1 3 C61Q1 21 1.212283 22.546328 0.624306 -0.230851
2 11 C61Q1 11 1.212283 22.546328 0.624306 -0.230851
3 28 C61Q1 10 1.212283 22.546328 0.624306 -0.230851
4 86 C61Q1 10 1.212283 22.546328 0.624306 -0.230851
... ... ... ... ... ... ... ...
995 937 C61Q1 0 0.636478 11.051747 -0.090347 -0.230851
996 952 C61Q1 5 0.636478 11.051747 -0.090347 -0.230851
997 954 C61Q1 14 0.636478 11.051747 -0.090347 -0.230851
998 971 C61Q1 13 0.636478 11.051747 -0.090347 -0.230851
999 975 C61Q1 22 0.636478 11.051747 -0.090347 -0.230851

1000 rows × 7 columns

This is more like real micro data that we would actually load from a file. For each observation, we know the market, choice, and demographics of the consumer.

To compute optimal micro moments at our consistent estimate of \(\theta\), we need to compute two types of scores:

  1. The score for each observation in our micro data via ProblemResults.compute_micro_scores.

  2. The score for each possible agent-choice in the model via ProblemResults.compute_agent_scores.

For each, we will integrate over unobserved heterogeneity with quadrature, but one could use Monte Carlo methods too. We will do so by just passing an Integration configuration to these two methods, but we could also do so by duplicating each row of micro data by as many integration nodes/weights we wanted for each, adding weights and nodes columns as with agent data.

[28]:
score_integration = pyblp.Integration('product', 5)
micro_scores = results.compute_micro_scores(micro_dataset, micro_data, integration=score_integration)
agent_scores = results.compute_agent_scores(micro_dataset, integration=score_integration)

Both micro_scores and agent_scores are lists, with one element for each nonlinear parameter in \(\theta\). The ordering is

[29]:
results.theta_labels
[29]:
['1 x 1',
 'prices x prices',
 'sugar x sugar',
 'mushy x mushy',
 '1 x income',
 '1 x age',
 'prices x income',
 'prices x income_squared',
 'prices x child',
 'sugar x income',
 'sugar x age',
 'mushy x income',
 'mushy x age']

The first element of micro_scores corresponds to the \(\sigma\) on the constant term (i.e., the '1 x 1' above). It has the estimated score for each observation in micro_data:

[30]:
micro_scores[0].shape
[30]:
(1000,)

The first element of agent_scores also correspondds to the \(\sigma\) on the constant term. It is a mapping from market IDs to arrays of scores for each of the \(I_t \times J_t\) possible agent-choices in that market.

[31]:
agent_scores[0]['C61Q1'].shape
[31]:
(20, 25)

We will construct one optimal MicroMoment for each parameter, matching the average score (via a MicroPart) for that parameter from the micro data with its model counterpart.

[32]:
optimal_micro_moments = []
for m, (micro_scores_m, agent_scores_m) in enumerate(zip(micro_scores, agent_scores)):
    optimal_micro_moments.append(pyblp.MicroMoment(
        name=f"Score for parameter #{m}",
        value=micro_scores_m.mean(),
        parts=pyblp.MicroPart(
            name=f"Score for parameter #{m}",
            dataset=micro_dataset,
            compute_values=lambda t, p, a, v=agent_scores_m: v[t],
        ),
    ))

For example, some information about the optimal micro moment for the first parameter is as follows.

[33]:
optimal_micro_moments[0]
[33]:
Score for parameter #0: -2.8E-02 (Score for parameter #0 on Simulated micro data: 1000 Observations in Market 'C61Q1')

Now, we can use our problem with updated optimal IVs, including our optimal micro moments, to obtain an efficient estimator.

[34]:
updated_results = updated_problem.solve(
    results.sigma,
    results.pi,
    optimization=pyblp.Optimization('bfgs', {'gtol': 1e-5}),
    method='1s',
    micro_moments=optimal_micro_moments,
)
updated_results
[34]:
Problem Results Summary:
=======================================================================================================
GMM   Objective  Gradient      Hessian         Hessian     Clipped  Weighting Matrix  Covariance Matrix
Step    Value      Norm    Min Eigenvalue  Max Eigenvalue  Shares   Condition Number  Condition Number
----  ---------  --------  --------------  --------------  -------  ----------------  -----------------
 1    +1.1E+02   +8.8E-06     +2.3E-03        +8.3E+04        0         +2.2E+08          +4.1E+07
=======================================================================================================

Cumulative Statistics:
===========================================================================
Computation  Optimizer  Optimization   Objective   Fixed Point  Contraction
   Time      Converged   Iterations   Evaluations  Iterations   Evaluations
-----------  ---------  ------------  -----------  -----------  -----------
 00:01:26       Yes          34           55          42977       133908
===========================================================================

Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
=====================================================================================================================
Sigma:      1         prices      sugar       mushy     |   Pi:      income    income_squared     age        child
------  ----------  ----------  ----------  ----------  |  ------  ----------  --------------  ----------  ----------
  1      +4.6E-01                                       |    1      +4.8E+00      +0.0E+00      +6.5E-01    +0.0E+00
        (+6.7E-02)                                      |          (+3.5E-01)                  (+1.5E-01)
                                                        |
prices   +0.0E+00    +3.1E+00                           |  prices   +4.7E+02      -2.5E+01      +0.0E+00    +3.2E+00
                    (+5.3E-01)                          |          (+3.1E+01)    (+1.6E+00)                (+1.7E+00)
                                                        |
sugar    +0.0E+00    +0.0E+00    +2.0E-02               |  sugar    -4.0E-01      +0.0E+00      +4.6E-02    +0.0E+00
                                (+6.1E-03)              |          (+2.4E-02)                  (+9.2E-03)
                                                        |
mushy    +0.0E+00    +0.0E+00    +0.0E+00    -6.6E-02   |  mushy    +1.1E+00      +0.0E+00      -1.1E+00    +0.0E+00
                                            (+8.3E-02)  |          (+1.9E-01)                  (+1.1E-01)
=====================================================================================================================

Beta Estimates (Robust SEs in Parentheses):
==========
  prices
----------
 -5.2E+01
(+2.2E+00)
==========

Estimated Micro Moments:
==============================================================================================================================
Observed  Estimated  Difference          Moment                    Part                  Dataset         Observations  Markets
--------  ---------  ----------  -----------------------  -----------------------  --------------------  ------------  -------
-2.8E-02  -2.3E-02    -5.5E-03   Score for parameter #0   Score for parameter #0   Simulated micro data      1000         1
+6.4E-04  +9.8E-04    -3.4E-04   Score for parameter #1   Score for parameter #1   Simulated micro data      1000         1
+1.1E-01  +1.5E-01    -3.4E-02   Score for parameter #2   Score for parameter #2   Simulated micro data      1000         1
-2.9E-04  -3.6E-04    +7.0E-05   Score for parameter #3   Score for parameter #3   Simulated micro data      1000         1
-2.0E-03  +2.1E-02    -2.3E-02   Score for parameter #4   Score for parameter #4   Simulated micro data      1000         1
-3.6E-02  -3.0E-02    -6.3E-03   Score for parameter #5   Score for parameter #5   Simulated micro data      1000         1
-2.5E-04  +2.1E-03    -2.4E-03   Score for parameter #6   Score for parameter #6   Simulated micro data      1000         1
-6.8E-03  +4.0E-02    -4.7E-02   Score for parameter #7   Score for parameter #7   Simulated micro data      1000         1
+1.7E-03  +1.0E-03    +6.8E-04   Score for parameter #8   Score for parameter #8   Simulated micro data      1000         1
+5.5E-02  +1.8E-01    -1.3E-01   Score for parameter #9   Score for parameter #9   Simulated micro data      1000         1
-4.8E-01  -4.7E-01    -1.0E-02   Score for parameter #10  Score for parameter #10  Simulated micro data      1000         1
-2.9E-03  +8.1E-03    -1.1E-02   Score for parameter #11  Score for parameter #11  Simulated micro data      1000         1
-2.6E-02  -5.5E-03    -2.0E-02   Score for parameter #12  Score for parameter #12  Simulated micro data      1000         1
==============================================================================================================================

Results are fairly similar to before because we simulated the micro data from our first-stage estimate of \(\theta\), which was somewhat close to our second-stage estimate. Scores are not matched perfectly because the model is now over-identified, with two times as many moments as there are parameters (one optimal IV and one optimal micro moment for each nonlinear parameter).

Download the Jupyter Notebook for this section: simulation.ipynb

Problem Simulation Tutorial

[1]:
import pyblp
import numpy as np
import pandas as pd

pyblp.options.digits = 2
pyblp.options.verbose = False
pyblp.__version__
[1]:
'1.1.0'

Before configuring and solving a problem with real data, it may be a good idea to perform Monte Carlo analysis on simulated data to verify that it is possible to accurately estimate model parameters. For example, before configuring and solving the example problems in the prior tutorials, it may have been a good idea to simulate data according to the assumed models of supply and demand. During such Monte Carlo anaysis, the data would only be used to determine sample sizes and perhaps to choose reasonable true parameters.

Simulations are configured with the Simulation class, which requires many of the same inputs as Problem. The two main differences are:

  1. Variables in formulations that cannot be loaded from product_data or agent_data will be drawn from independent uniform distributions.

  2. True parameters and the distribution of unobserved product characteristics are specified.

First, we’ll use build_id_data to build market and firm IDs for a model in which there are \(T = 50\) markets, and in each market \(t\), a total of \(J_t = 20\) products produced by \(F = 10\) firms.

[2]:
id_data = pyblp.build_id_data(T=50, J=20, F=10)

Next, we’ll create an Integration configuration to build agent data according to a Gauss-Hermite product rule that exactly integrates polynomials of degree \(2 \times 9 - 1 = 17\) or less.

[3]:
integration = pyblp.Integration('product', 9)
integration
[3]:
Configured to construct nodes and weights according to the level-9 Gauss-Hermite product rule with options {}.

We’ll then pass these data to Simulation. We’ll use Formulation configurations to create an \(X_1\) that consists of a constant, prices, and an exogenous characteristic; an \(X_2\) that consists only of the same exogenous characteristic; and an \(X_3\) that consists of the common exogenous characteristic and a cost-shifter.

[4]:
simulation = pyblp.Simulation(
   product_formulations=(
       pyblp.Formulation('1 + prices + x'),
       pyblp.Formulation('0 + x'),
       pyblp.Formulation('0 + x + z')
   ),
   beta=[1, -2, 2],
   sigma=1,
   gamma=[1, 4],
   product_data=id_data,
   integration=integration,
   seed=0
)
simulation
[4]:
Dimensions:
=====================================
 T    N     F    I    K1    K2    K3
---  ----  ---  ---  ----  ----  ----
50   1000  10   450   3     1     2
=====================================

Formulations:
=================================================
        Column Indices:           0     1      2
-------------------------------  ---  ------  ---
  X1: Linear Characteristics      1   prices   x
 X2: Nonlinear Characteristics    x
X3: Linear Cost Characteristics   x     z
=================================================

Nonlinear Coefficient True Values:
================
Sigma:     x
------  --------
  x     +1.0E+00
================

Beta True Values:
============================
   1       prices      x
--------  --------  --------
+1.0E+00  -2.0E+00  +2.0E+00
============================

Gamma True Values:
==================
   x         z
--------  --------
+1.0E+00  +4.0E+00
==================

When Simulation is initialized, it constructs Simulation.agent_data and simulates Simulation.product_data.

The Simulation can be further configured with other arguments that determine how unobserved product characteristics are simulated and how marginal costs are specified.

At this stage, simulated variables are not consistent with true parameters, so we still need to solve the simulation with Simulation.replace_endogenous. This method replaced simulated prices and market shares with values that are consistent with the true parameters. Just like ProblemResults.compute_prices, to do so it iterates over the \(\zeta\)-markup equation from Morrow and Skerlos (2011).

[5]:
simulation_results = simulation.replace_endogenous()
simulation_results
[5]:
Simulation Results Summary:
======================================================================================================
Computation  Fixed Point  Fixed Point  Contraction  Profit Gradients  Profit Hessians  Profit Hessians
   Time       Failures    Iterations   Evaluations      Max Norm      Min Eigenvalue   Max Eigenvalue
-----------  -----------  -----------  -----------  ----------------  ---------------  ---------------
 00:00:00         0           721          721          +1.3E-13         -8.4E-01         -9.6E-06
======================================================================================================

Now, we can try to recover the true parameters by creating and solving a Problem.

The convenience method SimulationResults.to_problem constructs some basic “sums of characteristics” BLP instruments that are functions of all exogenous numerical variables in the problem. In this example, excluded demand-side instruments are the cost-shifter z and traditional BLP instruments constructed from x. Excluded supply-side instruments are traditional BLP instruments constructed from x and z.

[6]:
problem = simulation_results.to_problem()
problem
[6]:
Dimensions:
=================================================
 T    N     F    I    K1    K2    K3    MD    MS
---  ----  ---  ---  ----  ----  ----  ----  ----
50   1000  10   450   3     1     2     5     6
=================================================

Formulations:
=================================================
        Column Indices:           0     1      2
-------------------------------  ---  ------  ---
  X1: Linear Characteristics      1   prices   x
 X2: Nonlinear Characteristics    x
X3: Linear Cost Characteristics   x     z
=================================================

We’ll choose starting values that are half the true parameters so that the optimization routine has to do some work. Note that since we’re jointly estimating the supply side, we need to provide an initial value for the linear coefficient on prices because this parameter cannot be concentrated out of the problem (unlike linear coefficients on exogenous characteristics).

[8]:
results = problem.solve(
    sigma=0.5 * simulation.sigma,
    pi=0.5 * simulation.pi,
    beta=[None, 0.5 * simulation.beta[1, 0], None],
    optimization=pyblp.Optimization('l-bfgs-b', {'gtol': 1e-5})
)
results
[8]:
Problem Results Summary:
==============================================================================================================
GMM   Objective    Projected    Reduced Hessian  Reduced Hessian  Clipped  Weighting Matrix  Covariance Matrix
Step    Value    Gradient Norm  Min Eigenvalue   Max Eigenvalue   Shares   Condition Number  Condition Number
----  ---------  -------------  ---------------  ---------------  -------  ----------------  -----------------
 2    +6.4E+00     +6.9E-08        +7.2E+00         +3.8E+03         0         +3.7E+04          +1.5E+04
==============================================================================================================

Cumulative Statistics:
===========================================================================
Computation  Optimizer  Optimization   Objective   Fixed Point  Contraction
   Time      Converged   Iterations   Evaluations  Iterations   Evaluations
-----------  ---------  ------------  -----------  -----------  -----------
 00:00:04       Yes          23           30          8289         26305
===========================================================================

Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
==================
Sigma:      x
------  ----------
  x      +7.8E-01
        (+5.2E-01)
==================

Beta Estimates (Robust SEs in Parentheses):
==================================
    1         prices        x
----------  ----------  ----------
 +9.6E-01    -2.0E+00    +2.1E+00
(+9.3E-02)  (+2.4E-02)  (+1.4E-01)
==================================

Gamma Estimates (Robust SEs in Parentheses):
======================
    x           z
----------  ----------
 +9.8E-01    +4.0E+00
(+8.7E-02)  (+8.5E-02)
======================

The parameters seem to have been estimated reasonably well.

[9]:
np.c_[simulation.beta, results.beta]
[9]:
array([[ 1.        ,  0.96223514],
       [-2.        , -2.00792431],
       [ 2.        ,  2.10032015]])
[10]:
np.c_[simulation.gamma, results.gamma]
[10]:
array([[1.        , 0.97820624],
       [4.        , 4.03121577]])
[11]:
np.c_[simulation.sigma, results.sigma]
[11]:
array([[1.        , 0.78358853]])

In addition to checking that the configuration for a model based on actual data makes sense, the Simulation class can also be a helpful tool for better understanding under what general conditions BLP models can be accurately estimated. Simulations are also used extensively in pyblp’s test suite.

API Documentation

The majority of the package consists of classes, which compartmentalize different aspects of the BLP model. There are some convenience functions as well.

Configuration Classes

Various components of the package require configurations for how to approximate integrals, solve fixed point problems, and solve optimimzation problems. Such configurations are specified with the following classes.

Formulation(formula[, absorb, …])

Configuration for designing matrices and absorbing fixed effects.

Integration(specification, size[, …])

Configuration for building integration nodes and weights.

Iteration(method[, method_options, …])

Configuration for solving fixed point problems.

Optimization(method[, method_options, …])

Configuration for solving optimization problems.

Custom optimization configurations can be used to help debug optimization, to define non-standard optimization routines, or to add ad-hoc moments to configured problems. They can use various information about optimization progress so far.

OptimizationProgress

Information about the current progress of optimization.

Data Manipulation Functions

There are also a number of convenience functions that can be used to construct common components of product and agent data, or manipulate other PyBLP objects.

build_matrix(formulation, data)

Construct a matrix according to a formulation.

build_blp_instruments(formulation, product_data)

Construct “sums of characteristics” excluded BLP instruments.

build_differentiation_instruments(…[, …])

Construct excluded differentiation instruments.

build_id_data(T, J, F)

Build a balanced panel of market and firm IDs.

build_ownership(product_data[, …])

Build ownership matrices, \(O\).

build_integration(integration, dimensions)

Build nodes and weights for integration over agent choice probabilities.

data_to_dict(data[, ignore_empty])

Convert a NumPy record array into a dictionary.

save_pickle(x, path)

Save an object as a pickle file.

read_pickle(path)

Load a pickled object into memory.

Problem Class

Given data and appropriate configurations, a BLP-type problem can be structured by initializing the following class.

Problem(product_formulations, product_data)

A BLP-type problem.

Once initialized, the following method solves the problem.

Problem.solve([sigma, pi, rho, beta, gamma, …])

Solve the problem.

Micro Moment Classes

Micro dataset configurations are passed to micro part configurations, which are passed to micro moment configurations, which in turn can be passed to Problem.solve().

MicroDataset(name, observations, compute_weights)

Configuration for a micro dataset \(d\) on which micro moments are computed.

MicroPart(name, dataset, compute_values)

Configuration for a micro moment part \(p\).

MicroMoment(name, value, parts[, …])

Configuration for a micro moment \(m\).

Problem Results Class

Solved problems return the following results class.

ProblemResults

Results of a solved BLP problem.

The results can be pickled or converted into a dictionary.

ProblemResults.to_pickle(path)

Save these results as a pickle file.

ProblemResults.to_dict([attributes])

Convert these results into a dictionary that maps attribute names to values.

The following methods test the validity of overidentifying and model restrictions.

ProblemResults.run_hansen_test()

Test the validity of overidentifying restrictions with the Hansen \(J\) test.

ProblemResults.run_distance_test(unrestricted)

Test the validity of model restrictions with the distance test.

ProblemResults.run_lm_test()

Test the validity of model restrictions with the Lagrange multiplier test.

ProblemResults.run_wald_test(restrictions, …)

Test the validity of model restrictions with the Wald test.

In addition to class attributes, other post-estimation outputs can be estimated market-by-market with the following methods, each of which return an array.

ProblemResults.compute_aggregate_elasticities([…])

Estimate aggregate elasticities of demand, \(\mathscr{E}\), with respect to a variable, \(x\).

ProblemResults.compute_elasticities([name, …])

Estimate matrices of elasticities of demand, \(\varepsilon\), with respect to a variable, \(x\).

ProblemResults.compute_demand_jacobians([…])

Estimate matrices of derivatives of demand with respect to a variable, \(x\).

ProblemResults.compute_demand_hessians([…])

Estimate arrays of second derivatives of demand with respect to a variable, \(x\).

ProblemResults.compute_profit_hessians([…])

Estimate arrays of second derivatives of profits with respect to a prices.

ProblemResults.compute_diversion_ratios([…])

Estimate matrices of diversion ratios, \(\mathscr{D}\), with respect to a variable, \(x\).

ProblemResults.compute_long_run_diversion_ratios([…])

Estimate matrices of long-run diversion ratios, \(\bar{\mathscr{D}}\).

ProblemResults.compute_probabilities([…])

Estimate matrices of choice probabilities.

ProblemResults.extract_diagonals(matrices[, …])

Extract diagonals from stacked \(J_t \times J_t\) matrices for each market \(t\).

ProblemResults.extract_diagonal_means(matrices)

Extract means of diagonals from stacked \(J_t \times J_t\) matrices for each market \(t\).

ProblemResults.compute_delta([agent_data, …])

Estimate mean utilities, \(\delta\).

ProblemResults.compute_costs([firm_ids, …])

Estimate marginal costs, \(c\).

ProblemResults.compute_passthrough([…])

Estimate matrices of passthrough of marginal costs to equilibrium prices, \(\Upsilon\).

ProblemResults.compute_approximate_prices([…])

Approximate equilibrium prices after firm or cost changes, \(p^*\), under the assumption that shares and their price derivatives are unaffected by such changes.

ProblemResults.compute_prices([firm_ids, …])

Estimate equilibrium prices after firm or cost changes, \(p^*\).

ProblemResults.compute_shares([prices, …])

Estimate shares.

ProblemResults.compute_hhi([firm_ids, …])

Estimate Herfindahl-Hirschman Indices, \(\text{HHI}\).

ProblemResults.compute_markups([prices, …])

Estimate markups, \(\mathscr{M}\).

ProblemResults.compute_profits([prices, …])

Estimate population-normalized gross expected profits, \(\pi\).

ProblemResults.compute_consumer_surpluses([…])

Estimate population-normalized consumer surpluses, \(\text{CS}\).

A parametric bootstrap can be used, for example, to compute standard errors forpost-estimation outputs. The following method returns a results class with the same methods in the list directly above, which returns a distribution of post-estimation outputs corresponding to different bootstrapped samples.

ProblemResults.bootstrap([draws, seed, …])

Use a parametric bootstrap to create an empirical distribution of results.

Optimal instruments, which also return a results class instead of an array, can be estimated with the following method.

ProblemResults.compute_optimal_instruments([…])

Estimate feasible optimal or efficient instruments, \(Z_D^\text{opt}\) and \(Z_S^\text{opt}\).

Importance sampling can be used to create new integration nodes and weights. Its method also returns a results class.

ProblemResults.importance_sampling(draws[, …])

Use importance sampling to construct nodes and weights for integration.

The following methods can compute micro moment values, compute scores from micro data, or simulate such data.

ProblemResults.compute_micro_values(…)

Estimate micro moment values, \(f_m(v)\).

ProblemResults.compute_micro_scores(dataset, …)

Compute scores for observations \(n \in N_d\) from a micro dataset \(d\).

ProblemResults.compute_agent_scores(dataset)

Compute scores for all agent-choices, treated as observations \(n \in N_d\) from a micro dataset \(d\).

ProblemResults.simulate_micro_data(dataset)

Simulate observations \(n \in N_d\) from a micro dataset \(d\).

Bootstrapped Problem Results Class

Parametric bootstrap computation returns the following class.

BootstrappedResults

Bootstrapped results of a solved problem.

This class has many of the same methods as ProblemResults(). It can also be pickled or converted into a dictionary.

BootstrappedResults.to_pickle(path)

Save these results as a pickle file.

BootstrappedResults.to_dict([attributes])

Convert these results into a dictionary that maps attribute names to values.

Optimal Instrument Results Class

Optimal instrument computation returns the following results class.

OptimalInstrumentResults

Results of optimal instrument computation.

The results can be pickled or converted into a dictionary.

OptimalInstrumentResults.to_pickle(path)

Save these results as a pickle file.

OptimalInstrumentResults.to_dict([attributes])

Convert these results into a dictionary that maps attribute names to values.

They can also be converted into a Problem with the following method.

OptimalInstrumentResults.to_problem([…])

Re-create the problem with estimated feasible optimal instruments.

This method returns the following class, which behaves exactly like a Problem.

OptimalInstrumentProblem

A BLP problem updated with optimal excluded instruments.

Importance Sampling Results Class

Importance sampling returns the following results class:

ImportanceSamplingResults

Results of importance sampling.

The results can be pickled or converted into a dictionary.

ImportanceSamplingResults.to_pickle(path)

Save these results as a pickle file.

ImportanceSamplingResults.to_dict([attributes])

Convert these results into a dictionary that maps attribute names to values.

They can also be converted into a Problem with the following method.

ImportanceSamplingResults.to_problem()

Re-create the problem with the agent data constructed from importance sampling.

This method returns the following class, which behaves exactly like a Problem.

ImportanceSamplingProblem

A BLP problem updated after importance sampling.

Simulation Class

The following class allows for evaluation of more complicated counterfactuals than is possible with ProblemResults methods, or for simulation of synthetic data from scratch.

Simulation(product_formulations, …[, …])

Simulation of data in BLP-type models.

Once initialized, the following method replaces prices and shares with equilibrium values that are consistent with true parameters.

Simulation.replace_endogenous([costs, …])

Replace simulated prices and market shares with equilibrium values that are consistent with true parameters.

A less common way to solve the simulation is to assume simulated prices and shares represent and equilibrium and to replace exogenous variables instead.

Simulation.replace_exogenous(X1_name[, …])

Replace exogenous product characteristics with values that are consistent with true parameters.

Simulation Results Class

Solved simulations return the following results class.

SimulationResults

Results of a solved simulation of synthetic BLP data.

This class has many of the same methods as ProblemResults. It can also be pickled or converted into a dictionary.

SimulationResults.to_pickle(path)

Save these results as a pickle file.

SimulationResults.to_dict([attributes])

Convert these results into a dictionary that maps attribute names to values.

It can also be converted into a Problem with the following method.

SimulationResults.to_problem([…])

Convert the solved simulation into a problem.

Structured Data Classes

Product and agent data that are passed or constructed by Problem and Simulation are structured internally into classes with field names that more closely resemble BLP notation. Although these structured data classes are not directly constructable, they can be accessed with Problem and Simulation class attributes. It can be helpful to compare these structured data classes with the data or configurations used to create them.

Products

Product data structured as a record array.

Agents

Agent data structured as a record array.

Multiprocessing

A context manager can be used to enable parallel processing for methods that perform market-by-market computation.

parallel(processes[, use_pathos])

Context manager used for parallel processing in a with statement context.

Options and Example Data

In addition to classes and functions, there are also two modules that can be used to configure global package options and locate example data that comes with the package.

options

Global options.

data

Locations of example data that are included in the package for convenience.

Exceptions

When errors occur, they will either be displayed as warnings or raised as exceptions.

exceptions.MultipleErrors

Multiple errors that occurred around the same time.

exceptions.NonpositiveCostsError

Encountered nonpositive marginal costs in a log-linear specification.

exceptions.NonpositiveSyntheticCostsError

Encountered nonpositive synthetic marginal costs in a log-linear specification.

exceptions.InvalidParameterCovariancesError

Failed to compute standard errors because of invalid estimated covariances of GMM parameters.

exceptions.InvalidMomentCovariancesError

Failed to compute a weighting matrix because of invalid estimated covariances of GMM moments.

exceptions.GenericNumericalError

Encountered a numerical error.

exceptions.DeltaNumericalError

Encountered a numerical error when computing \(\delta\).

exceptions.CostsNumericalError

Encountered a numerical error when computing marginal costs.

exceptions.MicroMomentsNumericalError

Encountered a numerical error when computing micro moments.

exceptions.XiByThetaJacobianNumericalError

Encountered a numerical error when computing the Jacobian (holding \(\beta\) fixed) of \(\xi\) (equivalently, of \(\delta\)) with respect to \(\theta\).

exceptions.OmegaByThetaJacobianNumericalError

Encountered a numerical error when computing the Jacobian (holding \(\gamma\) fixed) of \(\omega\) (equivalently, of transformed marginal costs) with respect to \(\theta\).

exceptions.MicroMomentsByThetaJacobianNumericalError

Encountered a numerical error when computing the Jacobian of micro moments with respect to \(\theta\).

exceptions.MicroMomentCovariancesNumericalError

Encountered a numerical error when computing micro moment covariances.

exceptions.SyntheticPricesNumericalError

Encountered a numerical error when computing synthetic prices.

exceptions.SyntheticSharesNumericalError

Encountered a numerical error when computing synthetic shares.

exceptions.SyntheticDeltaNumericalError

Encountered a numerical error when computing the synthetic \(\delta\).

exceptions.SyntheticCostsNumericalError

Encountered a numerical error when computing synthetic marginal costs.

exceptions.SyntheticMicroDataNumericalError

Encountered a numerical error when computing synthetic micro data.

exceptions.SyntheticMicroMomentsNumericalError

Encountered a numerical error when computing synthetic micro moments.

exceptions.MicroScoresNumericalError

Encountered a numerical error when computing micro scores.

exceptions.EquilibriumRealizationNumericalError

Encountered a numerical error when solving for a realization of equilibrium prices and shares.

exceptions.JacobianRealizationNumericalError

Encountered a numerical error when computing a realization of the Jacobian (holding \(\beta\) fixed) of \(\xi\) (equivalently, of \(\delta\)) or \(\omega\) (equivalently, of transformed marginal costs) with respect to \(\theta\).

exceptions.PostEstimationNumericalError

Encountered a numerical error when computing a post-estimation output.

exceptions.AbsorptionError

A fixed effect absorption procedure failed to properly absorb fixed effects.

exceptions.ClippedSharesError

Shares were clipped during the final iteration of the fixed point routine for computing \(\delta\).

exceptions.ThetaConvergenceError

The optimization routine failed to converge.

exceptions.DeltaConvergenceError

The fixed point computation of \(\delta\) failed to converge.

exceptions.SyntheticPricesConvergenceError

The fixed point computation of synthetic prices failed to converge.

exceptions.SyntheticDeltaConvergenceError

The fixed point computation of the synthetic \(\delta\) failed to converge.

exceptions.EquilibriumPricesConvergenceError

The fixed point computation of equilibrium prices failed to converge.

exceptions.ObjectiveReversionError

Reverted a problematic GMM objective value.

exceptions.GradientReversionError

Reverted problematic elements in the GMM objective gradient.

exceptions.DeltaReversionError

Reverted problematic elements in \(\delta\).

exceptions.CostsReversionError

Reverted problematic marginal costs.

exceptions.MicroMomentsReversionError

Reverted problematic micro moments.

exceptions.XiByThetaJacobianReversionError

Reverted problematic elements in the Jacobian (holding \(\beta\) fixed) of \(\xi\) (equivalently, of \(\delta\)) with respect to \(\theta\).

exceptions.OmegaByThetaJacobianReversionError

Reverted problematic elements in the Jacobian (holding \(\gamma\) fixed) of \(\omega\) (equivalently, of transformed marginal costs) with respect to \(\theta\).

exceptions.MicroMomentsByThetaJacobianReversionError

Reverted problematic elements in the Jacobian of micro moments with respect to \(\theta\).

exceptions.HessianEigenvaluesError

Failed to compute eigenvalues for the GMM objective’s (reduced) Hessian matrix.

exceptions.ProfitHessianEigenvaluesError

Failed to compute eigenvalues for a firm’s profit Hessian.

exceptions.FittedValuesInversionError

Failed to invert an estimated covariance when computing fitted values.

exceptions.SharesByXiJacobianInversionError

Failed to invert a Jacobian of shares with respect to \(\xi\) when computing the Jacobian (holding \(\beta\) fixed) of \(\xi\) (equivalently, of \(\delta\)) with respect to \(\theta\).

exceptions.IntraFirmJacobianInversionError

Failed to invert an intra-firm Jacobian of shares with respect to prices.

exceptions.PassthroughInversionError

Failed to invert the matrix to recover the passthrough matrix.

exceptions.LinearParameterCovariancesInversionError

Failed to invert an estimated covariance matrix of linear parameters.

exceptions.GMMParameterCovariancesInversionError

Failed to invert an estimated covariance matrix of GMM parameters.

exceptions.GMMMomentCovariancesInversionError

Failed to invert an estimated covariance matrix of GMM moments.

References

This page contains a full list of references cited in the documentation, including the original work of Berry, Levinsohn, and Pakes (1995). If you use PyBLP in your research, we ask that you also cite the below Conlon and Gortmaker (2020), which describes the advances implemented in the package. If you use PyBLP’s micro moments functionality, we ask that you also cite Conlon and Gortmaker (2023), which describes the standardized framework implemented by PyBLP for incorporating micro data into BLP-style estimation.

Conlon and Gortmaker (2020)

Conlon, Christopher, and Jeff Gortmaker (2020). Best practices for differentiated products demand estimation with PyBLP. RAND Journal of Economics, 51 (4), 1108-1161.

Conlon and Gortmaker (2023)

Conlon, Christopher, and Jeff Gortmaker (2023). Incorporating micro data into differentiated products demand estimation with PyBLP. Working paper.

Other References

Amemiya (1977)

Amemiya, Takeshi (1977). A note on a heteroscedastic model. Journal of Econometrics, 6 (3), 365-370.

Andrews, Gentzkow, and Shapiro (2017)

Andrews, Isaiah, Matthew Gentzkow, and Jesse M. Shapiro (2017). Measuring the sensitivity of parameter estimates to estimation moments. Quarterly Journal of Economics, 132 (4), 1553-1592.

Armstrong (2016)

Armstrong, Timothy B. (2016). Large market asymptotics for differentiated product demand estimators with economic models of supply. Econometrica, 84 (5), 1961-1980.

Berry (1994)

Berry, Steven (1994). Estimating discrete-choice models of product differentiation. RAND Journal of Economics, 25 (2), 242-262.

Berry, Levinsohn, and Pakes (1995)

Berry, Steven, James Levinsohn, and Ariel Pakes (1995). Automobile prices in market equilibrium. Econometrica, 63 (4), 841-890.

Berry, Levinsohn, and Pakes (1999)

Berry, Steven, James Levinsohn, and Ariel Pakes (1999). Voluntary export restraints on automobiles: Evaluating a trade policy. American Economic Review, 83 (9), 400-430.

Berry, Levinsohn, and Pakes (2004)

Berry, Steven, James Levinsohn, and Ariel Pakes (2004). Differentiated products demand systems from a combination of micro and macro data: The new car market. Journal of Political Economy, 112 (1), 68-105.

Berry and Pakes (2007)

Berry, Steven, and Ariel Pakes (2007). The pure characteristics demand model. International Economic Review, 48 (4), 1193-1225.

Brenkers and Verboven (2006)

Brenkers, Randy, and Frank Verboven (2006). Liberalizing a distribution system: The European car market. Journal of the European Economic Association, 4 (1), 216-251.

Brunner, Heiss, Romahn, and Weiser (2017)

Brunner, Daniel, Florian Heiss, André Romahn, and Constantin Weiser (2017) Reliable estimation of random coefficient logit demand models. DICE Discussion Paper 267.

Cardell (1997)

Cardell, N. Scott (1997). Variance components structures for the extreme-value and logistic distributions with application to models of heterogeneity. Econometric Theory, 13 (2), 185-213.

Chamberlain (1987)

Chamberlain, Gary (1987) Asymptotic efficiency in estimation with conditional moment restrictions. Journal of Econometrics, 34 (3), 305-334.

Conlon and Mortimer (2018)

Conlon, Christopher T., and Julie H. Mortimer (2018). Empirical properties of diversion ratios. NBER working paper 24816.

Frisch and Waugh (1933)

Frisch, Ragnar, and Frederick V. Waugh (1933). Partial time regressions as compared with individual trends. Econometrica, 1 (4), 387-401.

Gandhi and Houde (2017)

Gandhi, Amit, and Jean-François Houde (2017). Measuring substitution patterns in differentiated products industries. Working paper.

Grigolon and Verboven (2014)

Grigolon, Laura, and Frank Verboven (2014). Nested logit or random coefficients logit? A comparison of alternative discrete choice models of product differentiation. Review of Economics and Statistics, 96 (5), 916-935.

Hausman, Leonard, and Zona (1994)

Hausman, Jerry, Gregory Leonard, and J. Douglas Zona (1994). Competitive analysis with differentiated products. Annals of Economics and Statistics, 34, 143-157.

Hansen (1982)

Hansen, Lars Peter (1982). Large sample properties of generalized method of moments estimators. Econometrica, 50 (4), 1029-1054.

Heiss and Winschel (2008)

Heiss, Florian, and Viktor Winschel (2008). Likelihood approximation by numerical integration on sparse grids. Journal of Econometrics, 144 (1), 62-80.

Hess, Train, and Polak (2004)

Hess, Stephane, Kenneth E. Train, and John W. Polak (2004). On the use of a Modified Latin Hypercube Sampling (MLHS) method in the estimation of a mixed logit model for vehicle choice. Transportation Research Part B (40), 147-167.

Imbens and Lancaster (1994)

Imbens, Guido W., and Tony Lancaster (1994). Combining micro and macro data in microeconometric models. Review of Economic Studies, 61 (4), 655-680.

Judd and Skrainka (2011)

Judd, Kenneth L., and Ben Skrainka (2011). High performance quadrature rules: How numerical integration affects a popular model of product differentiation. CeMMAP working paper CWP03/11.

Knittel and Metaxoglou (2014)

Knittel, Christopher R., and Konstantinos Metaxoglou (2014). Estimation of random-coefficient demand models: Two empiricists’ perspective. Review of Economics and Statistics, 96 (1), 34-59.

Lovell (1963)

Lovell, Michael C. (1963). Seasonal adjustment of economic time series and multiple regression analysis. Journal of the American Statistical Association, 58 (304), 993-1010.

MacKay and Miller (2023)

MacKay, Alexander and Nathan Miller (2023). Estimating models of supply and demand: Instruments and covariance restrictions. HBS working paper 19-051.

Morrow and Skerlos (2011)

Morrow, W. Ross, and Steven J. Skerlos (2011). Fixed-point approaches to computing Bertrand-Nash equilibrium prices under mixed-logit demand. Operations Research, 59 (2), 328-345.

Nevo (2000a)

Nevo, Aviv (2000). A practitioner’s guide to estimation of random‐coefficients logit models of demand. Journal of Economics & Management Strategy, 9 (4), 513-548.

Nevo (2000b)

Nevo, Aviv (2000). Mergers with differentiated products: The case of the ready-to-eat cereal industry. RAND Journal of Economics, 31 (3), 395-421.

Newey and West (1987)

Newey, Whitney K., and Kenneth D. West (1987). Hypothesis testing with efficient method of moments estimation. International Economic Review, 28 (3), 777-787.

Owen (2013)

Owen, Art B. (2013). Monte Carlo theory, methods and examples.

Owen (2017)

Owen, Art B. (2017). A randomized Halton algorithm in R.

Petrin (2002)

Petrin, Amil (2002). Quantifying the benefits of new products: The case of the minivan. Journal of Political Economy, 110 (4), 705-729.

Reynaert and Verboven (2014)

Reynaert, Mathias, and Frank Verboven (2014). Improving the performance of random coefficients demand models: The role of optimal instruments. Journal of Econometrics, 179 (1), 83-98.

Reynaerts, Varadhan, and Nash (2012)

Reynaerts, Jo, Ravi Varadhan, and John C. Nash (2012). Enhancing the convergence properties of the BLP (1995) contraction mapping. VIVES discussion paper 35.

Varadhan and Roland (2008)

Varadhan, Ravi, and Christophe Roland (2008). Simple and globally convergent methods for accelerating the convergence of any EM algorithm. Scandinavian Journal of Statistics, 35 (2), 335-353.

Werden (1997)

Werden, Gregory J. (1997). Simulating the effects of differentiated products mergers: A practitioners’ guide. Economic Analysis Group, Proceedings of NE-165 Conference, Washington, D.C., June 20–21, 1996, 1997.

Contributing

Please use the GitHub issue tracker to report bugs or to request features. Contributions are welcome. Examples include:

  • Code optimizations.

  • Documentation improvements.

  • Alternate formulations that have been implemented in the literature but not in PyBLP.

Testing

Testing is done with the tox automation tool, which runs a pytest-backed test suite in the tests/ directory.

Testing Requirements

In addition to the installation requirements for the package itself, running tests and building documentation requires additional packages specified by the tests and docs extras in setup.py, along with any other explicitly specified deps in tox.ini.

The full suite of tests also requires installation of the following software:

  • Artleys Knitro version 10.3 or newer: testing optimization routines.

  • MATLAB: comparing sparse grids with those created by the function nwspgr created by Florian Heiss and Viktor Winschel, which must be included in a directory on the MATLAB path.

  • R: simulating nested logit errors created by the package evd created by Alec Stephenson, which must be installed.

If software is not installed, its associated tests will be skipped. Additionally, some tests that require support for extended precision will be skipped if on the platform running the tests, numpy.longdouble has the same precision as numpy.float64. This tends to be the case on Windows.

Running Tests

Defined in tox.ini are environments that test the package under different python versions, check types, enforce style guidelines, verify the integrity of the documentation, and release the package. First, tox should be installed on top of an Anaconda installation. The following command can be run in the top-level pyblp directory to run all testing environments:

tox

You can choose to run only one environment, such as the one that builds the documentation, with the -e flag:

tox -e docs

Test Organization

Fixtures, which are defined in tests.conftest, configure the testing environment and simulate problems according to a range of specifications.

Most BLP-specific tests in tests.test_blp verify properties about results obtained by solving the simulated problems under various parameterizations. Examples include:

  • Reasonable formulations of problems should give rise to estimated parameters that are close to their true values.

  • Cosmetic changes such as the number of processes should not change estimates.

  • Post-estimation outputs should satisfy certain properties.

  • Optimization routines should behave as expected.

  • Derivatives computed with finite differences should approach analytic derivatives.

Tests of generic utilities in tests.test_formulation, tests.test_integration, tests.test_iteration, and tests.test_optimization verify that matrix formulation, integral approximation, fixed point iteration, and nonlinear optimization all work as expected. Example include:

  • Nonlinear formulas give rise to expected matrices and derivatives.

  • Gauss-Hermite integrals are better approximated with quadrature based on Gauss-Hermite rules than with Monte Carlo integration.

  • To solve a fixed point iteration problem for which it was developed, SQUAREM requires fewer fixed point evaluations than does simple iteration.

  • All optimization routines manage to solve a well-known optimization problem under different parameterizations.

Version Notes

These notes will only include major changes.

1.1

  • Covariance restrictions

  • Demographic-specific product availability

1.0

  • Support matching smooth functions of micro means

  • Optimal micro moments

  • Support elimination of groups of products for second choices

  • Micro data simulation

  • Micro moment tutorials

0.13

  • Overhauled micro moment API

  • Product-specific demographics

  • Passthrough calculations

  • Added problem results methods to simulation results

  • Profit Hessian computation

  • Checks of pricing second order conditions

  • Newton-based methods for computing equilibrium prices

  • Large speedups for supply-side and micro moment derivatives

  • Universal display for fixed point iteration progress

  • Support adjusting for simulation error in moment covariances

0.12

  • Refactored micro moment API

  • Custom micro moments

  • Properly scale micro moment covariances

  • Pickling support

0.11

  • Elasticities and diversion ratios with respect to mean utility

  • Willingness to pay calculations

0.10

  • Simplify micro moment API

  • Second choice or diversion micro moments

  • Add share clipping to make fixed point more robust

  • Report covariance matrix estimates in addition to Cholesky root

  • Approximation to the pure characteristics model

  • Add option to always use finite differences

0.9

  • More control over matrices of instruments

  • Split off fixed effect absorption into companion package PyHDFE

  • Scrambled Halton and Modified Latin Hypercube Sampling (MLHS) integration

  • Importance sampling

  • Quantity dependent marginal costs

  • Speed up various matrix construction routines

  • Option to do initial GMM update at starting values

  • Update BLP example data to better replicate original paper

  • Lognormal random coefficients

  • Removed outdated default parameter bounds

  • Change default objective scaling for more comparable objective values across problem sizes

  • Add post-estimation routines to simplify integration error comparison

0.8

  • Micro moments that match product and agent characteristic covariances

  • Extended use of pseudo-inverses

  • Added more information to error messages

  • More flexible simulation interface

  • Alternative way to simulate data with specified prices and shares

  • Tests of overidentifying and model restrictions

  • Report projected gradients and reduced Hessians

  • Change objective gradient scaling

  • Switch to a lower-triangular covariance matrix to fix a bug with off-diagonal parameters

0.7

  • Support more fixed point and optimization solvers

  • Hessian computation with finite differences

  • Simplified interface for firm changes

  • Construction of differentiation instruments

  • Add collinearity checks

  • Update notation and explanations

0.6

  • Optimal instrument estimation

  • Structured all results as classes

  • Additional information in progress reports

  • Parametric bootstrapping of post-estimation outputs

  • Replaced all examples in the documentation with Jupyter notebooks

  • Updated the instruments for the BLP example problem

  • Improved support for multiple equation GMM

  • Made concentrating out linear parameters optional

  • Better support for larger nesting parameters

  • Improved robustness to overflow

0.5

  • Estimation of nesting parameters

  • Performance improvements for matrix algebra and matrix construction

  • Support for Python 3.7

  • Computation of reasonable default bounds on nonlinear parameters

  • Additional information in progress updates

  • Improved error handling and documentation

  • Simplified multiprocessing interface

  • Cancelled out delta in the nonlinear contraction to improve performance

  • Additional example data and improvements to the example problems

  • Cleaned up covariance estimation

  • Added type annotations and overhauled the testing suite

0.4

  • Estimation of a Logit benchmark model

  • Support for fixing of all nonlinear parameters

  • More efficient two-way fixed effect absorption

  • Clustered standard errors

0.3

  • Patsy- and SymPy-backed R-style formula API

  • More informative errors and displays of information

  • Absorption of arbitrary fixed effects

  • Reduction of memory footprint

0.2

  • Improved support for longdouble precision

  • Custom ownership matrices

  • New benchmarking statistics

  • Supply-side gradient computation

  • Improved configuration for the automobile example problem

0.1

  • Initial release