Download the Jupyter Notebook for this section: nevo.ipynb
Random Coefficients Logit Tutorial with the Fake Cereal Data¶
[1]:
import pyblp
import numpy as np
import pandas as pd
pyblp.options.digits = 2
pyblp.options.verbose = False
pyblp.__version__
[1]:
'0.7.0'
In this tutorial, we’ll use data from Nevo (2000) to solve the paper’s fake cereal problem. Locations of CSV files that contain the data are in the data module.
Theory of Random Coefficients Logit¶
The random coefficients model extends the plain logit model by allowing for correlated tastes for different product characteristics. In this model (indirect) utility is given by
The main addition is that \(\beta_i = (\alpha_i, \beta_i^x)\) have individual specific subscripts \(i\).
Conditional on \(\beta_i\), the individual marketshares follow the same logit form as before. But now we must integrate over heterogeneous individuals to get the aggregate marketshares:
In general, this integral needs to be calculated numerically. This also means that we can’t directly linearize the model. It is common to reparametrize the model to separate the aspects of mean utility that all individuals agree on, \(\delta_{jt} = \alpha p_{jt} + x_{jt} \beta^x + \xi_{jt}\), from the individual specific heterogeneity, \(\mu_{jti}(\theta)\). This gives us
Given a guess of \(\theta\) we can solve the system of nonlinear equations for the vector \(\delta\) which equates observed and predicted marketshares \(s_{jt} = s_{jt}(\delta, \theta)\). Now we can perform a linear IV GMM regression of the form
The moments are constructed by interacting the predicted residuals \(\hat{\xi}_{jt}(\theta)\) with instruments \(z_{jt}\) to form
Application of Random Coefficients Logit with the Fake Cereal Data¶
To include random coefficients we need to add a Formulation for the nonlinear characteristics \(X_2\).
Just like in the logit case we have the same reserved field names in product_data
:
market_ids
are the unique market identifiers which we subscript \(t\).shares
specifies the marketshares which need to be between zero and one, and within a market ID, \(\sum_{j} s_{jt} < 1\).prices
are prices \(p_{jt}\). These have some special properties and are always treated as endogenous.demand_instruments0
,demand_instruments1
, and so on are numbered demand instruments. These represent only the excluded instruments. The exogenous regressors in \(X_1\) (of which \(X_2\) is typically a subset) will be automatically added to the set of instruments.
We proceed with the following steps:
Load the
product data
which at a minimum consists ofmarket_ids
,shares
,prices
, and at least a single column of demand instruments,demand_instruments0
.Define a Formulation for the \(X_1\) (linear) demand model.
This and all other formulas are similar to R and patsy formulas.
It includes a constant by default. To exclude the constant, specify either a
0
or a1
.To efficiently include fixed effects, use the
absorb
option and specify which categorical variables you would like to absorb.Some model reduction may happen automatically. The constant will be excluded if you include fixed effects and some precautions are taken against collinearity. However, you will have to make sure that differentlynamed variables are not collinear.
Define a Formulation for the \(X_2\) (nonlinear) demand model.
Include only the variables over which we want random coefficients.
Do not absorb or include fixed effects.
It will include a random coefficient on the constant (to capture inside good vs. outside good preference) unless you specify not to with a
0
or a1
.
Define an Integration configuration to solve the market share integral from several available options:
Monte Carlo integration (pseudorandom draws).
Product rule quadrature.
Sparse grid quadrature.
Combine Formulation classes,
product_data
, and the Integration configuration to construct a Problem.Use the Problem.solve method to estimate paramters.
It is required to specify an initial guess of the nonlinear parameters. This serves two primary purposes: speeding up estimation and indicating to the solver through initial values of zero which parameters are restricted to be always zero.
Specification of Random Taste Parameters¶
It is common to assume that \(f(\beta_i \mid \theta)\) follows a multivariate normal distribution, and to break it up into three parts:
A mean \(K_1 \times 1\) taste which all individuals agree on, \(\beta\).
A \(K_2 \times K_2\) covariance matrix, \(\Sigma\) .
Any \(K_2 \times D\) interactions, \(\Pi\), with observed \(D \times 1\) demographic data, \(d_i\).
Together this gives us that
Problem.solve takes an initial guess \(\Sigma_0\) of \(\Sigma\). It guarantees that \(\hat{\Sigma}\) (the estimated parameters) will have the same sparsity structure as \(\Sigma_0\). So any zero element of \(\Sigma\) is restricted to be zero in the solution \(\hat{\Sigma}\). For example, a popular restriction is that \(\Sigma\) is diagonal, this can be achieved by passing a diagonal matrix as \(\Sigma_0\).
As is common with multivariate normal distributions, \(\Sigma\) is not estimated directly. Rather, its matrix square (Cholesky) root \(L\) is estimated where \(LL' = \Sigma\).
Loading the Data¶
The product_data
argument of Problem should be a structured arraylike object with fields that store data. Product data can be a structured NumPy array, a pandas DataFrame, or other similar objects.
[2]:
product_data = pd.read_csv(pyblp.data.NEVO_PRODUCTS_LOCATION)
product_data.head()
[2]:
market_ids  city_ids  quarter  product_ids  firm_ids  brand_ids  shares  prices  sugar  mushy  ...  demand_instruments10  demand_instruments11  demand_instruments12  demand_instruments13  demand_instruments14  demand_instruments15  demand_instruments16  demand_instruments17  demand_instruments18  demand_instruments19  

0  C01Q1  1  1  F1B04  1  4  0.012417  0.072088  2  1  ...  2.116358  0.154708  0.005796  0.014538  0.126244  0.067345  0.068423  0.034800  0.126346  0.035484 
1  C01Q1  1  1  F1B06  1  6  0.007809  0.114178  18  1  ...  7.374091  0.576412  0.012991  0.076143  0.029736  0.087867  0.110501  0.087784  0.049872  0.072579 
2  C01Q1  1  1  F1B07  1  7  0.012995  0.132391  4  1  ...  2.187872  0.207346  0.003509  0.091781  0.163773  0.111881  0.108226  0.086439  0.122347  0.101842 
3  C01Q1  1  1  F1B09  1  9  0.005770  0.130344  3  0  ...  2.704576  0.040748  0.003724  0.094732  0.135274  0.088090  0.101767  0.101777  0.110741  0.104332 
4  C01Q1  1  1  F1B11  1  11  0.017934  0.154823  12  0  ...  1.261242  0.034836  0.000568  0.102451  0.130640  0.084818  0.101075  0.125169  0.133464  0.121111 
5 rows × 30 columns
The product data contains market_ids
, product_ids
, firm_ids
, shares
, prices
, a number of other firm IDs and product characteristics, and some precomputed excluded demand_instruments0
, demand_instruments1
, and so on. The product_ids
will be incorporated as fixed effects.
For more information about the instruments and the example data as a whole, refer to the data module.
Setting Up and Solving the Problem Without Demographics¶
Formulations, product data, and an integration configuration are collectively used to initialize a Problem. Once initialized, Problem.solve runs the estimation routine. The arguments to Problem.solve configure how estimation is performed. For example, optimization
and iteration
arguments configure the optimization and
iteration routines that are used by the outer and inner loops of estimation.
We’ll specify Formulation configurations for \(X_1\), the linear characteristics, and \(X_2\), the nonlinear characteristics.
The formulation for \(X_1\) consists of
prices
and fixed effects constructed fromproduct_ids
, which we will absorb usingabsorb
argument of Formulation.If we were interested in reporting estimates for each fixed effect, we could replace the formulation for \(X_1\) with
Formulation('prices + C(product_ids)')
.Because
sugar
,mushy
, and the constant are collinear withproduct_ids
, we can include them in \(X_2\) but not in \(X_1\).
[3]:
X1_formulation = pyblp.Formulation('0 + prices', absorb='C(product_ids)')
X2_formulation = pyblp.Formulation('1 + prices + sugar + mushy')
product_formulations = (X1_formulation, X2_formulation)
product_formulations
[3]:
(prices + Absorb[C(product_ids)], 1 + prices + sugar + mushy)
We also need to specify an Integration configuration. We consider two alternatives:
Monte Carlo draws: we simulate 50 individuals from a random normal distribution.
Product rules: we construct nodes and weights according to a product rule that exactly integrates polynomials of degree \(5 \times 2  1 = 9\) or less.
[4]:
mc_integration = pyblp.Integration('monte_carlo', size=50, seed=0)
mc_integration
[4]:
Configured to construct nodes and weights with Monte Carlo simulation.
[5]:
pr_integration = pyblp.Integration('product', size=5)
pr_integration
[5]:
Configured to construct nodes and weights according to the level5 GaussHermite product rule.
[6]:
mc_problem = pyblp.Problem(product_formulations, product_data, integration=mc_integration)
mc_problem
[6]:
Dimensions:
============================================
T N F I K1 K2 MD ED
       
94 2256 5 4700 1 4 20 1
============================================
Formulations:
===========================================================
Column Indices: 0 1 2 3
    
X1: Linear Characteristics prices
X2: Nonlinear Characteristics 1 prices sugar mushy
===========================================================
[7]:
pr_problem = pyblp.Problem(product_formulations, product_data, integration=pr_integration)
pr_problem
[7]:
Dimensions:
=============================================
T N F I K1 K2 MD ED
       
94 2256 5 58750 1 4 20 1
=============================================
Formulations:
===========================================================
Column Indices: 0 1 2 3
    
X1: Linear Characteristics prices
X2: Nonlinear Characteristics 1 prices sugar mushy
===========================================================
As an illustration of how to configure the optimization routine, we’ll use a simpler, nondefault Optimization configuration that doesn’t support parameter bounds.
[8]:
bfgs = pyblp.Optimization('bfgs')
bfgs
[8]:
Configured to optimize using the BFGS algorithm implemented in SciPy with analytic gradients and options {}.
We estimate three versions of the model:
An unrestricted covariance matrix for random tastes using Monte Carlo integration.
An unrestricted covariance matrix for random tastes using the product rule.
A restricted diagonal matrix for random tastes using Monte Carlo Integration.
Notice that the only thing that changes when we estimate the restricted covariance is our initial guess of \(\Sigma_0\). The lower diagonal in this initial guess is ignored because we are optimizing over the upper triangular (Cholesky) root of \(\Sigma\).
[9]:
results1 = mc_problem.solve(sigma=np.ones((4, 4)), optimization=bfgs)
results1
[9]:
Problem Results Summary:
========================================================================================================================
Smallest Largest
Computation GMM Optimization Objective Fixed Point Contraction Objective Gradient Hessian Hessian
Time Step Iterations Evaluations Iterations Evaluations Value Infinity Norm Eigenvalue Eigenvalue
         
00:05:19 2 62 171 175814 541988 +3.0E+05 +8.0E03 +1.5E+02 +1.2E+07
========================================================================================================================
Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
======================================================
Sigma: 1 prices sugar mushy
    
1 +9.1E02 +1.3E01 +1.1E01 5.7E02
(+1.8E+00) (+8.2E+00) (+9.5E+00) (+4.3E+00)
prices 1.4E+00 +3.5E+00 +1.3E+01
(+6.6E+01) (+7.4E+01) (+1.9E+01)
sugar 1.1E02 1.4E01
(+1.7E01) (+1.8E01)
mushy 2.6E02
(+7.5E01)
======================================================
Beta Estimates (Robust SEs in Parentheses):
==========
prices

3.4E+01
(+1.5E+01)
==========
[10]:
results2 = pr_problem.solve(sigma=np.ones((4, 4)), optimization=bfgs)
results2
[10]:
Problem Results Summary:
========================================================================================================================
Smallest Largest
Computation GMM Optimization Objective Fixed Point Contraction Objective Gradient Hessian Hessian
Time Step Iterations Evaluations Iterations Evaluations Value Infinity Norm Eigenvalue Eigenvalue
         
00:10:49 2 69 153 100768 312910 +3.6E+05 +1.3E02 +3.3E+00 +1.2E+07
========================================================================================================================
Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
======================================================
Sigma: 1 prices sugar mushy
    
1 9.8E10 1.0E09 +3.2E01 +6.7E01
(+2.2E+09) (+5.1E+09) (+9.9E+01) (+5.2E+01)
prices +1.4E08 6.2E+00 1.1E+01
(+2.8E+10) (+1.6E+03) (+9.3E+02)
sugar +5.7E02 +9.7E02
(+1.4E+01) (+7.9E+00)
mushy 1.6E01
(+1.5E+01)
======================================================
Beta Estimates (Robust SEs in Parentheses):
==========
prices

3.1E+01
(+3.1E+01)
==========
[11]:
results3 = mc_problem.solve(sigma=np.eye(4), optimization=bfgs)
results3
[11]:
Problem Results Summary:
========================================================================================================================
Smallest Largest
Computation GMM Optimization Objective Fixed Point Contraction Objective Gradient Hessian Hessian
Time Step Iterations Evaluations Iterations Evaluations Value Infinity Norm Eigenvalue Eigenvalue
         
00:01:19 2 16 77 43049 134868 +4.0E+05 +3.9E04 +2.4E+03 +1.4E+07
========================================================================================================================
Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
======================================================
Sigma: 1 prices sugar mushy
    
1 +5.2E02 +0.0E+00 +0.0E+00 +0.0E+00
(+1.1E+00)
prices 4.3E01 +0.0E+00 +0.0E+00
(+8.0E+00)
sugar +3.6E02 +0.0E+00
(+5.8E02)
mushy +5.0E01
(+1.7E+00)
======================================================
Beta Estimates (Robust SEs in Parentheses):
==========
prices

3.0E+01
(+1.4E+00)
==========
We see that all three models give similar estimates of the price coefficient \(\hat{\alpha} \approx 30\). Note a few of the estimated terms on the diagonal of \(\Sigma\) are negative. Since the diagonal consists of standard deviations, negative values are unrealistic. When using another optimization routine that supports bounds (like the default LBFGSB routine), these diagonal elements are by default bounded from below by zero.
Adding Demographics to the Problem¶
To add demographic data we need to make two changes:
We need to load
agent_data
, which for this cereal problem contains precomputed Monte Carlo draws and demographics.We need to add an
agent_formulation
to the model.
The agent data
has several reserved column names.
market_ids
are the index that link theagent data
to themarket_ids
inproduct data
.weights
are the weights \(w_{ti}\) attached to each agent. In each market, these should sum to one so that \(\sum_i w_{ti} = 1\). It is often the case the \(w_{ti} = 1 / I_t\) where \(I_t\) is the number of agents in market \(t\), so that each agent gets equal weight. Other possibilities include quadrature nodes and weights.nodes0
,nodes1
, and so on are the nodes at which the unobserved agent tastes \(\mu_{jti}\) are evaluated. The nodes should be labeled from \(0, \ldots, K_2  1\) where \(K_2\) is the number of random coefficients.Other fields are the realizations of the demographics \(d\) themselves.
[12]:
agent_data = pd.read_csv(pyblp.data.NEVO_AGENTS_LOCATION)
agent_data.head()
[12]:
market_ids  city_ids  quarter  weights  nodes0  nodes1  nodes2  nodes3  income  income_squared  age  child  

0  C01Q1  1  1  0.05  0.434101  1.500838  1.151079  0.161017  0.495123  8.331304  0.230109  0.230851 
1  C01Q1  1  1  0.05  0.726649  0.133182  0.500750  0.129732  0.378762  6.121865  2.532694  0.769149 
2  C01Q1  1  1  0.05  0.623061  0.138241  0.797441  0.795549  0.105015  1.030803  0.006965  0.230851 
3  C01Q1  1  1  0.05  0.041317  1.257136  0.683054  0.259044  1.485481  25.583605  0.827946  0.769149 
4  C01Q1  1  1  0.05  0.466691  0.226968  1.044424  0.092019  0.316597  6.517009  0.230109  0.230851 
The agent formulation
tells us which columns of demographic information to interact with \(X_2\).
[13]:
agent_formulation = pyblp.Formulation('0 + income + income_squared + age + child')
agent_formulation
[13]:
income + income_squared + age + child
This tells us to include demographic realizations for income
, income_squared
, age
, and the presence of children, child
, but to ignore other possible demographics when interacting demographics with \(X_2\). We should also generally exclude the constant from the demographic formula.
Now we configure the Problem to include the agent_formulation
and agent_data
, which follow the product_formulations
and product_data
.
When we display the class, it lists the demographic interactions in the table of formulations and reports \(D = 4\), the dimension of the demographic draws.
[14]:
nevo_problem = pyblp.Problem(
product_formulations,
product_data,
agent_formulation,
agent_data
)
nevo_problem
[14]:
Dimensions:
=================================================
T N F I K1 K2 D MD ED
        
94 2256 5 1880 1 4 4 20 1
=================================================
Formulations:
===================================================================
Column Indices: 0 1 2 3
    
X1: Linear Characteristics prices
X2: Nonlinear Characteristics 1 prices sugar mushy
d: Demographics income income_squared age child
===================================================================
The initialized problem can be solved with Problem.solve. We’ll use the same starting values as Nevo (2000). By passing a diagonal matrix as starting values for \(\Sigma\), we’re choosing to ignore covariance terms. Similarly, zeros in the starting values for \(\Pi\) mean that those parameters will be fixed at zero.
To replicate common estimates, we’ll use the nondefault BFGS optimization routine, and we’ll configure Problem.solve to use 1step GMM instead of 2step GMM.
[15]:
initial_sigma = np.diag([0.3302, 2.4526, 0.0163, 0.2441])
initial_pi = np.array([
[ 5.4819, 0, 0.2037, 0 ],
[15.8935, 1.2000, 0, 2.6342],
[0.2506, 0, 0.0511, 0 ],
[ 1.2650, 0, 0.8091, 0 ]
])
nevo_results = nevo_problem.solve(
initial_sigma,
initial_pi,
optimization=bfgs,
method='1s'
)
nevo_results
[15]:
Problem Results Summary:
========================================================================================================================
Smallest Largest
Computation GMM Optimization Objective Fixed Point Contraction Objective Gradient Hessian Hessian
Time Step Iterations Evaluations Iterations Evaluations Value Infinity Norm Eigenvalue Eigenvalue
         
00:01:18 1 51 58 46236 143503 +4.6E+00 +6.9E06 +2.8E05 +1.6E+04
========================================================================================================================
Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
=====================================================================================================================
Sigma: 1 prices sugar mushy  Pi: income income_squared age child
          
1 +5.6E01 +0.0E+00 +0.0E+00 +0.0E+00  1 +2.3E+00 +0.0E+00 +1.3E+00 +0.0E+00
(+1.6E01)  (+1.2E+00) (+6.3E01)

prices +3.3E+00 +0.0E+00 +0.0E+00  prices +5.9E+02 3.0E+01 +0.0E+00 +1.1E+01
(+1.3E+00)  (+2.7E+02) (+1.4E+01) (+4.1E+00)

sugar 5.8E03 +0.0E+00  sugar 3.8E01 +0.0E+00 +5.2E02 +0.0E+00
(+1.4E02)  (+1.2E01) (+2.6E02)

mushy +9.3E02  mushy +7.5E01 +0.0E+00 1.4E+00 +0.0E+00
(+1.9E01)  (+8.0E01) (+6.7E01)
=====================================================================================================================
Beta Estimates (Robust SEs in Parentheses):
==========
prices

6.3E+01
(+1.5E+01)
==========
Results are similar to those in the original paper with a GMM objective value of \(q(\hat{\theta}) = 4.56\) and a price coefficient of \(\hat{\alpha} = 62.7\).
Restricting Parameters¶
Because the interactions between price
, income
, and income_squared
are potentially collinear, we might worry that \(\hat{\Pi}_{21} = 588\) and \(\hat{\Pi}_{22} = 30.2\) are pushing the price coefficient in opposite directions. Both are large in magnitude but statistically insignficant. One way of dealing with this is to restrict \(\Pi_{22} = 0\).
There are two ways we can do this:
Change the initial \(\Pi_0\) values to make this term zero.
Change the agent formula to drop
income_squared
.
First, we’ll change the initial \(\Pi_0\) values.
[16]:
restricted_pi = initial_pi.copy()
restricted_pi[1, 1] = 0
nevo_problem.solve(
initial_sigma,
restricted_pi,
optimization=bfgs,
method='1s'
)
[16]:
Problem Results Summary:
========================================================================================================================
Smallest Largest
Computation GMM Optimization Objective Fixed Point Contraction Objective Gradient Hessian Hessian
Time Step Iterations Evaluations Iterations Evaluations Value Infinity Norm Eigenvalue Eigenvalue
         
00:00:57 1 34 41 34457 106706 +1.5E+01 +5.2E06 +4.7E02 +1.7E+04
========================================================================================================================
Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
=====================================================================================================================
Sigma: 1 prices sugar mushy  Pi: income income_squared age child
          
1 +3.7E01 +0.0E+00 +0.0E+00 +0.0E+00  1 +3.1E+00 +0.0E+00 +1.2E+00 +0.0E+00
(+1.2E01)  (+1.1E+00) (+1.0E+00)

prices +1.8E+00 +0.0E+00 +0.0E+00  prices +4.2E+00 +0.0E+00 +0.0E+00 +1.2E+01
(+9.2E01)  (+4.6E+00) (+5.2E+00)

sugar 4.4E03 +0.0E+00  sugar 1.9E01 +0.0E+00 +2.8E02 +0.0E+00
(+1.2E02)  (+3.5E02) (+3.2E02)

mushy +8.6E02  mushy +1.5E+00 +0.0E+00 1.5E+00 +0.0E+00
(+1.9E01)  (+6.5E01) (+1.1E+00)
=====================================================================================================================
Beta Estimates (Robust SEs in Parentheses):
==========
prices

3.2E+01
(+2.3E+00)
==========
Now we’ll drop both income_squared
and the corresponding column in \(\Pi_0\).
[17]:
restricted_formulation = pyblp.Formulation('0 + income + age + child')
deleted_pi = np.delete(initial_pi, 1, axis=1)
restricted_problem = pyblp.Problem(
product_formulations,
product_data,
restricted_formulation,
agent_data
)
restricted_problem.solve(
initial_sigma,
deleted_pi,
optimization=bfgs,
method='1s'
)
[17]:
Problem Results Summary:
========================================================================================================================
Smallest Largest
Computation GMM Optimization Objective Fixed Point Contraction Objective Gradient Hessian Hessian
Time Step Iterations Evaluations Iterations Evaluations Value Infinity Norm Eigenvalue Eigenvalue
         
00:00:54 1 34 41 34472 106726 +1.5E+01 +5.2E06 +4.7E02 +1.7E+04
========================================================================================================================
Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
=====================================================================================================
Sigma: 1 prices sugar mushy  Pi: income age child
         
1 +3.7E01 +0.0E+00 +0.0E+00 +0.0E+00  1 +3.1E+00 +1.2E+00 +0.0E+00
(+1.2E01)  (+1.1E+00) (+1.0E+00)

prices +1.8E+00 +0.0E+00 +0.0E+00  prices +4.2E+00 +0.0E+00 +1.2E+01
(+9.2E01)  (+4.6E+00) (+5.2E+00)

sugar 4.4E03 +0.0E+00  sugar 1.9E01 +2.8E02 +0.0E+00
(+1.2E02)  (+3.5E02) (+3.2E02)

mushy +8.6E02  mushy +1.5E+00 1.5E+00 +0.0E+00
(+1.9E01)  (+6.5E01) (+1.1E+00)
=====================================================================================================
Beta Estimates (Robust SEs in Parentheses):
==========
prices

3.2E+01
(+2.3E+00)
==========
The parameter estimates and standard errors are identical for both approaches. Based on the number of fixed point iterations, there is some evidence that the solver took a slightly different path for each problem, but both restricted problems arrived at identical answers. The \(\hat{\Pi}_{12}\) interaction term is still insignificant.