Download the Jupyter Notebook for this section: `logit_nested.ipynb`

# Logit and Nested Logit Tutorial¶

```
[1]:
```

```
import pyblp
import numpy as np
import pandas as pd
pyblp.options.digits = 2
pyblp.options.verbose = False
pyblp.__version__
```

```
[1]:
```

```
'0.12.0'
```

In this tutorial, we’ll use data from Nevo (2000) to solve the paper’s fake cereal problem. Locations of CSV files that contain the data are in the data module.

We will compare two simple models, the plain (IIA) logit model and the nested logit (GEV) model using the fake cereal dataset of Nevo (2000).

## Theory of Plain Logit¶

Let’s start with the plain logit model under independence of irrelevant alternatives (IIA). In this model (indirect) utility is given by

where \(\epsilon_{ijt}\) is distributed IID with the Type I Extreme Value (Gumbel) distribution. It is common to normalize the mean utility of the outside good to zero so that \(U_{i0t} = \epsilon_{i0t}\). This gives us aggregate market shares

If we take logs we get

and

By differencing the above we get a linear estimating equation:

Because the left hand side is data, we can estimate this model using linear IV GMM.

## Application of Plain Logit¶

A Logit Problem can be created by simply excluding the formulation for the nonlinear parameters, \(X_2\), along with any agent information. In other words, it requires only specifying the *linear component* of demand.

We’ll set up and solve a simple version of the fake data cereal problem from Nevo (2000). Since we won’t include any demand-side nonlinear characteristics or parameters, we don’t have to worry about configuring an optimization routine.

There are some important reserved variable names:

`market_ids`

are the unique market identifiers which we subscript with \(t\).`shares`

specifies the market shares which need to be between zero and one, and within a market ID, \(\sum_{j} s_{jt} \leq 1\).`prices`

are prices \(p_{jt}\). These have some special properties and are*always*treated as endogenous.`demand_instruments0`

,`demand_instruments1`

, and so on are numbered demand instruments. These represent only the*excluded*instruments. The exogenous regressors in \(X_1\) will be automatically added to the set of instruments.

We begin with two steps:

Load the product data which at a minimum consists of

`market_ids`

,`shares`

,`prices`

, and at least a single column of demand instruments,`demand_instruments0`

.Define a Formulation for the \(X_1\) (linear) demand model.

This and all other formulas are similar to R and patsy formulas.

It includes a constant by default. To exclude the constant, specify either a

`0`

or a`-1`

.To efficiently include fixed effects, use the

`absorb`

option and specify which categorical variables you would like to absorb.Some model reduction may happen automatically. The constant will be excluded if you include fixed effects and some precautions are taken against collinearity. However, you will have to make sure that differently-named variables are not collinear.

Combine the Formulation and product data to construct a Problem.

Use Problem.solve to estimate paramters.

### Loading the Data¶

The `product_data`

argument of Problem should be a structured array-like object with fields that store data. Product data can be a structured NumPy array, a pandas DataFrame, or other similar objects.

```
[2]:
```

```
product_data = pd.read_csv(pyblp.data.NEVO_PRODUCTS_LOCATION)
product_data.head()
```

```
[2]:
```

market_ids | city_ids | quarter | product_ids | firm_ids | brand_ids | shares | prices | sugar | mushy | ... | demand_instruments10 | demand_instruments11 | demand_instruments12 | demand_instruments13 | demand_instruments14 | demand_instruments15 | demand_instruments16 | demand_instruments17 | demand_instruments18 | demand_instruments19 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

0 | C01Q1 | 1 | 1 | F1B04 | 1 | 4 | 0.012417 | 0.072088 | 2 | 1 | ... | 2.116358 | -0.154708 | -0.005796 | 0.014538 | 0.126244 | 0.067345 | 0.068423 | 0.034800 | 0.126346 | 0.035484 |

1 | C01Q1 | 1 | 1 | F1B06 | 1 | 6 | 0.007809 | 0.114178 | 18 | 1 | ... | -7.374091 | -0.576412 | 0.012991 | 0.076143 | 0.029736 | 0.087867 | 0.110501 | 0.087784 | 0.049872 | 0.072579 |

2 | C01Q1 | 1 | 1 | F1B07 | 1 | 7 | 0.012995 | 0.132391 | 4 | 1 | ... | 2.187872 | -0.207346 | 0.003509 | 0.091781 | 0.163773 | 0.111881 | 0.108226 | 0.086439 | 0.122347 | 0.101842 |

3 | C01Q1 | 1 | 1 | F1B09 | 1 | 9 | 0.005770 | 0.130344 | 3 | 0 | ... | 2.704576 | 0.040748 | -0.003724 | 0.094732 | 0.135274 | 0.088090 | 0.101767 | 0.101777 | 0.110741 | 0.104332 |

4 | C01Q1 | 1 | 1 | F1B11 | 1 | 11 | 0.017934 | 0.154823 | 12 | 0 | ... | 1.261242 | 0.034836 | -0.000568 | 0.102451 | 0.130640 | 0.084818 | 0.101075 | 0.125169 | 0.133464 | 0.121111 |

5 rows × 30 columns

The product data contains `market_ids`

, `product_ids`

, `firm_ids`

, `shares`

, `prices`

, a number of other IDs and product characteristics, and some pre-computed excluded `demand_instruments0`

, `demand_instruments1`

, and so on. The `product_ids`

will be incorporated as fixed effects.

For more information about the instruments and the example data as a whole, refer to the data module.

### Setting Up the Problem¶

We can combine the Formulation and `product_data`

to construct a Problem. We pass the Formulation first and the `product_data`

second. We can also display the properties of the problem by typing its name.

```
[3]:
```

```
logit_formulation = pyblp.Formulation('prices', absorb='C(product_ids)')
logit_formulation
```

```
[3]:
```

```
prices + Absorb[C(product_ids)]
```

```
[4]:
```

```
problem = pyblp.Problem(logit_formulation, product_data)
problem
```

```
[4]:
```

```
Dimensions:
================================
T N F K1 MD ED
--- ---- --- ---- ---- ----
94 2256 5 1 20 1
================================
Formulations:
==================================
Column Indices: 0
-------------------------- ------
X1: Linear Characteristics prices
==================================
```

Two sets of properties are displayed:

Dimensions of the data.

Formulations of the problem.

The dimensions describe the shapes of matrices as laid out in Notation. They include:

\(T\) is the number of markets.

\(N\) is the length of the dataset (the number of products across all markets).

\(F\) is the number of firms, which we won’t use in this example.

\(K_1\) is the dimension of the linear demand parameters.

\(M_D\) is the dimension of the instrument variables (excluded instruments and exogenous regressors).

\(E_D\) is the number of fixed effect dimensions (one-dimensional fixed effects, two-dimensional fixed effects, etc.).

There is only a single Formulation for this model.

\(X_1\) is the linear component of utility for demand and depends only on prices (after the fixed effects are removed).

### Solving the Problem¶

The Problem.solve method always returns a ProblemResults class, which can be used to compute post-estimation outputs. See the post estimation tutorial for more information.

```
[5]:
```

```
logit_results = problem.solve()
logit_results
```

```
[5]:
```

```
Problem Results Summary:
==========================================
GMM Objective Clipped Weighting Matrix
Step Value Shares Condition Number
---- --------- ------- ----------------
2 +1.9E+02 0 +5.7E+07
==========================================
Cumulative Statistics:
========================
Computation Objective
Time Evaluations
----------- -----------
00:00:00 2
========================
Beta Estimates (Robust SEs in Parentheses):
==========
prices
----------
-3.0E+01
(+1.0E+00)
==========
```

## Theory of Nested Logit¶

We can extend the logit model to allow for correlation within a group \(h\) so that

Now, we require that \(\epsilon_{jti} = \bar{\epsilon}_{h(j)ti} + (1 - \rho) \bar{\epsilon}_{jti}\) is distributed IID with the Type I Extreme Value (Gumbel) distribution. As \(\rho \rightarrow 1\), all consumers stay within their group. As \(\rho \rightarrow 0\), this collapses to the IIA logit. Note that if we wanted, we could allow \(\rho\) to differ between groups with the notation \(\rho_{h(j)}\).

This gives us aggregate market shares as the product of two logits, the within group logit and the across group logit:

where \(V_{jt} = \alpha p_{jt} + x_{jt} \beta^\text{ex} + \xi_{jt}\).

After some work we again obtain the linear estimating equation:

where \(s_{j|h(j)t} = s_{jt} / s_{h(j)t}\) and \(s_{h(j)t}\) is the share of group \(h\) in market \(t\). See Berry (1994) or Cardell (1997) for more information.

Again, the left hand side is data, though the \(\ln s_{j|h(j)t}\) is clearly endogenous which means we must instrument for it. Rather than include \(\ln s_{j|h(j)t}\) along with the linear components of utility, \(X_1\), whenever `nesting_ids`

are included in `product_data`

, \(\rho\) is treated as a nonlinear \(X_2\) parameter. This means that the linear component is given instead by

This is done for two reasons:

It forces the user to treat \(\rho\) as an endogenous parameter.

It extends much more easily to the RCNL model of Brenkers and Verboven (2006).

A common choice for an additional instrument is the number of products per nest.

## Application of Nested Logit¶

By including `nesting_ids`

(another reserved name) as a field in `product_data`

, we tell the package to estimate a nested logit model, and we don’t need to change any of the formulas. We show how to construct the category groupings in two different ways:

We put all products in a single nest (only the outside good in the other nest).

We put products into two nests (either mushy or non-mushy).

We also construct an additional instrument based on the number of products per nest. Typically this is useful as a source of exogenous variation in the within group share \(\ln s_{j|h(j)t}\). However, in this example because the number of products per nest does not vary across markets, if we include product fixed effects, this instrument is irrelevant.

We’ll define a function that constructs the additional instrument and solves the nested logit problem. We’ll exclude product ID fixed effects, which are collinear with `mushy`

, and we’ll choose \(\rho = 0.7\) as the initial value at which the optimization routine will start.

```
[6]:
```

```
def solve_nl(df):
groups = df.groupby(['market_ids', 'nesting_ids'])
df['demand_instruments20'] = groups['shares'].transform(np.size)
nl_formulation = pyblp.Formulation('0 + prices')
problem = pyblp.Problem(nl_formulation, df)
return problem.solve(rho=0.7)
```

First, we’ll solve the problem when there’s a single nest for all products, with the outside good in its own nest.

```
[7]:
```

```
df1 = product_data.copy()
df1['nesting_ids'] = 1
nl_results1 = solve_nl(df1)
nl_results1
```

```
[7]:
```

```
Problem Results Summary:
======================================================================================
GMM Objective Projected Reduced Clipped Weighting Matrix Covariance Matrix
Step Value Gradient Norm Hessian Shares Condition Number Condition Number
---- --------- ------------- -------- ------- ---------------- -----------------
2 +2.0E+02 +7.9E-10 +1.1E+04 0 +2.0E+09 +3.0E+04
======================================================================================
Cumulative Statistics:
=================================================
Computation Optimizer Optimization Objective
Time Converged Iterations Evaluations
----------- --------- ------------ -----------
00:00:05 Yes 3 8
=================================================
Rho Estimates (Robust SEs in Parentheses):
==========
All Groups
----------
+9.8E-01
(+1.4E-02)
==========
Beta Estimates (Robust SEs in Parentheses):
==========
prices
----------
-1.2E+00
(+4.0E-01)
==========
```

When we inspect the Problem, the only changes from the plain logit model is the additional instrument that contributes to \(M_D\) and the inclusion of \(H\), the number of nesting categories.

```
[8]:
```

```
nl_results1.problem
```

```
[8]:
```

```
Dimensions:
===============================
T N F K1 MD H
--- ---- --- ---- ---- ---
94 2256 5 1 21 1
===============================
Formulations:
==================================
Column Indices: 0
-------------------------- ------
X1: Linear Characteristics prices
==================================
```

Next, we’ll solve the problem when there are two nests for mushy and non-mushy.

```
[9]:
```

```
df2 = product_data.copy()
df2['nesting_ids'] = df2['mushy']
nl_results2 = solve_nl(df2)
nl_results2
```

```
[9]:
```

```
Problem Results Summary:
======================================================================================
GMM Objective Projected Reduced Clipped Weighting Matrix Covariance Matrix
Step Value Gradient Norm Hessian Shares Condition Number Condition Number
---- --------- ------------- -------- ------- ---------------- -----------------
2 +6.9E+02 +8.2E-09 +5.6E+03 0 +5.1E+08 +2.0E+04
======================================================================================
Cumulative Statistics:
=================================================
Computation Optimizer Optimization Objective
Time Converged Iterations Evaluations
----------- --------- ------------ -----------
00:00:06 Yes 4 13
=================================================
Rho Estimates (Robust SEs in Parentheses):
==========
All Groups
----------
+8.9E-01
(+1.9E-02)
==========
Beta Estimates (Robust SEs in Parentheses):
==========
prices
----------
-7.8E+00
(+4.8E-01)
==========
```

For both cases we find that \(\hat{\rho} > 0.8\).

Finally, we’ll also look at the adjusted parameter on prices, \(\alpha / (1-\rho)\).

```
[10]:
```

```
nl_results1.beta[0] / (1 - nl_results1.rho)
```

```
[10]:
```

```
array([[-67.39338888]])
```

```
[11]:
```

```
nl_results2.beta[0] / (1 - nl_results2.rho)
```

```
[11]:
```

```
array([[-72.27074638]])
```