{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Logit and Nested Logit Tutorial" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'0.7.0'" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pyblp\n", "import numpy as np\n", "import pandas as pd\n", "\n", "pyblp.options.digits = 2\n", "pyblp.options.verbose = False\n", "pyblp.__version__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this tutorial, we'll use data from [Nevo (2000)](https://pyblp.readthedocs.io/en/stable/references.html#nevo-2000) to solve the paper's fake cereal problem. Locations of CSV files that contain the data are in the [data](https://pyblp.readthedocs.io/en/stable/_api/pyblp.data.html#module-pyblp.data) module.\n", "\n", "We will compare two simple models, the plain (IIA) logit model and the nested logit (GEV) model using the fake cereal dataset of [Nevo (2000)](https://pyblp.readthedocs.io/en/stable/references.html#nevo-2000).\n", "\n", "## Theory of Plain Logit\n", "\n", "Let's start with the plain logit model under independence of irrelevant alternatives (IIA). In this model (indirect) utility is given by\n", "\n", "$$U_{jti} = \\alpha p_{jt} + x_{jt} \\beta^x + \\xi_{jt} + \\epsilon_{jti},$$\n", "\n", "where $\\epsilon_{jti}$ is distributed IID with the Type I Extreme Value (Gumbel) distribution. It is common to normalize the mean utility of the outside good to zero so that $U_{0ti} = \\epsilon_{0ti}$. This gives us aggregate marketshares\n", "\n", "$$s_{jt} = \\frac{\\exp(\\alpha p_{jt} + x_{jt} \\beta^x + \\xi_{jt})}{1 + \\sum_k \\exp(\\alpha p_{jt} + x_{kt} \\beta^x + \\xi_{kt})}.$$\n", "\n", "If we take logs we get\n", "\n", "$$\\log s_{jt} = \\alpha p_{jt} + x_{jt} \\beta^x + \\xi_{jt} - 0 - \\log \\sum_k \\exp(\\alpha p_{jt} + x_{kt} \\beta^x + \\xi_{kt})$$\n", "\n", "and\n", "\n", "$$\\log s_{0t} = 0 - \\log \\sum_k \\exp(\\alpha p_{jt} + x_{kt} \\beta^x + \\xi_{kt}).$$\n", "\n", "By differencing the above we get a linear estimating equation:\n", "\n", "$$\\log s_{jt} - \\log s_{0t} = \\alpha p_{jt} + x_{jt} \\beta^x + \\xi_{jt}.$$\n", "\n", "Because the left hand side is data, we can estimate this model using linear IV GMM." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Application of Plain Logit\n", "\n", "A Logit [Problem](https://pyblp.readthedocs.io/en/stable/_api/pyblp.Problem.html#pyblp.Problem) can be created by simply excluding the formulation for the nonlinear parameters, $X_2$, along with any agent information. In other words, it requires only specifying the _linear component_ of demand.\n", "\n", "We'll set up and solve a simple version of the fake data cereal problem from [Nevo (2000)](https://pyblp.readthedocs.io/en/stable/references.html#nevo-2000). Since we won't include any nonlinear characteristics or parameters, we don't have to worry about configuring an optimization routine.\n", "\n", "There are some important reserved variable names:\n", "\n", "- market_ids are the unique market identifiers which we subscript with $t$.\n", "- shares specifies the marketshares which need to be between zero and one, and within a market ID, $\\sum_{j} s_{jt} \\leq 1$.\n", "- prices are prices $p_{jt}$. These have some special properties and are _always_ treated as endogenous.\n", "- demand_instruments0, demand_instruments1, and so on are numbered demand instruments. These represent only the _excluded_ instruments. The exogenous regressors in $X_1$ will be automatically added to the set of instruments.\n", "\n", "We begin with two steps:\n", "\n", "1. Load the product data which at a minimum consists of market_ids, shares, prices, and at least a single column of demand instruments, demand_instruments0.\n", "2. Define a [Formulation](https://pyblp.readthedocs.io/en/stable/_api/pyblp.Formulation.html#pyblp.Formulation) for the $X_1$ (linear) demand model.\n", "\n", " - This and all other formulas are similar to R and [patsy](https://patsy.readthedocs.io/en/stable/) formulas.\n", " - It includes a constant by default. To exclude the constant, specify either a 0 or a -1.\n", " - To efficiently include fixed effects, use the absorb option and specify which categorical variables you would like to absorb.\n", " - Some model reduction may happen automatically. The constant will be excluded if you include fixed effects and some precautions are taken against collinearity. However, you will have to make sure that differently-named variables are not collinear.\n", " \n", "3. Combine the [Formulation](https://pyblp.readthedocs.io/en/stable/_api/pyblp.Formulation.html#pyblp.Formulation) and product data to construct a [Problem](https://pyblp.readthedocs.io/en/stable/_api/pyblp.Problem.html#pyblp.Problem).\n", "4. Use [Problem.solve](https://pyblp.readthedocs.io/en/stable/_api/pyblp.Problem.solve.html#pyblp.Problem.solve) to estimate paramters." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Loading the Data\n", "\n", "The product_data argument of [Problem](https://pyblp.readthedocs.io/en/stable/_api/pyblp.Problem.html#pyblp.Problem) should be a structured array-like object with fields that store data. Product data can be a structured [NumPy](https://www.numpy.org/) array, a [pandas](https://pandas.pydata.org/) DataFrame, or other similar objects." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
market_idscity_idsquarterproduct_idsfirm_idsbrand_idssharespricessugarmushy...demand_instruments10demand_instruments11demand_instruments12demand_instruments13demand_instruments14demand_instruments15demand_instruments16demand_instruments17demand_instruments18demand_instruments19
0C01Q111F1B04140.0124170.07208821...2.116358-0.154708-0.0057960.0145380.1262440.0673450.0684230.0348000.1263460.035484
1C01Q111F1B06160.0078090.114178181...-7.374091-0.5764120.0129910.0761430.0297360.0878670.1105010.0877840.0498720.072579
2C01Q111F1B07170.0129950.13239141...2.187872-0.2073460.0035090.0917810.1637730.1118810.1082260.0864390.1223470.101842
3C01Q111F1B09190.0057700.13034430...2.7045760.040748-0.0037240.0947320.1352740.0880900.1017670.1017770.1107410.104332
4C01Q111F1B111110.0179340.154823120...1.2612420.034836-0.0005680.1024510.1306400.0848180.1010750.1251690.1334640.121111
\n", "

5 rows \u00d7 30 columns

\n", "
" ], "text/plain": [ " market_ids city_ids quarter product_ids firm_ids brand_ids shares \\\n", "0 C01Q1 1 1 F1B04 1 4 0.012417 \n", "1 C01Q1 1 1 F1B06 1 6 0.007809 \n", "2 C01Q1 1 1 F1B07 1 7 0.012995 \n", "3 C01Q1 1 1 F1B09 1 9 0.005770 \n", "4 C01Q1 1 1 F1B11 1 11 0.017934 \n", "\n", " prices sugar mushy ... demand_instruments10 \\\n", "0 0.072088 2 1 ... 2.116358 \n", "1 0.114178 18 1 ... -7.374091 \n", "2 0.132391 4 1 ... 2.187872 \n", "3 0.130344 3 0 ... 2.704576 \n", "4 0.154823 12 0 ... 1.261242 \n", "\n", " demand_instruments11 demand_instruments12 demand_instruments13 \\\n", "0 -0.154708 -0.005796 0.014538 \n", "1 -0.576412 0.012991 0.076143 \n", "2 -0.207346 0.003509 0.091781 \n", "3 0.040748 -0.003724 0.094732 \n", "4 0.034836 -0.000568 0.102451 \n", "\n", " demand_instruments14 demand_instruments15 demand_instruments16 \\\n", "0 0.126244 0.067345 0.068423 \n", "1 0.029736 0.087867 0.110501 \n", "2 0.163773 0.111881 0.108226 \n", "3 0.135274 0.088090 0.101767 \n", "4 0.130640 0.084818 0.101075 \n", "\n", " demand_instruments17 demand_instruments18 demand_instruments19 \n", "0 0.034800 0.126346 0.035484 \n", "1 0.087784 0.049872 0.072579 \n", "2 0.086439 0.122347 0.101842 \n", "3 0.101777 0.110741 0.104332 \n", "4 0.125169 0.133464 0.121111 \n", "\n", "[5 rows x 30 columns]" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "product_data = pd.read_csv(pyblp.data.NEVO_PRODUCTS_LOCATION)\n", "product_data.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The product data contains market_ids, product_ids, firm_ids, shares, prices, a number of other IDs and product characteristics, and some pre-computed excluded demand_instruments0, demand_instruments1, and so on. The product_ids will be incorporated as fixed effects. \n", "\n", "For more information about the instruments and the example data as a whole, refer to the [data](https://pyblp.readthedocs.io/en/stable/_api/pyblp.data.html#module-pyblp.data) module." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Setting Up the Problem\n", "\n", "We can combine the [Formulation](https://pyblp.readthedocs.io/en/stable/_api/pyblp.Formulation.html#pyblp.Formulation) and product_data to construct a [Problem](https://pyblp.readthedocs.io/en/stable/_api/pyblp.Problem.html#pyblp.Problem). We pass the [Formulation](https://pyblp.readthedocs.io/en/stable/_api/pyblp.Formulation.html#pyblp.Formulation) first and the product_data second. We can also display the properties of the problem by typing its name. " ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "prices + Absorb[C(product_ids)]" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "logit_formulation = pyblp.Formulation('prices', absorb='C(product_ids)')\n", "logit_formulation" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Dimensions:\n", "================================\n", " T N F K1 MD ED \n", "--- ---- --- ---- ---- ----\n", "94 2256 5 1 20 1 \n", "================================\n", "\n", "Formulations:\n", "=====================================\n", " Column Indices: 0 \n", "----------------------------- ------\n", " X1: Linear Characteristics prices\n", "=====================================" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "problem = pyblp.Problem(logit_formulation, product_data)\n", "problem" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Two sets of properties are displayed:\n", "\n", "1. Dimensions of the data.\n", "2. Formulations of the problem.\n", "\n", "The dimensions describe the shapes of matrices as laid out in [Notation](https://pyblp.readthedocs.io/en/stable/notation.html#notation). They include:\n", "\n", "- $T$ is the number of markets.\n", "- $N$ is the length of the dataset (the number of products across all markets).\n", "- $F$ is the number of firms, which we won't use in this example.\n", "- $K_1$ is the dimension of the linear demand parameters.\n", "- $M_D$ is the dimension of the instrument variables (excluded instruments and exogenous regressors).\n", "- $E_D$ is the number of fixed effect dimensions (one-dimensional fixed effects, two-dimensional fixed effects, etc.).\n", "\n", "There is only a single [Formulation](https://pyblp.readthedocs.io/en/stable/_api/pyblp.Formulation.html#pyblp.Formulation) for this model. \n", "\n", "- $X_1$ is the linear component of utility for demand and depends only on prices (after the fixed effects are removed)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Solving the Problem\n", "\n", "The [Problem.solve](https://pyblp.readthedocs.io/en/stable/_api/pyblp.Problem.solve.html#pyblp.Problem.solve) method always returns a [ProblemResults](https://pyblp.readthedocs.io/en/stable/_api/pyblp.ProblemResults.html#pyblp.ProblemResults) class, which can be used to compute post-estimation outputs. See the [post estimation](post_estimation.ipynb) tutorial for more information." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Problem Results Summary:\n", "=========================================\n", "Computation GMM Objective Objective\n", " Time Step Evaluations Value \n", "----------- ---- ----------- ---------\n", " 00:00:01 2 2 +4.2E+05 \n", "=========================================\n", "\n", "Beta Estimates (Robust SEs in Parentheses):\n", "==========\n", " prices \n", "----------\n", " -3.0E+01 \n", "(+1.0E+00)\n", "==========" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "logit_results = problem.solve()\n", "logit_results" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Theory of Nested Logit\n", "\n", "We can extend the logit model to allow for correlation within a group $h$ so that\n", "\n", "$$U_{jti} = \\alpha p_{jt} + x_{jt} \\beta^x + \\xi_{jt} + \\bar{\\epsilon}_{h(j)ti} + (1 - \\rho) \\bar{\\epsilon}_{jti}.$$\n", "\n", "Now, we require that $\\epsilon_{jti} = \\bar{\\epsilon}_{h(j)ti} + (1 - \\rho) \\bar{\\epsilon}_{jti}$ is distributed IID with the Type I Extreme Value (Gumbel) distribution. As $\\rho \\rightarrow 1$, all consumers stay within their group. As $\\rho \\rightarrow 0$, this collapses to the IIA logit. Note that if we wanted, we could allow $\\rho$ to differ between groups with the notation $\\rho_{h(j)}$.\n", "\n", "This gives us aggregate marketshares as the product of two logits, the within group logit and the across group logit:\n", "\n", "$$s_{jt} = \\frac{\\exp[V_{jt} / (1 - \\rho)]}{\\exp[V_{h(j)t} / (1 - \\rho)]}\\cdot\\frac{\\exp V_{h(j)t}}{1 + \\sum_h \\exp V_{ht}},$$\n", "\n", "where $V_{jt} = \\alpha p_{jt} + x_{jt} \\beta^x + \\xi_{jt}$.\n", "\n", "After some work we again obtain the linear estimating equation:\n", "\n", "$$\\log s_{jt} - \\log s_{0t} = \\alpha p_{jt}+ x_{jt} \\beta^x +\\rho \\log s_{j|h(j)t} + \\xi_{jt},$$\n", "\n", "where $s_{j|h(j)t} = s_{jt} / s_{h(j)t}$ and $s_{h(j)t}$ is the share of group $h$ in market $t$. See [Berry (1994)](https://pyblp.readthedocs.io/en/stable/references.html#berry-1994) or [Cardell (1997)](https://pyblp.readthedocs.io/en/stable/references.html#cardell-1997) for more information.\n", "\n", "Again, the left hand side is data, though the $\\ln s_{j|h(j)t}$ is clearly endogenous which means we must instrument for it. Rather than include $\\ln s_{j|h(j)t}$ along with the linear components of utility, $X_1$, whenever nesting_ids are included in product_data, $\\rho$ is treated as a nonlinear $X_2$ parameter. This means that the linear component is given instead by\n", "\n", "$$\\log s_{jt} - \\log s_{0t} - \\rho \\log s_{j|h(j)t} = \\alpha p_{jt} + x_{jt} \\beta^x + \\xi_{jt}.$$\n", "\n", "This is done for two reasons:\n", "\n", "1. It forces the user to treat $\\rho$ as an endogenous parameter.\n", "2. It extends much more easily to the RCNL model of [Brenkers and Verboven (2006)](https://pyblp.readthedocs.io/en/stable/references.html#brenkers-and-verboven-2006).\n", "\n", "A common choice for an additional instrument is the number of products per nest." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Application of Nested Logit\n", "\n", "By including nesting_ids (another reserved name) as a field in product_data, we tell the package to estimate a nested logit model, and we don't need to change any of the formulas. We show how to construct the category groupings in two different ways:\n", "\n", "1. We put all products in a single nest (only the outside good in the other nest).\n", "2. We put products into two nests (either mushy or non-mushy).\n", "\n", "We also construct an additional instrument based on the number of products per nest. Typically this is useful as a source of exogenous variation in the within group share $\\ln s_{j|h(j)t}$. However, in this example because the number of products per nest does not vary across markets, if we include product fixed effects, this instrument is irrelevant.\n", "\n", "We'll define a function that constructs the additional instrument and solves the nested logit problem. We'll exclude product ID fixed effects, which are collinear with mushy, and we'll choose $\\rho = 0.7$ as the initial value at which the optimization routine will start." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "def solve_nl(df):\n", " groups = df.groupby(['market_ids', 'nesting_ids'])\n", " df['demand_instruments20'] = groups['shares'].transform(np.size)\n", " nl_formulation = pyblp.Formulation('0 + prices')\n", " problem = pyblp.Problem(nl_formulation, df)\n", " return problem.solve(rho=0.7)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, we'll solve the problem when there's a single nest for all products, with the outside good in its own nest." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Problem Results Summary:\n", "==================================================================================\n", "Computation GMM Optimization Objective Objective Gradient Hessian \n", " Time Step Iterations Evaluations Value Infinity Norm Eigenvalue\n", "----------- ---- ------------ ----------- --------- ------------- ----------\n", " 00:00:11 2 3 8 +4.6E+05 +3.1E-06 +2.4E+07 \n", "==================================================================================\n", "\n", "Rho Estimates (Robust SEs in Parentheses):\n", "==========\n", "All Groups\n", "----------\n", " +9.8E-01 \n", "(+1.4E-02)\n", "==========\n", "\n", "Beta Estimates (Robust SEs in Parentheses):\n", "==========\n", " prices \n", "----------\n", " -1.2E+00 \n", "(+4.0E-01)\n", "==========" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df1 = product_data.copy()\n", "df1['nesting_ids'] = 1\n", "nl_results1 = solve_nl(df1)\n", "nl_results1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When we inspect the [Problem](https://pyblp.readthedocs.io/en/stable/_api/pyblp.Problem.html#pyblp.Problem), the only changes from the plain logit model is the additional instrument that contributes to $M_D$ and the inclusion of $H$, the number of nesting categories." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Dimensions:\n", "===============================\n", " T N F K1 MD H \n", "--- ---- --- ---- ---- ---\n", "94 2256 5 1 21 1 \n", "===============================\n", "\n", "Formulations:\n", "=====================================\n", " Column Indices: 0 \n", "----------------------------- ------\n", " X1: Linear Characteristics prices\n", "=====================================" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nl_results1.problem" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, we'll solve the problem when there are two nests for mushy and non-mushy." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Problem Results Summary:\n", "==================================================================================\n", "Computation GMM Optimization Objective Objective Gradient Hessian \n", " Time Step Iterations Evaluations Value Infinity Norm Eigenvalue\n", "----------- ---- ------------ ----------- --------- ------------- ----------\n", " 00:00:13 2 3 8 +1.6E+06 +9.4E-06 +1.3E+07 \n", "==================================================================================\n", "\n", "Rho Estimates (Robust SEs in Parentheses):\n", "==========\n", "All Groups\n", "----------\n", " +8.9E-01 \n", "(+1.9E-02)\n", "==========\n", "\n", "Beta Estimates (Robust SEs in Parentheses):\n", "==========\n", " prices \n", "----------\n", " -7.8E+00 \n", "(+4.8E-01)\n", "==========" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df2 = product_data.copy()\n", "df2['nesting_ids'] = df2['mushy']\n", "nl_results2 = solve_nl(df2)\n", "nl_results2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For both cases we find that $\\hat{\\rho} > 0.8$.\n", "\n", "Finally, we'll also look at the adjusted parameter on prices, $\\alpha / (1-\\rho)$." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[-67.39338888]])" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nl_results1.beta / (1 - nl_results1.rho)" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[-72.27074638]])" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nl_results2.beta / (1 - nl_results2.rho)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Treating Within Group Shares as Exogenous\n", "\n", "The package is designed to prevent the user from treating the within group share, $\\log s_{j|h(j)t}$, as an exogenous variable. For example, if we were to compute a group_share variable and use the algebraic functionality of [Formulation](https://pyblp.readthedocs.io/en/stable/_api/pyblp.Formulation.html#pyblp.Formulation) by including the expression log(shares / group_share) in our formula for $X_1$, the package would raise an error because the package knows that shares should not be included in this formulation.\n", "\n", "To demonstrate why this is a bad idea, we'll override this feature by calculating $\\log s_{j|h(j)t}$ and including it as an additional variable in $X_1$. To do so, we'll first re-define our function for setting up and solving the nested logit problem." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "def solve_nl2(df):\n", " groups = df.groupby(['market_ids', 'nesting_ids'])\n", " df['group_share'] = groups['shares'].transform(np.sum)\n", " df['within_share'] = df['shares'] / df['group_share']\n", " df['demand_instruments20'] = groups['shares'].transform(np.size)\n", " nl2_formulation = pyblp.Formulation('0 + prices + log(within_share)')\n", " problem = pyblp.Problem(nl2_formulation, df.drop(columns=['nesting_ids']))\n", " return problem.solve()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Again, we'll solve the problem when there's a single nest for all products, with the outside good in its own nest." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Problem Results Summary:\n", "=========================================\n", "Computation GMM Objective Objective\n", " Time Step Evaluations Value \n", "----------- ---- ----------- ---------\n", " 00:00:01 2 2 +4.6E+05 \n", "=========================================\n", "\n", "Beta Estimates (Robust SEs in Parentheses):\n", "=============================\n", " prices log(within_share)\n", "---------- -----------------\n", " -1.0E+00 +9.9E-01 \n", "(+2.4E-01) (+7.9E-03) \n", "=============================" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nl2_results1 = solve_nl2(df1)\n", "nl2_results1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And again, we'll solve the problem when there are two nests for mushy and non-mushy." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Problem Results Summary:\n", "=========================================\n", "Computation GMM Objective Objective\n", " Time Step Evaluations Value \n", "----------- ---- ----------- ---------\n", " 00:00:01 2 2 +1.6E+06 \n", "=========================================\n", "\n", "Beta Estimates (Robust SEs in Parentheses):\n", "=============================\n", " prices log(within_share)\n", "---------- -----------------\n", " -6.8E+00 +9.3E-01 \n", "(+2.9E-01) (+1.1E-02) \n", "=============================" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nl2_results2 = solve_nl2(df2)\n", "nl2_results2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One can observe that we obtain parameter estimates which are quite different than above." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([-86.37368446])" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nl2_results1.beta / (1 - nl2_results1.beta)" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([-100.14496892])" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nl2_results2.beta / (1 - nl2_results2.beta)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.6" } }, "nbformat": 4, "nbformat_minor": 2 }