pyblp.Problem.solve¶

Problem.
solve
(sigma=None, pi=None, rho=None, beta=None, gamma=None, sigma_bounds=None, pi_bounds=None, rho_bounds=None, beta_bounds=None, gamma_bounds=None, delta=None, W=None, method='2s', optimization=None, check_optimality='both', error_behavior='revert', error_punishment=1, delta_behavior='first', iteration=None, fp_type='safe_linear', costs_type='linear', costs_bounds=None, center_moments=True, W_type='robust', se_type='robust')¶ Solve the problem.
The problem is solved in one or more GMM steps. During each step, any parameters in \(\hat{\theta}\) are optimized to minimize the GMM objective value. If there are no parameters in \(\hat{\theta}\) (for example, in the logit model there are no nonlinear parameters and all linear parameters can be concentrated out), the objective is evaluated once during the step.
If there are nonlinear parameters, the mean utility, \(\delta(\hat{\theta})\) is computed marketbymarket with fixed point iteration. Otherwise, it is computed analytically according to the solution of the logit model. If a supply side is to be estimated, marginal costs, \(c(\hat{\theta})\), are also computed marketbymarket. Linear parameters are then estimated, which are used to recover structural error terms, which in turn are used to form the objective value. By default, the objective gradient is computed as well.
Note
This method supports
parallel()
processing. If multiprocessing is used, marketbymarket computation of \(\delta(\hat{\theta})\) (and \(\tilde{c}(\hat{\theta})\) if a supply side is estimated), along with associated Jacobians, will be distributed among the processes. Parameters
sigma (arraylike, optional) –
Configuration for which elements in the Cholesky root of the covariance matrix for unobserved taste heterogeneity, \(\Sigma\), are fixed at zero and starting values for the other elements, which, if not fixed by
sigma_bounds
, are in the vector of unknown elements, \(\theta\).Rows and columns correspond to columns in \(X_2\), which is formulated according
product_formulations
inProblem
. If \(X_2\) was not formulated, this should not be specified, since the logit model will be estimated.Values below the diagonal are ignored. Zeros are assumed to be zero throughout estimation and nonzeros are, if not fixed by
sigma_bounds
, starting values for unknown elements in \(\theta\). If any columns are fixed at zero, only the first few columns of integration nodes (specified inProblem
) will be used.pi (arraylike, optional) –
Configuration for which elements in the matrix of parameters that measures how agent tastes vary with demographics, \(\Pi\), are fixed at zero and starting values for the other elements, which, if not fixed by
pi_bounds
, are in the vector of unknown elements, \(\theta\).Rows correspond to the same product characteristics as in
sigma
. Columns correspond to columns in \(d\), which is formulated according toagent_formulation
inProblem
. If \(d\) was not formulated, this should not be specified.Zeros are assumed to be zero throughout estimation and nonzeros are, if not fixed by
pi_bounds
, starting values for unknown elements in \(\theta\).rho (arraylike, optional) –
Configuration for which elements in the vector of parameters that measure within nesting group correlation, \(\rho\), are fixed at zero and starting values for the other elements, which, if not fixed by
rho_bounds
, are in the vector of unknown elements, \(\theta\).If this is a scalar, it corresponds to all groups defined by the
nesting_ids
field ofproduct_data
inProblem
. If this is a vector, it must have \(H\) elements, one for each nesting group. Elements correspond to group IDs in the sorted order ofProblem.unique_nesting_ids
. If nesting IDs were not specified, this should not be specified either.Zeros are assumed to be zero throughout estimation and nonzeros are, if not fixed by
rho_bounds
, starting values for unknown elements in \(\theta\).beta (arraylike, optional) –
Configuration for which elements in the vector of demandside linear parameters, \(\beta\), are concentrated out of the problem. Usually, this is left unspecified, unless there is a supply side, in which case parameters on endogenous product characteristics cannot be concentrated out of the problem. Values specify which elements are fixed at zero and starting values for the other elements, which, if not fixed by
beta_bounds
, are in the vector of unknown elements, \(\theta\).Elements correspond to columns in \(X_1\), which is formulated according to
product_formulations
inProblem
.Both
None
andnumpy.nan
indicate that the parameter should be concentrated out of the problem. That is, it will be estimated, but does not have to be included in \(\theta\). Zeros are assumed to be zero throughout estimation and nonzeros are, if not fixed bybeta_bounds
, starting values for unknown elements in \(\theta\).gamma (arraylike, optional) –
Configuration for which elements in the vector of supplyside linear parameters, \(\gamma\), are concentrated out of the problem. Usually, this is left unspecified. Values specify which elements are fixed at zero and starting values for the other elements, which, if not fixed by
gamma_bounds
, are in the vector of unknown elements, \(\theta\).Elements correspond to columns in \(X_3\), which is formulated according to
product_formulations
inProblem
. If \(X_3\) was not formulated, this should not be specified.Both
None
andnumpy.nan
indicate that the parameter should be concentrated out of the problem. That is, it will be estimated, but does not have to be included in \(\theta\). Zeros are assumed to be zero throughout estimation and nonzeros are, if not fixed bygamma_bounds
, starting values for unknown elements in \(\theta\).sigma_bounds (tuple, optional) –
Configuration for \(\Sigma\) bounds of the form
(lb, ub)
, in which bothlb
andub
are of the same size assigma
. Each element inlb
andub
determines the lower and upper bound for its counterpart insigma
. Ifoptimization
does not support bounds, these will be ignored.By default, if bounds are supported, the diagonal of
sigma
is bounded from below by zero. Conditional on \(X_2\), \(\mu\), and an initial estimate of \(\mu\), default bounds for offdiagonal parameters are chosen to reduce the need for overflow safety precautions.Values below the diagonal are ignored. Lower and upper bounds corresponding to zeros in
sigma
are set to zero. Setting a lower bound equal to an upper bound fixes the corresponding element, removing it from \(\theta\). BothNone
andnumpy.nan
are converted tonumpy.inf
inlb
and tonumpy.inf
inub
.pi_bounds (tuple, optional) –
Configuration for \(\Pi\) bounds of the form
(lb, ub)
, in which bothlb
andub
are of the same size aspi
. Each element inlb
andub
determines the lower and upper bound for its counterpart inpi
. Ifoptimization
does not support bounds, these will be ignored.By default, if bounds are supported, conditional on \(X_2\), \(d\), and an initial estimate of \(\mu\), default bounds are chosen to reduce the need for overflow safety precautions.
Lower and upper bounds corresponding to zeros in
pi
are set to zero. Setting a lower bound equal to an upper bound fixes the corresponding element, removing it from \(\theta\). BothNone
andnumpy.nan
are converted tonumpy.inf
inlb
and tonumpy.inf
inub
.rho_bounds (tuple, optional) –
Configuration for \(\rho\) bounds of the form
(lb, ub)
, in which bothlb
andub
are of the same size asrho
. Each element inlb
andub
determines the lower and upper bound for its counterpart inrho
. Ifoptimization
does not support bounds, these will be ignored.By default, if bounds are supported, all elements are bounded from below by
0
, which corresponds to the simple logit model. Conditional on an initial estimate of \(\mu\), upper bounds are chosen to reduce the need for overflow safety precautions, and are less than1
because larger values are inconsistent with utility maximization.Lower and upper bounds corresponding to zeros in
rho
are set to zero. Setting a lower bound equal to an upper bound fixes the corresponding element, removing it from \(\theta\). BothNone
andnumpy.nan
are converted tonumpy.inf
inlb
and tonumpy.inf
inub
.beta_bounds (tuple, optional) –
Configuration for \(\beta\) bounds of the form
(lb, ub)
, in which bothlb
andub
are of the same size asbeta
. Each element inlb
andub
determines the lower and upper bound for its counterpart inbeta
. Ifoptimization
does not support bounds, these will be ignored.Usually, this is left unspecified, unless there is a supply side, in which case parameters on endogenous product characteristics cannot be concentrated out of the problem. It is generally a good idea to constrain such parameters to be nonzero so that the intrafirm Jacobian of shares with respect to prices does not become singular.
By default, all nonconcentrated out parameters are unbounded. Bounds should only be specified for parameters that are included in \(\theta\); that is, those with initial values specified in
beta
.Lower and upper bounds corresponding to zeros in
beta
are set to zero. Setting a lower bound equal to an upper bound fixes the corresponding element, removing it from \(\theta\). BothNone
andnumpy.nan
are converted tonumpy.inf
inlb
and tonumpy.inf
inub
.gamma_bounds (tuple, optional) –
Configuration for \(\gamma\) bounds of the form
(lb, ub)
, in which bothlb
andub
are of the same size asgamma
. Each element inlb
andub
determines the lower and upper bound for its counterpart ingamma
. Ifoptimization
does not support bounds, these will be ignored.By default, all nonconcentrated out parameters are unbounded. Bounds should only be specified for parameters that are included in \(\theta\); that is, those with initial values specified in
gamma
.Lower and upper bounds corresponding to zeros in
gamma
are set to zero. Setting a lower bound equal to an upper bound fixes the corresponding element, removing it from \(\theta\). BothNone
andnumpy.nan
are converted tonumpy.inf
inlb
and tonumpy.inf
inub
.delta (arraylike, optional) – Initial values for the mean utility, \(\delta\). If there are any nonlinear parameters, these are the values at which the fixed point iteration routine will start during the first objective evaluation. By default, the solution to the logit model in (36) is used. If \(\rho\) is specified, the solution to the nested logit model in (37) under the initial
rho
is used instead.W (arraylike, optional) – Starting values for the weighting matrix, \(W\). By default, the 2SLS weighting matrix in (23) is used.
method (str, optional) –
The estimation routine that will be used. The following methods are supported:
'1s'
 Onestep GMM.'2s'
(default)  Twostep GMM.
Iterated GMM can be manually implemented by executing single GMM steps in a loop, in which after the first iteration, nonlinear parameters and weighting matrices from the last
ProblemResults
are passed as arguments.optimization (Optimization, optional) –
Optimization
configuration for how to solve the optimization problem in each GMM step, which is only used if there are unfixed nonlinear parameters over which to optimize. By default,Optimization('lbfgsb')
is used. If available,Optimization('knitro')
may be preferable. Generally, it is recommended to consider a number of different optimization routines and starting values, verifying that \(\hat{\theta}\) satisfies both the first and second order conditions. Routines that do not support bounds will ignoresigma_bounds
andpi_bounds
. Choosing a routine that does not use analytic gradients will often down estimation.check_optimality (str, optional) –
How to check for optimality (first and second order conditions) after the optimization routine finishes. The following configurations are supported:
'gradient'
 Analytically compute the gradient after optimization finishes, but do not compute the Hessian. Since Jacobians needed to compute standard errors will already be computed, gradient computation will not take a long time. This option may be useful if Hessian computation takes a long time when, for example, there are a large number of parameters.'both'
(default)  Also compute the Hessian with central finite differences after optimization finishes. Specifically, analytically compute the gradient \(2P\) times, perturbing each of the \(P\) parameters by \(\pm\epsilon / 2\) where \(\epsilon\) is the square root of the machine precision.
error_behavior (str, optional) –
How to handle any errors. For example, there can sometimes be overflow or underflow when computing \(\delta(\hat{\theta})\) at a large \(\hat{\theta}\). The following behaviors are supported:
'revert'
(default)  Revert problematic \(\delta(\hat{\theta})\) elements to their last computed values and use reverted values to compute \(\frac{\partial\xi}{\partial\theta}\), and, if there is a supply side, to compute both \(\tilde{c}(\hat{\theta})\) and \(\frac{\partial\omega}{\partial\theta}\) as well. If there are problematic elements in \(\frac{\partial\xi}{\partial\theta}\), \(\tilde{c}(\hat{\theta})\), or \(\frac{\partial\omega}{\partial\theta}\), revert these to their last computed values as well. If there are problematic elements after the first objective evaluation, revert values in \(\delta(\hat{\theta})\) to their starting values; in \(\tilde{c}(\hat{\theta})\), to prices; and in Jacobians, to zeros. In the unlikely event that the gradient or its objective have problematic elements, revert them as well, and if this happens during the first objective evaluation, revert the objective to1e10
and its gradient to zeros.'punish'
 Set the objective to1
and its gradient to all zeros. This option along with a largeerror_punishment
can be helpful for routines that do not use analytic gradients.'raise'
 Raise an exception.
error_punishment (float, optional) – How to scale the GMM objective value after an error. By default, the objective value is not scaled.
delta_behavior (str, optional) –
Configuration for the values at which the fixed point computation of \(\delta(\hat{\theta})\) in each market will start. This configuration is only relevant if there are unfixed nonlinear parameters over which to optimize. The following behaviors are supported:
'first'
(default)  Start at the values configured bydelta
during the first GMM step, and at the values computed by the last GMM step for each subsequent step.'last'
 Start at the values of \(\delta(\hat{\theta})\) computed during the last objective evaluation, or, if this is the first evaluation, at the values configured bydelta
. This behavior tends to speed up computation but may introduce some instability into estimation.
iteration (Iteration, optional) –
Iteration
configuration for how to solve the fixed point problem used to compute \(\delta(\hat{\theta})\) in each market. This configuration is only relevant if there are nonlinear parameters, since \(\delta\) can be estimated analytically in the logit model. By default,Iteration('squarem', {'atol': 1e14})
is used. Newtonbased routines such asIteration('lm'`)
that compute the Jacobian can often be faster (especially when there are nesting parameters), but the nonJacobian SQUAREM routine is used by default because it speed is often comparable and in practice it can be slightly more stable.fp_type (str, optional) –
Configuration for the type of contraction mapping used to compute \(\delta(\hat{\theta})\). The following types are supported:
'safe_linear'
(default)  The standard linear contraction mapping in (13) (or (35) when there is nesting) with safeguards against numerical overflow. Specifically, \(\max_j V_{jti}\) (or \(\max_j V_{jti} / (1  \rho_{h(j)}) when there is nesting) is subtracted from :math:\) and the logit expression for choice probabilities in (5) (or (33)) is rescaled accordingly. Such rescaling is known as the logsumexp trick.'linear'
 The standard linear contraction mapping without safeguards against numerical overflow. This option may be preferable to'safe_linear'
if utilities are reasonably small and unlikely to create overflow problems.'nonlinear'
 Iteration over \(\exp(\delta_{jt})\) instead of \(\delta_{jt}\). This can be faster than'linear'
because it involves fewer logarithms. Also, following Brunner, Heiss, Romahn, and Weiser (2017), the \(\exp(\delta_{jt})\) term can be cancelled out of the expression because it also appears in the numerator of (5) in the definition of \(s_{jt}(\delta, \hat{\theta})\). This second trick only works when there are no nesting parameters.'safe_nonlinear'
 Exponentiated version with minimal safeguards against numerical overflow. Specifically, \(\max_j \mu_{jti}\) is subtracted from \(\mu_{jti}\). This helps with stability but is less helpful than subtracting from the full \(V_{jti}\), so this version is less stable than'safe_linear'
.
This option is only relevant if
sigma
orpi
are specified because \(\delta\) can be estimated analytically in the logit model with (36) and in the nested logit model with (37).costs_type (str, optional) –
Specification of the marginal cost function \(\tilde{c} = f(c)\) in (9). The following specifications are supported:
'linear'
(default)  Linear specification: \(\tilde{c} = c\).'log'
 Loglinear specification: \(\tilde{c} = \log c\).
This specification is only relevant if \(X_3\) was formulated by
product_formulations
inProblem
.costs_bounds (tuple, optional) –
Configuration for \(c\) bounds of the form
(lb, ub)
, in which bothlb
andub
are floats. This is only relevant if \(X_3\) was formulated byproduct_formulations
inProblem
. By default, marginal costs are unbounded.When
costs_type
is'log'
, nonpositive \(c(\hat{\theta})\) values can create problems when computing \(\tilde{c}(\hat{\theta}) = \log c(\hat{\theta})\). One solution is to setlb
to a small number. Rows in Jacobians associated with clipped marginal costs will be zero.Both
None
andnumpy.nan
are converted tonumpy.inf
inlb
and tonumpy.inf
inub
.center_moments (bool, optional) – Whether to center each column of the sample moments \(g\) before updating the weighting matrix \(W\). By default, sample moments are centered. This has no effect if
W_type
is'unadjusted'
.W_type (str, optional) –
How to update the weighting matrix. This has no effect if
method
is'1s'
. Often,se_type
should be the same. The following types are supported:'robust'
(default)  Heteroscedasticity robust weighting matrix defined in (24) and (25).'clustered'
 Clustered weighting matrix defined in (24) and (26). Clusters must be defined by theclustering_ids
field ofproduct_data
inProblem
.'unadjusted'
 Homoskedastic weighting matrix defined in (24) and (28).
se_type (str, optional) –
How to compute standard errors. Typically,
W_type
should be the same. The following types are supported:'robust'
(default)  Heteroscedasticity robust standard errors defined in (29) and (25).'clustered'
 Clustered standard errors defined in (29) and (26). Clusters must be defined by theclustering_ids
field ofproduct_data
inProblem
.'unadjusted'
 Homoskedastic standard errors defined in (30), which are computed under the assumption that the weighting matrix is optimal.
 Returns
ProblemResults
of the solved problem. Return type
ProblemResults
Examples