pyblp.Formulation¶
-
class
pyblp.
Formulation
(formula, absorb=None, absorb_method=None, absorb_options=None)¶ Configuration for designing matrices and absorbing fixed effects.
Internally, the patsy package is used to convert data and R-style formulas into matrices. All of the standard binary operators can be used to design complex matrices of factor interactions:
+
- Set union of terms.-
- Set difference of terms.*
- Short-hand. The formulaa * b
is the same asa + b + a:b
./
- Short-hand. The formulaa / b
is the same asa + a:b
.:
- Interactions between two sets of terms.**
- Interactions up to an integer degree.
However, since factors need to be differentiated (for example, when computing elasticities), only the most essential functions are supported:
C
- Mark a variable as categorical. Seepatsy.builtins.C()
. Arguments are not supported.I
- Encapsulate mathematical operations. Seepatsy.builtins.I()
.log
- Natural logarithm function.exp
- Natural exponential function.
Data associated with variables should generally already be transformed. However, when encapsulated by
I()
, these operators function like normal mathematical operators on numeric variables:+
adds,-
subtracts,*
multiplies,/
divides, and**
exponentiates.Internally, mathematical operations are parsed and evaluated by the SymPy package, which is also used to symbolically differentiate terms when derivatives are needed.
- Parameters
formula (str) – R-style formula used to design a matrix. Variable names will be validated when this formulation and data are passed to a function that uses them. By default, an intercept is included, which can be removed with
0
or-1
. Ifabsorb
is specified, intercepts are ignored.absorb (str, optional) – R-style formula used to design a matrix of categorical variables representing fixed effects, which will be absorbed into the matrix designed by
formula
by the PyHDFE package. Fixed effect absorption is only supported for some matrices. Unlikeformula
, intercepts are ignored. Only categorical variables are supported.absorb_method (str, optional) –
Method by which fixed effects will be absorbed. For a full list of supported methods, refer to the
residualize_method
argument ofpyhdfe.create()
.By default, the simplest methods are used: simple de-meaning for a single fixed effect and simple iterative de-meaning by way of the method of alternating projections (MAP) for multiple dimensions of fixed effects. For multiple dimensions, non-accelerated MAP is unlikely to be the fastest algorithm. If fixed effect absorption seems to be taking a long time, consider using a different method such as
'lsmr'
, usingabsorb_options
to specify a MAP acceleration method, or configuring other options such as termination tolerances.absorb_options (dict, optional) – Configuration options for the chosen
method
, which will be passed to theoptions
argument ofpyhdfe.create()
.
Examples