pyblp.parallel

pyblp.parallel(processes, use_pathos=False)

Context manager used for parallel processing in a with statement context.

This manager creates a context in which a pool of Python processes will be used by any method that requires market-by-market computation. These methods will distribute their work among the processes. After the context created by the with statement ends, all worker processes in the pool will be terminated. Outside this context, such methods will not use multiprocessing.

Importantly, multiprocessing will only improve speed if gains from parallelization outweigh overhead from serializing and passing data between processes. For example, if computation for a single market is very fast and there is a lot of data in each market that must be serialized and passed between processes, using multiprocessing may reduce overall speed.

Parameters
  • processes (int) – Number of Python processes that will be created and used by any method that supports parallel processing.

  • use_pathos (bool, optional) – Whether to use pathos (which will need to be installed) instead of the default, built-in multiprocessing module. Since pathos uses dill to pickle and pass objects between processes, it can support more objects than the default multiprocessing module, which uses the default pickle module. However, dill can be much slower, so using pathos can further increase overhead of passing data between processes.

Examples