Composite Distribution
The composite distribution is the staple distribtion of dmx-learn that allows for distributions over heterogenous tuples of data. Assume we have observed a d-dimensional tuple \(x=(x_1, x_2, \dots, x_d)\) with component-wise data types \((T_1, T_2, \dots, T_d)\). The composite distribution models the tuple with a likelihood
where \(f(x_i \vert \theta_i)\) are distributions compatible with component data type \(T_i\).
CompositeDistribution
- class dmx.stats.composite.CompositeDistribution(dists, name=None, keys=None)
CompositeDistribution for modeling tuples of heterogeneous data.
- dists
Distributions for each component.
- Type:
Tuple[SequenceEncodableProbabilityDistribution, …]
- count
Number of components (i.e. len(dists)).
- Type:
int
- name
Name of object.
- Type:
Optional[str]
- keys
Key for marking shared parameters.
- Type:
Optional[str]
- __init__(dists, name=None, keys=None)
Create an instance of CompositeDistribution.
- Parameters:
dists (Sequence[SequenceEncodableProbabilityDistribution]) – Component distributions.
name (Optional[str], optional) – Name of object. Defaults to None.
keys (Optional[str], optional) – Key for marking shared parameters. Defaults to None.
- density(x)
Evaluate density of CompositeDistribution for a single observation tuple x.
- Parameters:
x (Tuple[Any, ...]) – Tuple of length = len(dists), the k-th data type must be consistent with dists[k].
- Returns:
Density value.
- Return type:
float
- dist_to_encoder()
Return a CompositeDataEncoder for this distribution.
- Returns:
Encoder object.
- Return type:
CompositeDataEncoder
- estimator(pseudo_count=None)
Create CompositeEstimator for estimating CompositeDistribution.
- Parameters:
pseudo_count (Optional[float], optional) – Used to inflate sufficient statistics in estimation.
- Returns:
Estimator object.
- Return type:
- log_density(x)
Evaluate log-density of CompositeDistribution for a single observation tuple x.
- Parameters:
x (Tuple[Any, ...]) – Tuple of length = len(dists), the k-th data type must be consistent with dists[k].
- Returns:
Log-density value.
- Return type:
float
- sampler(seed=None)
Create CompositeSampler for sampling from CompositeDistribution instance.
- Parameters:
seed (Optional[int], optional) – Seed to set for sampling with RandomState. Defaults to None.
- Returns:
Sampler object.
- Return type:
- seq_log_density(x)
Vectorized evaluation of log density for CompositeEncodedDataSequence.
- Parameters:
x (CompositeEncodedDataSequence) – EncodedDataSequence for Composite Distribution.
- Returns:
Log-density evaluated at all encoded data points.
- Return type:
np.ndarray
- Raises:
Exception – If input is not a CompositeEncodedDataSequence.
CompositeEstimator
- class dmx.stats.composite.CompositeEstimator(estimators, keys=None, name=None)
Estimator for CompositeDistribution.
- estimators
Estimators for each component.
- Type:
Sequence[ParameterEstimator]
- keys
Keys used for merging sufficient statistics.
- Type:
Optional[str]
- count
Number of components.
- Type:
int
- name
Name of the object.
- Type:
Optional[str]
- __init__(estimators, keys=None, name=None)
Initialize CompositeEstimator.
- Parameters:
estimators (Sequence[ParameterEstimator]) – Estimators for each component.
keys (Optional[str], optional) – Keys used for merging sufficient statistics. Defaults to None.
name (Optional[str], optional) – Name of the object. Defaults to None.
- Raises:
TypeError – If keys is not a string or None.
- accumulator_factory()
Return a CompositeAccumulatorFactory for this estimator.
- Returns:
Factory object.
- Return type:
CompositeAccumulatorFactory
- estimate(nobs, suff_stat)
Estimate a CompositeDistribution from aggregated sufficient statistics.
- Parameters:
nobs (Optional[float]) – Weighted number of observations used to form suff_stat.
suff_stat (Tuple[Any, ...]) – Tuple of sufficient statistics for each estimator.
- Returns:
Estimated distribution.
- Return type:
CompositeSampler
- class dmx.stats.composite.CompositeSampler(dist, seed=None)
CompositeSampler used to generate samples from CompositeDistribution.
- dist
CompositeDistribution to draw samples from.
- Type:
- rng
RandomState with seed set if provided.
- Type:
RandomState
- dist_samplers
List of DistributionSamplers for each component.
- Type:
List[DistributionSampler]
- sample(size=None)
Generate independent samples from a CompositeDistribution.
If size is None, draw one sample and return as Tuple of length = len(dists). If size > 0, draw size samples and return a list of length size containing tuples of len(dists).
- Parameters:
size (Optional[int], optional) – If None, draw 1 sample. Else, draw size number of iid samples.
- Returns:
A tuple of length = len(dists) or a list of length size containing tuples of length = len(dists).
- Return type:
Union[List[Tuple[Any, …]], Tuple[Any, …]]