Multinomial Distribution
Data Type: Sequence[Tuple[int, str]]
The multinomial distribution is a generalization of the binomial distribution to k classes. The multinomial give the probability of observing \(x_k\) success/counts of class object \(v_k\) in \(n=\sum_{i} x_i\) trials. The probability mass function is given by
where \(\sum_{i=1}^{k} p_i = 1\) and \(\sum_{i=1}^{k} x_i = n\). Here we have allowed the classes to be represented by an object/string \(v_i\) in the set \(V=\{v_1, \dots, v_k\}\). If the user maps the objects to the set of integers, the Integer Multinomial Distribtion can be used instead.
For more info see Multinomial Distribution.
MultinomialDistribution
- class dmx.stats.catmultinomial.MultinomialDistribution(dist, len_dist=None, len_normalized=False, name=None, keys=None)
Multinomial distribution over a countable support with optional distribution for number of trials.
- Parameters:
dist (SequenceEncodableProbabilityDistribution) – Distribution with at most a countable support.
len_dist (Optional[SequenceEncodableProbabilityDistribution], optional) – Distribution for the number of trials. Defaults to NullDistribution().
len_normalized (bool, optional) – Take geometric mean of the density of observation. Defaults to False.
name (Optional[str], optional) – Name for the object instance. Defaults to None.
keys (Optional[str], optional) – Keys for merging sufficient statistics. Defaults to None.
- dist
Distribution with at most a countable support.
- len_dist
Distribution for the number of trials.
- Type:
Optional[SequenceEncodableProbabilityDistribution]
- len_normalized
Take geometric mean of the density of observation.
- Type:
bool
- name
Name for the object instance.
- Type:
Optional[str]
- keys
Keys for merging sufficient statistics.
- Type:
Optional[str]
- __init__(dist, len_dist=None, len_normalized=False, name=None, keys=None)
Initializes a MultinomialDistribution object.
- Parameters:
dist (SequenceEncodableProbabilityDistribution) – Distribution with at most a countable support.
len_dist (Optional[SequenceEncodableProbabilityDistribution], optional) – Distribution for the number of trials. Defaults to NullDistribution().
len_normalized (bool, optional) – Take geometric mean of the density of observation. Defaults to False.
name (Optional[str], optional) – Name for the object instance. Defaults to None.
keys (Optional[str], optional) – Keys for merging sufficient statistics. Defaults to None.
- density(x)
Returns the density of multinomial evaluated at observation x.
- Parameters:
x (Sequence[Tuple[T, float]]) – Tuples of observed multinomial values and successes such that the successes sum to the number of trials.
- Returns:
Density evaluated at x.
- Return type:
float
- dist_to_encoder()
Get a data encoder for this distribution.
- Returns:
Encoder for this distribution.
- Return type:
MultinomialDataEncoder
- estimator(pseudo_count=None)
Create a MultinomialEstimator object from this distribution.
- Parameters:
pseudo_count (Optional[float], optional) – Re-weight member sufficient statistics when estimating from aggregated data.
- Returns:
Estimator for this distribution.
- Return type:
- log_density(x)
Returns the log-density of multinomial evaluated at observation x.
- Parameters:
x (Sequence[Tuple[T, float]]) – Tuples of observed multinomial values and successes such that the successes sum to the number of trials.
- Returns:
Log-density evaluated at x.
- Return type:
float
- sampler(seed=None)
Create a MultinomialSampler object from this distribution.
- Parameters:
seed (Optional[int], optional) – Seed for sampling. Defaults to None.
- Returns:
Sampler for this distribution.
- Return type:
- Raises:
Exception – If len_dist is a NullDistribution.
- seq_log_density(x)
Vectorized log-density for encoded data.
- Parameters:
x (MultinomialEncodedDataSequence) – Encoded sequence.
- Returns:
Log-densities for each observation.
- Return type:
np.ndarray
- Raises:
Exception – If input is not a MultinomialEncodedDataSequence.
MultinomialEstimator
- class dmx.stats.catmultinomial.MultinomialEstimator(estimator, len_estimator=<dmx.stats.null_dist.NullEstimator object>, pseudo_count=None, len_dist=None, len_normalized=False, name=None, keys=None)
MultinomialEstimator object for estimating MultinomialDistribution objects from aggregated data.
- estimator
ParameterEstimator for distribution of values.
- Type:
- len_estimator
ParameterEstimator for the number of trials, defaults to the NullEstimator if None is passed.
- Type:
- pseudo_count
Regularizer estimator and len_estimator.
- Type:
Optional[float]
- len_dist
If None, distribution for number of trials will be estimated from ‘len_estimator’.
- Type:
Optional[SequenceEncodableProbabilityDistribution]
- len_normalized
Take geometric mean of density.
- Type:
Optional[bool]
- name
Name of object instance.
- Type:
Optional[str]
- keys
Keys of object instance for merging sufficient statistics.
- Type:
Optional[str]
- __init__(estimator, len_estimator=<dmx.stats.null_dist.NullEstimator object>, pseudo_count=None, len_dist=None, len_normalized=False, name=None, keys=None)
MultinomialEstimator object.
- Parameters:
estimator (ParameterEstimator) – ParameterEstimator for distribution of values.
len_estimator (Optional[ParameterEstimator]) – Optional ParameterEstimator for the number of trials.
pseudo_count (Optional[float]) – Regularizer estimator and len_estimator.
len_dist (Optional[SequenceEncodableProbabilityDistribution]) – Set distribution for the number of trials.
len_normalized (Optional[bool]) – Take geometric mean of density.
name (Optional[str]) – Set name to object instance.
keys (Optional[str]) – Set keys to object instance for merging sufficient statistics.
- accumulator_factory()
Create SequenceEncodableStatisticAccumulator object.
- Return type:
MultinomialAccumulatorFactory
- estimate(nobs, suff_stat)
Estimate a MultinomialDistribution object from aggregated data contained in arg ‘suff_stat’.
- Parameters:
nobs (Optional[float]) – Number of observations used in aggregation of ‘suff_stat’.
suff_stat (Tuple[SS1, Optional[SS2]]) – Tuple of sufficient statistics for distribution of values and trial distribution.
- Returns:
Estimate from sufficient statistics.
- Return type:
MultinomialSampler
- class dmx.stats.catmultinomial.MultinomialSampler(dist, seed=None)
MultinomialSampler object for sampling from multinomial distribution.
- dist
An instance of a MultinomialDistribution object.
- Type:
- rng
RandomState with seed set if passed.
- Type:
RandomState
- dist_sampler
DistributionSampler object for sampling category values.
- Type:
- len_sampler
DistributionSampler object for sampling number of trials in multinomial.
- Type:
- sample(size=None)
Draw samples from multinomial distribution.
Note: If len_sampler can draw n=0, an empty list is returned for that sample.
- Parameters:
size (Optional[int]) – Number of iid samples to draw from multinomial.
- Returns:
Sequence of ‘size’ iid observations if size is not None, else a single multinomial sample.
- Return type:
Union[Sequence[Sequence[Tuple[Any, float]]], Sequence[Tuple[Any, float]]]