Abstract Classes
dmx-learn captures most distributions in the exponential family. A detailed walkthrough on defining a custom distribution class can be found in User Defined Classes. We list the abstract classes that exist in dmx-learn below.
ProbabilityDistribution
- class dmx.stats.pdist.ProbabilityDistribution
Defines ProbabilityDistribution Abstract Class.
Note
This is generally used as an inherited class for SequenceEncodableProbabilityDistribution.
- __init__()
- abstract estimator(pseudo_count=None)
Create a ParameterEstimator for corresponding SequenceEncodableProbabilityDistribution.
- Parameters:
pseudo_count (Optional[float]) – Regularize sufficient statistics in estimation step.
- Return type:
- Returns:
ParameterEstimator
- abstract log_density(x)
Evaluate the log-density of distribution.
- Return type:
float- Returns:
float
- abstract sampler(seed=None)
Create a DistributionSampler object for a given ProbabilityDistribution.
- Parameters:
seed (Optional[int]) – Set seed for drawing samples from distribution.
- Return type:
SequenceEncodableProbabilityDistribution
- class dmx.stats.pdist.SequenceEncodableProbabilityDistribution
Extends the ProbabilityDistribution to handle vectorized calls.
- abstract dist_to_encoder()
Create DataSequenceEncoder object for SequenceEncodableProbabilityDistribution instance.
- Return type:
- Returns:
DataSequenceEncoder
- abstract seq_log_density(x)
Vectorized evaluation of the log density.
- Parameters:
x (EncodedDataSequence) – EncodedDataSequence for corresponding SequenceEncodedProbabilityDistribution.
- Return type:
ndarray- Returns:
np.ndarray
DistributionSampler
- class dmx.stats.pdist.DistributionSampler(dist, seed=None)
DistributionSampler is an Abstract class for distribution samplers.
- dist
Distribution to sample from.
- rng
Random number generator.
- Type:
RandomState
- __init__(dist, seed=None)
Initialize DistributionSampler.
- Parameters:
dist (SequenceEncodableProbabilityDistribution) – Distribution to sample from.
seed (Optional[int]) – Used to set seed on rng.
- new_seed()
Generates a new seed from rng
- Return type:
int
- abstract sample(size=None)
Generate samples from distribution.
- Parameters:
size (Optional[int]) – Number of samples to generate.
- Return type:
Any- Returns:
Samples from distribution.
ConditionalSampler
- class dmx.stats.pdist.ConditionalSampler
AbstractClass for ConditionalSampler.
Note
This is only implemented for samples of conditional distributions.
- abstract sample_given(x)
Sample at conditional value.
- Parameters:
x (Any) – Conditioned on x, sample from dist.
- Returns:
Sample from conditional distribution.
StatisticAccumulator
- class dmx.stats.pdist.StatisticAccumulator
- abstract combine(suff_stat)
Method for combining aggregated sufficient statistics.
- Parameters:
suff_stat (SS) – Sufficient statistics.
- Return type:
- Returns:
None
- abstract from_value(x)
Set sufficient statistics equal to passed value.
- Parameters:
x (SS) – Generic sufficient statistic for instance of StatisticAccumulator.
- Return type:
- initialize(x, weight, rng)
Initialize sufficient statistics for a single data observation.
Note
Used for debugging only.
- Parameters:
x (Any) – Data type corresponding to StatisticAccumulator object.
weight (float) – Weight associated with single observation.
rng (np.random.RandomState) – Set seed for initialization.
- Return type:
None
- abstract key_merge(stats_dict)
Merge sufficient statistics with matching keys.
- Parameters:
stats_dict (Dict[str, Any]) – Dict mapping keys to sufficient statistic value or accumulator.
- Return type:
None
- abstract key_replace(stats_dict)
Set sufficient statistics of accumulator instance to key’d values.
- Parameters:
stats_dict (Dict[str, Any]) – Dict mapping keys to sufficient statistic value or accumulator.
- Return type:
None
- update(x, weight, estimate)
Accumulate sufficient statistics for a single data observation.
Note
Used for debugging only.
- Parameters:
x (Any) – Data type corresponding to StatisticAccumulator object.
weight (float) – Weight associated with single observation.
estimate (SequenceEncodableProbabilityDistribution) – Previous estimate of distribution.
- Return type:
None
- abstract value()
Return sufficient statistics of StatisticAccumulator.
- Return type:
TypeVar(SS)
SequenceEncodableStatisticAccumulator
- class dmx.stats.pdist.SequenceEncodableStatisticAccumulator
- abstract acc_to_encoder()
Create DataSequenceEncoder object for SequenceEncodableStatisticAccumulator instance.
- Return type:
- abstract seq_initialize(x, weights, rng)
Vectorized initialization of sufficient statistics.
- Parameters:
x (EncodedDataSequence) – EncodedDataSequence for given SequenceEncodableStatisticAccumulator type.
weights (np.ndarray) – weights for observations.
rng (np.random.RandomState) – RandomState object for setting seed on initialization.
- Return type:
None
- abstract seq_update(x, weights, estimate)
Vectorized accumulation of sufficient statistics for EM updates.
- Parameters:
x (EncodedDataSequence) – EncodedDataSequence for given SequenceEncodableStatisticAccumulator type.
weights (np.ndarray) – weights for observations.
estimate (Optional[SequenceEncodableProbabilityDistribution]) – Optional previous estimate of distribution.
- Return type:
None
ParameterEstimator
- class dmx.stats.pdist.ParameterEstimator(*args)
Abstract class for ParameterEstimator object.
- abstract __init__(*args)
Must implement constructor for ParameterEstimator
- abstract accumulator_factory()
Create SequenceEncodableStatisticAccumulator object.
- Return type:
StatisticAccumulatorFactory
- abstract estimate(nobs, suff_stat)
Estimate SequenceEncodableProbabilityDistribution for sufficient statistics.
- Parameters:
nobs (Optional[float]) – Weighted number of observations.
suff_stat (Tuple[int, np.ndarray, np.ndarray, np.ndarray]) – Sufficient statistics for dirichlet distribution.
- Return type:
- Returns:
SequenceEncodableProbabilityDistribution
DataSequenceEncoder
EncodedDataSequence
- class dmx.stats.pdist.EncodedDataSequence(data)
EncodedDatSequence is the outputed data structure from DataSeqeunceEncoder. Object is used for vectorized functions and type checks.
- __init__(data)
Create instance of EncodedDataSequence.
- Parameters:
data (Any) – Store the data encocded for vectorized calls.