Abstract Classes

dmx-learn captures most distributions in the exponential family. A detailed walkthrough on defining a custom distribution class can be found in User Defined Classes. We list the abstract classes that exist in dmx-learn below.

ProbabilityDistribution

class dmx.stats.pdist.ProbabilityDistribution

Defines ProbabilityDistribution Abstract Class.

Note

This is generally used as an inherited class for SequenceEncodableProbabilityDistribution.

__init__()
abstract estimator(pseudo_count=None)

Create a ParameterEstimator for corresponding SequenceEncodableProbabilityDistribution.

Parameters:

pseudo_count (Optional[float]) – Regularize sufficient statistics in estimation step.

Return type:

ParameterEstimator

Returns:

ParameterEstimator

abstract log_density(x)

Evaluate the log-density of distribution.

Return type:

float

Returns:

float

abstract sampler(seed=None)

Create a DistributionSampler object for a given ProbabilityDistribution.

Parameters:

seed (Optional[int]) – Set seed for drawing samples from distribution.

Return type:

DistributionSampler

SequenceEncodableProbabilityDistribution

class dmx.stats.pdist.SequenceEncodableProbabilityDistribution

Extends the ProbabilityDistribution to handle vectorized calls.

abstract dist_to_encoder()

Create DataSequenceEncoder object for SequenceEncodableProbabilityDistribution instance.

Return type:

DataSequenceEncoder

Returns:

DataSequenceEncoder

abstract seq_log_density(x)

Vectorized evaluation of the log density.

Parameters:

x (EncodedDataSequence) – EncodedDataSequence for corresponding SequenceEncodedProbabilityDistribution.

Return type:

ndarray

Returns:

np.ndarray

DistributionSampler

class dmx.stats.pdist.DistributionSampler(dist, seed=None)

DistributionSampler is an Abstract class for distribution samplers.

dist

Distribution to sample from.

Type:

SequenceEncodableProbabilityDistribution

rng

Random number generator.

Type:

RandomState

__init__(dist, seed=None)

Initialize DistributionSampler.

Parameters:
new_seed()

Generates a new seed from rng

Return type:

int

abstract sample(size=None)

Generate samples from distribution.

Parameters:

size (Optional[int]) – Number of samples to generate.

Return type:

Any

Returns:

Samples from distribution.

ConditionalSampler

class dmx.stats.pdist.ConditionalSampler

AbstractClass for ConditionalSampler.

Note

This is only implemented for samples of conditional distributions.

abstract sample_given(x)

Sample at conditional value.

Parameters:

x (Any) – Conditioned on x, sample from dist.

Returns:

Sample from conditional distribution.

StatisticAccumulator

class dmx.stats.pdist.StatisticAccumulator
abstract combine(suff_stat)

Method for combining aggregated sufficient statistics.

Parameters:

suff_stat (SS) – Sufficient statistics.

Return type:

StatisticAccumulator

Returns:

None

abstract from_value(x)

Set sufficient statistics equal to passed value.

Parameters:

x (SS) – Generic sufficient statistic for instance of StatisticAccumulator.

Return type:

SequenceEncodableStatisticAccumulator

initialize(x, weight, rng)

Initialize sufficient statistics for a single data observation.

Note

Used for debugging only.

Parameters:
  • x (Any) – Data type corresponding to StatisticAccumulator object.

  • weight (float) – Weight associated with single observation.

  • rng (np.random.RandomState) – Set seed for initialization.

Return type:

None

abstract key_merge(stats_dict)

Merge sufficient statistics with matching keys.

Parameters:

stats_dict (Dict[str, Any]) – Dict mapping keys to sufficient statistic value or accumulator.

Return type:

None

abstract key_replace(stats_dict)

Set sufficient statistics of accumulator instance to key’d values.

Parameters:

stats_dict (Dict[str, Any]) – Dict mapping keys to sufficient statistic value or accumulator.

Return type:

None

update(x, weight, estimate)

Accumulate sufficient statistics for a single data observation.

Note

Used for debugging only.

Parameters:
  • x (Any) – Data type corresponding to StatisticAccumulator object.

  • weight (float) – Weight associated with single observation.

  • estimate (SequenceEncodableProbabilityDistribution) – Previous estimate of distribution.

Return type:

None

abstract value()

Return sufficient statistics of StatisticAccumulator.

Return type:

TypeVar(SS)

SequenceEncodableStatisticAccumulator

class dmx.stats.pdist.SequenceEncodableStatisticAccumulator
abstract acc_to_encoder()

Create DataSequenceEncoder object for SequenceEncodableStatisticAccumulator instance.

Return type:

DataSequenceEncoder

abstract seq_initialize(x, weights, rng)

Vectorized initialization of sufficient statistics.

Parameters:
  • x (EncodedDataSequence) – EncodedDataSequence for given SequenceEncodableStatisticAccumulator type.

  • weights (np.ndarray) – weights for observations.

  • rng (np.random.RandomState) – RandomState object for setting seed on initialization.

Return type:

None

abstract seq_update(x, weights, estimate)

Vectorized accumulation of sufficient statistics for EM updates.

Parameters:
Return type:

None

ParameterEstimator

class dmx.stats.pdist.ParameterEstimator(*args)

Abstract class for ParameterEstimator object.

abstract __init__(*args)

Must implement constructor for ParameterEstimator

abstract accumulator_factory()

Create SequenceEncodableStatisticAccumulator object.

Return type:

StatisticAccumulatorFactory

abstract estimate(nobs, suff_stat)

Estimate SequenceEncodableProbabilityDistribution for sufficient statistics.

Parameters:
  • nobs (Optional[float]) – Weighted number of observations.

  • suff_stat (Tuple[int, np.ndarray, np.ndarray, np.ndarray]) – Sufficient statistics for dirichlet distribution.

Return type:

SequenceEncodableProbabilityDistribution

Returns:

SequenceEncodableProbabilityDistribution

DataSequenceEncoder

class dmx.stats.pdist.DataSequenceEncoder
abstract seq_encode(x)

Create EncodedDataSequence from iid observations from SequenceEncodedProbabilityDistribution.

Parameters:

x (Any) – Sequence of observations from corresponding distribution.

Return type:

EncodedDataSequence

Returns:

EncodedDataSequence

EncodedDataSequence

class dmx.stats.pdist.EncodedDataSequence(data)

EncodedDatSequence is the outputed data structure from DataSeqeunceEncoder. Object is used for vectorized functions and type checks.

__init__(data)

Create instance of EncodedDataSequence.

Parameters:

data (Any) – Store the data encocded for vectorized calls.