Spike and Slab Distribution

Data Type: int

The spike and slab distribution places a spike of probability p on integer value k in the range of values [min_val, max_val]. The remaining n-1 follow a uniform distribution. This distribution is great for cases where you encounter integer valued data with a spike on certain values in the range.

\[\begin{split}f(x|k, p) = \left\{ \begin{array}{ll} p, & x=k \\ \frac{1-p}{n-1}, & x \neq k \;\& \; x \in [min\_val, max\_val] \\ 0, & else \end{array} \right.\end{split}\]

In the above we have assumed the length of [min_val, max_val] is n.

SpikeAndSlabDistribution

class dmx.stats.int_spike.SpikeAndSlabDistribution(k, num_vals, p, min_val=0, name=None, keys=None)

SpikeAndSlabDistribution object for creating a uniform integer distribution with a spike on k.

p

Probability of drawing from k.

Type:

float

min_val

Lower bound for the range.

Type:

int

max_val

Max value for the range.

Type:

int

k

Integer to place the spike on.

Type:

int

log_p

Log of p.

Type:

float

log_1p

Log of 1-p

Type:

float

num_vals

Total number of integers in range.

Type:

int

name

Name for object instance.

Type:

Optional[str]

keys

Key for parameters.

Type:

Optional[str]

__init__(k, num_vals, p, min_val=0, name=None, keys=None)

SpikeAndSlabDistribution object.

Parameters:
  • k (int) – Integer value to place spike on. Must be within [min_val,min_val+num_vals)

  • num_vals (int) – Number of integers in the range.

  • p (float) – Probability of drawing k. (1-p)/(num_vals-1) to draw any other integer in range.

  • min_val (Optional[int]) – Defaults to 0. Set bottom of integer range.

  • name (Optional[str]) – Set name for object.

  • keys (Optional[str]) – Key for parameters.

dist_to_encoder()

Create DataSequenceEncoder object for SequenceEncodableProbabilityDistribution instance.

Return type:

SpikeAndSlabDataEncoder

Returns:

DataSequenceEncoder

estimator(pseudo_count=None)

Create a ParameterEstimator for corresponding SequenceEncodableProbabilityDistribution.

Parameters:

pseudo_count (Optional[float]) – Regularize sufficient statistics in estimation step.

Return type:

SpikeAndSlabEstimator

Returns:

ParameterEstimator

log_density(x)

Evaluate the log-density of distribution.

Return type:

float

Returns:

float

sampler(seed=None)

Create a DistributionSampler object for a given ProbabilityDistribution.

Parameters:

seed (Optional[int]) – Set seed for drawing samples from distribution.

Return type:

SpikeAndSlabSampler

seq_log_density(x)

Vectorized evaluation of the log density.

Parameters:

x (EncodedDataSequence) – EncodedDataSequence for corresponding SequenceEncodedProbabilityDistribution.

Return type:

ndarray

Returns:

np.ndarray

SpikeAndSlabEstimator

class dmx.stats.int_spike.SpikeAndSlabEstimator(min_val=None, max_val=None, pseudo_count=None, suff_stat=None, name=None, keys=None)

SpikeAndSlabEstimator object instance for estimating SpikeAndSlabDistribution objects.

pseudo_count

Regularize value k.

Type:

Optional[float]

min_val

Smallest integer value in the range. Defaults to 0.

Type:

int

max_val

Set to the min val plus number of values - 1.

Type:

int

suff_stat

Tuple of k to regularize and optional value of p for k.

Type:

Optional[Tuple[int, Optional[float]]]

name

Set name for object instance.

Type:

Optional[str]

keys

Set keys for object instance.

Type:

Optional[str]

__init__(min_val=None, max_val=None, pseudo_count=None, suff_stat=None, name=None, keys=None)

SpikeAndSlabEstimator object.

Parameters:
  • min_val (Optional[int]) – Smallest integer value in the range.

  • pseudo_count (Optional[float]) – Regularize value k.

  • suff_stat (Optional[Tuple[int, Optional[float]]]) – Tuple of k to regularize and optional value of p for k.

  • name (Optional[str]) – Set name for object instance.

  • keys (Optional[str]) – Set keys for object instance.

accumulator_factory()

Create SequenceEncodableStatisticAccumulator object.

Return type:

SpikeAndSlabAccumulatorFactory

estimate(nobs, suff_stat)

Estimate SequenceEncodableProbabilityDistribution for sufficient statistics.

Parameters:
  • nobs (Optional[float]) – Weighted number of observations.

  • suff_stat (Tuple[int, np.ndarray, np.ndarray, np.ndarray]) – Sufficient statistics for dirichlet distribution.

Return type:

SpikeAndSlabDistribution

Returns:

SequenceEncodableProbabilityDistribution

SpikeAndSlabSampler

class dmx.stats.int_spike.SpikeAndSlabSampler(dist, seed=None)

SpikeAndSlabSampler object for sampling from spike and slab distribution on integers.

rng

RandomState for seeding samples.

Type:

RandomState

dist

SpikeAndSlabDistribution to sample from.

Type:

SpikeAndSlabDistribution

non_k

All integers outside of the spiked value ‘k’.

Type:

np.ndarray

sample(size=None)

Generate samples from distribution.

Parameters:

size (Optional[int]) – Number of samples to generate.

Return type:

Union[int, array]

Returns:

Samples from distribution.