API Reference

Base Classes

Base class for probability distributions with scipy-like API.

This module provides an abstract base class that defines the standard interface for probability distributions, similar to scipy.stats.

The API includes:

Density functions: pdf(), logpdf()
Cumulative distribution: cdf(), sf() (survival function)
Quantile functions: ppf(), isf() (inverse survival)
Random sampling: rvs()
Fitting: fit() (returns self for method chaining)
Moments: mean(), var(), std(), stats()

Cache infrastructure

All distributions support lazy caching of derived quantities via functools.cached_property. The cache is invalidated by calling _invalidate_cache(), which removes all cached_property entries listed in the class-level _cached_attrs tuple.

Subclasses extend _cached_attrs to register their own cached properties:

class MyDist(ExponentialFamily):
    _cached_attrs = ExponentialFamily._cached_attrs + ('log_det_Sigma',)

    @cached_property
    def log_det_Sigma(self) -> float:
        ...

class normix.base.distribution.Distribution(*args, **kwargs)[source]

Bases: ABC

Abstract base class for probability distributions.

This class defines the standard API for probability distributions, similar to scipy.stats distributions. All concrete distributions should inherit from this class.

The API includes:

pdf/pmf: Probability density/mass function
logpdf/logpmf: Log of the probability density/mass function
cdf: Cumulative distribution function
logcdf: Log of the cumulative distribution function
sf: Survival function (1 - CDF)
logsf: Log of the survival function
ppf: Percent point function (inverse of CDF)
isf: Inverse survival function
rvs: Random variate sampling
fit: Fit distribution parameters to data
stats: Return moments (mean, variance, skewness, kurtosis)
entropy: Differential entropy of the distribution
moment: Non-central moment of order n
median: Median of the distribution
mean: Mean of the distribution
var: Variance of the distribution
std: Standard deviation of the distribution
interval: Confidence interval

_fitted

Whether parameters have been set (via factory methods, setters, or fit).

Type:: bool

_cached_attrs

Names of cached_property attributes that _invalidate_cache() will clear. Subclasses extend this via tuple concatenation.

Type:: tuple of str

cdf(x: ArrayLike) → ndarray[tuple[Any, ...], dtype[floating]][source]

Cumulative distribution function.

Parameters:: x (array_like) – Points at which to evaluate the CDF.
Returns:: cdf – Cumulative probability at each point.
Return type:: ndarray

entropy() → float[source]

Differential entropy of the distribution.

Returns:: entropy – Entropy value.
Return type:: float

abstractmethod fit(data: ArrayLike, *args, **kwargs) → Distribution[source]

Fit distribution parameters to data (sklearn-style).

This method should fit the distribution parameters to the given data and return self for method chaining.

Parameters:

data (array_like) – Data to fit the distribution to.
*args – Additional parameters for fitting.
**kwargs – Additional parameters for fitting.

Returns:

self – The fitted distribution instance (for method chaining).

Return type:

Distribution

interval(alpha: float) → tuple[ndarray[tuple[Any, ...], dtype[floating]], ndarray[tuple[Any, ...], dtype[floating]]][source]

Confidence interval with equal areas around the median.

Parameters:: alpha (float) – Confidence level (between 0 and 1).
Returns:: a, b – Lower and upper bounds of the confidence interval.
Return type:: tuple of ndarrays

isf(q: ArrayLike) → ndarray[tuple[Any, ...], dtype[floating]][source]

Inverse survival function (inverse of SF).

Parameters:: q (array_like) – Probabilities at which to evaluate the ISF.
Returns:: isf – Quantiles corresponding to the given survival probabilities.
Return type:: ndarray

kurtosis() → ndarray[tuple[Any, ...], dtype[floating]][source]

Kurtosis of the distribution.

Returns:: kurtosis – Kurtosis value(s).
Return type:: ndarray

logcdf(x: ArrayLike) → ndarray[tuple[Any, ...], dtype[floating]][source]

Log of the cumulative distribution function.

Parameters:: x (array_like) – Points at which to evaluate the log CDF.
Returns:: logcdf – Log cumulative probability at each point.
Return type:: ndarray

logpdf(x: ArrayLike) → ndarray[tuple[Any, ...], dtype[floating]][source]

Log of the probability density function.

Default implementation: log(pdf(x)) Subclasses should override for numerical stability.

Parameters:: x (array_like) – Points at which to evaluate the log PDF.
Returns:: logpdf – Log probability density at each point.
Return type:: ndarray

logsf(x: ArrayLike) → ndarray[tuple[Any, ...], dtype[floating]][source]

Log of the survival function.

Parameters:: x (array_like) – Points at which to evaluate the log survival function.
Returns:: logsf – Log survival probability at each point.
Return type:: ndarray

mean() → ndarray[tuple[Any, ...], dtype[floating]][source]

Mean of the distribution.

Returns:: mean – Mean value(s).
Return type:: ndarray

median() → ndarray[tuple[Any, ...], dtype[floating]][source]

Median of the distribution.

Returns:: median – Median value(s).
Return type:: ndarray

moment(n: int) → ndarray[tuple[Any, ...], dtype[floating]][source]

Non-central moment of order n.

Parameters:: n (int) – Order of the moment.
Returns:: moment – n-th moment.
Return type:: ndarray

abstractmethod pdf(x: ArrayLike) → ndarray[tuple[Any, ...], dtype[floating]][source]

Probability density function.

Parameters:: x (array_like) – Points at which to evaluate the PDF.
Returns:: pdf – Probability density at each point.
Return type:: ndarray

ppf(q: ArrayLike) → ndarray[tuple[Any, ...], dtype[floating]][source]

Percent point function (inverse of CDF).

Parameters:: q (array_like) – Probabilities at which to evaluate the PPF.
Returns:: ppf – Quantiles corresponding to the given probabilities.
Return type:: ndarray

abstractmethod rvs(size: int | tuple | None = None, random_state: int | Generator | None = None) → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Random variate sampling.

Parameters:

size (int or tuple of ints, optional) – Shape of the output. If None, returns a scalar.
random_state (int or numpy.random.Generator, optional) – Random state for reproducibility.

Returns:

rvs – Random variates.

Return type:

ndarray or scalar

score(X: ArrayLike, y: ArrayLike | None = None) → float[source]

Compute the mean log-likelihood (sklearn-style scoring).

This method follows the sklearn convention where higher scores are better.

Parameters:

X (array_like) – Data samples.
y (array_like, optional) – Ignored. Present for sklearn API compatibility.

Returns:

score – Mean log-likelihood.

Return type:

float

sf(x: ArrayLike) → ndarray[tuple[Any, ...], dtype[floating]][source]

Survival function (1 - CDF).

Parameters:: x (array_like) – Points at which to evaluate the survival function.
Returns:: sf – Survival probability at each point.
Return type:: ndarray

skewness() → ndarray[tuple[Any, ...], dtype[floating]][source]

Skewness of the distribution.

Returns:: skewness – Skewness value(s).
Return type:: ndarray

stats(moments: str = 'mv') → ndarray[tuple[Any, ...], dtype[_ScalarT]] | tuple[ndarray[tuple[Any, ...], dtype[_ScalarT]], ...][source]

Return moments of the distribution.

Parameters:: moments (str, optional) – Composed of letters [‘mvsk’] defining which moments to compute: ‘m’ = mean, ‘v’ = variance, ‘s’ = skewness, ‘k’ = kurtosis. Default is ‘mv’.
Returns:: stats – Requested moments.
Return type:: ndarray or tuple

std() → ndarray[tuple[Any, ...], dtype[floating]][source]

Standard deviation of the distribution.

Returns:: std – Standard deviation value(s).
Return type:: ndarray

var() → ndarray[tuple[Any, ...], dtype[floating]][source]

Variance of the distribution.

Returns:: var – Variance value(s).
Return type:: ndarray

Base class for exponential family distributions.

Exponential families have the canonical form:

\[p(x|\theta) = h(x) \exp(\theta^T t(x) - \psi(\theta))\]

where:

\(\theta\): natural parameters (d-dimensional vector)
\(t(x)\): sufficient statistics (d-dimensional vector)
\(\psi(\theta)\): log partition function (cumulant generating function)
\(h(x)\): base measure

The log partition function satisfies:

\[\nabla\psi(\theta) = E[t(X)] = \eta\]

where \(\eta\) are the expectation parameters.

Supports three parametrizations:

Classical: Domain-specific parameters (e.g., \(\mu\), \(\sigma\) for Normal)
Natural: \(\theta\) in the exponential family form (numpy array)
Expectation: \(\eta = \nabla\psi(\theta) = E[t(X)]\) (numpy array)

The Fisher information matrix equals the Hessian of the log partition:

\[I(\theta) = \nabla^2\psi(\theta) = \text{Cov}[t(X)]\]

class normix.base.exponential_family.ExponentialFamily[source]

Bases: Distribution

Abstract base class for exponential family distributions.

Exponential family distributions have the probability density:

\[p(x|\theta) = h(x) \exp(\theta^T t(x) - \psi(\theta))\]

where:

\(\theta\) are the natural parameters (d-dimensional vector)
\(t(x)\) are the sufficient statistics (d-dimensional vector)
\(\psi(\theta)\) is the log partition function (scalar)
\(h(x)\) is the base measure (scalar)

Three parametrizations:

Classical: Domain-specific (e.g., \(\mu\), \(\sigma\) for Normal; \(\lambda\) for Exponential)
Natural: \(\theta\) in exponential family form (numpy array)
Expectation: \(\eta = \nabla\psi(\theta) = E[t(X)]\) (numpy array)

Subclasses must implement the following state management methods:

_set_from_classical(**kwargs): Set internal state from classical params.
_set_from_natural(theta): Set internal state from natural params.
_compute_natural_params(): Derive natural params from internal state.
_compute_classical_params(): Derive classical params from internal state.

Parameters should be stored as named internal attributes (e.g., _rate, _mu, _L) for efficient direct access.

Both interfaces use functools.cached_property for lazy caching of derived parametrizations, with _invalidate_cache() clearing all entries when state changes.

_natural_params

Legacy storage for natural parameters (tuple for hashability). Set by the default _set_from_natural implementation. Migrated subclasses may not use this.

Type:: tuple or None

_fitted

Whether parameters have been set (inherited from Distribution).

Type:: bool

Examples

Use factory methods to create instances:

>>> # From classical parameters
>>> dist = Exponential.from_classical_params(rate=2.0)

>>> # From natural parameters (array)
>>> dist = Exponential.from_natural_params(np.array([-2.0]))

>>> # From expectation parameters (array)
>>> dist = Exponential.from_expectation_params(np.array([0.5]))

>>> # Fit from data (returns self for method chaining)
>>> dist = Exponential().fit(data)

Notes

The log partition function \(\psi(\theta)\) is convex, and its gradient and Hessian give important quantities:

Gradient: \(\nabla\psi(\theta) = E[t(X)] = \eta\) (expectation parameters)
Hessian: \(\nabla^2\psi(\theta) = \text{Cov}[t(X)] = I(\theta)\) (Fisher information)

References

Barndorff-Nielsen, O. E. (1978). Information and exponential families in statistical theory.

property classical_params

Classical parameters as a frozen dataclass or dict (cached).

Computed lazily from internal state via _compute_classical_params. For migrated subclasses, returns a frozen dataclass with attribute access (e.g., params.rate). For legacy subclasses, returns a dict.

Returns:: params – Classical parameters.
Return type:: dataclass or dict

property expectation_params: ndarray[tuple[Any, ...], dtype[_ScalarT]]

Expectation parameters \(\eta = E[t(X)]\) (cached).

Computed lazily from internal state via _compute_expectation_params.

Returns:: eta – Expectation parameter vector.
Return type:: ndarray

fisher_information(theta: ndarray[tuple[Any, ...], dtype[_ScalarT]] | None = None) → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Compute Fisher information matrix: I(theta) = Hessian of psi(theta).

The Fisher information equals:

\[I(\theta) = \nabla^2\psi(\theta) = \text{Cov}[t(X)]\]

It is the Hessian of the log partition function (always positive semi-definite since \(\psi\) is convex), and equals the covariance matrix of the sufficient statistics.

Default implementation uses scipy.differentiate.hessian. Override for analytical Hessian when available (recommended for performance).

Parameters:: theta (ndarray, optional) – Natural parameter vector \(\theta\). If None, uses current parameters.
Returns:: fisher – Fisher information matrix (positive semi-definite).
Return type:: ndarray, shape (d, d)

Examples

>>> dist = Exponential.from_classical_params(rate=2.0)
>>> fisher = dist.fisher_information()
>>> print(fisher)  # [[0.25]] for rate=2

fit(X: ArrayLike, y: ArrayLike | None = None, **kwargs) → ExponentialFamily[source]

Fit distribution parameters using Maximum Likelihood Estimation.

For exponential families, the MLE has a closed form in expectation parameters:

\[\hat{\eta} = \frac{1}{n} \sum_{i=1}^n t(x_i)\]

That is, the MLE of expectation parameters equals the sample mean of sufficient statistics. Then converts to natural parameters using _expectation_to_natural.

Parameters:

X (array_like) – Training data.
y (array_like, optional) – Ignored (for sklearn API compatibility).
**kwargs – Additional options (currently unused).

Returns:

self – Returns self for method chaining (sklearn convention).

Return type:

ExponentialFamily

Examples

>>> data = np.random.exponential(scale=0.5, size=1000)
>>> dist = Exponential().fit(data)
>>> print(dist.classical_params)
{'rate': 2.01...}

classmethod from_classical_params(**kwargs) → ExponentialFamily[source]

Create distribution from classical parameters.

Parameters:: **kwargs – Distribution-specific classical parameters. E.g., rate=2.0 for Exponential, mu=0.0, sigma=1.0 for Normal.
Returns:: dist – Distribution instance with parameters set.
Return type:: ExponentialFamily

Examples

>>> dist = Exponential.from_classical_params(rate=2.0)
>>> dist = Normal.from_classical_params(mu=0.0, sigma=1.0)

classmethod from_expectation_params(eta: ndarray[tuple[Any, ...], dtype[_ScalarT]]) → ExponentialFamily[source]

Create distribution from expectation parameters.

Parameters:: eta (ndarray) – Expectation parameter vector (E[t(X)]).
Returns:: dist – Distribution instance with parameters set.
Return type:: ExponentialFamily

Examples

>>> dist = Exponential.from_expectation_params(np.array([0.5]))
>>> dist = Normal.from_expectation_params(np.array([0.0, 1.0]))

classmethod from_natural_params(theta: ndarray[tuple[Any, ...], dtype[_ScalarT]]) → ExponentialFamily[source]

Create distribution from natural parameters.

Parameters:: theta (ndarray) – Natural parameter vector.
Returns:: dist – Distribution instance with parameters set.
Return type:: ExponentialFamily

Examples

>>> dist = Exponential.from_natural_params(np.array([-2.0]))
>>> dist = Normal.from_natural_params(np.array([0.0, -0.5]))

logpdf(x: ArrayLike) → float | ndarray[tuple[Any, ...], dtype[floating]][source]

Log probability density using exponential family form.

Computes:

\[\log p(x|\theta) = \log h(x) + \theta^T t(x) - \psi(\theta)\]

Parameters:: x (array_like) – Points at which to evaluate log PDF.
Returns:: logpdf – Log probability density at each point.
Return type:: float or ndarray

property natural_params: ndarray[tuple[Any, ...], dtype[_ScalarT]]

Natural parameters \(\theta\) as numpy array (cached).

Computed lazily from internal state via _compute_natural_params. Cache is invalidated when parameters change.

Returns:: theta – Natural parameter vector.
Return type:: ndarray

pdf(x: ArrayLike) → float | ndarray[tuple[Any, ...], dtype[floating]][source]

Probability density: p(x|θ) = exp(logpdf(x)).

Parameters:: x (array_like) – Points at which to evaluate PDF.
Returns:: pdf – Probability density at each point.
Return type:: float or ndarray

score(X: ArrayLike, y: ArrayLike | None = None) → float[source]

Compute mean log-likelihood (sklearn-style scoring).

Higher scores are better (sklearn convention).

Parameters:

X (array_like) – Data samples.
y (array_like, optional) – Ignored (for sklearn API compatibility).

Returns:

score – Mean log-likelihood.

Return type:

float

set_classical_params(**kwargs) → ExponentialFamily[source]

Set parameters from classical parametrization.

Parameters:: **kwargs – Distribution-specific classical parameters.
Returns:: self – Returns self for method chaining.
Return type:: ExponentialFamily

set_expectation_params(eta: ndarray[tuple[Any, ...], dtype[_ScalarT]], theta0: ndarray[tuple[Any, ...], dtype[_ScalarT]] | List[ndarray[tuple[Any, ...], dtype[_ScalarT]]] | None = None) → ExponentialFamily[source]

Set parameters from expectation parametrization.

Parameters:

eta (ndarray) – Expectation parameter vector (E[t(X)]).
theta0 (ndarray or list of ndarray, optional) – Initial guess(es) for natural parameters in optimization. If provided, these are used instead of the default initial points.

Returns:

self – Returns self for method chaining.

Return type:

ExponentialFamily

set_natural_params(theta: ndarray[tuple[Any, ...], dtype[_ScalarT]]) → ExponentialFamily[source]

Set parameters from natural parametrization.

Parameters:: theta (ndarray) – Natural parameter vector.
Returns:: self – Returns self for method chaining.
Return type:: ExponentialFamily

Base classes for normal mixture distributions.

Normal mixture distributions have the form:

\[X \stackrel{d}{=} \mu + \gamma Y + \sqrt{Y} Z\]

where \(Z \sim N(0, \Sigma)\) is a Gaussian random vector independent of \(Y\), and \(Y\) follows some positive mixing distribution (GIG, Gamma, Inverse Gaussian, etc.).

This module provides two abstract base classes:

JointNormalMixture: The joint distribution \(f(x, y)\) which IS an exponential family. This class extends ExponentialFamily.
NormalMixture: The marginal distribution \(f(x) = \int f(x, y) dy\) which is NOT an exponential family. This class extends Distribution and owns a JointNormalMixture instance accessible via the .joint property.

The key insight is that while the marginal distribution of \(X\) (with \(Y\) integrated out) does not belong to the exponential family, the joint distribution \((X, Y)\) does belong to the exponential family when both variables are observed.

Method naming convention:

pdf(x): Marginal PDF \(f(x)\) (on NormalMixture)
joint.pdf(x, y): Joint PDF \(f(x, y)\) (via .joint property)
pdf_joint(x, y): Convenience alias for joint.pdf(x, y)
rvs(size): Sample from marginal (returns X only)
rvs_joint(size): Sample from joint (returns X, Y tuple)

class normix.base.mixture.JointNormalMixture(d: int | None = None)[source]

Bases: ExponentialFamily, ABC

Abstract base class for joint normal mixture distributions \(f(x, y)\).

The joint distribution of \((X, Y)\) where:

\[ \begin{align}\begin{aligned}X | Y \sim N(\mu + \gamma Y, \Sigma Y)\\Y \sim \text{Mixing distribution}\end{aligned}\end{align} \]

This joint distribution belongs to the exponential family with sufficient statistics:

\[\begin{split}t(x, y) = \begin{pmatrix} \log y \\ y^{-1} \\ y \\ x \\ x y^{-1} \\ x x^T y^{-1} \end{pmatrix}\end{split}\]

The natural parameters are derived from the classical parameters \((\mu, \gamma, \Sigma, \text{mixing params})\).

Subclasses must implement:

_get_mixing_distribution_class(): Return the mixing distribution class
_mixing_natural_params(): Extract mixing distribution natural params
_mixing_sufficient_statistics(): Compute mixing dist sufficient stats
_mixing_log_partition(): Compute mixing distribution log partition
_classical_to_natural(): Convert classical to natural parameters
_natural_to_classical(): Convert natural to classical parameters

_d

Dimension of the observed variable \(X\).

Type:: int or None

_natural_params

Internal storage for natural parameters.

Type:: tuple or None

See also

NormalMixture: Marginal distribution (owns a JointNormalMixture)
ExponentialFamily: Parent class providing exponential family methods

property L_Sigma_inv: ndarray[tuple[Any, ...], dtype[_ScalarT]]

Inverse of the lower Cholesky factor of Sigma (cached).

\(L_\Sigma^{-1}\) such that \(\Sigma^{-1} = L_\Sigma^{-T} L_\Sigma^{-1}\).

Returns:: L_inv – Lower triangular matrix.
Return type:: ndarray, shape (d, d)

clear_L_Sigma_cache() → None[source]: Clear cached Cholesky-derived quantities (backward compat).

conditional_cov_x_given_y(y: ArrayLike) → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Conditional covariance \(\text{Cov}[X | Y = y] = \Sigma y\).

Parameters:: y (array_like) – Conditioning value(s) of Y.
Returns:: cov – Conditional covariance of X given Y.
Return type:: ndarray

conditional_mean_x_given_y(y: ArrayLike) → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Conditional mean \(E[X | Y = y] = \mu + \gamma y\).

Parameters:: y (array_like) – Conditioning value(s) of Y.
Returns:: mean – Conditional mean of X given Y.
Return type:: ndarray

cov() → Tuple[ndarray[tuple[Any, ...], dtype[_ScalarT]], float][source]

Covariance of the joint distribution.

\[\text{Cov}[X] = E[Y] \Sigma + \text{Var}[Y] \gamma \gamma^T\]

Returns:

Cov_X (ndarray) – Covariance matrix of X, shape (d, d)
Var_Y (float) – Variance of Y from the mixing distribution

property d: int: Dimension of the observed variable X.

property gamma_mahal_sq: float

\(\gamma^T \Sigma^{-1} \gamma\) (cached).

Used in marginal PDF computations.

Returns:: mahal_sq
Return type:: float
Type:: Mahalanobis norm of gamma

get_L_Sigma() → Tuple[ndarray[tuple[Any, ...], dtype[_ScalarT]], float][source]

Get the lower Cholesky factor of Sigma and its log determinant.

Returns:

L_Sigma (ndarray) – Lower Cholesky factor of Sigma, shape (d, d).
log_det_Sigma (float) – Log determinant of Sigma.

Notes

This method now delegates to the internal _L_Sigma attribute and the log_det_Sigma cached property.

property log_det_Sigma: float

Log-determinant of the covariance scale matrix (cached).

\[\log|\Sigma| = 2 \sum_{i=1}^d \log (L_\Sigma)_{ii}\]

Returns:: log_det
Return type:: float

logpdf(x: ArrayLike, y: ArrayLike) → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Log joint probability density function \(\log f(x, y)\).

\[\log f(x, y) = \log h(x, y) + \theta^T t(x, y) - \psi(\theta)\]

Parameters:

x (array_like) – Observed values of X. Shape (d,) or (n, d).
y (array_like) – Observed values of Y. Shape () or (n,).

Returns:

logpdf – Log joint density values.

Return type:

ndarray

mean() → Tuple[ndarray[tuple[Any, ...], dtype[_ScalarT]], float][source]

Mean of the joint distribution.

Returns:

E_X (ndarray) – \(E[X] = \mu + \gamma E[Y]\)
E_Y (float) – \(E[Y]\) from the mixing distribution

pdf(x: ArrayLike, y: ArrayLike) → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Joint probability density function \(f(x, y)\).

Parameters:

x (array_like) – Observed values of X. Shape (d,) or (n, d).
y (array_like) – Observed values of Y. Shape () or (n,).

Returns:

pdf – Joint density values.

Return type:

ndarray

rvs(size: int | Tuple[int, ...] | None = None, random_state: int | Generator | None = None) → Tuple[ndarray[tuple[Any, ...], dtype[_ScalarT]], ndarray[tuple[Any, ...], dtype[_ScalarT]]][source]

Generate random samples from the joint distribution.

Samples \((X, Y)\) pairs where:

\(Y \sim\) mixing distribution
\(X | Y \sim N(\mu + \gamma Y, \Sigma Y)\)

Parameters:

size (int or tuple of ints, optional) – Number of samples to generate. If None, returns single sample.
random_state (int or Generator, optional) – Random number generator or seed.

Returns:

X (ndarray) – Sampled X values. Shape (d,) if size is None, (n, d) otherwise.
Y (ndarray) – Sampled Y values. Shape () if size is None, (n,) otherwise.

set_L_Sigma(L_Sigma: ndarray[tuple[Any, ...], dtype[_ScalarT]], lower: bool = True) → None[source]

Set the Cholesky factor of Sigma.

Used by the M-step of the EM algorithm.

Parameters:

L_Sigma (ndarray) – Cholesky factor of Sigma, shape (d, d).
lower (bool, default True) – If True, L_Sigma is lower triangular. If False, converts to lower triangular.

var() → Tuple[ndarray[tuple[Any, ...], dtype[_ScalarT]], float][source]

Variance of the joint distribution (diagonal of covariance).

Returns:

Var_X (ndarray) – Variance of X (diagonal of covariance matrix)
Var_Y (float) – Variance of Y from the mixing distribution

class normix.base.mixture.NormalMixture[source]

Bases: Distribution, ABC

Abstract base class for marginal normal mixture distributions \(f(x)\).

The marginal distribution of \(X\) where:

\[f(x) = \int_0^\infty f(x, y) \, dy = \int_0^\infty f(x | y) f(y) \, dy\]

This marginal distribution is NOT an exponential family. However, the joint distribution \(f(x, y)\) IS an exponential family and can be accessed via the joint property.

Method naming convention:

pdf(x): Marginal PDF \(f(x)\) (default)
joint.pdf(x, y): Joint PDF \(f(x, y)\) (via .joint property)
pdf_joint(x, y): Convenience alias for joint.pdf(x, y)
rvs(size): Sample from marginal (returns X only)
rvs_joint(size): Sample from joint (returns X, Y tuple)

Subclasses must implement:

_create_joint_distribution(): Factory method to create the joint dist
_marginal_logpdf(): Compute the marginal log PDF
_conditional_expectation_y_given_x(): E[Y | X] for EM algorithm

_joint

The underlying joint distribution instance.

Type:: JointNormalMixture or None

See also

JointNormalMixture: Joint distribution (exponential family)
Distribution: Parent class defining the scipy-like API

Examples

>>> # Access marginal distribution methods (default)
>>> gh = GeneralizedHyperbolic.from_classical_params(mu=0, gamma=1, ...)
>>> gh.pdf(x)  # Marginal PDF f(x)
>>> gh.rvs(size=100)  # Sample X values only

>>> # Access joint distribution methods via .joint property
>>> gh.joint.pdf(x, y)  # Joint PDF f(x, y)
>>> gh.joint.sufficient_statistics(x, y)  # Exponential family stats
>>> gh.joint.natural_params  # Natural parameters

>>> # Convenience aliases for joint methods
>>> gh.pdf_joint(x, y)  # Same as gh.joint.pdf(x, y)
>>> gh.rvs_joint(size=100)  # Returns (X, Y) tuple

property classical_params: Dict[str, Any]

Classical parameters of the distribution.

Delegates to the joint distribution’s classical parameters.

Returns:: params – Dictionary of classical parameters.
Return type:: dict

cov() → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Covariance matrix of the marginal distribution.

\[\text{Cov}[X] = E[Y] \Sigma + \text{Var}[Y] \gamma \gamma^T\]

Returns:: cov – Covariance matrix, shape (d, d).
Return type:: ndarray

property d: int: Dimension of the distribution.

property expectation_params: ndarray[tuple[Any, ...], dtype[_ScalarT]]

Expectation parameters \(\eta = E[t(X, Y)]\) of the joint.

Returns:: eta – Expectation parameter vector.
Return type:: ndarray

fit(X: ArrayLike, y: ArrayLike | None = None, **kwargs) → NormalMixture[source]

Fit distribution to data using EM algorithm.

When only X is observed (Y is latent), uses the EM algorithm. For complete data (both X and Y observed), use fit_complete().

Parameters:

X (array_like) – Observed X data, shape (n_samples, d).
y (array_like, optional) – Ignored.
**kwargs – Additional fitting parameters.

Returns:

self – Fitted distribution.

Return type:

NormalMixture

Note

Not yet implemented. Placeholder for EM algorithm.

fit_complete(X: ArrayLike, Y: ArrayLike, **kwargs) → NormalMixture[source]

Fit distribution from complete data (both X and Y observed).

Since the joint distribution is an exponential family, this uses closed-form MLE.

Parameters:

X (array_like) – Observed X data, shape (n_samples, d).
Y (array_like) – Observed Y data, shape (n_samples,).
**kwargs – Additional fitting parameters.

Returns:

self – Fitted distribution.

Return type:

NormalMixture

Note

Not yet implemented.

classmethod from_classical_params(**kwargs) → NormalMixture[source]

Create distribution from classical parameters.

Parameters:: **kwargs – Distribution-specific classical parameters. Common parameters: - mu : Location parameter (d,) - gamma : Skewness parameter (d,) - sigma : Covariance scale matrix (d, d) - Plus mixing distribution parameters
Returns:: dist – Distribution instance with parameters set.
Return type:: NormalMixture

Examples

>>> gh = GeneralizedHyperbolic.from_classical_params(
...     mu=0.0, gamma=0.5, sigma=1.0, p=-0.5, a=1.0, b=1.0
... )

classmethod from_expectation_params(eta: ndarray[tuple[Any, ...], dtype[_ScalarT]]) → NormalMixture[source]

Create distribution from expectation parameters.

The expectation parameters are those of the joint distribution.

Parameters:: eta (ndarray) – Expectation parameter vector for the joint distribution.
Returns:: dist – Distribution instance with parameters set.
Return type:: NormalMixture

classmethod from_natural_params(theta: ndarray[tuple[Any, ...], dtype[_ScalarT]]) → NormalMixture[source]

Create distribution from natural parameters.

The natural parameters are those of the joint distribution.

Parameters:: theta (ndarray) – Natural parameter vector for the joint distribution.
Returns:: dist – Distribution instance with parameters set.
Return type:: NormalMixture

property joint: JointNormalMixture

Access the joint distribution \(f(x, y)\).

The joint distribution is an ExponentialFamily with all associated methods (natural parameters, sufficient statistics, Fisher information, etc.).

Returns:: joint – The underlying joint distribution instance.
Return type:: JointNormalMixture

Examples

>>> gh = GeneralizedHyperbolic.from_classical_params(...)
>>> gh.joint.pdf(x, y)  # Joint PDF
>>> gh.joint.natural_params  # Natural parameters
>>> gh.joint.sufficient_statistics(x, y)  # Sufficient stats

logpdf(x: ArrayLike) → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Log marginal probability density function \(\log f(x)\).

Parameters:: x (array_like) – Points at which to evaluate the log PDF.
Returns:: logpdf – Log probability density values.
Return type:: ndarray

logpdf_joint(x: ArrayLike, y: ArrayLike) → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Log joint probability density function \(\log f(x, y)\).

Convenience alias for self.joint.logpdf(x, y).

Parameters:

x (array_like) – Observed X values.
y (array_like) – Observed Y values.

Returns:

logpdf – Log joint density values.

Return type:

ndarray

mean() → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Mean of the marginal distribution.

\[E[X] = \mu + \gamma E[Y]\]

Returns:: mean – Mean vector of X, shape (d,).
Return type:: ndarray

property mixing_distribution: ExponentialFamily

Access the mixing distribution of Y.

Returns:: mixing – The mixing distribution (e.g., GIG, Gamma, InverseGaussian).
Return type:: ExponentialFamily

property natural_params: ndarray[tuple[Any, ...], dtype[_ScalarT]]

Natural parameters \(\theta\) of the joint distribution.

Note: The marginal distribution itself is not an exponential family, so these are the natural parameters of the joint distribution.

Returns:: theta – Natural parameter vector.
Return type:: ndarray

pdf(x: ArrayLike) → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Marginal probability density function \(f(x)\).

\[f(x) = \int_0^\infty f(x, y) \, dy\]

Parameters:: x (array_like) – Points at which to evaluate the PDF.
Returns:: pdf – Probability density values.
Return type:: ndarray

pdf_joint(x: ArrayLike, y: ArrayLike) → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Joint probability density function \(f(x, y)\).

Convenience alias for self.joint.pdf(x, y).

Parameters:

x (array_like) – Observed X values.
y (array_like) – Observed Y values.

Returns:

pdf – Joint density values.

Return type:

ndarray

rvs(size: int | Tuple[int, ...] | None = None, random_state: int | Generator | None = None) → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Generate random samples from the marginal distribution.

Samples X values only (Y is integrated out).

Parameters:

size (int or tuple of ints, optional) – Number of samples to generate.
random_state (int or Generator, optional) – Random number generator or seed.

Returns:

X – Sampled X values. Shape (d,) if size is None, else (n, d) or (*size, d).

Return type:

ndarray

rvs_joint(size: int | Tuple[int, ...] | None = None, random_state: int | Generator | None = None) → Tuple[ndarray[tuple[Any, ...], dtype[_ScalarT]], ndarray[tuple[Any, ...], dtype[_ScalarT]]][source]

Generate random samples from the joint distribution.

Samples both X and Y values.

Parameters:

size (int or tuple of ints, optional) – Number of samples to generate.
random_state (int or Generator, optional) – Random number generator or seed.

Returns:

X (ndarray) – Sampled X values.
Y (ndarray) – Sampled Y values (latent mixing variable).

set_classical_params(**kwargs) → NormalMixture[source]

Set parameters from classical parametrization.

Parameters:: **kwargs – Distribution-specific classical parameters.
Returns:: self – Returns self for method chaining.
Return type:: NormalMixture

set_expectation_params(eta: ndarray[tuple[Any, ...], dtype[_ScalarT]]) → NormalMixture[source]

Set parameters from expectation parametrization.

Parameters:: eta (ndarray) – Expectation parameter vector for the joint distribution.
Returns:: self – Returns self for method chaining.
Return type:: NormalMixture

set_natural_params(theta: ndarray[tuple[Any, ...], dtype[_ScalarT]]) → NormalMixture[source]

Set parameters from natural parametrization.

Parameters:: theta (ndarray) – Natural parameter vector for the joint distribution.
Returns:: self – Returns self for method chaining.
Return type:: NormalMixture

std() → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Standard deviation of the marginal distribution.

Returns:: std – Standard deviation of each component, shape (d,).
Return type:: ndarray

var() → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Variance of the marginal distribution (diagonal of covariance).

Returns:: var – Variance of each component, shape (d,).
Return type:: ndarray

Univariate Distributions

Exponential distribution as an exponential family.

The Exponential distribution has PDF:

\[p(x|\lambda) = \lambda e^{-\lambda x}\]

for \(x \geq 0\), where \(\lambda > 0\) is the rate parameter.

Exponential family form:

\(h(x) = 1\) for \(x \geq 0\) (base measure)
\(t(x) = x\) (sufficient statistic)
\(\theta = -\lambda\) (natural parameter)
\(\psi(\theta) = -\log(-\theta)\) (log partition function)

Parametrizations:

Classical: \(\lambda\) (rate), \(\lambda > 0\)
Natural: \(\theta = -\lambda\), \(\theta < 0\)
Expectation: \(\eta = E[X] = 1/\lambda\), \(\eta > 0\)

Note: scipy uses scale = 1/rate parametrization.

class normix.distributions.univariate.exponential.Exponential[source]

Bases: ExponentialFamily

Exponential distribution in exponential family form.

The Exponential distribution has PDF:

\[p(x|\lambda) = \lambda e^{-\lambda x}\]

for \(x \geq 0\), where \(\lambda\) is the rate parameter.

Parameters:: rate (float, optional) – Rate parameter \(\lambda > 0\). Use from_classical_params(rate=...).

_natural_params

Internal storage for natural parameters \(\theta = -\lambda\).

Type:: tuple or None

Examples

>>> # Create from rate parameter
>>> dist = Exponential.from_classical_params(rate=2.0)
>>> dist.mean()
0.5

>>> # Create from natural parameters
>>> dist = Exponential.from_natural_params(np.array([-2.0]))

>>> # Fit from data
>>> data = np.random.exponential(scale=0.5, size=1000)
>>> dist = Exponential().fit(data)

See also

Gamma: Generalization of Exponential (Exponential is Gamma with \(\alpha = 1\))

Notes

The Exponential distribution is a special case of the Gamma distribution with shape parameter \(\alpha = 1\):

\[\text{Exponential}(\lambda) = \text{Gamma}(1, \lambda)\]

Exponential family form:

Sufficient statistic: \(t(x) = x\)
Natural parameter: \(\theta = -\lambda\)
Log partition: \(\psi(\theta) = -\log(-\theta)\)

References

Barndorff-Nielsen, O. E. (1978). Information and exponential families.

cdf(x: ArrayLike) → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Cumulative distribution function: F(x) = 1 - exp(-λx) for x ≥ 0.

Parameters:: x (array_like) – Points at which to evaluate CDF.
Returns:: cdf – CDF values.
Return type:: ndarray

fisher_information(theta: ndarray[tuple[Any, ...], dtype[_ScalarT]] | None = None) → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Analytical Fisher information: I(theta) = 1/theta^2.

The Fisher information matrix (Hessian of log partition) is:

\[I(\theta) = \frac{1}{\theta^2} = \frac{1}{\lambda^2}\]

Parameters:: theta (ndarray, optional) – Natural parameter vector. If None, uses current parameters.
Returns:: fisher – Fisher information matrix.
Return type:: ndarray, shape (1, 1)

mean() → float[source]

Mean of Exponential distribution: E[X] = 1/lambda.

\[E[X] = \frac{1}{\lambda}\]

Returns:: mean – Mean of the distribution.
Return type:: float

rvs(size=None, random_state=None)[source]

Generate random samples from the exponential distribution.

Parameters:

size (int or tuple of ints, optional) – Shape of samples to generate.
random_state (int or Generator, optional) – Random number generator seed or instance.

Returns:

samples – Random samples from the distribution.

Return type:

float or ndarray

var() → float[source]

Variance of Exponential distribution: Var[X] = 1/lambda^2.

\[\text{Var}[X] = \frac{1}{\lambda^2}\]

Returns:: var – Variance of the distribution.
Return type:: float

Gamma distribution as an exponential family.

The Gamma distribution has PDF:

\[p(x|\alpha, \beta) = \frac{\beta^\alpha}{\Gamma(\alpha)} x^{\alpha-1} e^{-\beta x}\]

for \(x > 0\), where \(\alpha > 0\) is the shape parameter and \(\beta > 0\) is the rate parameter.

Exponential family form:

\(h(x) = 1\) for \(x > 0\) (base measure)
\(t(x) = [\log x, x]\) (sufficient statistics)
\(\theta = [\alpha - 1, -\beta]\) (natural parameters)
\(\psi(\theta) = \log\Gamma(\theta_1 + 1) - (\theta_1 + 1)\log(-\theta_2)\) (log partition)

Parametrizations:

Classical: \(\alpha\) (shape), \(\beta\) (rate), \(\alpha > 0, \beta > 0\)
Natural: \(\theta = [\alpha - 1, -\beta]\), \(\theta_1 > -1, \theta_2 < 0\)
Expectation: \(\eta = [\psi(\alpha) - \log\beta, \alpha/\beta]\), where \(\psi\) is the digamma function

Note: scipy uses scale = 1/rate parametrization.

class normix.distributions.univariate.gamma.Gamma[source]

Bases: ExponentialFamily

Gamma distribution in exponential family form.

The Gamma distribution has PDF:

\[p(x|\alpha, \beta) = \frac{\beta^\alpha}{\Gamma(\alpha)} x^{\alpha-1} e^{-\beta x}\]

for \(x > 0\), where \(\alpha\) is the shape parameter and \(\beta\) is the rate parameter.

Parameters:

shape (float, optional) – Shape parameter \(\alpha > 0\). Use from_classical_params(shape=..., rate=...).
rate (float, optional) – Rate parameter \(\beta > 0\). Use from_classical_params(shape=..., rate=...).

_natural_params

Internal storage for natural parameters \(\theta = [\alpha - 1, -\beta]\).

Type:: tuple or None

Examples

>>> # Create from shape and rate parameters
>>> dist = Gamma.from_classical_params(shape=2.0, rate=1.0)
>>> dist.mean()
2.0

>>> # Create from natural parameters
>>> dist = Gamma.from_natural_params(np.array([1.0, -1.0]))

>>> # Fit from data
>>> data = np.random.gamma(shape=2.0, scale=1.0, size=1000)
>>> dist = Gamma().fit(data)

See also

InverseGamma: Inverse of Gamma distribution
GeneralizedInverseGaussian: Generalization including Gamma as special case
Exponential: Special case of Gamma with \(\alpha = 1\)

Notes

The Gamma distribution belongs to the exponential family with:

Sufficient statistics: \(t(x) = [\log x, x]\)
Natural parameters: \(\theta = [\alpha - 1, -\beta]\)
Log partition: \(\psi(\theta) = \log\Gamma(\theta_1 + 1) - (\theta_1 + 1)\log(-\theta_2)\)

The expectation parameters are:

\[\eta = [\psi(\alpha) - \log\beta, \alpha/\beta]\]

where \(\psi\) is the digamma function.

References

Barndorff-Nielsen, O. E. (1978). Information and exponential families.

cdf(x: ArrayLike) → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Cumulative distribution function using scipy’s implementation.

Parameters:: x (array_like) – Points at which to evaluate CDF.
Returns:: cdf – CDF values.
Return type:: ndarray or float

fisher_information(theta: ndarray[tuple[Any, ...], dtype[_ScalarT]] | None = None) → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Analytical Fisher information: I(theta) = Hessian of psi(theta).

The Fisher information matrix (Hessian of log partition) is:

\[\begin{split}I(\theta) = \begin{pmatrix} \psi'(\alpha) & 1/\beta \\ 1/\beta & \alpha/\beta^2 \end{pmatrix}\end{split}\]

where \(\psi'\) is the trigamma function, \(\alpha = \theta_1 + 1\), and \(\beta = -\theta_2\).

Parameters:: theta (ndarray, optional) – Natural parameter vector. If None, uses current parameters.
Returns:: fisher – Fisher information matrix (positive definite).
Return type:: ndarray, shape (2, 2)

mean() → float[source]

Mean of Gamma distribution: E[X] = alpha/beta.

\[E[X] = \frac{\alpha}{\beta}\]

Returns:: mean – Mean of the distribution.
Return type:: float

rvs(size=None, random_state=None)[source]

Generate random samples from the gamma distribution.

Parameters:

size (int or tuple of ints, optional) – Shape of samples to generate.
random_state (int or Generator, optional) – Random number generator seed or instance.

Returns:

samples – Random samples from the distribution.

Return type:

float or ndarray

var() → float[source]

Variance of Gamma distribution: Var[X] = alpha/beta^2.

\[\text{Var}[X] = \frac{\alpha}{\beta^2}\]

Returns:: var – Variance of the distribution.
Return type:: float

Inverse Gamma distribution as an exponential family.

The Inverse Gamma distribution has PDF:

\[p(x|\alpha, \beta) = \frac{\beta^\alpha}{\Gamma(\alpha)} x^{-\alpha-1} e^{-\beta/x}\]

for \(x > 0\), where \(\alpha > 0\) is the shape parameter and \(\beta > 0\) is the rate parameter.

Exponential family form:

\(h(x) = 1\) for \(x > 0\) (base measure)
\(t(x) = [-1/x, \log x]\) (sufficient statistics)
\(\theta = [\beta, -(\alpha+1)]\) (natural parameters)
\(\psi(\theta) = \log\Gamma(-\theta_2-1) - (-\theta_2-1)\log(\theta_1)\) (log partition)

Parametrizations:

Classical: \(\alpha\) (shape), \(\beta\) (rate), \(\alpha > 0, \beta > 0\)
Natural: \(\theta = [\beta, -(\alpha+1)]\), \(\theta_1 > 0, \theta_2 < -1\)
Expectation: \(\eta = [-\alpha/\beta, \log\beta - \psi(\alpha)]\), where \(\psi\) is the digamma function

Note: We use rate \(\beta\) (like Gamma), while scipy uses scale = \(\beta\).

class normix.distributions.univariate.inverse_gamma.InverseGamma[source]

Bases: ExponentialFamily

Inverse Gamma distribution in exponential family form.

The Inverse Gamma distribution has PDF:

\[p(x|\alpha, \beta) = \frac{\beta^\alpha}{\Gamma(\alpha)} x^{-\alpha-1} e^{-\beta/x}\]

for \(x > 0\), where \(\alpha\) is the shape parameter and \(\beta\) is the rate parameter.

If \(X \sim \text{Gamma}(\alpha, \beta)\), then \(1/X \sim \text{InvGamma}(\alpha, \beta)\).

Parameters:

shape (float, optional) – Shape parameter \(\alpha > 0\). Use from_classical_params(shape=..., rate=...).
rate (float, optional) – Rate parameter \(\beta > 0\). Use from_classical_params(shape=..., rate=...).

_natural_params

Internal storage for natural parameters \(\theta = [\beta, -(\alpha+1)]\).

Type:: tuple or None

Examples

>>> # Create from shape and rate parameters
>>> dist = InverseGamma.from_classical_params(shape=3.0, rate=1.5)
>>> dist.mean()
0.75

>>> # Create from natural parameters
>>> dist = InverseGamma.from_natural_params(np.array([1.5, -4.0]))

>>> # Fit from data
>>> from scipy.stats import invgamma
>>> data = invgamma.rvs(a=3.0, scale=1.5, size=1000)
>>> dist = InverseGamma().fit(data)

See also

Gamma: Inverse of InverseGamma distribution
GeneralizedInverseGaussian: Generalization including InverseGamma as special case

Notes

The Inverse Gamma distribution belongs to the exponential family with:

Sufficient statistics: \(t(x) = [-1/x, \log x]\)
Natural parameters: \(\theta = [\beta, -(\alpha+1)]\)
Log partition: \(\psi(\theta) = \log\Gamma(-\theta_2-1) - (-\theta_2-1)\log(\theta_1)\)

The mean exists only for \(\alpha > 1\):

\[E[X] = \frac{\beta}{\alpha - 1}\]

The variance exists only for \(\alpha > 2\):

\[\text{Var}[X] = \frac{\beta^2}{(\alpha-1)^2(\alpha-2)}\]

References

Barndorff-Nielsen, O. E. (1978). Information and exponential families.

cdf(x: ArrayLike) → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Cumulative distribution function using scipy’s implementation.

Parameters:: x (array_like) – Points at which to evaluate CDF.
Returns:: cdf – CDF values.
Return type:: ndarray or float

fisher_information(theta: ndarray[tuple[Any, ...], dtype[_ScalarT]] | None = None) → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Analytical Fisher information: I(θ) = ∇²ψ(θ).

The Hessian is: I₁₁ = α/β² I₁₂ = I₂₁ = 1/β I₂₂ = ψ’(α) (trigamma function)

where α = -θ₂-1, β = θ₁.

mean() → float[source]

Mean of Inverse Gamma distribution: E[X] = beta/(alpha-1) for alpha > 1.

\[E[X] = \frac{\beta}{\alpha - 1} \quad \text{for } \alpha > 1\]

Returns infinity if \(\alpha \leq 1\).

Returns:: mean – Mean of the distribution, or infinity if undefined.
Return type:: float

rvs(size=None, random_state=None)[source]

Generate random samples from the inverse gamma distribution.

Parameters:

size (int or tuple of ints, optional) – Shape of samples to generate.
random_state (int or Generator, optional) – Random number generator seed or instance.

Returns:

samples – Random samples from the distribution.

Return type:

float or ndarray

var() → float[source]

Variance of Inverse Gamma distribution for alpha > 2.

\[\text{Var}[X] = \frac{\beta^2}{(\alpha-1)^2(\alpha-2)} \quad \text{for } \alpha > 2\]

Returns infinity if \(\alpha \leq 2\).

Returns:: var – Variance of the distribution, or infinity if undefined.
Return type:: float

Inverse Gaussian distribution (Wald distribution).

Special case of GIG with \(p = -1/2\). Also belongs to the exponential family.

The Inverse Gaussian distribution has PDF:

\[p(x|\mu, \lambda) = \sqrt{\frac{\lambda}{2\pi x^3}} \exp\left(-\frac{\lambda(x-\mu)^2}{2\mu^2 x}\right)\]

for \(x > 0\), where \(\mu > 0\) is the mean and \(\lambda > 0\) is the shape parameter.

Exponential family form:

\(h(x) = 1/\sqrt{2\pi x^3}\) for \(x > 0\) (base measure)
\(t(x) = [x, 1/x]\) (sufficient statistics)
\(\theta = [-\lambda/(2\mu^2), -\lambda/2]\) (natural parameters)
\(\psi(\theta) = -2\sqrt{\theta_1\theta_2} - \frac{1}{2}\log(-2\theta_2)\) (log partition)

Parametrizations:

Classical: \(\mu\) (mean), \(\lambda\) (shape), \(\mu > 0, \lambda > 0\)
Natural: \(\theta = [-\lambda/(2\mu^2), -\lambda/2]\), \(\theta_1 < 0, \theta_2 < 0\)
Expectation: \(\eta = [\mu, 1/\mu + 1/\lambda]\)

Note: scipy uses (mu, scale) where:

scipy_mu = \(\mu/\lambda\) (shape parameter, confusingly named)
scipy_scale = \(\lambda\) (our shape parameter)
Relationship: \(\mu = \text{scipy\_mu} \times \text{scipy\_scale}\), \(\lambda = \text{scipy\_scale}\)

class normix.distributions.univariate.inverse_gaussian.InverseGaussian[source]

Bases: ExponentialFamily

Inverse Gaussian distribution in exponential family form.

Also known as the Wald distribution. Special case of GIG with \(p = -1/2\).

The Inverse Gaussian distribution has PDF:

\[p(x|\mu, \lambda) = \sqrt{\frac{\lambda}{2\pi x^3}} \exp\left(-\frac{\lambda(x-\mu)^2}{2\mu^2 x}\right)\]

for \(x > 0\), where \(\mu\) is the mean and \(\lambda\) is the shape parameter.

Parameters:

mean (float, optional) – Mean parameter \(\mu > 0\). Use from_classical_params(mean=..., shape=...).
shape (float, optional) – Shape parameter \(\lambda > 0\). Use from_classical_params(mean=..., shape=...).

_natural_params

Internal storage for natural parameters \(\theta = [-\lambda/(2\mu^2), -\lambda/2]\).

Type:: tuple or None

Examples

>>> # Create from mean and shape parameters
>>> dist = InverseGaussian.from_classical_params(mean=1.0, shape=1.0)
>>> dist.mean()
1.0

>>> # Create from natural parameters
>>> dist = InverseGaussian.from_natural_params(np.array([-0.5, -0.5]))

>>> # Fit from data
>>> from scipy.stats import invgauss
>>> data = invgauss.rvs(mu=1.0, scale=1.0, size=1000)
>>> dist = InverseGaussian().fit(data)

See also

GeneralizedInverseGaussian: Generalization with parameter \(p\)

Notes

The Inverse Gaussian distribution belongs to the exponential family with:

Sufficient statistics: \(t(x) = [x, 1/x]\)
Natural parameters: \(\theta = [-\lambda/(2\mu^2), -\lambda/2]\)
Log partition: \(\psi(\theta) = -2\sqrt{\theta_1\theta_2} - \frac{1}{2}\log(-2\theta_2)\)

It is a special case of the Generalized Inverse Gaussian (GIG) with \(p = -1/2\):

\[\text{InvGauss}(\mu, \lambda) = \text{GIG}(p=-1/2, a=\lambda/\mu^2, b=\lambda)\]

References

Barndorff-Nielsen, O. E. (1978). Information and exponential families.

Chhikara, R. S. & Folks, J. L. (1989). The Inverse Gaussian Distribution.

cdf(x: ArrayLike) → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Cumulative distribution function using scipy’s implementation.

Parameters:: x (array_like) – Points at which to evaluate CDF.
Returns:: cdf – CDF values.
Return type:: ndarray or float

fisher_information(theta: ndarray[tuple[Any, ...], dtype[_ScalarT]] | None = None) → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]: Analytical Fisher information: I(θ) = ∇²ψ(θ).

mean() → float[source]

Mean of Inverse Gaussian distribution: E[X] = mu.

\[E[X] = \mu\]

Returns:: mean – Mean of the distribution.
Return type:: float

rvs(size=None, random_state=None)[source]

Generate random samples from the inverse Gaussian distribution.

Uses numpy’s wald distribution which has the same parameterization.

Parameters:

size (int or tuple of ints, optional) – Shape of samples to generate.
random_state (int or Generator, optional) – Random number generator seed or instance.

Returns:

samples – Random samples from the distribution.

Return type:

float or ndarray

var() → float[source]

Variance of Inverse Gaussian distribution: Var[X] = mu^3/lambda.

\[\text{Var}[X] = \frac{\mu^3}{\lambda}\]

Returns:: var – Variance of the distribution.
Return type:: float

Generalized Inverse Gaussian (GIG) distribution.

The GIG distribution belongs to the exponential family and serves as a unifying framework for many important distributions.

The GIG distribution has PDF (Wikipedia parameterization):

\[f(x|p,a,b) = \frac{(a/b)^{p/2}}{2 K_p(\sqrt{ab})} x^{p-1} \exp\left(-\frac{ax + b/x}{2}\right)\]

for \(x > 0\), where:

\(p\): real-valued shape parameter (any real number)
\(a > 0\): rate parameter (coefficient of \(x\))
\(b > 0\): rate parameter (coefficient of \(1/x\))
\(K_p\) is the modified Bessel function of the second kind

Exponential family form:

\(h(x) = 1\) for \(x > 0\) (base measure)
\(t(x) = [\log x, 1/x, x]\) (sufficient statistics)
\(\theta = [p-1, -b/2, -a/2]\) (natural parameters)
\(\psi(\theta) = \log(2) + \log K_p(\sqrt{ab}) + \frac{p}{2}\log(b/a)\) (log partition)

Natural parameters: \(\theta = (\theta_1, \theta_2, \theta_3)\):

\(\theta_1 = p - 1\) (unbounded)
\(\theta_2 = -b/2 < 0\)
\(\theta_3 = -a/2 < 0\)

Scipy parametrization (scipy.stats.geninvgauss):

scipy uses \((p, b_{scipy}, \text{scale})\) where \(b_{scipy} = \sqrt{ab}\) and \(\text{scale} = \sqrt{b/a}\)

Special cases:

Gamma: \(b \to 0\) gives \(\text{Gamma}(p, a/2)\) for \(p > 0\)
Inverse Gamma: \(a \to 0\) gives \(\text{InvGamma}(-p, b/2)\) for \(p < 0\)
Inverse Gaussian: \(p = -1/2\)

References

https://en.wikipedia.org/wiki/Generalized_inverse_Gaussian_distribution

normix.distributions.univariate.generalized_inverse_gaussian.GIG: alias of GeneralizedInverseGaussian

class normix.distributions.univariate.generalized_inverse_gaussian.GeneralizedInverseGaussian[source]

Bases: ExponentialFamily

Generalized Inverse Gaussian (GIG) distribution in exponential family form.

The GIG distribution has PDF (Wikipedia parameterization):

\[f(x|p,a,b) = \frac{(a/b)^{p/2}}{2 K_p(\sqrt{ab})} x^{p-1} \exp\left(-\frac{ax + b/x}{2}\right)\]

for \(x > 0\), where \(K_p\) is the modified Bessel function of the second kind.

Parameters:

p (float, optional) – Shape parameter (any real number). Use from_classical_params(p=..., a=..., b=...).
a (float, optional) – Rate parameter \(a > 0\) (coefficient of \(x\)).
b (float, optional) – Rate parameter \(b > 0\) (coefficient of \(1/x\)).

_natural_params

Internal storage for natural parameters \(\theta = [p-1, -b/2, -a/2]\).

Type:: tuple or None

Examples

>>> # Create from classical parameters
>>> dist = GeneralizedInverseGaussian.from_classical_params(p=1.0, a=1.0, b=1.0)
>>> dist.mean()

>>> # Create from natural parameters
>>> dist = GeneralizedInverseGaussian.from_natural_params(np.array([0.0, -0.5, -0.5]))

>>> # Fit from data
>>> from scipy.stats import geninvgauss
>>> data = geninvgauss.rvs(p=1.0, b=1.0, size=1000)
>>> dist = GeneralizedInverseGaussian().fit(data)

See also

Gamma: Special case when \(b \to 0\) for \(p > 0\)
InverseGamma: Special case when \(a \to 0\) for \(p < 0\)
InverseGaussian: Special case when \(p = -1/2\)

Notes

The GIG distribution belongs to the exponential family with:

Sufficient statistics: \(t(x) = [\log x, 1/x, x]\)
Natural parameters: \(\theta = [p-1, -b/2, -a/2]\)
Log partition: \(\psi(\theta) = \log(2) + \log K_p(\sqrt{ab}) + \frac{p}{2}\log(b/a)\)

Moments are given by:

\[E[X^\alpha] = \left(\frac{b}{a}\right)^{\alpha/2} \frac{K_{p+\alpha}(\sqrt{ab})}{K_p(\sqrt{ab})}\]

Special cases:

\(b \to 0, p > 0\): \(\text{GIG} \to \text{Gamma}(p, a/2)\)
\(a \to 0, p < 0\): \(\text{GIG} \to \text{InvGamma}(-p, b/2)\)
\(p = -1/2\): \(\text{GIG} \to \text{InverseGaussian}\)

References

Barndorff-Nielsen, O. E. (1978). Information and exponential families.

https://en.wikipedia.org/wiki/Generalized_inverse_Gaussian_distribution

cdf(x: ArrayLike) → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Cumulative distribution function.

Uses scipy.stats.geninvgauss.

fisher_information(theta: ndarray[tuple[Any, ...], dtype[_ScalarT]] | None = None) → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Fisher information matrix I(θ) = ∇²A(θ).

Uses numerical differentiation since the analytical form is complex.

classmethod from_scipy_params(p: float, b: float, scale: float = 1.0) → GeneralizedInverseGaussian[source]

Create from scipy.stats.geninvgauss parameters.

Conversion:: p = p_scipy a = b_scipy / scale b = b_scipy * scale

Parameters:

p (float) – Shape parameter.
b (float) – Shape parameter b > 0 (this is √(ab) in our notation).
scale (float, optional) – Scale parameter > 0. Default is 1.0.

Returns:

dist – Distribution instance.

Return type:

GeneralizedInverseGaussian

mean() → float[source]

Mean of GIG distribution using Bessel function ratio.

\[E[X] = \sqrt{\frac{b}{a}} \frac{K_{p+1}(\sqrt{ab})}{K_p(\sqrt{ab})}\]

Returns:: mean – Mean of the distribution.
Return type:: float

moment_alpha(alpha: float) → float[source]

Compute the α-th moment E[X^α].

E[X^α] = (b/a)^(α/2) · K_{p+α}(√(ab)) / K_p(√(ab))

Parameters:: alpha (float) – Order of the moment.
Returns:: moment – The α-th moment.
Return type:: float

ppf(q: ArrayLike) → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Percent point function (inverse of CDF).

Uses scipy.stats.geninvgauss.

rvs(size=None, random_state=None, method='scipy')[source]

Generate random samples from the GIG distribution.

Parameters:

size (int or tuple of ints, optional) – Shape of samples to generate.
random_state (int or Generator, optional) – Random number generator seed or instance.
method (str, optional) – Sampling method: ‘scipy’ uses scipy.stats.geninvgauss (default), ‘naive’ uses inverse CDF method.

Returns:

samples – Random samples from the distribution.

Return type:

float or ndarray

to_scipy_params()[source]

Convert to scipy.stats.geninvgauss parameters (p, b, scale).

scipy parameterization:: f(x|p,b,scale) ∝ (x/scale)^{p-1} exp(-b((x/scale) + scale/x)/2)
Conversion:: p_scipy = p b_scipy = √(ab) scale = √(b/a)

Returns:: params – Dictionary with keys ‘p’, ‘b’, ‘scale’.
Return type:: dict

var() → float[source]

Variance of GIG distribution: Var[X] = E[X^2] - E[X]^2.

\[\text{Var}[X] = E[X^2] - (E[X])^2\]

Uses the moment formula \(E[X^\alpha] = (b/a)^{\alpha/2} K_{p+\alpha}(\sqrt{ab})/K_p(\sqrt{ab})\).

Returns:: var – Variance of the distribution.
Return type:: float

Multivariate Distributions

Multivariate Normal distribution as an exponential family.

The multivariate Normal distribution has PDF:

\[p(x|\mu,\Sigma) = (2\pi)^{-d/2} |\Sigma|^{-1/2} \exp\left(-\frac{1}{2} (x-\mu)^T \Sigma^{-1} (x-\mu)\right)\]

for \(x \in \mathbb{R}^d\), where \(\mu\) is the mean vector and \(\Sigma\) is the covariance matrix.

Exponential family form:

\(h(x) = 1\) (base measure, with \((2\pi)^{-d/2}\) absorbed into \(\psi\))
\(t(x) = [x, \text{vec}(xx^T)]\) (sufficient statistics)
\(\theta = [\Lambda\mu, -\frac{1}{2}\text{vec}(\Lambda)]\) where \(\Lambda = \Sigma^{-1}\)
\(\psi(\theta) = \frac{1}{2}\mu^T\Lambda\mu - \frac{1}{2}\log|\Lambda| + \frac{d}{2}\log(2\pi)\)

Parametrizations:

Classical: \(\mu\) (mean, d-vector), \(\Sigma\) (covariance, d×d positive definite)
Natural: \(\theta = [\eta, \text{vec}(\Lambda_{half})]\) where \(\eta = \Lambda\mu\), \(\Lambda_{half} = -\frac{1}{2}\Lambda\)
Expectation: \(\eta = [E[X], E[XX^T]] = [\mu, \Sigma + \mu\mu^T]\)

Supports both univariate (d=1) and multivariate (d>1) cases. For d=1: \(\mu\) is scalar, \(\Sigma\) is scalar (variance \(\sigma^2\)).

Internal storage

After migration, the distribution stores the Cholesky decomposition of the covariance matrix rather than the full covariance or its inverse:

_mu: mean vector, shape (d,)
_L: lower Cholesky factor of \(\Sigma\), shape (d, d)

Derived quantities log_det_Sigma and L_inv are cached properties, computed on demand and invalidated when parameters change.

normix.distributions.multivariate.normal.MVN: alias of MultivariateNormal

class normix.distributions.multivariate.normal.MultivariateNormal(d: int | None = None)[source]

Bases: ExponentialFamily

Multivariate Normal distribution in exponential family form.

The Multivariate Normal distribution has PDF:

\[p(x|\mu,\Sigma) = (2\pi)^{-d/2} |\Sigma|^{-1/2} \exp\left(-\frac{1}{2} (x-\mu)^T \Sigma^{-1} (x-\mu)\right)\]

Supports both univariate (d=1) and multivariate (d>1) cases.

Parameters:: d (int, optional) – Dimension of the distribution. Inferred from parameters if not provided.

_mu

Mean vector, shape (d,).

Type:: ndarray or None

_L

Lower Cholesky factor of \(\Sigma\), shape (d, d). \(\Sigma = L L^T\).

Type:: ndarray or None

_d

Dimension of the distribution.

Type:: int or None

Examples

>>> # 1D case (univariate normal)
>>> dist = MultivariateNormal.from_classical_params(mu=0.0, sigma=np.array([[1.0]]))
>>> dist.mean()
array([0.])

>>> # 2D case
>>> mu = np.array([1.0, 2.0])
>>> sigma = np.array([[1.0, 0.5], [0.5, 1.0]])
>>> dist = MultivariateNormal.from_classical_params(mu=mu, sigma=sigma)
>>> dist.mean()
array([1., 2.])

>>> # Fit from data
>>> data = np.random.multivariate_normal([0, 0], [[1, 0.5], [0.5, 1]], size=1000)
>>> dist = MultivariateNormal(d=2).fit(data)

Notes

The Multivariate Normal distribution belongs to the exponential family with:

Sufficient statistics: \(t(x) = [x, \text{vec}(xx^T)]\)
Natural parameters: \(\theta = [\Lambda\mu, -\frac{1}{2}\text{vec}(\Lambda)]\)
Log partition: \(\psi(\theta) = \frac{1}{2}\eta^T\Sigma\eta - \frac{1}{2}\log|\Lambda| + \frac{d}{2}\log(2\pi)\)

where \(\Lambda = \Sigma^{-1}\) is the precision matrix.

The internal storage uses the Cholesky decomposition \(L\) of the covariance \(\Sigma = LL^T\), which:

Avoids repeated matrix inversions in logpdf and rvs
Provides numerically stable log-determinant: \(\log|\Sigma| = 2\sum_i \log L_{ii}\)
Enables efficient Mahalanobis distance via solve_triangular

References

Barndorff-Nielsen, O. E. (1978). Information and exponential families.

property L_inv: ndarray[tuple[Any, ...], dtype[_ScalarT]]

Inverse of the lower Cholesky factor (cached).

\(L^{-1}\) such that \(\Sigma^{-1} = L^{-T} L^{-1}\).

Returns:: L_inv – Lower triangular matrix.
Return type:: ndarray, shape (d, d)

cdf(x: ArrayLike) → float | ndarray[tuple[Any, ...], dtype[floating]][source]

Cumulative distribution function.

For d=1, uses the univariate normal CDF. For d>1, uses scipy’s multivariate_normal.cdf.

Parameters:: x (array_like) – Points at which to evaluate CDF.
Returns:: cdf – CDF values.
Return type:: float or ndarray

cov() → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]: Covariance matrix Σ = L L^T.

property d: int: Dimension of the distribution.

entropy() → float[source]: Differential entropy.

\[H(X) = \frac{d}{2}(1 + \log(2\pi)) + \frac{1}{2}\log|\Sigma|\]

fisher_information(theta: ndarray[tuple[Any, ...], dtype[_ScalarT]] | None = None) → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Fisher information matrix.

Uses numerical differentiation from base class.

fit(X: ArrayLike, y: ArrayLike | None = None, **kwargs) → MultivariateNormal[source]

Fit distribution parameters using Maximum Likelihood Estimation.

For MVN, the MLE is: - μ̂ = sample mean - Σ̂ = sample covariance

Parameters:

X (array_like) – Training data. Shape (n_samples, d) or (n_samples,) for 1D.
y (array_like, optional) – Ignored.

Returns:

self – Fitted distribution.

Return type:

MultivariateNormal

classmethod from_scipy(rv: <scipy.stats._multivariate.multivariate_normal_gen object at 0x7fbff6bf4770>) → MultivariateNormal[source]: Create from scipy.stats.multivariate_normal.

property log_det_Sigma: float

Log-determinant of the covariance matrix (cached).

\[\log|\Sigma| = 2 \sum_{i=1}^d \log L_{ii}\]

Returns:: log_det
Return type:: float

logpdf(x: ArrayLike) → float | ndarray[tuple[Any, ...], dtype[floating]][source]

Log probability density using Cholesky-based computation.

Uses solve_triangular instead of np.linalg.inv for the Mahalanobis distance:

\[\log p(x|\mu,\Sigma) = -\frac{d}{2}\log(2\pi) - \frac{1}{2}\log|\Sigma| - \frac{1}{2}(x-\mu)^T \Sigma^{-1}(x-\mu)\]

where \(\Sigma^{-1}(x-\mu) = L^{-T}(L^{-1}(x-\mu))\).

Parameters:: x (array_like) – Points at which to evaluate log PDF. Shape (d,) for single sample, (n, d) for n samples.
Returns:: logpdf – Log probability density at each point.
Return type:: float or ndarray

mean() → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]: Mean of the distribution: E[X] = μ.

pdf(x: ArrayLike) → float | ndarray[tuple[Any, ...], dtype[floating]][source]: Probability density: p(x|θ) = exp(logpdf(x)).

rvs(size=None, random_state=None) → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Generate random samples using \(X = \mu + L Z\) where \(Z \sim N(0, I)\).

Parameters:

size (int or tuple of ints, optional) – Number of samples to generate.
random_state (int or Generator, optional) – Random number generator.

Returns:

samples – Shape (size, d) for size > 1, (d,) for size = None.

Return type:

ndarray

to_scipy() → <scipy.stats._multivariate.multivariate_normal_gen object at 0x7fbff6bf4770>[source]: Convert to scipy.stats.multivariate_normal.

var() → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]: Variance (diagonal of covariance matrix).

Mixture Distributions

Variance Gamma (VG) marginal distribution \(f(x)\).

The marginal distribution of \(X\) obtained by integrating out \(Y\):

\[f(x) = \int_0^\infty f(x, y) \, dy = \int_0^\infty f(x | y) f(y) \, dy\]

where:

\[ \begin{align}\begin{aligned}X | Y \sim N(\mu + \gamma Y, \Sigma Y)\\Y \sim \text{Gamma}(\alpha, \beta)\end{aligned}\end{align} \]

The marginal PDF has a closed form involving modified Bessel functions:

\[f(x) = \frac{2 \beta^\alpha}{(2\pi)^{d/2} |\Sigma|^{1/2} \Gamma(\alpha)} \left(\frac{q(x)}{2c}\right)^{(\alpha - d/2)/2} K_{\alpha - d/2}\left(\sqrt{2 q(x) c}\right) e^{(x-\mu)^T\Sigma^{-1}\gamma}\]

where \(q(x) = (x-\mu)^T\Sigma^{-1}(x-\mu)\) is the squared Mahalanobis distance and \(c = \beta + \frac{1}{2}\gamma^T\Sigma^{-1}\gamma\).

Note: The marginal distribution is NOT an exponential family, but the joint distribution \(f(x, y)\) IS an exponential family (accessible via .joint).

class normix.distributions.mixtures.variance_gamma.VarianceGamma[source]

Bases: NormalMixture

Variance Gamma (VG) marginal distribution.

The Variance Gamma distribution is a normal variance-mean mixture where the mixing distribution is Gamma. It is a special case of the Generalized Hyperbolic distribution with GIG parameter \(b \to 0\).

The marginal distribution \(f(x)\) is NOT an exponential family. The joint distribution \(f(x, y)\) IS an exponential family and can be accessed via the .joint property.

Parameters:: instances. (None. Use factory methods to create)

_joint

The underlying joint distribution.

Type:: JointVarianceGamma

Examples

>>> # 1D case
>>> vg = VarianceGamma.from_classical_params(
...     mu=np.array([0.0]),
...     gamma=np.array([0.5]),
...     sigma=np.array([[1.0]]),
...     shape=2.0,
...     rate=1.0
... )
>>> x = vg.rvs(size=1000, random_state=42)  # Samples X only

>>> # Access joint distribution
>>> vg.joint.pdf(x, y)  # Joint PDF f(x, y)
>>> vg.joint.natural_params  # Natural parameters (exponential family)

>>> # 2D case
>>> vg = VarianceGamma.from_classical_params(
...     mu=np.array([0.0, 0.0]),
...     gamma=np.array([0.5, -0.3]),
...     sigma=np.array([[1.0, 0.3], [0.3, 1.0]]),
...     shape=2.0,
...     rate=1.0
... )

See also

JointVarianceGamma: Joint distribution (exponential family)
GeneralizedHyperbolic: General case with GIG mixing
NormalInverseGaussian: Special case with Inverse Gaussian mixing

Notes

The Variance Gamma distribution has heavier tails than the normal distribution, controlled by the Gamma shape parameter \(\alpha\). Smaller \(\alpha\) leads to heavier tails.

The skewness is controlled by the \(\gamma\) parameter:

\(\gamma = 0\): symmetric distribution
\(\gamma > 0\): right-skewed
\(\gamma < 0\): left-skewed

fit(X: ArrayLike, y: ArrayLike | None = None, *, max_iter: int = 100, tol: float = 0.001, verbose: int = 0, random_state: int | Generator | None = None) → VarianceGamma[source]

Fit distribution to data using the EM algorithm.

When only X is observed (Y is latent), uses the EM algorithm described in the theory documentation. The E-step computes conditional expectations of Y given X, and the M-step updates parameters using closed-form formulas.

Convergence is checked based on the relative change in the normal parameters \((\mu, \gamma, \Sigma)\). Log-likelihood is optionally displayed per iteration when verbose >= 1.

Parameters:

X (array_like) – Observed X data, shape (n_samples, d) or (n_samples,) for d=1.
y (array_like, optional) – Ignored (for sklearn API compatibility).
max_iter (int, optional) – Maximum number of EM iterations. Default is 100.
tol (float, optional) – Convergence tolerance for relative parameter change. Default is 1e-6.
verbose (int, optional) – Verbosity level. 0 = silent, 1 = progress with llh, 2 = detailed with parameter norms. Default is 0.
random_state (int or Generator, optional) – Random state for initialization.

Returns:

self – Fitted distribution (returns self for method chaining).

Return type:

VarianceGamma

Examples

>>> # Generate data from known distribution
>>> true_dist = VarianceGamma.from_classical_params(
...     mu=np.array([0.0]), gamma=np.array([0.5]),
...     sigma=np.array([[1.0]]), shape=2.0, rate=1.0
... )
>>> X = true_dist.rvs(size=5000, random_state=42)
>>>
>>> # Fit new distribution
>>> fitted = VarianceGamma().fit(X)
>>> print(fitted.classical_params)

Notes

The EM algorithm iterates between:

E-step: Compute conditional expectations

\[E[Y | X = x_j], \quad E[1/Y | X = x_j], \quad E[\log Y | X = x_j]\]

M-step: Update parameters using closed-form formulas

\[\begin{split}\mu &= \frac{\bar{X} - \bar{E[Y]} \cdot \overline{X/Y}}{1 - \overline{E[1/Y]} \cdot \overline{E[Y]}} \\ \gamma &= \frac{\overline{X/Y} - \overline{E[1/Y]} \cdot \bar{X}}{1 - \overline{E[1/Y]} \cdot \overline{E[Y]}}\end{split}\]

The Gamma parameters \((\alpha, \beta)\) are updated via Newton’s method.

fit_complete(X: ArrayLike, Y: ArrayLike, **kwargs) → VarianceGamma[source]

Fit distribution from complete data (both X and Y observed).

Since the joint distribution is an exponential family, this uses the closed-form MLE via the joint distribution’s fit method.

Parameters:

X (array_like) – Observed X data, shape (n_samples, d) or (n_samples,) for d=1.
Y (array_like) – Observed Y data (mixing variable), shape (n_samples,).
**kwargs – Additional fitting parameters passed to joint.fit().

Returns:

self – Fitted distribution (returns self for method chaining).

Return type:

VarianceGamma

Examples

>>> # Generate complete data
>>> true_dist = VarianceGamma.from_classical_params(
...     mu=np.array([0.0]), gamma=np.array([0.5]),
...     sigma=np.array([[1.0]]), shape=2.0, rate=1.0
... )
>>> X, Y = true_dist.rvs_joint(size=5000, random_state=42)
>>>
>>> # Fit with complete data
>>> fitted = VarianceGamma().fit_complete(X, Y)

Normal Inverse Gamma (NInvG) marginal distribution \(f(x)\).

The marginal distribution of \(X\) obtained by integrating out \(Y\):

\[f(x) = \int_0^\infty f(x, y) \, dy = \int_0^\infty f(x | y) f(y) \, dy\]

where:

\[ \begin{align}\begin{aligned}X | Y \sim N(\mu + \gamma Y, \Sigma Y)\\Y \sim \text{InvGamma}(\alpha, \beta)\end{aligned}\end{align} \]

The marginal PDF has a closed form involving modified Bessel functions:

\[f(x) = \frac{2 \beta^\alpha}{(2\pi)^{d/2} |\Sigma|^{1/2} \Gamma(\alpha)} \left(\frac{q(x)}{2c}\right)^{(\alpha - d/2)/2} K_{\alpha - d/2}\left(\sqrt{2 q(x) c}\right) e^{(x-\mu)^T\Sigma^{-1}\gamma}\]

where \(q(x) = (x-\mu)^T\Sigma^{-1}(x-\mu)\) is the squared Mahalanobis distance and \(c\) depends on the skewness parameter \(\gamma\).

Note: The marginal distribution is NOT an exponential family, but the joint distribution \(f(x, y)\) IS an exponential family (accessible via .joint).

This is a special case of the Generalized Hyperbolic distribution with GIG parameter \(a \to 0\) (inverse gamma mixing).

class normix.distributions.mixtures.normal_inverse_gamma.NormalInverseGamma[source]

Bases: NormalMixture

Normal Inverse Gamma (NInvG) marginal distribution.

The Normal Inverse Gamma distribution is a normal variance-mean mixture where the mixing distribution is InverseGamma. It is a special case of the Generalized Hyperbolic distribution with GIG parameter \(a \to 0\).

The marginal distribution \(f(x)\) is NOT an exponential family. The joint distribution \(f(x, y)\) IS an exponential family and can be accessed via the .joint property.

Parameters:: instances. (None. Use factory methods to create)

_joint

The underlying joint distribution.

Type:: JointNormalInverseGamma

Examples

>>> # 1D case
>>> nig = NormalInverseGamma.from_classical_params(
...     mu=np.array([0.0]),
...     gamma=np.array([0.5]),
...     sigma=np.array([[1.0]]),
...     shape=3.0,
...     rate=1.0
... )
>>> x = nig.rvs(size=1000, random_state=42)  # Samples X only

>>> # Access joint distribution
>>> nig.joint.pdf(x, y)  # Joint PDF f(x, y)
>>> nig.joint.natural_params  # Natural parameters (exponential family)

>>> # 2D case
>>> nig = NormalInverseGamma.from_classical_params(
...     mu=np.array([0.0, 0.0]),
...     gamma=np.array([0.5, -0.3]),
...     sigma=np.array([[1.0, 0.3], [0.3, 1.0]]),
...     shape=3.0,
...     rate=1.0
... )

See also

JointNormalInverseGamma: Joint distribution (exponential family)
GeneralizedHyperbolic: General case with GIG mixing
VarianceGamma: Special case with Gamma mixing

Notes

The Normal Inverse Gamma distribution has heavier tails than the normal distribution, controlled by the InverseGamma shape parameter \(\alpha\). Smaller \(\alpha\) leads to heavier tails.

The skewness is controlled by the \(\gamma\) parameter:

\(\gamma = 0\): symmetric distribution
\(\gamma > 0\): right-skewed
\(\gamma < 0\): left-skewed

For the mean to exist, we need \(\alpha > 1\). For the variance to exist, we need \(\alpha > 2\).

fit(X: ArrayLike, y: ArrayLike | None = None, *, max_iter: int = 100, tol: float = 0.001, verbose: int = 0, random_state: int | Generator | None = None) → NormalInverseGamma[source]

Fit distribution to data using the EM algorithm.

When only X is observed (Y is latent), uses the EM algorithm described in the theory documentation. The E-step computes conditional expectations of Y given X, and the M-step updates parameters using closed-form formulas.

Convergence is checked based on the relative change in the normal parameters \((\mu, \gamma, \Sigma)\). Log-likelihood is optionally displayed per iteration when verbose >= 1.

Parameters:

X (array_like) – Observed X data, shape (n_samples, d) or (n_samples,) for d=1.
y (array_like, optional) – Ignored (for sklearn API compatibility).
max_iter (int, optional) – Maximum number of EM iterations. Default is 100.
tol (float, optional) – Convergence tolerance for relative parameter change. Default is 1e-6.
verbose (int, optional) – Verbosity level. 0 = silent, 1 = progress with llh, 2 = detailed with parameter norms. Default is 0.
random_state (int or Generator, optional) – Random state for initialization.

Returns:

self – Fitted distribution (returns self for method chaining).

Return type:

NormalInverseGamma

Notes

The EM algorithm iterates between:

E-step: Compute conditional expectations

\[E[Y | X = x_j], \quad E[1/Y | X = x_j], \quad E[\log Y | X = x_j]\]

M-step: Update parameters using closed-form formulas

The InverseGamma parameters \((\alpha, \beta)\) are updated via Newton’s method.

For valid estimates, we need \(\alpha > 1\) for the mean to exist.

fit_complete(X: ArrayLike, Y: ArrayLike, **kwargs) → NormalInverseGamma[source]

Fit distribution from complete data (both X and Y observed).

Since the joint distribution is an exponential family, this uses the closed-form MLE via the joint distribution’s fit method.

Parameters:

X (array_like) – Observed X data, shape (n_samples, d) or (n_samples,) for d=1.
Y (array_like) – Observed Y data (mixing variable), shape (n_samples,).
**kwargs – Additional fitting parameters passed to joint.fit().

Returns:

self – Fitted distribution (returns self for method chaining).

Return type:

NormalInverseGamma

Normal Inverse Gaussian (NIG) marginal distribution \(f(x)\).

The marginal distribution of \(X\) obtained by integrating out \(Y\):

\[f(x) = \int_0^\infty f(x, y) \, dy = \int_0^\infty f(x | y) f(y) \, dy\]

where:

\[ \begin{align}\begin{aligned}X | Y \sim N(\mu + \gamma Y, \Sigma Y)\\Y \sim \text{InvGauss}(\delta, \eta)\end{aligned}\end{align} \]

The marginal PDF has a closed form involving modified Bessel functions \(K_1\):

\[f(x) = \frac{\eta \delta}{\pi} \frac{K_1(\eta \sqrt{\delta^2 + q(x)})}{\sqrt{\delta^2 + q(x)}} e^{\delta \eta + (x-\mu)^T \Sigma^{-1} \gamma}\]

for the univariate case, where \(q(x) = (x-\mu)^T \Sigma^{-1}(x-\mu)\).

This is a special case of the Generalized Hyperbolic distribution with GIG parameter \(p = -1/2\).

Note: The marginal distribution is NOT an exponential family, but the joint distribution \(f(x, y)\) IS an exponential family (accessible via .joint).

class normix.distributions.mixtures.normal_inverse_gaussian.NormalInverseGaussian[source]

Bases: NormalMixture

Normal Inverse Gaussian (NIG) marginal distribution.

The Normal Inverse Gaussian distribution is a normal variance-mean mixture where the mixing distribution is Inverse Gaussian. It is a special case of the Generalized Hyperbolic distribution with GIG parameter \(p = -1/2\).

The marginal distribution \(f(x)\) is NOT an exponential family. The joint distribution \(f(x, y)\) IS an exponential family and can be accessed via the .joint property.

Parameters:: instances. (None. Use factory methods to create)

_joint

The underlying joint distribution.

Type:: JointNormalInverseGaussian

Examples

>>> # 1D case
>>> nig = NormalInverseGaussian.from_classical_params(
...     mu=np.array([0.0]),
...     gamma=np.array([0.5]),
...     sigma=np.array([[1.0]]),
...     delta=1.0,
...     eta=1.0
... )
>>> x = nig.rvs(size=1000, random_state=42)  # Samples X only

>>> # Access joint distribution
>>> nig.joint.pdf(x, y)  # Joint PDF f(x, y)
>>> nig.joint.natural_params  # Natural parameters (exponential family)

>>> # 2D case
>>> nig = NormalInverseGaussian.from_classical_params(
...     mu=np.array([0.0, 0.0]),
...     gamma=np.array([0.5, -0.3]),
...     sigma=np.array([[1.0, 0.3], [0.3, 1.0]]),
...     delta=1.0,
...     eta=1.0
... )

See also

JointNormalInverseGaussian: Joint distribution (exponential family)
GeneralizedHyperbolic: General case with GIG mixing
VarianceGamma: Special case with Gamma mixing
NormalInverseGamma: Special case with InverseGamma mixing

Notes

The NIG distribution is widely used in finance due to its semi-heavy tails and ability to model both skewness and excess kurtosis. It was introduced by Barndorff-Nielsen (1977) and provides a good fit for asset returns.

The skewness is controlled by the \(\gamma\) parameter:

\(\gamma = 0\): symmetric distribution
\(\gamma > 0\): right-skewed
\(\gamma < 0\): left-skewed

The tail heaviness is controlled by \(\eta\): smaller \(\eta\) gives heavier tails.

fit(X: ArrayLike, y: ArrayLike | None = None, *, max_iter: int = 100, tol: float = 0.001, verbose: int = 0, random_state: int | Generator | None = None) → NormalInverseGaussian[source]

Fit distribution to data using the EM algorithm.

When only X is observed (Y is latent), uses the EM algorithm described in the theory documentation. The E-step computes conditional expectations of Y given X, and the M-step updates parameters using closed-form formulas.

Convergence is checked based on the relative change in the normal parameters \((\mu, \gamma, \Sigma)\). Log-likelihood is optionally displayed per iteration when verbose >= 1.

Parameters:

X (array_like) – Observed X data, shape (n_samples, d) or (n_samples,) for d=1.
y (array_like, optional) – Ignored (for sklearn API compatibility).
max_iter (int, optional) – Maximum number of EM iterations. Default is 100.
tol (float, optional) – Convergence tolerance for relative parameter change. Default is 1e-6.
verbose (int, optional) – Verbosity level. 0 = silent, 1 = progress with llh, 2 = detailed with parameter norms. Default is 0.
random_state (int or Generator, optional) – Random state for initialization.

Returns:

self – Fitted distribution (returns self for method chaining).

Return type:

NormalInverseGaussian

Notes

The EM algorithm iterates between:

E-step: Compute conditional expectations

\[E[Y | X = x_j], \quad E[1/Y | X = x_j], \quad E[\log Y | X = x_j]\]

M-step: Update parameters using closed-form formulas

The Inverse Gaussian parameters \((\delta, \eta)\) are updated analytically: - \(\delta = E[Y]\) - \(\eta = 1 / (E[1/Y] - 1/\delta)\)

fit_complete(X: ArrayLike, Y: ArrayLike, **kwargs) → NormalInverseGaussian[source]

Fit distribution from complete data (both X and Y observed).

Since the joint distribution is an exponential family, this uses the closed-form MLE via the joint distribution’s fit method.

Parameters:

X (array_like) – Observed X data, shape (n_samples, d) or (n_samples,) for d=1.
Y (array_like) – Observed Y data (mixing variable), shape (n_samples,).
**kwargs – Additional fitting parameters passed to joint.fit().

Returns:

self – Fitted distribution (returns self for method chaining).

Return type:

NormalInverseGaussian

Generalized Hyperbolic (GH) marginal distribution \(f(x)\).

The marginal distribution of \(X\) obtained by integrating out \(Y\):

\[f(x) = \int_0^\infty f(x, y) \, dy = \int_0^\infty f(x | y) f(y) \, dy\]

where:

\[ \begin{align}\begin{aligned}X | Y \sim N(\mu + \gamma Y, \Sigma Y)\\Y \sim \text{GIG}(p, a, b)\end{aligned}\end{align} \]

The marginal PDF has a closed form involving modified Bessel functions.

Note: The marginal distribution is NOT an exponential family, but the joint distribution \(f(x, y)\) IS an exponential family (accessible via .joint).

Special cases: - Variance Gamma (VG): \(b \to 0\) (GIG → Gamma) - Normal-Inverse Gaussian (NIG): \(p = -1/2\) (GIG → InverseGaussian) - Normal-Inverse Gamma (NInvG): \(a \to 0\) (GIG → InverseGamma) - Hyperbolic: \(p = 1\)

class normix.distributions.mixtures.generalized_hyperbolic.GeneralizedHyperbolic[source]

Bases: NormalMixture

Generalized Hyperbolic (GH) marginal distribution.

The Generalized Hyperbolic distribution is a normal variance-mean mixture where the mixing distribution is Generalized Inverse Gaussian (GIG). It is the most general form of the normal variance-mean mixture family.

The marginal distribution \(f(x)\) is NOT an exponential family. The joint distribution \(f(x, y)\) IS an exponential family and can be accessed via the .joint property.

Parameters:: instances. (None. Use factory methods to create)

_joint

The underlying joint distribution.

Type:: JointGeneralizedHyperbolic

Examples

>>> # 1D case
>>> gh = GeneralizedHyperbolic.from_classical_params(
...     mu=np.array([0.0]),
...     gamma=np.array([0.5]),
...     sigma=np.array([[1.0]]),
...     p=1.0,
...     a=1.0,
...     b=1.0
... )
>>> x = gh.rvs(size=1000, random_state=42)  # Samples X only

>>> # Access joint distribution
>>> gh.joint.pdf(x, y)  # Joint PDF f(x, y)
>>> gh.joint.natural_params  # Natural parameters (exponential family)

>>> # Fit with EM algorithm and regularization
>>> gh = GeneralizedHyperbolic().fit(
...     X, max_iter=100,
...     regularization='det_sigma_one'  # or 'sigma_diagonal_one', 'fix_p', 'none'
... )

See also

JointGeneralizedHyperbolic: Joint distribution (exponential family)
GeneralizedInverseGaussian: The mixing distribution for Y
VarianceGamma: Special case with Gamma mixing (b → 0)
NormalInverseGaussian: Special case with p = -1/2
NormalInverseGamma: Special case with a → 0

Notes

The GH distribution is widely used in finance for modeling asset returns. It includes many important distributions as special cases:

Variance Gamma (VG): \(b \to 0, p > 0\) (Gamma mixing)
Normal-Inverse Gaussian (NIG): \(p = -1/2\) (IG mixing)
Normal-Inverse Gamma (NInvG): \(a \to 0, p < 0\) (InvGamma mixing)
Hyperbolic: \(p = 1\)
Student-t: \(p = -\nu/2, a \to 0, b = \nu\) gives Student-t with ν d.f.

The model is not identifiable since the parameter sets \((\mu, \gamma/c, \Sigma/c, p, c \cdot b, a/c)\) give the same distribution for any \(c > 0\). Use regularization to fix this.

Available regularization methods:

'det_sigma_one': Fix \(|\Sigma| = 1\) (recommended)
'sigma_diagonal_one': Fix \(\Sigma_{11} = 1\)
'fix_p': Fix \(p\) to a specific value (e.g., -0.5 for NIG)
'none': No regularization

classmethod as_normal_inverse_gamma(mu: ArrayLike, gamma: ArrayLike, sigma: ArrayLike, shape: float, rate: float) → GeneralizedHyperbolic[source]

Create a GH distribution equivalent to Normal-Inverse Gamma.

NInvG is GH with a → 0 (Inverse Gamma mixing).

Parameters:

mu (arrays) – Normal parameters.
gamma (arrays) – Normal parameters.
sigma (arrays) – Normal parameters.
shape (float) – InverseGamma shape parameter α.
rate (float) – InverseGamma rate parameter β.

Returns:

gh – GH distribution equivalent to NInvG.

Return type:

GeneralizedHyperbolic

classmethod as_normal_inverse_gaussian(mu: ArrayLike, gamma: ArrayLike, sigma: ArrayLike, delta: float, eta: float) → GeneralizedHyperbolic[source]

Create a GH distribution equivalent to Normal-Inverse Gaussian.

NIG is GH with p = -1/2 (Inverse Gaussian mixing).

Parameters:

mu (arrays) – Normal parameters.
gamma (arrays) – Normal parameters.
sigma (arrays) – Normal parameters.
delta (float) – IG mean parameter.
eta (float) – IG shape parameter.

Returns:

gh – GH distribution equivalent to NIG.

Return type:

GeneralizedHyperbolic

classmethod as_variance_gamma(mu: ArrayLike, gamma: ArrayLike, sigma: ArrayLike, shape: float, rate: float) → GeneralizedHyperbolic[source]

Create a GH distribution equivalent to Variance Gamma.

VG is GH with b → 0 (Gamma mixing).

Parameters:

mu (arrays) – Normal parameters.
gamma (arrays) – Normal parameters.
sigma (arrays) – Normal parameters.
shape (float) – Gamma shape parameter α.
rate (float) – Gamma rate parameter β.

Returns:

gh – GH distribution equivalent to VG.

Return type:

GeneralizedHyperbolic

cov() → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Covariance matrix of the marginal distribution.

\[\text{Cov}[X] = E[Y] \Sigma + \text{Var}[Y] \gamma\gamma^T\]

Returns:: cov – Covariance matrix, shape (d, d).
Return type:: ndarray

fit(X: ArrayLike, *, max_iter: int = 100, tol: float = 0.01, verbose: int = 0, regularization: str | Callable = 'det_sigma_one', regularization_params: Dict | None = None, random_state: int | None = None, fix_tail: bool = False) → GeneralizedHyperbolic[source]

Fit distribution to marginal data X using the EM algorithm.

Since the marginal distribution is not an exponential family, we use the EM algorithm treating Y as latent. The joint distribution (X, Y) is an exponential family, which makes the EM algorithm tractable.

Convergence is checked based on the relative change in the normal parameters \((\mu, \gamma, \Sigma)\), which are more stable than the GIG parameters. Log-likelihood is optionally displayed per iteration when verbose >= 1.

Parameters:

X (array_like) – Observed data, shape (n_samples, d) or (n_samples,) for d=1.
max_iter (int, optional) – Maximum number of EM iterations. Default is 100.
tol (float, optional) – Convergence tolerance for relative parameter change (based on \(\mu, \gamma, \Sigma\)). Default is 1e-4.
verbose (int, optional) – Verbosity level. 0 = silent, 1 = progress with llh, 2 = detailed with parameter norms. Default is 0.
regularization (str or callable, optional) –
Regularization method to use. Options:
- 'det_sigma_one': Fix |Σ| = 1 (recommended, default)
- 'sigma_diagonal_one': Fix Σ₁₁ = 1
- 'fix_p': Fix p to a specific value (use regularization_params)
- 'none': No regularization
- Or provide a custom callable with signature: f(mu, gamma, sigma, p, a, b, d) -> dict
regularization_params (dict, optional) – Additional parameters for the regularization method. For ‘fix_p’: {'p_fixed': -0.5} to fix p = -0.5 (NIG).
random_state (int, optional) – Random seed for initialization.
fix_tail (bool, optional) – If True, do not update GIG parameters (p, a, b) during fitting. Useful for fitting special cases like VG, NIG, NInvG. Default False.

Returns:

self – Fitted distribution (returns self for method chaining).

Return type:

GeneralizedHyperbolic

Notes

The EM algorithm alternates between:

E-step: Compute conditional expectations given current parameters:

\[E[Y^{-1}|X], E[Y|X], E[\log Y|X]\]

M-step: Update parameters using closed-form solutions for normal parameters and numerical optimization for GIG parameters.

After each M-step, regularization is applied to ensure identifiability. The recommended regularization is \(|\Sigma| = 1\), which doesn’t affect EM convergence (see em_algorithm.rst for proof).

The initial parameters are obtained by running a short NIG EM (5 iterations) to get a reasonable starting point for \((\mu, \gamma, \Sigma)\) and the GIG parameters.

Examples

>>> # Default regularization (|Σ| = 1)
>>> gh = GeneralizedHyperbolic().fit(X)

>>> # Fix p = -0.5 (NIG)
>>> gh = GeneralizedHyperbolic().fit(
...     X,
...     regularization='fix_p',
...     regularization_params={'p_fixed': -0.5}
... )

>>> # Custom regularization
>>> def my_regularize(mu, gamma, sigma, p, a, b, d):
...     return {'mu': mu, 'gamma': gamma, 'sigma': sigma,
...             'p': p, 'a': a, 'b': 1.0}  # Fix b = 1
>>> gh = GeneralizedHyperbolic().fit(X, regularization=my_regularize)

mean() → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Mean of the marginal distribution: \(E[X] = \mu + \gamma E[Y]\).

Returns:: mean – Mean vector, shape (d,).
Return type:: ndarray

var() → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Variance of the marginal distribution (diagonal elements of covariance).

\[\text{Cov}[X] = E[Y] \Sigma + \text{Var}[Y] \gamma\gamma^T\]

Returns:: var – Variance vector, shape (d,).
Return type:: ndarray

normix.distributions.mixtures.generalized_hyperbolic.regularize_det_sigma_one(mu: ndarray[tuple[Any, ...], dtype[_ScalarT]], gamma: ndarray[tuple[Any, ...], dtype[_ScalarT]], sigma: ndarray[tuple[Any, ...], dtype[_ScalarT]], p: float, a: float, b: float, d: int, log_det_sigma: float | None = None) → Dict[str, Any][source]

Regularize GH parameters by enforcing \(|\Sigma| = 1\).

This is the recommended regularization that doesn’t affect convergence of the EM algorithm. Following the legacy implementation:

\[(\mu, \gamma, \Sigma, p, a, b) \to (\mu, |\Sigma|^{-1/d} \gamma, |\Sigma|^{-1/d} \Sigma, p, |\Sigma|^{-1/d} a, |\Sigma|^{1/d} b)\]

Parameters:

mu (parameters) – Classical GH parameters.
gamma (parameters) – Classical GH parameters.
sigma (parameters) – Classical GH parameters.
p (parameters) – Classical GH parameters.
a (parameters) – Classical GH parameters.
b (parameters) – Classical GH parameters.
d (int) – Dimension of X.
log_det_sigma (float, optional) – Precomputed log determinant of Sigma. If provided, avoids recomputing the determinant.

Returns:

params – Regularized parameters with |Σ| = 1.

Return type:

dict

normix.distributions.mixtures.generalized_hyperbolic.regularize_fix_p(mu: ndarray[tuple[Any, ...], dtype[_ScalarT]], gamma: ndarray[tuple[Any, ...], dtype[_ScalarT]], sigma: ndarray[tuple[Any, ...], dtype[_ScalarT]], p: float, a: float, b: float, d: int, p_fixed: float = -0.5) → Dict[str, Any][source]

Regularize by fixing the GIG parameter \(p\).

This converts GH to a special case: - p = -0.5: Normal-Inverse Gaussian - p > 0 with b → 0: Variance Gamma (approximately)

Parameters:

mu (parameters) – Classical GH parameters.
gamma (parameters) – Classical GH parameters.
sigma (parameters) – Classical GH parameters.
p (parameters) – Classical GH parameters.
a (parameters) – Classical GH parameters.
b (parameters) – Classical GH parameters.
d (int) – Dimension of X.
p_fixed (float, optional) – Fixed value for p. Default is -0.5 (NIG).

Returns:

params – Regularized parameters with fixed p.

Return type:

dict

normix.distributions.mixtures.generalized_hyperbolic.regularize_none(mu: ndarray[tuple[Any, ...], dtype[_ScalarT]], gamma: ndarray[tuple[Any, ...], dtype[_ScalarT]], sigma: ndarray[tuple[Any, ...], dtype[_ScalarT]], p: float, a: float, b: float, d: int) → Dict[str, Any][source]

No regularization (identity function).

Parameters:

mu (parameters) – Classical GH parameters.
gamma (parameters) – Classical GH parameters.
sigma (parameters) – Classical GH parameters.
p (parameters) – Classical GH parameters.
a (parameters) – Classical GH parameters.
b (parameters) – Classical GH parameters.
d (int) – Dimension of X.

Returns:

params – Unchanged parameters.

Return type:

dict

normix.distributions.mixtures.generalized_hyperbolic.regularize_sigma_diagonal_one(mu: ndarray[tuple[Any, ...], dtype[_ScalarT]], gamma: ndarray[tuple[Any, ...], dtype[_ScalarT]], sigma: ndarray[tuple[Any, ...], dtype[_ScalarT]], p: float, a: float, b: float, d: int) → Dict[str, Any][source]

Regularize by enforcing the first diagonal element of \(\Sigma\) to be 1.

\[\Sigma_{11} = 1\]

This is an alternative regularization useful in some applications.

Parameters:

mu (parameters) – Classical GH parameters.
gamma (parameters) – Classical GH parameters.
sigma (parameters) – Classical GH parameters.
p (parameters) – Classical GH parameters.
a (parameters) – Classical GH parameters.
b (parameters) – Classical GH parameters.
d (int) – Dimension of X.

Returns:

params – Regularized parameters with Σ₁₁ = 1.

Return type:

dict

Utilities

Bessel function utilities for numerical stability.

This module provides numerically stable implementations of log Bessel functions that are essential for the Generalized Inverse Gaussian distribution.

The modified Bessel function of the second kind \(K_\nu(z)\) appears in the normalizing constant of the GIG distribution:

\[f(x|p,a,b) = \frac{(a/b)^{p/2}}{2 K_p(\sqrt{ab})} x^{p-1} \exp\left(-\frac{ax + b/x}{2}\right)\]

For numerical stability, we work with \(\log K_\nu(z)\) instead of \(K_\nu(z)\) directly.

normix.utils.bessel.kv_ratio(v1: float, v2: float, z: float | ndarray[tuple[Any, ...], dtype[_ScalarT]]) → float | ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Compute the ratio K_{v1}(z) / K_{v2}(z) in a numerically stable way.

Uses log space to avoid overflow/underflow:: K_{v1}(z) / K_{v2}(z) = exp(log K_{v1}(z) - log K_{v2}(z))

Parameters:

v1 (float) – Order of numerator Bessel function.
v2 (float) – Order of denominator Bessel function.
z (float or ndarray) – Argument (must be > 0).

Returns:

ratio – The ratio K_{v1}(z) / K_{v2}(z).

Return type:

float or ndarray

normix.utils.bessel.log_kv(v: float, z: float | ndarray[tuple[Any, ...], dtype[_ScalarT]]) → float | ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Log modified Bessel function of the second kind: log K_v(z).

Computes \(\log K_\nu(z)\) in a numerically stable way.

This is an optimized version where \(\nu\) is a scalar and \(z\) can be vectorized. For the fully vectorized version (both \(\nu\) and \(z\) arrays), use log_kv_vectorized().

Uses the exponentially scaled function \(K_\nu^e(z) = K_\nu(z) e^z\) for numerical stability when \(z\) is large:

\[\log K_\nu(z) = \log K_\nu^e(z) - z\]

When the result is inf (underflow for small \(z\)), uses asymptotic approximations:

If \(|\nu| > \varepsilon\): \(\log K_\nu(z) \approx \log\Gamma(|\nu|) - \log 2 + |\nu|(\log 2 - \log z)\)
If \(|\nu| \approx 0\): \(\log K_0(z) \approx \log(-\log(z/2) - \gamma)\) where \(\gamma\) is Euler’s constant

Parameters:

v (float) – Order of Bessel function \(\nu\) (scalar).
z (float or ndarray) – Argument (must be > 0), can be vectorized.

Returns:

log_kv – \(\log K_\nu(z)\).

Return type:

float or ndarray

Examples

>>> log_kv(1.0, 2.0)
-1.9670713500717493
>>> log_kv(1.5, np.array([0.5, 1.0, 2.0]))
array([ 0.68024015, -0.38318674, -1.71531713])

normix.utils.bessel.log_kv_derivative_z(v: float, z: float | ndarray[tuple[Any, ...], dtype[_ScalarT]]) → float | ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Derivative of log(K_v(z)) with respect to z.

Using the recurrence relation:

K'_v(z) = -(K_{v-1}(z) + K_{v+1}(z)) / 2

So:

d/dz log(K_v(z)) = K'_v(z) / K_v(z)
                 = -(K_{v-1}(z) + K_{v+1}(z)) / (2 K_v(z))
                 = -1/2 (exp(log K_{v-1}(z) - log K_v(z)) +
                         exp(log K_{v+1}(z) - log K_v(z)))

Parameters:

v (float) – Order of Bessel function (scalar).
z (float or ndarray) – Argument (must be > 0), can be vectorized.

Returns:

deriv – Derivative of log K_v(z) with respect to z.

Return type:

float or ndarray

Examples

>>> log_kv_derivative_z(1.0, 2.0)
-1.3143079478654318

normix.utils.bessel.log_kv_vectorized(v: float | ndarray[tuple[Any, ...], dtype[_ScalarT]], z: float | ndarray[tuple[Any, ...], dtype[_ScalarT]]) → float | ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Log modified Bessel function of the second kind: log(K_v(z)).

Fully vectorized version where both v and z can be arrays. This is slower than log_kv when v is a scalar.

Parameters:

v (float or ndarray) – Order of Bessel function.
z (float or ndarray) – Argument (must be > 0).

Returns:

log_kv – Log of the modified Bessel function of the second kind.

Return type:

float or ndarray