API Reference
Base Classes
Base class for probability distributions with scipy-like API.
This module provides an abstract base class that defines the standard interface
for probability distributions, similar to scipy.stats.
The API includes:
Density functions:
pdf(),logpdf()Cumulative distribution:
cdf(),sf()(survival function)Quantile functions:
ppf(),isf()(inverse survival)Random sampling:
rvs()Fitting:
fit()(returns self for method chaining)Moments:
mean(),var(),std(),stats()
Cache infrastructure
All distributions support lazy caching of derived quantities via
functools.cached_property. The cache is invalidated by calling
_invalidate_cache(), which removes all cached_property entries
listed in the class-level _cached_attrs tuple.
Subclasses extend _cached_attrs to register their own cached
properties:
class MyDist(ExponentialFamily):
_cached_attrs = ExponentialFamily._cached_attrs + ('log_det_Sigma',)
@cached_property
def log_det_Sigma(self) -> float:
...
- class normix.base.distribution.Distribution(*args, **kwargs)[source]
Bases:
ABCAbstract base class for probability distributions.
This class defines the standard API for probability distributions, similar to scipy.stats distributions. All concrete distributions should inherit from this class.
The API includes:
pdf/pmf: Probability density/mass function
logpdf/logpmf: Log of the probability density/mass function
cdf: Cumulative distribution function
logcdf: Log of the cumulative distribution function
sf: Survival function (1 - CDF)
logsf: Log of the survival function
ppf: Percent point function (inverse of CDF)
isf: Inverse survival function
rvs: Random variate sampling
fit: Fit distribution parameters to data
stats: Return moments (mean, variance, skewness, kurtosis)
entropy: Differential entropy of the distribution
moment: Non-central moment of order n
median: Median of the distribution
mean: Mean of the distribution
var: Variance of the distribution
std: Standard deviation of the distribution
interval: Confidence interval
- _fitted
Whether parameters have been set (via factory methods, setters, or fit).
- Type:
bool
- _cached_attrs
Names of
cached_propertyattributes that_invalidate_cache()will clear. Subclasses extend this via tuple concatenation.- Type:
tuple of str
- cdf(x: ArrayLike) ndarray[tuple[Any, ...], dtype[floating]][source]
Cumulative distribution function.
- Parameters:
x (array_like) – Points at which to evaluate the CDF.
- Returns:
cdf – Cumulative probability at each point.
- Return type:
ndarray
- entropy() float[source]
Differential entropy of the distribution.
- Returns:
entropy – Entropy value.
- Return type:
float
- abstractmethod fit(data: ArrayLike, *args, **kwargs) Distribution[source]
Fit distribution parameters to data (sklearn-style).
This method should fit the distribution parameters to the given data and return self for method chaining.
- Parameters:
data (array_like) – Data to fit the distribution to.
*args – Additional parameters for fitting.
**kwargs – Additional parameters for fitting.
- Returns:
self – The fitted distribution instance (for method chaining).
- Return type:
- interval(alpha: float) tuple[ndarray[tuple[Any, ...], dtype[floating]], ndarray[tuple[Any, ...], dtype[floating]]][source]
Confidence interval with equal areas around the median.
- Parameters:
alpha (float) – Confidence level (between 0 and 1).
- Returns:
a, b – Lower and upper bounds of the confidence interval.
- Return type:
tuple of ndarrays
- isf(q: ArrayLike) ndarray[tuple[Any, ...], dtype[floating]][source]
Inverse survival function (inverse of SF).
- Parameters:
q (array_like) – Probabilities at which to evaluate the ISF.
- Returns:
isf – Quantiles corresponding to the given survival probabilities.
- Return type:
ndarray
- kurtosis() ndarray[tuple[Any, ...], dtype[floating]][source]
Kurtosis of the distribution.
- Returns:
kurtosis – Kurtosis value(s).
- Return type:
ndarray
- logcdf(x: ArrayLike) ndarray[tuple[Any, ...], dtype[floating]][source]
Log of the cumulative distribution function.
- Parameters:
x (array_like) – Points at which to evaluate the log CDF.
- Returns:
logcdf – Log cumulative probability at each point.
- Return type:
ndarray
- logpdf(x: ArrayLike) ndarray[tuple[Any, ...], dtype[floating]][source]
Log of the probability density function.
Default implementation: log(pdf(x)) Subclasses should override for numerical stability.
- Parameters:
x (array_like) – Points at which to evaluate the log PDF.
- Returns:
logpdf – Log probability density at each point.
- Return type:
ndarray
- logsf(x: ArrayLike) ndarray[tuple[Any, ...], dtype[floating]][source]
Log of the survival function.
- Parameters:
x (array_like) – Points at which to evaluate the log survival function.
- Returns:
logsf – Log survival probability at each point.
- Return type:
ndarray
- mean() ndarray[tuple[Any, ...], dtype[floating]][source]
Mean of the distribution.
- Returns:
mean – Mean value(s).
- Return type:
ndarray
- median() ndarray[tuple[Any, ...], dtype[floating]][source]
Median of the distribution.
- Returns:
median – Median value(s).
- Return type:
ndarray
- moment(n: int) ndarray[tuple[Any, ...], dtype[floating]][source]
Non-central moment of order n.
- Parameters:
n (int) – Order of the moment.
- Returns:
moment – n-th moment.
- Return type:
ndarray
- abstractmethod pdf(x: ArrayLike) ndarray[tuple[Any, ...], dtype[floating]][source]
Probability density function.
- Parameters:
x (array_like) – Points at which to evaluate the PDF.
- Returns:
pdf – Probability density at each point.
- Return type:
ndarray
- ppf(q: ArrayLike) ndarray[tuple[Any, ...], dtype[floating]][source]
Percent point function (inverse of CDF).
- Parameters:
q (array_like) – Probabilities at which to evaluate the PPF.
- Returns:
ppf – Quantiles corresponding to the given probabilities.
- Return type:
ndarray
- abstractmethod rvs(size: int | tuple | None = None, random_state: int | Generator | None = None) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Random variate sampling.
- Parameters:
size (int or tuple of ints, optional) – Shape of the output. If None, returns a scalar.
random_state (int or numpy.random.Generator, optional) – Random state for reproducibility.
- Returns:
rvs – Random variates.
- Return type:
ndarray or scalar
- score(X: ArrayLike, y: ArrayLike | None = None) float[source]
Compute the mean log-likelihood (sklearn-style scoring).
This method follows the sklearn convention where higher scores are better.
- Parameters:
X (array_like) – Data samples.
y (array_like, optional) – Ignored. Present for sklearn API compatibility.
- Returns:
score – Mean log-likelihood.
- Return type:
float
- sf(x: ArrayLike) ndarray[tuple[Any, ...], dtype[floating]][source]
Survival function (1 - CDF).
- Parameters:
x (array_like) – Points at which to evaluate the survival function.
- Returns:
sf – Survival probability at each point.
- Return type:
ndarray
- skewness() ndarray[tuple[Any, ...], dtype[floating]][source]
Skewness of the distribution.
- Returns:
skewness – Skewness value(s).
- Return type:
ndarray
- stats(moments: str = 'mv') ndarray[tuple[Any, ...], dtype[_ScalarT]] | tuple[ndarray[tuple[Any, ...], dtype[_ScalarT]], ...][source]
Return moments of the distribution.
- Parameters:
moments (str, optional) – Composed of letters [‘mvsk’] defining which moments to compute: ‘m’ = mean, ‘v’ = variance, ‘s’ = skewness, ‘k’ = kurtosis. Default is ‘mv’.
- Returns:
stats – Requested moments.
- Return type:
ndarray or tuple
Base class for exponential family distributions.
Exponential families have the canonical form:
where:
\(\theta\): natural parameters (d-dimensional vector)
\(t(x)\): sufficient statistics (d-dimensional vector)
\(\psi(\theta)\): log partition function (cumulant generating function)
\(h(x)\): base measure
The log partition function satisfies:
where \(\eta\) are the expectation parameters.
Supports three parametrizations:
Classical: Domain-specific parameters (e.g., \(\mu\), \(\sigma\) for Normal)
Natural: \(\theta\) in the exponential family form (numpy array)
Expectation: \(\eta = \nabla\psi(\theta) = E[t(X)]\) (numpy array)
The Fisher information matrix equals the Hessian of the log partition:
- class normix.base.exponential_family.ExponentialFamily[source]
Bases:
DistributionAbstract base class for exponential family distributions.
Exponential family distributions have the probability density:
\[p(x|\theta) = h(x) \exp(\theta^T t(x) - \psi(\theta))\]where:
\(\theta\) are the natural parameters (d-dimensional vector)
\(t(x)\) are the sufficient statistics (d-dimensional vector)
\(\psi(\theta)\) is the log partition function (scalar)
\(h(x)\) is the base measure (scalar)
Three parametrizations:
Classical: Domain-specific (e.g., \(\mu\), \(\sigma\) for Normal; \(\lambda\) for Exponential)
Natural: \(\theta\) in exponential family form (numpy array)
Expectation: \(\eta = \nabla\psi(\theta) = E[t(X)]\) (numpy array)
Subclasses must implement the following state management methods:
_set_from_classical(**kwargs): Set internal state from classical params._set_from_natural(theta): Set internal state from natural params._compute_natural_params(): Derive natural params from internal state._compute_classical_params(): Derive classical params from internal state.
Parameters should be stored as named internal attributes (e.g.,
_rate,_mu,_L) for efficient direct access.Both interfaces use
functools.cached_propertyfor lazy caching of derived parametrizations, with_invalidate_cache()clearing all entries when state changes.- _natural_params
Legacy storage for natural parameters (tuple for hashability). Set by the default
_set_from_naturalimplementation. Migrated subclasses may not use this.- Type:
tuple or None
- _fitted
Whether parameters have been set (inherited from Distribution).
- Type:
bool
Examples
Use factory methods to create instances:
>>> # From classical parameters >>> dist = Exponential.from_classical_params(rate=2.0)
>>> # From natural parameters (array) >>> dist = Exponential.from_natural_params(np.array([-2.0]))
>>> # From expectation parameters (array) >>> dist = Exponential.from_expectation_params(np.array([0.5]))
>>> # Fit from data (returns self for method chaining) >>> dist = Exponential().fit(data)
Notes
The log partition function \(\psi(\theta)\) is convex, and its gradient and Hessian give important quantities:
Gradient: \(\nabla\psi(\theta) = E[t(X)] = \eta\) (expectation parameters)
Hessian: \(\nabla^2\psi(\theta) = \text{Cov}[t(X)] = I(\theta)\) (Fisher information)
References
Barndorff-Nielsen, O. E. (1978). Information and exponential families in statistical theory.
- property classical_params
Classical parameters as a frozen dataclass or dict (cached).
Computed lazily from internal state via
_compute_classical_params. For migrated subclasses, returns a frozen dataclass with attribute access (e.g.,params.rate). For legacy subclasses, returns a dict.- Returns:
params – Classical parameters.
- Return type:
dataclass or dict
- property expectation_params: ndarray[tuple[Any, ...], dtype[_ScalarT]]
Expectation parameters \(\eta = E[t(X)]\) (cached).
Computed lazily from internal state via
_compute_expectation_params.- Returns:
eta – Expectation parameter vector.
- Return type:
ndarray
- fisher_information(theta: ndarray[tuple[Any, ...], dtype[_ScalarT]] | None = None) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Compute Fisher information matrix: I(theta) = Hessian of psi(theta).
The Fisher information equals:
\[I(\theta) = \nabla^2\psi(\theta) = \text{Cov}[t(X)]\]It is the Hessian of the log partition function (always positive semi-definite since \(\psi\) is convex), and equals the covariance matrix of the sufficient statistics.
Default implementation uses
scipy.differentiate.hessian. Override for analytical Hessian when available (recommended for performance).- Parameters:
theta (ndarray, optional) – Natural parameter vector \(\theta\). If None, uses current parameters.
- Returns:
fisher – Fisher information matrix (positive semi-definite).
- Return type:
ndarray, shape
(d, d)
Examples
>>> dist = Exponential.from_classical_params(rate=2.0) >>> fisher = dist.fisher_information() >>> print(fisher) # [[0.25]] for rate=2
- fit(X: ArrayLike, y: ArrayLike | None = None, **kwargs) ExponentialFamily[source]
Fit distribution parameters using Maximum Likelihood Estimation.
For exponential families, the MLE has a closed form in expectation parameters:
\[\hat{\eta} = \frac{1}{n} \sum_{i=1}^n t(x_i)\]That is, the MLE of expectation parameters equals the sample mean of sufficient statistics. Then converts to natural parameters using
_expectation_to_natural.- Parameters:
X (array_like) – Training data.
y (array_like, optional) – Ignored (for sklearn API compatibility).
**kwargs – Additional options (currently unused).
- Returns:
self – Returns self for method chaining (sklearn convention).
- Return type:
Examples
>>> data = np.random.exponential(scale=0.5, size=1000) >>> dist = Exponential().fit(data) >>> print(dist.classical_params) {'rate': 2.01...}
- classmethod from_classical_params(**kwargs) ExponentialFamily[source]
Create distribution from classical parameters.
- Parameters:
**kwargs – Distribution-specific classical parameters. E.g., rate=2.0 for Exponential, mu=0.0, sigma=1.0 for Normal.
- Returns:
dist – Distribution instance with parameters set.
- Return type:
Examples
>>> dist = Exponential.from_classical_params(rate=2.0) >>> dist = Normal.from_classical_params(mu=0.0, sigma=1.0)
- classmethod from_expectation_params(eta: ndarray[tuple[Any, ...], dtype[_ScalarT]]) ExponentialFamily[source]
Create distribution from expectation parameters.
- Parameters:
eta (ndarray) – Expectation parameter vector (E[t(X)]).
- Returns:
dist – Distribution instance with parameters set.
- Return type:
Examples
>>> dist = Exponential.from_expectation_params(np.array([0.5])) >>> dist = Normal.from_expectation_params(np.array([0.0, 1.0]))
- classmethod from_natural_params(theta: ndarray[tuple[Any, ...], dtype[_ScalarT]]) ExponentialFamily[source]
Create distribution from natural parameters.
- Parameters:
theta (ndarray) – Natural parameter vector.
- Returns:
dist – Distribution instance with parameters set.
- Return type:
Examples
>>> dist = Exponential.from_natural_params(np.array([-2.0])) >>> dist = Normal.from_natural_params(np.array([0.0, -0.5]))
- logpdf(x: ArrayLike) float | ndarray[tuple[Any, ...], dtype[floating]][source]
Log probability density using exponential family form.
Computes:
\[\log p(x|\theta) = \log h(x) + \theta^T t(x) - \psi(\theta)\]- Parameters:
x (array_like) – Points at which to evaluate log PDF.
- Returns:
logpdf – Log probability density at each point.
- Return type:
float or ndarray
- property natural_params: ndarray[tuple[Any, ...], dtype[_ScalarT]]
Natural parameters \(\theta\) as numpy array (cached).
Computed lazily from internal state via
_compute_natural_params. Cache is invalidated when parameters change.- Returns:
theta – Natural parameter vector.
- Return type:
ndarray
- pdf(x: ArrayLike) float | ndarray[tuple[Any, ...], dtype[floating]][source]
Probability density: p(x|θ) = exp(logpdf(x)).
- Parameters:
x (array_like) – Points at which to evaluate PDF.
- Returns:
pdf – Probability density at each point.
- Return type:
float or ndarray
- score(X: ArrayLike, y: ArrayLike | None = None) float[source]
Compute mean log-likelihood (sklearn-style scoring).
Higher scores are better (sklearn convention).
- Parameters:
X (array_like) – Data samples.
y (array_like, optional) – Ignored (for sklearn API compatibility).
- Returns:
score – Mean log-likelihood.
- Return type:
float
- set_classical_params(**kwargs) ExponentialFamily[source]
Set parameters from classical parametrization.
- Parameters:
**kwargs – Distribution-specific classical parameters.
- Returns:
self – Returns self for method chaining.
- Return type:
- set_expectation_params(eta: ndarray[tuple[Any, ...], dtype[_ScalarT]], theta0: ndarray[tuple[Any, ...], dtype[_ScalarT]] | List[ndarray[tuple[Any, ...], dtype[_ScalarT]]] | None = None) ExponentialFamily[source]
Set parameters from expectation parametrization.
- Parameters:
eta (ndarray) – Expectation parameter vector (E[t(X)]).
theta0 (ndarray or list of ndarray, optional) – Initial guess(es) for natural parameters in optimization. If provided, these are used instead of the default initial points.
- Returns:
self – Returns self for method chaining.
- Return type:
- set_natural_params(theta: ndarray[tuple[Any, ...], dtype[_ScalarT]]) ExponentialFamily[source]
Set parameters from natural parametrization.
- Parameters:
theta (ndarray) – Natural parameter vector.
- Returns:
self – Returns self for method chaining.
- Return type:
Base classes for normal mixture distributions.
Normal mixture distributions have the form:
where \(Z \sim N(0, \Sigma)\) is a Gaussian random vector independent of \(Y\), and \(Y\) follows some positive mixing distribution (GIG, Gamma, Inverse Gaussian, etc.).
This module provides two abstract base classes:
JointNormalMixture: The joint distribution \(f(x, y)\) which IS an exponential family. This class extends
ExponentialFamily.NormalMixture: The marginal distribution \(f(x) = \int f(x, y) dy\) which is NOT an exponential family. This class extends
Distributionand owns aJointNormalMixtureinstance accessible via the.jointproperty.
The key insight is that while the marginal distribution of \(X\) (with \(Y\) integrated out) does not belong to the exponential family, the joint distribution \((X, Y)\) does belong to the exponential family when both variables are observed.
Method naming convention:
pdf(x): Marginal PDF \(f(x)\) (on NormalMixture)joint.pdf(x, y): Joint PDF \(f(x, y)\) (via .joint property)pdf_joint(x, y): Convenience alias forjoint.pdf(x, y)rvs(size): Sample from marginal (returns X only)rvs_joint(size): Sample from joint (returns X, Y tuple)
- class normix.base.mixture.JointNormalMixture(d: int | None = None)[source]
Bases:
ExponentialFamily,ABCAbstract base class for joint normal mixture distributions \(f(x, y)\).
The joint distribution of \((X, Y)\) where:
\[ \begin{align}\begin{aligned}X | Y \sim N(\mu + \gamma Y, \Sigma Y)\\Y \sim \text{Mixing distribution}\end{aligned}\end{align} \]This joint distribution belongs to the exponential family with sufficient statistics:
\[\begin{split}t(x, y) = \begin{pmatrix} \log y \\ y^{-1} \\ y \\ x \\ x y^{-1} \\ x x^T y^{-1} \end{pmatrix}\end{split}\]The natural parameters are derived from the classical parameters \((\mu, \gamma, \Sigma, \text{mixing params})\).
Subclasses must implement:
_get_mixing_distribution_class(): Return the mixing distribution class_mixing_natural_params(): Extract mixing distribution natural params_mixing_sufficient_statistics(): Compute mixing dist sufficient stats_mixing_log_partition(): Compute mixing distribution log partition_classical_to_natural(): Convert classical to natural parameters_natural_to_classical(): Convert natural to classical parameters
- _d
Dimension of the observed variable \(X\).
- Type:
int or None
- _natural_params
Internal storage for natural parameters.
- Type:
tuple or None
See also
NormalMixtureMarginal distribution (owns a JointNormalMixture)
ExponentialFamilyParent class providing exponential family methods
- property L_Sigma_inv: ndarray[tuple[Any, ...], dtype[_ScalarT]]
Inverse of the lower Cholesky factor of Sigma (cached).
\(L_\Sigma^{-1}\) such that \(\Sigma^{-1} = L_\Sigma^{-T} L_\Sigma^{-1}\).
- Returns:
L_inv – Lower triangular matrix.
- Return type:
ndarray, shape
(d, d)
- conditional_cov_x_given_y(y: ArrayLike) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Conditional covariance \(\text{Cov}[X | Y = y] = \Sigma y\).
- Parameters:
y (array_like) – Conditioning value(s) of Y.
- Returns:
cov – Conditional covariance of X given Y.
- Return type:
ndarray
- conditional_mean_x_given_y(y: ArrayLike) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Conditional mean \(E[X | Y = y] = \mu + \gamma y\).
- Parameters:
y (array_like) – Conditioning value(s) of Y.
- Returns:
mean – Conditional mean of X given Y.
- Return type:
ndarray
- cov() Tuple[ndarray[tuple[Any, ...], dtype[_ScalarT]], float][source]
Covariance of the joint distribution.
\[\text{Cov}[X] = E[Y] \Sigma + \text{Var}[Y] \gamma \gamma^T\]- Returns:
Cov_X (ndarray) – Covariance matrix of X, shape (d, d)
Var_Y (float) – Variance of Y from the mixing distribution
- property d: int
Dimension of the observed variable X.
- property gamma_mahal_sq: float
\(\gamma^T \Sigma^{-1} \gamma\) (cached).
Used in marginal PDF computations.
- Returns:
mahal_sq
- Return type:
float
- Type:
Mahalanobis norm of gamma
- get_L_Sigma() Tuple[ndarray[tuple[Any, ...], dtype[_ScalarT]], float][source]
Get the lower Cholesky factor of Sigma and its log determinant.
- Returns:
L_Sigma (ndarray) – Lower Cholesky factor of Sigma, shape (d, d).
log_det_Sigma (float) – Log determinant of Sigma.
Notes
This method now delegates to the internal
_L_Sigmaattribute and thelog_det_Sigmacached property.
- property log_det_Sigma: float
Log-determinant of the covariance scale matrix (cached).
\[\log|\Sigma| = 2 \sum_{i=1}^d \log (L_\Sigma)_{ii}\]- Returns:
log_det
- Return type:
float
- logpdf(x: ArrayLike, y: ArrayLike) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Log joint probability density function \(\log f(x, y)\).
\[\log f(x, y) = \log h(x, y) + \theta^T t(x, y) - \psi(\theta)\]- Parameters:
x (array_like) – Observed values of X. Shape (d,) or (n, d).
y (array_like) – Observed values of Y. Shape () or (n,).
- Returns:
logpdf – Log joint density values.
- Return type:
ndarray
- mean() Tuple[ndarray[tuple[Any, ...], dtype[_ScalarT]], float][source]
Mean of the joint distribution.
- Returns:
E_X (ndarray) – \(E[X] = \mu + \gamma E[Y]\)
E_Y (float) – \(E[Y]\) from the mixing distribution
- pdf(x: ArrayLike, y: ArrayLike) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Joint probability density function \(f(x, y)\).
- Parameters:
x (array_like) – Observed values of X. Shape (d,) or (n, d).
y (array_like) – Observed values of Y. Shape () or (n,).
- Returns:
pdf – Joint density values.
- Return type:
ndarray
- rvs(size: int | Tuple[int, ...] | None = None, random_state: int | Generator | None = None) Tuple[ndarray[tuple[Any, ...], dtype[_ScalarT]], ndarray[tuple[Any, ...], dtype[_ScalarT]]][source]
Generate random samples from the joint distribution.
Samples \((X, Y)\) pairs where:
\(Y \sim\) mixing distribution
\(X | Y \sim N(\mu + \gamma Y, \Sigma Y)\)
- Parameters:
size (int or tuple of ints, optional) – Number of samples to generate. If None, returns single sample.
random_state (int or Generator, optional) – Random number generator or seed.
- Returns:
X (ndarray) – Sampled X values. Shape (d,) if size is None, (n, d) otherwise.
Y (ndarray) – Sampled Y values. Shape () if size is None, (n,) otherwise.
- set_L_Sigma(L_Sigma: ndarray[tuple[Any, ...], dtype[_ScalarT]], lower: bool = True) None[source]
Set the Cholesky factor of Sigma.
Used by the M-step of the EM algorithm.
- Parameters:
L_Sigma (ndarray) – Cholesky factor of Sigma, shape (d, d).
lower (bool, default True) – If True, L_Sigma is lower triangular. If False, converts to lower triangular.
- class normix.base.mixture.NormalMixture[source]
Bases:
Distribution,ABCAbstract base class for marginal normal mixture distributions \(f(x)\).
The marginal distribution of \(X\) where:
\[f(x) = \int_0^\infty f(x, y) \, dy = \int_0^\infty f(x | y) f(y) \, dy\]This marginal distribution is NOT an exponential family. However, the joint distribution \(f(x, y)\) IS an exponential family and can be accessed via the
jointproperty.Method naming convention:
pdf(x): Marginal PDF \(f(x)\) (default)joint.pdf(x, y): Joint PDF \(f(x, y)\) (via .joint property)pdf_joint(x, y): Convenience alias forjoint.pdf(x, y)rvs(size): Sample from marginal (returns X only)rvs_joint(size): Sample from joint (returns X, Y tuple)
Subclasses must implement:
_create_joint_distribution(): Factory method to create the joint dist_marginal_logpdf(): Compute the marginal log PDF_conditional_expectation_y_given_x(): E[Y | X] for EM algorithm
- _joint
The underlying joint distribution instance.
- Type:
JointNormalMixture or None
See also
JointNormalMixtureJoint distribution (exponential family)
DistributionParent class defining the scipy-like API
Examples
>>> # Access marginal distribution methods (default) >>> gh = GeneralizedHyperbolic.from_classical_params(mu=0, gamma=1, ...) >>> gh.pdf(x) # Marginal PDF f(x) >>> gh.rvs(size=100) # Sample X values only
>>> # Access joint distribution methods via .joint property >>> gh.joint.pdf(x, y) # Joint PDF f(x, y) >>> gh.joint.sufficient_statistics(x, y) # Exponential family stats >>> gh.joint.natural_params # Natural parameters
>>> # Convenience aliases for joint methods >>> gh.pdf_joint(x, y) # Same as gh.joint.pdf(x, y) >>> gh.rvs_joint(size=100) # Returns (X, Y) tuple
- property classical_params: Dict[str, Any]
Classical parameters of the distribution.
Delegates to the joint distribution’s classical parameters.
- Returns:
params – Dictionary of classical parameters.
- Return type:
dict
- cov() ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Covariance matrix of the marginal distribution.
\[\text{Cov}[X] = E[Y] \Sigma + \text{Var}[Y] \gamma \gamma^T\]- Returns:
cov – Covariance matrix, shape (d, d).
- Return type:
ndarray
- property d: int
Dimension of the distribution.
- property expectation_params: ndarray[tuple[Any, ...], dtype[_ScalarT]]
Expectation parameters \(\eta = E[t(X, Y)]\) of the joint.
- Returns:
eta – Expectation parameter vector.
- Return type:
ndarray
- fit(X: ArrayLike, y: ArrayLike | None = None, **kwargs) NormalMixture[source]
Fit distribution to data using EM algorithm.
When only X is observed (Y is latent), uses the EM algorithm. For complete data (both X and Y observed), use
fit_complete().- Parameters:
X (array_like) – Observed X data, shape (n_samples, d).
y (array_like, optional) – Ignored.
**kwargs – Additional fitting parameters.
- Returns:
self – Fitted distribution.
- Return type:
Note
Not yet implemented. Placeholder for EM algorithm.
- fit_complete(X: ArrayLike, Y: ArrayLike, **kwargs) NormalMixture[source]
Fit distribution from complete data (both X and Y observed).
Since the joint distribution is an exponential family, this uses closed-form MLE.
- Parameters:
X (array_like) – Observed X data, shape (n_samples, d).
Y (array_like) – Observed Y data, shape (n_samples,).
**kwargs – Additional fitting parameters.
- Returns:
self – Fitted distribution.
- Return type:
Note
Not yet implemented.
- classmethod from_classical_params(**kwargs) NormalMixture[source]
Create distribution from classical parameters.
- Parameters:
**kwargs – Distribution-specific classical parameters. Common parameters: - mu : Location parameter (d,) - gamma : Skewness parameter (d,) - sigma : Covariance scale matrix (d, d) - Plus mixing distribution parameters
- Returns:
dist – Distribution instance with parameters set.
- Return type:
Examples
>>> gh = GeneralizedHyperbolic.from_classical_params( ... mu=0.0, gamma=0.5, sigma=1.0, p=-0.5, a=1.0, b=1.0 ... )
- classmethod from_expectation_params(eta: ndarray[tuple[Any, ...], dtype[_ScalarT]]) NormalMixture[source]
Create distribution from expectation parameters.
The expectation parameters are those of the joint distribution.
- Parameters:
eta (ndarray) – Expectation parameter vector for the joint distribution.
- Returns:
dist – Distribution instance with parameters set.
- Return type:
- classmethod from_natural_params(theta: ndarray[tuple[Any, ...], dtype[_ScalarT]]) NormalMixture[source]
Create distribution from natural parameters.
The natural parameters are those of the joint distribution.
- Parameters:
theta (ndarray) – Natural parameter vector for the joint distribution.
- Returns:
dist – Distribution instance with parameters set.
- Return type:
- property joint: JointNormalMixture
Access the joint distribution \(f(x, y)\).
The joint distribution is an
ExponentialFamilywith all associated methods (natural parameters, sufficient statistics, Fisher information, etc.).- Returns:
joint – The underlying joint distribution instance.
- Return type:
Examples
>>> gh = GeneralizedHyperbolic.from_classical_params(...) >>> gh.joint.pdf(x, y) # Joint PDF >>> gh.joint.natural_params # Natural parameters >>> gh.joint.sufficient_statistics(x, y) # Sufficient stats
- logpdf(x: ArrayLike) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Log marginal probability density function \(\log f(x)\).
- Parameters:
x (array_like) – Points at which to evaluate the log PDF.
- Returns:
logpdf – Log probability density values.
- Return type:
ndarray
- logpdf_joint(x: ArrayLike, y: ArrayLike) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Log joint probability density function \(\log f(x, y)\).
Convenience alias for
self.joint.logpdf(x, y).- Parameters:
x (array_like) – Observed X values.
y (array_like) – Observed Y values.
- Returns:
logpdf – Log joint density values.
- Return type:
ndarray
- mean() ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Mean of the marginal distribution.
\[E[X] = \mu + \gamma E[Y]\]- Returns:
mean – Mean vector of X, shape (d,).
- Return type:
ndarray
- property mixing_distribution: ExponentialFamily
Access the mixing distribution of Y.
- Returns:
mixing – The mixing distribution (e.g., GIG, Gamma, InverseGaussian).
- Return type:
- property natural_params: ndarray[tuple[Any, ...], dtype[_ScalarT]]
Natural parameters \(\theta\) of the joint distribution.
Note: The marginal distribution itself is not an exponential family, so these are the natural parameters of the joint distribution.
- Returns:
theta – Natural parameter vector.
- Return type:
ndarray
- pdf(x: ArrayLike) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Marginal probability density function \(f(x)\).
\[f(x) = \int_0^\infty f(x, y) \, dy\]- Parameters:
x (array_like) – Points at which to evaluate the PDF.
- Returns:
pdf – Probability density values.
- Return type:
ndarray
- pdf_joint(x: ArrayLike, y: ArrayLike) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Joint probability density function \(f(x, y)\).
Convenience alias for
self.joint.pdf(x, y).- Parameters:
x (array_like) – Observed X values.
y (array_like) – Observed Y values.
- Returns:
pdf – Joint density values.
- Return type:
ndarray
- rvs(size: int | Tuple[int, ...] | None = None, random_state: int | Generator | None = None) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Generate random samples from the marginal distribution.
Samples X values only (Y is integrated out).
- Parameters:
size (int or tuple of ints, optional) – Number of samples to generate.
random_state (int or Generator, optional) – Random number generator or seed.
- Returns:
X – Sampled X values. Shape
(d,)if size is None, else(n, d)or(*size, d).- Return type:
ndarray
- rvs_joint(size: int | Tuple[int, ...] | None = None, random_state: int | Generator | None = None) Tuple[ndarray[tuple[Any, ...], dtype[_ScalarT]], ndarray[tuple[Any, ...], dtype[_ScalarT]]][source]
Generate random samples from the joint distribution.
Samples both X and Y values.
- Parameters:
size (int or tuple of ints, optional) – Number of samples to generate.
random_state (int or Generator, optional) – Random number generator or seed.
- Returns:
X (ndarray) – Sampled X values.
Y (ndarray) – Sampled Y values (latent mixing variable).
- set_classical_params(**kwargs) NormalMixture[source]
Set parameters from classical parametrization.
- Parameters:
**kwargs – Distribution-specific classical parameters.
- Returns:
self – Returns self for method chaining.
- Return type:
- set_expectation_params(eta: ndarray[tuple[Any, ...], dtype[_ScalarT]]) NormalMixture[source]
Set parameters from expectation parametrization.
- Parameters:
eta (ndarray) – Expectation parameter vector for the joint distribution.
- Returns:
self – Returns self for method chaining.
- Return type:
- set_natural_params(theta: ndarray[tuple[Any, ...], dtype[_ScalarT]]) NormalMixture[source]
Set parameters from natural parametrization.
- Parameters:
theta (ndarray) – Natural parameter vector for the joint distribution.
- Returns:
self – Returns self for method chaining.
- Return type:
Univariate Distributions
Exponential distribution as an exponential family.
The Exponential distribution has PDF:
for \(x \geq 0\), where \(\lambda > 0\) is the rate parameter.
Exponential family form:
\(h(x) = 1\) for \(x \geq 0\) (base measure)
\(t(x) = x\) (sufficient statistic)
\(\theta = -\lambda\) (natural parameter)
\(\psi(\theta) = -\log(-\theta)\) (log partition function)
Parametrizations:
Classical: \(\lambda\) (rate), \(\lambda > 0\)
Natural: \(\theta = -\lambda\), \(\theta < 0\)
Expectation: \(\eta = E[X] = 1/\lambda\), \(\eta > 0\)
Note: scipy uses scale = 1/rate parametrization.
- class normix.distributions.univariate.exponential.Exponential[source]
Bases:
ExponentialFamilyExponential distribution in exponential family form.
The Exponential distribution has PDF:
\[p(x|\lambda) = \lambda e^{-\lambda x}\]for \(x \geq 0\), where \(\lambda\) is the rate parameter.
- Parameters:
rate (float, optional) – Rate parameter \(\lambda > 0\). Use
from_classical_params(rate=...).
- _natural_params
Internal storage for natural parameters \(\theta = -\lambda\).
- Type:
tuple or None
Examples
>>> # Create from rate parameter >>> dist = Exponential.from_classical_params(rate=2.0) >>> dist.mean() 0.5
>>> # Create from natural parameters >>> dist = Exponential.from_natural_params(np.array([-2.0]))
>>> # Fit from data >>> data = np.random.exponential(scale=0.5, size=1000) >>> dist = Exponential().fit(data)
See also
GammaGeneralization of Exponential (Exponential is Gamma with \(\alpha = 1\))
Notes
The Exponential distribution is a special case of the Gamma distribution with shape parameter \(\alpha = 1\):
\[\text{Exponential}(\lambda) = \text{Gamma}(1, \lambda)\]Exponential family form:
Sufficient statistic: \(t(x) = x\)
Natural parameter: \(\theta = -\lambda\)
Log partition: \(\psi(\theta) = -\log(-\theta)\)
References
Barndorff-Nielsen, O. E. (1978). Information and exponential families.
- cdf(x: ArrayLike) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Cumulative distribution function: F(x) = 1 - exp(-λx) for x ≥ 0.
- Parameters:
x (array_like) – Points at which to evaluate CDF.
- Returns:
cdf – CDF values.
- Return type:
ndarray
- fisher_information(theta: ndarray[tuple[Any, ...], dtype[_ScalarT]] | None = None) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Analytical Fisher information: I(theta) = 1/theta^2.
The Fisher information matrix (Hessian of log partition) is:
\[I(\theta) = \frac{1}{\theta^2} = \frac{1}{\lambda^2}\]- Parameters:
theta (ndarray, optional) – Natural parameter vector. If None, uses current parameters.
- Returns:
fisher – Fisher information matrix.
- Return type:
ndarray, shape (1, 1)
- mean() float[source]
Mean of Exponential distribution: E[X] = 1/lambda.
\[E[X] = \frac{1}{\lambda}\]- Returns:
mean – Mean of the distribution.
- Return type:
float
- rvs(size=None, random_state=None)[source]
Generate random samples from the exponential distribution.
- Parameters:
size (int or tuple of ints, optional) – Shape of samples to generate.
random_state (int or Generator, optional) – Random number generator seed or instance.
- Returns:
samples – Random samples from the distribution.
- Return type:
float or ndarray
Gamma distribution as an exponential family.
The Gamma distribution has PDF:
for \(x > 0\), where \(\alpha > 0\) is the shape parameter and \(\beta > 0\) is the rate parameter.
Exponential family form:
\(h(x) = 1\) for \(x > 0\) (base measure)
\(t(x) = [\log x, x]\) (sufficient statistics)
\(\theta = [\alpha - 1, -\beta]\) (natural parameters)
\(\psi(\theta) = \log\Gamma(\theta_1 + 1) - (\theta_1 + 1)\log(-\theta_2)\) (log partition)
Parametrizations:
Classical: \(\alpha\) (shape), \(\beta\) (rate), \(\alpha > 0, \beta > 0\)
Natural: \(\theta = [\alpha - 1, -\beta]\), \(\theta_1 > -1, \theta_2 < 0\)
Expectation: \(\eta = [\psi(\alpha) - \log\beta, \alpha/\beta]\), where \(\psi\) is the digamma function
Note: scipy uses scale = 1/rate parametrization.
- class normix.distributions.univariate.gamma.Gamma[source]
Bases:
ExponentialFamilyGamma distribution in exponential family form.
The Gamma distribution has PDF:
\[p(x|\alpha, \beta) = \frac{\beta^\alpha}{\Gamma(\alpha)} x^{\alpha-1} e^{-\beta x}\]for \(x > 0\), where \(\alpha\) is the shape parameter and \(\beta\) is the rate parameter.
- Parameters:
shape (float, optional) – Shape parameter \(\alpha > 0\). Use
from_classical_params(shape=..., rate=...).rate (float, optional) – Rate parameter \(\beta > 0\). Use
from_classical_params(shape=..., rate=...).
- _natural_params
Internal storage for natural parameters \(\theta = [\alpha - 1, -\beta]\).
- Type:
tuple or None
Examples
>>> # Create from shape and rate parameters >>> dist = Gamma.from_classical_params(shape=2.0, rate=1.0) >>> dist.mean() 2.0
>>> # Create from natural parameters >>> dist = Gamma.from_natural_params(np.array([1.0, -1.0]))
>>> # Fit from data >>> data = np.random.gamma(shape=2.0, scale=1.0, size=1000) >>> dist = Gamma().fit(data)
See also
InverseGammaInverse of Gamma distribution
GeneralizedInverseGaussianGeneralization including Gamma as special case
ExponentialSpecial case of Gamma with \(\alpha = 1\)
Notes
The Gamma distribution belongs to the exponential family with:
Sufficient statistics: \(t(x) = [\log x, x]\)
Natural parameters: \(\theta = [\alpha - 1, -\beta]\)
Log partition: \(\psi(\theta) = \log\Gamma(\theta_1 + 1) - (\theta_1 + 1)\log(-\theta_2)\)
The expectation parameters are:
\[\eta = [\psi(\alpha) - \log\beta, \alpha/\beta]\]where \(\psi\) is the digamma function.
References
Barndorff-Nielsen, O. E. (1978). Information and exponential families.
- cdf(x: ArrayLike) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Cumulative distribution function using scipy’s implementation.
- Parameters:
x (array_like) – Points at which to evaluate CDF.
- Returns:
cdf – CDF values.
- Return type:
ndarray or float
- fisher_information(theta: ndarray[tuple[Any, ...], dtype[_ScalarT]] | None = None) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Analytical Fisher information: I(theta) = Hessian of psi(theta).
The Fisher information matrix (Hessian of log partition) is:
\[\begin{split}I(\theta) = \begin{pmatrix} \psi'(\alpha) & 1/\beta \\ 1/\beta & \alpha/\beta^2 \end{pmatrix}\end{split}\]where \(\psi'\) is the trigamma function, \(\alpha = \theta_1 + 1\), and \(\beta = -\theta_2\).
- Parameters:
theta (ndarray, optional) – Natural parameter vector. If None, uses current parameters.
- Returns:
fisher – Fisher information matrix (positive definite).
- Return type:
ndarray, shape (2, 2)
- mean() float[source]
Mean of Gamma distribution: E[X] = alpha/beta.
\[E[X] = \frac{\alpha}{\beta}\]- Returns:
mean – Mean of the distribution.
- Return type:
float
- rvs(size=None, random_state=None)[source]
Generate random samples from the gamma distribution.
- Parameters:
size (int or tuple of ints, optional) – Shape of samples to generate.
random_state (int or Generator, optional) – Random number generator seed or instance.
- Returns:
samples – Random samples from the distribution.
- Return type:
float or ndarray
Inverse Gamma distribution as an exponential family.
The Inverse Gamma distribution has PDF:
for \(x > 0\), where \(\alpha > 0\) is the shape parameter and \(\beta > 0\) is the rate parameter.
Exponential family form:
\(h(x) = 1\) for \(x > 0\) (base measure)
\(t(x) = [-1/x, \log x]\) (sufficient statistics)
\(\theta = [\beta, -(\alpha+1)]\) (natural parameters)
\(\psi(\theta) = \log\Gamma(-\theta_2-1) - (-\theta_2-1)\log(\theta_1)\) (log partition)
Parametrizations:
Classical: \(\alpha\) (shape), \(\beta\) (rate), \(\alpha > 0, \beta > 0\)
Natural: \(\theta = [\beta, -(\alpha+1)]\), \(\theta_1 > 0, \theta_2 < -1\)
Expectation: \(\eta = [-\alpha/\beta, \log\beta - \psi(\alpha)]\), where \(\psi\) is the digamma function
Note: We use rate \(\beta\) (like Gamma), while scipy uses scale = \(\beta\).
- class normix.distributions.univariate.inverse_gamma.InverseGamma[source]
Bases:
ExponentialFamilyInverse Gamma distribution in exponential family form.
The Inverse Gamma distribution has PDF:
\[p(x|\alpha, \beta) = \frac{\beta^\alpha}{\Gamma(\alpha)} x^{-\alpha-1} e^{-\beta/x}\]for \(x > 0\), where \(\alpha\) is the shape parameter and \(\beta\) is the rate parameter.
If \(X \sim \text{Gamma}(\alpha, \beta)\), then \(1/X \sim \text{InvGamma}(\alpha, \beta)\).
- Parameters:
shape (float, optional) – Shape parameter \(\alpha > 0\). Use
from_classical_params(shape=..., rate=...).rate (float, optional) – Rate parameter \(\beta > 0\). Use
from_classical_params(shape=..., rate=...).
- _natural_params
Internal storage for natural parameters \(\theta = [\beta, -(\alpha+1)]\).
- Type:
tuple or None
Examples
>>> # Create from shape and rate parameters >>> dist = InverseGamma.from_classical_params(shape=3.0, rate=1.5) >>> dist.mean() 0.75
>>> # Create from natural parameters >>> dist = InverseGamma.from_natural_params(np.array([1.5, -4.0]))
>>> # Fit from data >>> from scipy.stats import invgamma >>> data = invgamma.rvs(a=3.0, scale=1.5, size=1000) >>> dist = InverseGamma().fit(data)
See also
GammaInverse of InverseGamma distribution
GeneralizedInverseGaussianGeneralization including InverseGamma as special case
Notes
The Inverse Gamma distribution belongs to the exponential family with:
Sufficient statistics: \(t(x) = [-1/x, \log x]\)
Natural parameters: \(\theta = [\beta, -(\alpha+1)]\)
Log partition: \(\psi(\theta) = \log\Gamma(-\theta_2-1) - (-\theta_2-1)\log(\theta_1)\)
The mean exists only for \(\alpha > 1\):
\[E[X] = \frac{\beta}{\alpha - 1}\]The variance exists only for \(\alpha > 2\):
\[\text{Var}[X] = \frac{\beta^2}{(\alpha-1)^2(\alpha-2)}\]References
Barndorff-Nielsen, O. E. (1978). Information and exponential families.
- cdf(x: ArrayLike) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Cumulative distribution function using scipy’s implementation.
- Parameters:
x (array_like) – Points at which to evaluate CDF.
- Returns:
cdf – CDF values.
- Return type:
ndarray or float
- fisher_information(theta: ndarray[tuple[Any, ...], dtype[_ScalarT]] | None = None) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Analytical Fisher information: I(θ) = ∇²ψ(θ).
The Hessian is: I₁₁ = α/β² I₁₂ = I₂₁ = 1/β I₂₂ = ψ’(α) (trigamma function)
where α = -θ₂-1, β = θ₁.
- mean() float[source]
Mean of Inverse Gamma distribution: E[X] = beta/(alpha-1) for alpha > 1.
\[E[X] = \frac{\beta}{\alpha - 1} \quad \text{for } \alpha > 1\]Returns infinity if \(\alpha \leq 1\).
- Returns:
mean – Mean of the distribution, or infinity if undefined.
- Return type:
float
- rvs(size=None, random_state=None)[source]
Generate random samples from the inverse gamma distribution.
- Parameters:
size (int or tuple of ints, optional) – Shape of samples to generate.
random_state (int or Generator, optional) – Random number generator seed or instance.
- Returns:
samples – Random samples from the distribution.
- Return type:
float or ndarray
Inverse Gaussian distribution (Wald distribution).
Special case of GIG with \(p = -1/2\). Also belongs to the exponential family.
The Inverse Gaussian distribution has PDF:
for \(x > 0\), where \(\mu > 0\) is the mean and \(\lambda > 0\) is the shape parameter.
Exponential family form:
\(h(x) = 1/\sqrt{2\pi x^3}\) for \(x > 0\) (base measure)
\(t(x) = [x, 1/x]\) (sufficient statistics)
\(\theta = [-\lambda/(2\mu^2), -\lambda/2]\) (natural parameters)
\(\psi(\theta) = -2\sqrt{\theta_1\theta_2} - \frac{1}{2}\log(-2\theta_2)\) (log partition)
Parametrizations:
Classical: \(\mu\) (mean), \(\lambda\) (shape), \(\mu > 0, \lambda > 0\)
Natural: \(\theta = [-\lambda/(2\mu^2), -\lambda/2]\), \(\theta_1 < 0, \theta_2 < 0\)
Expectation: \(\eta = [\mu, 1/\mu + 1/\lambda]\)
Note: scipy uses (mu, scale) where:
scipy_mu = \(\mu/\lambda\) (shape parameter, confusingly named)
scipy_scale = \(\lambda\) (our shape parameter)
Relationship: \(\mu = \text{scipy\_mu} \times \text{scipy\_scale}\), \(\lambda = \text{scipy\_scale}\)
- class normix.distributions.univariate.inverse_gaussian.InverseGaussian[source]
Bases:
ExponentialFamilyInverse Gaussian distribution in exponential family form.
Also known as the Wald distribution. Special case of GIG with \(p = -1/2\).
The Inverse Gaussian distribution has PDF:
\[p(x|\mu, \lambda) = \sqrt{\frac{\lambda}{2\pi x^3}} \exp\left(-\frac{\lambda(x-\mu)^2}{2\mu^2 x}\right)\]for \(x > 0\), where \(\mu\) is the mean and \(\lambda\) is the shape parameter.
- Parameters:
mean (float, optional) – Mean parameter \(\mu > 0\). Use
from_classical_params(mean=..., shape=...).shape (float, optional) – Shape parameter \(\lambda > 0\). Use
from_classical_params(mean=..., shape=...).
- _natural_params
Internal storage for natural parameters \(\theta = [-\lambda/(2\mu^2), -\lambda/2]\).
- Type:
tuple or None
Examples
>>> # Create from mean and shape parameters >>> dist = InverseGaussian.from_classical_params(mean=1.0, shape=1.0) >>> dist.mean() 1.0
>>> # Create from natural parameters >>> dist = InverseGaussian.from_natural_params(np.array([-0.5, -0.5]))
>>> # Fit from data >>> from scipy.stats import invgauss >>> data = invgauss.rvs(mu=1.0, scale=1.0, size=1000) >>> dist = InverseGaussian().fit(data)
See also
GeneralizedInverseGaussianGeneralization with parameter \(p\)
Notes
The Inverse Gaussian distribution belongs to the exponential family with:
Sufficient statistics: \(t(x) = [x, 1/x]\)
Natural parameters: \(\theta = [-\lambda/(2\mu^2), -\lambda/2]\)
Log partition: \(\psi(\theta) = -2\sqrt{\theta_1\theta_2} - \frac{1}{2}\log(-2\theta_2)\)
It is a special case of the Generalized Inverse Gaussian (GIG) with \(p = -1/2\):
\[\text{InvGauss}(\mu, \lambda) = \text{GIG}(p=-1/2, a=\lambda/\mu^2, b=\lambda)\]References
Barndorff-Nielsen, O. E. (1978). Information and exponential families.
Chhikara, R. S. & Folks, J. L. (1989). The Inverse Gaussian Distribution.
- cdf(x: ArrayLike) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Cumulative distribution function using scipy’s implementation.
- Parameters:
x (array_like) – Points at which to evaluate CDF.
- Returns:
cdf – CDF values.
- Return type:
ndarray or float
- fisher_information(theta: ndarray[tuple[Any, ...], dtype[_ScalarT]] | None = None) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Analytical Fisher information: I(θ) = ∇²ψ(θ).
- mean() float[source]
Mean of Inverse Gaussian distribution: E[X] = mu.
\[E[X] = \mu\]- Returns:
mean – Mean of the distribution.
- Return type:
float
- rvs(size=None, random_state=None)[source]
Generate random samples from the inverse Gaussian distribution.
Uses numpy’s wald distribution which has the same parameterization.
- Parameters:
size (int or tuple of ints, optional) – Shape of samples to generate.
random_state (int or Generator, optional) – Random number generator seed or instance.
- Returns:
samples – Random samples from the distribution.
- Return type:
float or ndarray
Generalized Inverse Gaussian (GIG) distribution.
The GIG distribution belongs to the exponential family and serves as a unifying framework for many important distributions.
The GIG distribution has PDF (Wikipedia parameterization):
for \(x > 0\), where:
\(p\): real-valued shape parameter (any real number)
\(a > 0\): rate parameter (coefficient of \(x\))
\(b > 0\): rate parameter (coefficient of \(1/x\))
\(K_p\) is the modified Bessel function of the second kind
Exponential family form:
\(h(x) = 1\) for \(x > 0\) (base measure)
\(t(x) = [\log x, 1/x, x]\) (sufficient statistics)
\(\theta = [p-1, -b/2, -a/2]\) (natural parameters)
\(\psi(\theta) = \log(2) + \log K_p(\sqrt{ab}) + \frac{p}{2}\log(b/a)\) (log partition)
Natural parameters: \(\theta = (\theta_1, \theta_2, \theta_3)\):
\(\theta_1 = p - 1\) (unbounded)
\(\theta_2 = -b/2 < 0\)
\(\theta_3 = -a/2 < 0\)
Scipy parametrization (scipy.stats.geninvgauss):
scipy uses \((p, b_{scipy}, \text{scale})\) where \(b_{scipy} = \sqrt{ab}\) and \(\text{scale} = \sqrt{b/a}\)
Special cases:
Gamma: \(b \to 0\) gives \(\text{Gamma}(p, a/2)\) for \(p > 0\)
Inverse Gamma: \(a \to 0\) gives \(\text{InvGamma}(-p, b/2)\) for \(p < 0\)
Inverse Gaussian: \(p = -1/2\)
References
https://en.wikipedia.org/wiki/Generalized_inverse_Gaussian_distribution
- normix.distributions.univariate.generalized_inverse_gaussian.GIG
alias of
GeneralizedInverseGaussian
- class normix.distributions.univariate.generalized_inverse_gaussian.GeneralizedInverseGaussian[source]
Bases:
ExponentialFamilyGeneralized Inverse Gaussian (GIG) distribution in exponential family form.
The GIG distribution has PDF (Wikipedia parameterization):
\[f(x|p,a,b) = \frac{(a/b)^{p/2}}{2 K_p(\sqrt{ab})} x^{p-1} \exp\left(-\frac{ax + b/x}{2}\right)\]for \(x > 0\), where \(K_p\) is the modified Bessel function of the second kind.
- Parameters:
p (float, optional) – Shape parameter (any real number). Use
from_classical_params(p=..., a=..., b=...).a (float, optional) – Rate parameter \(a > 0\) (coefficient of \(x\)).
b (float, optional) – Rate parameter \(b > 0\) (coefficient of \(1/x\)).
- _natural_params
Internal storage for natural parameters \(\theta = [p-1, -b/2, -a/2]\).
- Type:
tuple or None
Examples
>>> # Create from classical parameters >>> dist = GeneralizedInverseGaussian.from_classical_params(p=1.0, a=1.0, b=1.0) >>> dist.mean()
>>> # Create from natural parameters >>> dist = GeneralizedInverseGaussian.from_natural_params(np.array([0.0, -0.5, -0.5]))
>>> # Fit from data >>> from scipy.stats import geninvgauss >>> data = geninvgauss.rvs(p=1.0, b=1.0, size=1000) >>> dist = GeneralizedInverseGaussian().fit(data)
See also
GammaSpecial case when \(b \to 0\) for \(p > 0\)
InverseGammaSpecial case when \(a \to 0\) for \(p < 0\)
InverseGaussianSpecial case when \(p = -1/2\)
Notes
The GIG distribution belongs to the exponential family with:
Sufficient statistics: \(t(x) = [\log x, 1/x, x]\)
Natural parameters: \(\theta = [p-1, -b/2, -a/2]\)
Log partition: \(\psi(\theta) = \log(2) + \log K_p(\sqrt{ab}) + \frac{p}{2}\log(b/a)\)
Moments are given by:
\[E[X^\alpha] = \left(\frac{b}{a}\right)^{\alpha/2} \frac{K_{p+\alpha}(\sqrt{ab})}{K_p(\sqrt{ab})}\]Special cases:
\(b \to 0, p > 0\): \(\text{GIG} \to \text{Gamma}(p, a/2)\)
\(a \to 0, p < 0\): \(\text{GIG} \to \text{InvGamma}(-p, b/2)\)
\(p = -1/2\): \(\text{GIG} \to \text{InverseGaussian}\)
References
Barndorff-Nielsen, O. E. (1978). Information and exponential families.
https://en.wikipedia.org/wiki/Generalized_inverse_Gaussian_distribution
- cdf(x: ArrayLike) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Cumulative distribution function.
Uses scipy.stats.geninvgauss.
- fisher_information(theta: ndarray[tuple[Any, ...], dtype[_ScalarT]] | None = None) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Fisher information matrix I(θ) = ∇²A(θ).
Uses numerical differentiation since the analytical form is complex.
- classmethod from_scipy_params(p: float, b: float, scale: float = 1.0) GeneralizedInverseGaussian[source]
Create from scipy.stats.geninvgauss parameters.
- Conversion:
p = p_scipy a = b_scipy / scale b = b_scipy * scale
- Parameters:
p (float) – Shape parameter.
b (float) – Shape parameter b > 0 (this is √(ab) in our notation).
scale (float, optional) – Scale parameter > 0. Default is 1.0.
- Returns:
dist – Distribution instance.
- Return type:
- mean() float[source]
Mean of GIG distribution using Bessel function ratio.
\[E[X] = \sqrt{\frac{b}{a}} \frac{K_{p+1}(\sqrt{ab})}{K_p(\sqrt{ab})}\]- Returns:
mean – Mean of the distribution.
- Return type:
float
- moment_alpha(alpha: float) float[source]
Compute the α-th moment E[X^α].
E[X^α] = (b/a)^(α/2) · K_{p+α}(√(ab)) / K_p(√(ab))
- Parameters:
alpha (float) – Order of the moment.
- Returns:
moment – The α-th moment.
- Return type:
float
- ppf(q: ArrayLike) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Percent point function (inverse of CDF).
Uses scipy.stats.geninvgauss.
- rvs(size=None, random_state=None, method='scipy')[source]
Generate random samples from the GIG distribution.
- Parameters:
size (int or tuple of ints, optional) – Shape of samples to generate.
random_state (int or Generator, optional) – Random number generator seed or instance.
method (str, optional) – Sampling method: ‘scipy’ uses scipy.stats.geninvgauss (default), ‘naive’ uses inverse CDF method.
- Returns:
samples – Random samples from the distribution.
- Return type:
float or ndarray
- to_scipy_params()[source]
Convert to scipy.stats.geninvgauss parameters (p, b, scale).
- scipy parameterization:
f(x|p,b,scale) ∝ (x/scale)^{p-1} exp(-b((x/scale) + scale/x)/2)
- Conversion:
p_scipy = p b_scipy = √(ab) scale = √(b/a)
- Returns:
params – Dictionary with keys ‘p’, ‘b’, ‘scale’.
- Return type:
dict
Multivariate Distributions
Multivariate Normal distribution as an exponential family.
The multivariate Normal distribution has PDF:
for \(x \in \mathbb{R}^d\), where \(\mu\) is the mean vector and \(\Sigma\) is the covariance matrix.
Exponential family form:
\(h(x) = 1\) (base measure, with \((2\pi)^{-d/2}\) absorbed into \(\psi\))
\(t(x) = [x, \text{vec}(xx^T)]\) (sufficient statistics)
\(\theta = [\Lambda\mu, -\frac{1}{2}\text{vec}(\Lambda)]\) where \(\Lambda = \Sigma^{-1}\)
\(\psi(\theta) = \frac{1}{2}\mu^T\Lambda\mu - \frac{1}{2}\log|\Lambda| + \frac{d}{2}\log(2\pi)\)
Parametrizations:
Classical: \(\mu\) (mean, d-vector), \(\Sigma\) (covariance, d×d positive definite)
Natural: \(\theta = [\eta, \text{vec}(\Lambda_{half})]\) where \(\eta = \Lambda\mu\), \(\Lambda_{half} = -\frac{1}{2}\Lambda\)
Expectation: \(\eta = [E[X], E[XX^T]] = [\mu, \Sigma + \mu\mu^T]\)
Supports both univariate (d=1) and multivariate (d>1) cases. For d=1: \(\mu\) is scalar, \(\Sigma\) is scalar (variance \(\sigma^2\)).
Internal storage
After migration, the distribution stores the Cholesky decomposition of the covariance matrix rather than the full covariance or its inverse:
_mu: mean vector, shape(d,)_L: lower Cholesky factor of \(\Sigma\), shape(d, d)
Derived quantities log_det_Sigma and L_inv are cached properties,
computed on demand and invalidated when parameters change.
- normix.distributions.multivariate.normal.MVN
alias of
MultivariateNormal
- class normix.distributions.multivariate.normal.MultivariateNormal(d: int | None = None)[source]
Bases:
ExponentialFamilyMultivariate Normal distribution in exponential family form.
The Multivariate Normal distribution has PDF:
\[p(x|\mu,\Sigma) = (2\pi)^{-d/2} |\Sigma|^{-1/2} \exp\left(-\frac{1}{2} (x-\mu)^T \Sigma^{-1} (x-\mu)\right)\]Supports both univariate (d=1) and multivariate (d>1) cases.
- Parameters:
d (int, optional) – Dimension of the distribution. Inferred from parameters if not provided.
- _mu
Mean vector, shape
(d,).- Type:
ndarray or None
- _L
Lower Cholesky factor of \(\Sigma\), shape
(d, d). \(\Sigma = L L^T\).- Type:
ndarray or None
- _d
Dimension of the distribution.
- Type:
int or None
Examples
>>> # 1D case (univariate normal) >>> dist = MultivariateNormal.from_classical_params(mu=0.0, sigma=np.array([[1.0]])) >>> dist.mean() array([0.])
>>> # 2D case >>> mu = np.array([1.0, 2.0]) >>> sigma = np.array([[1.0, 0.5], [0.5, 1.0]]) >>> dist = MultivariateNormal.from_classical_params(mu=mu, sigma=sigma) >>> dist.mean() array([1., 2.])
>>> # Fit from data >>> data = np.random.multivariate_normal([0, 0], [[1, 0.5], [0.5, 1]], size=1000) >>> dist = MultivariateNormal(d=2).fit(data)
Notes
The Multivariate Normal distribution belongs to the exponential family with:
Sufficient statistics: \(t(x) = [x, \text{vec}(xx^T)]\)
Natural parameters: \(\theta = [\Lambda\mu, -\frac{1}{2}\text{vec}(\Lambda)]\)
Log partition: \(\psi(\theta) = \frac{1}{2}\eta^T\Sigma\eta - \frac{1}{2}\log|\Lambda| + \frac{d}{2}\log(2\pi)\)
where \(\Lambda = \Sigma^{-1}\) is the precision matrix.
The internal storage uses the Cholesky decomposition \(L\) of the covariance \(\Sigma = LL^T\), which:
Avoids repeated matrix inversions in
logpdfandrvsProvides numerically stable log-determinant: \(\log|\Sigma| = 2\sum_i \log L_{ii}\)
Enables efficient Mahalanobis distance via
solve_triangular
References
Barndorff-Nielsen, O. E. (1978). Information and exponential families.
- property L_inv: ndarray[tuple[Any, ...], dtype[_ScalarT]]
Inverse of the lower Cholesky factor (cached).
\(L^{-1}\) such that \(\Sigma^{-1} = L^{-T} L^{-1}\).
- Returns:
L_inv – Lower triangular matrix.
- Return type:
ndarray, shape
(d, d)
- cdf(x: ArrayLike) float | ndarray[tuple[Any, ...], dtype[floating]][source]
Cumulative distribution function.
For d=1, uses the univariate normal CDF. For d>1, uses scipy’s multivariate_normal.cdf.
- Parameters:
x (array_like) – Points at which to evaluate CDF.
- Returns:
cdf – CDF values.
- Return type:
float or ndarray
- property d: int
Dimension of the distribution.
- entropy() float[source]
Differential entropy.
\[H(X) = \frac{d}{2}(1 + \log(2\pi)) + \frac{1}{2}\log|\Sigma|\]
- fisher_information(theta: ndarray[tuple[Any, ...], dtype[_ScalarT]] | None = None) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Fisher information matrix.
Uses numerical differentiation from base class.
- fit(X: ArrayLike, y: ArrayLike | None = None, **kwargs) MultivariateNormal[source]
Fit distribution parameters using Maximum Likelihood Estimation.
For MVN, the MLE is: - μ̂ = sample mean - Σ̂ = sample covariance
- Parameters:
X (array_like) – Training data. Shape (n_samples, d) or (n_samples,) for 1D.
y (array_like, optional) – Ignored.
- Returns:
self – Fitted distribution.
- Return type:
- classmethod from_scipy(rv: <scipy.stats._multivariate.multivariate_normal_gen object at 0x7fbff6bf4770>) MultivariateNormal[source]
Create from scipy.stats.multivariate_normal.
- property log_det_Sigma: float
Log-determinant of the covariance matrix (cached).
\[\log|\Sigma| = 2 \sum_{i=1}^d \log L_{ii}\]- Returns:
log_det
- Return type:
float
- logpdf(x: ArrayLike) float | ndarray[tuple[Any, ...], dtype[floating]][source]
Log probability density using Cholesky-based computation.
Uses
solve_triangularinstead ofnp.linalg.invfor the Mahalanobis distance:\[\log p(x|\mu,\Sigma) = -\frac{d}{2}\log(2\pi) - \frac{1}{2}\log|\Sigma| - \frac{1}{2}(x-\mu)^T \Sigma^{-1}(x-\mu)\]where \(\Sigma^{-1}(x-\mu) = L^{-T}(L^{-1}(x-\mu))\).
- Parameters:
x (array_like) – Points at which to evaluate log PDF. Shape
(d,)for single sample,(n, d)for n samples.- Returns:
logpdf – Log probability density at each point.
- Return type:
float or ndarray
- pdf(x: ArrayLike) float | ndarray[tuple[Any, ...], dtype[floating]][source]
Probability density: p(x|θ) = exp(logpdf(x)).
- rvs(size=None, random_state=None) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Generate random samples using \(X = \mu + L Z\) where \(Z \sim N(0, I)\).
- Parameters:
size (int or tuple of ints, optional) – Number of samples to generate.
random_state (int or Generator, optional) – Random number generator.
- Returns:
samples – Shape (size, d) for size > 1, (d,) for size = None.
- Return type:
ndarray
Mixture Distributions
Variance Gamma (VG) marginal distribution \(f(x)\).
The marginal distribution of \(X\) obtained by integrating out \(Y\):
where:
The marginal PDF has a closed form involving modified Bessel functions:
where \(q(x) = (x-\mu)^T\Sigma^{-1}(x-\mu)\) is the squared Mahalanobis distance and \(c = \beta + \frac{1}{2}\gamma^T\Sigma^{-1}\gamma\).
Note: The marginal distribution is NOT an exponential family, but the joint
distribution \(f(x, y)\) IS an exponential family (accessible via .joint).
- class normix.distributions.mixtures.variance_gamma.VarianceGamma[source]
Bases:
NormalMixtureVariance Gamma (VG) marginal distribution.
The Variance Gamma distribution is a normal variance-mean mixture where the mixing distribution is Gamma. It is a special case of the Generalized Hyperbolic distribution with GIG parameter \(b \to 0\).
The marginal distribution \(f(x)\) is NOT an exponential family. The joint distribution \(f(x, y)\) IS an exponential family and can be accessed via the
.jointproperty.- Parameters:
instances. (None. Use factory methods to create)
- _joint
The underlying joint distribution.
- Type:
JointVarianceGamma
Examples
>>> # 1D case >>> vg = VarianceGamma.from_classical_params( ... mu=np.array([0.0]), ... gamma=np.array([0.5]), ... sigma=np.array([[1.0]]), ... shape=2.0, ... rate=1.0 ... ) >>> x = vg.rvs(size=1000, random_state=42) # Samples X only
>>> # Access joint distribution >>> vg.joint.pdf(x, y) # Joint PDF f(x, y) >>> vg.joint.natural_params # Natural parameters (exponential family)
>>> # 2D case >>> vg = VarianceGamma.from_classical_params( ... mu=np.array([0.0, 0.0]), ... gamma=np.array([0.5, -0.3]), ... sigma=np.array([[1.0, 0.3], [0.3, 1.0]]), ... shape=2.0, ... rate=1.0 ... )
See also
JointVarianceGammaJoint distribution (exponential family)
GeneralizedHyperbolicGeneral case with GIG mixing
NormalInverseGaussianSpecial case with Inverse Gaussian mixing
Notes
The Variance Gamma distribution has heavier tails than the normal distribution, controlled by the Gamma shape parameter \(\alpha\). Smaller \(\alpha\) leads to heavier tails.
The skewness is controlled by the \(\gamma\) parameter:
\(\gamma = 0\): symmetric distribution
\(\gamma > 0\): right-skewed
\(\gamma < 0\): left-skewed
- fit(X: ArrayLike, y: ArrayLike | None = None, *, max_iter: int = 100, tol: float = 0.001, verbose: int = 0, random_state: int | Generator | None = None) VarianceGamma[source]
Fit distribution to data using the EM algorithm.
When only X is observed (Y is latent), uses the EM algorithm described in the theory documentation. The E-step computes conditional expectations of Y given X, and the M-step updates parameters using closed-form formulas.
Convergence is checked based on the relative change in the normal parameters \((\mu, \gamma, \Sigma)\). Log-likelihood is optionally displayed per iteration when
verbose >= 1.- Parameters:
X (array_like) – Observed X data, shape (n_samples, d) or (n_samples,) for d=1.
y (array_like, optional) – Ignored (for sklearn API compatibility).
max_iter (int, optional) – Maximum number of EM iterations. Default is 100.
tol (float, optional) – Convergence tolerance for relative parameter change. Default is 1e-6.
verbose (int, optional) – Verbosity level. 0 = silent, 1 = progress with llh, 2 = detailed with parameter norms. Default is 0.
random_state (int or Generator, optional) – Random state for initialization.
- Returns:
self – Fitted distribution (returns self for method chaining).
- Return type:
Examples
>>> # Generate data from known distribution >>> true_dist = VarianceGamma.from_classical_params( ... mu=np.array([0.0]), gamma=np.array([0.5]), ... sigma=np.array([[1.0]]), shape=2.0, rate=1.0 ... ) >>> X = true_dist.rvs(size=5000, random_state=42) >>> >>> # Fit new distribution >>> fitted = VarianceGamma().fit(X) >>> print(fitted.classical_params)
Notes
The EM algorithm iterates between:
E-step: Compute conditional expectations
\[E[Y | X = x_j], \quad E[1/Y | X = x_j], \quad E[\log Y | X = x_j]\]M-step: Update parameters using closed-form formulas
\[\begin{split}\mu &= \frac{\bar{X} - \bar{E[Y]} \cdot \overline{X/Y}}{1 - \overline{E[1/Y]} \cdot \overline{E[Y]}} \\ \gamma &= \frac{\overline{X/Y} - \overline{E[1/Y]} \cdot \bar{X}}{1 - \overline{E[1/Y]} \cdot \overline{E[Y]}}\end{split}\]The Gamma parameters \((\alpha, \beta)\) are updated via Newton’s method.
- fit_complete(X: ArrayLike, Y: ArrayLike, **kwargs) VarianceGamma[source]
Fit distribution from complete data (both X and Y observed).
Since the joint distribution is an exponential family, this uses the closed-form MLE via the joint distribution’s fit method.
- Parameters:
X (array_like) – Observed X data, shape (n_samples, d) or (n_samples,) for d=1.
Y (array_like) – Observed Y data (mixing variable), shape (n_samples,).
**kwargs – Additional fitting parameters passed to joint.fit().
- Returns:
self – Fitted distribution (returns self for method chaining).
- Return type:
Examples
>>> # Generate complete data >>> true_dist = VarianceGamma.from_classical_params( ... mu=np.array([0.0]), gamma=np.array([0.5]), ... sigma=np.array([[1.0]]), shape=2.0, rate=1.0 ... ) >>> X, Y = true_dist.rvs_joint(size=5000, random_state=42) >>> >>> # Fit with complete data >>> fitted = VarianceGamma().fit_complete(X, Y)
Normal Inverse Gamma (NInvG) marginal distribution \(f(x)\).
The marginal distribution of \(X\) obtained by integrating out \(Y\):
where:
The marginal PDF has a closed form involving modified Bessel functions:
where \(q(x) = (x-\mu)^T\Sigma^{-1}(x-\mu)\) is the squared Mahalanobis distance and \(c\) depends on the skewness parameter \(\gamma\).
Note: The marginal distribution is NOT an exponential family, but the joint
distribution \(f(x, y)\) IS an exponential family (accessible via .joint).
This is a special case of the Generalized Hyperbolic distribution with GIG parameter \(a \to 0\) (inverse gamma mixing).
- class normix.distributions.mixtures.normal_inverse_gamma.NormalInverseGamma[source]
Bases:
NormalMixtureNormal Inverse Gamma (NInvG) marginal distribution.
The Normal Inverse Gamma distribution is a normal variance-mean mixture where the mixing distribution is InverseGamma. It is a special case of the Generalized Hyperbolic distribution with GIG parameter \(a \to 0\).
The marginal distribution \(f(x)\) is NOT an exponential family. The joint distribution \(f(x, y)\) IS an exponential family and can be accessed via the
.jointproperty.- Parameters:
instances. (None. Use factory methods to create)
- _joint
The underlying joint distribution.
- Type:
JointNormalInverseGamma
Examples
>>> # 1D case >>> nig = NormalInverseGamma.from_classical_params( ... mu=np.array([0.0]), ... gamma=np.array([0.5]), ... sigma=np.array([[1.0]]), ... shape=3.0, ... rate=1.0 ... ) >>> x = nig.rvs(size=1000, random_state=42) # Samples X only
>>> # Access joint distribution >>> nig.joint.pdf(x, y) # Joint PDF f(x, y) >>> nig.joint.natural_params # Natural parameters (exponential family)
>>> # 2D case >>> nig = NormalInverseGamma.from_classical_params( ... mu=np.array([0.0, 0.0]), ... gamma=np.array([0.5, -0.3]), ... sigma=np.array([[1.0, 0.3], [0.3, 1.0]]), ... shape=3.0, ... rate=1.0 ... )
See also
JointNormalInverseGammaJoint distribution (exponential family)
GeneralizedHyperbolicGeneral case with GIG mixing
VarianceGammaSpecial case with Gamma mixing
Notes
The Normal Inverse Gamma distribution has heavier tails than the normal distribution, controlled by the InverseGamma shape parameter \(\alpha\). Smaller \(\alpha\) leads to heavier tails.
The skewness is controlled by the \(\gamma\) parameter:
\(\gamma = 0\): symmetric distribution
\(\gamma > 0\): right-skewed
\(\gamma < 0\): left-skewed
For the mean to exist, we need \(\alpha > 1\). For the variance to exist, we need \(\alpha > 2\).
- fit(X: ArrayLike, y: ArrayLike | None = None, *, max_iter: int = 100, tol: float = 0.001, verbose: int = 0, random_state: int | Generator | None = None) NormalInverseGamma[source]
Fit distribution to data using the EM algorithm.
When only X is observed (Y is latent), uses the EM algorithm described in the theory documentation. The E-step computes conditional expectations of Y given X, and the M-step updates parameters using closed-form formulas.
Convergence is checked based on the relative change in the normal parameters \((\mu, \gamma, \Sigma)\). Log-likelihood is optionally displayed per iteration when
verbose >= 1.- Parameters:
X (array_like) – Observed X data, shape (n_samples, d) or (n_samples,) for d=1.
y (array_like, optional) – Ignored (for sklearn API compatibility).
max_iter (int, optional) – Maximum number of EM iterations. Default is 100.
tol (float, optional) – Convergence tolerance for relative parameter change. Default is 1e-6.
verbose (int, optional) – Verbosity level. 0 = silent, 1 = progress with llh, 2 = detailed with parameter norms. Default is 0.
random_state (int or Generator, optional) – Random state for initialization.
- Returns:
self – Fitted distribution (returns self for method chaining).
- Return type:
Notes
The EM algorithm iterates between:
E-step: Compute conditional expectations
\[E[Y | X = x_j], \quad E[1/Y | X = x_j], \quad E[\log Y | X = x_j]\]M-step: Update parameters using closed-form formulas
The InverseGamma parameters \((\alpha, \beta)\) are updated via Newton’s method.
For valid estimates, we need \(\alpha > 1\) for the mean to exist.
- fit_complete(X: ArrayLike, Y: ArrayLike, **kwargs) NormalInverseGamma[source]
Fit distribution from complete data (both X and Y observed).
Since the joint distribution is an exponential family, this uses the closed-form MLE via the joint distribution’s fit method.
- Parameters:
X (array_like) – Observed X data, shape (n_samples, d) or (n_samples,) for d=1.
Y (array_like) – Observed Y data (mixing variable), shape (n_samples,).
**kwargs – Additional fitting parameters passed to joint.fit().
- Returns:
self – Fitted distribution (returns self for method chaining).
- Return type:
Normal Inverse Gaussian (NIG) marginal distribution \(f(x)\).
The marginal distribution of \(X\) obtained by integrating out \(Y\):
where:
The marginal PDF has a closed form involving modified Bessel functions \(K_1\):
for the univariate case, where \(q(x) = (x-\mu)^T \Sigma^{-1}(x-\mu)\).
This is a special case of the Generalized Hyperbolic distribution with GIG parameter \(p = -1/2\).
Note: The marginal distribution is NOT an exponential family, but the joint
distribution \(f(x, y)\) IS an exponential family (accessible via .joint).
- class normix.distributions.mixtures.normal_inverse_gaussian.NormalInverseGaussian[source]
Bases:
NormalMixtureNormal Inverse Gaussian (NIG) marginal distribution.
The Normal Inverse Gaussian distribution is a normal variance-mean mixture where the mixing distribution is Inverse Gaussian. It is a special case of the Generalized Hyperbolic distribution with GIG parameter \(p = -1/2\).
The marginal distribution \(f(x)\) is NOT an exponential family. The joint distribution \(f(x, y)\) IS an exponential family and can be accessed via the
.jointproperty.- Parameters:
instances. (None. Use factory methods to create)
- _joint
The underlying joint distribution.
- Type:
JointNormalInverseGaussian
Examples
>>> # 1D case >>> nig = NormalInverseGaussian.from_classical_params( ... mu=np.array([0.0]), ... gamma=np.array([0.5]), ... sigma=np.array([[1.0]]), ... delta=1.0, ... eta=1.0 ... ) >>> x = nig.rvs(size=1000, random_state=42) # Samples X only
>>> # Access joint distribution >>> nig.joint.pdf(x, y) # Joint PDF f(x, y) >>> nig.joint.natural_params # Natural parameters (exponential family)
>>> # 2D case >>> nig = NormalInverseGaussian.from_classical_params( ... mu=np.array([0.0, 0.0]), ... gamma=np.array([0.5, -0.3]), ... sigma=np.array([[1.0, 0.3], [0.3, 1.0]]), ... delta=1.0, ... eta=1.0 ... )
See also
JointNormalInverseGaussianJoint distribution (exponential family)
GeneralizedHyperbolicGeneral case with GIG mixing
VarianceGammaSpecial case with Gamma mixing
NormalInverseGammaSpecial case with InverseGamma mixing
Notes
The NIG distribution is widely used in finance due to its semi-heavy tails and ability to model both skewness and excess kurtosis. It was introduced by Barndorff-Nielsen (1977) and provides a good fit for asset returns.
The skewness is controlled by the \(\gamma\) parameter:
\(\gamma = 0\): symmetric distribution
\(\gamma > 0\): right-skewed
\(\gamma < 0\): left-skewed
The tail heaviness is controlled by \(\eta\): smaller \(\eta\) gives heavier tails.
- fit(X: ArrayLike, y: ArrayLike | None = None, *, max_iter: int = 100, tol: float = 0.001, verbose: int = 0, random_state: int | Generator | None = None) NormalInverseGaussian[source]
Fit distribution to data using the EM algorithm.
When only X is observed (Y is latent), uses the EM algorithm described in the theory documentation. The E-step computes conditional expectations of Y given X, and the M-step updates parameters using closed-form formulas.
Convergence is checked based on the relative change in the normal parameters \((\mu, \gamma, \Sigma)\). Log-likelihood is optionally displayed per iteration when
verbose >= 1.- Parameters:
X (array_like) – Observed X data, shape (n_samples, d) or (n_samples,) for d=1.
y (array_like, optional) – Ignored (for sklearn API compatibility).
max_iter (int, optional) – Maximum number of EM iterations. Default is 100.
tol (float, optional) – Convergence tolerance for relative parameter change. Default is 1e-6.
verbose (int, optional) – Verbosity level. 0 = silent, 1 = progress with llh, 2 = detailed with parameter norms. Default is 0.
random_state (int or Generator, optional) – Random state for initialization.
- Returns:
self – Fitted distribution (returns self for method chaining).
- Return type:
Notes
The EM algorithm iterates between:
E-step: Compute conditional expectations
\[E[Y | X = x_j], \quad E[1/Y | X = x_j], \quad E[\log Y | X = x_j]\]M-step: Update parameters using closed-form formulas
The Inverse Gaussian parameters \((\delta, \eta)\) are updated analytically: - \(\delta = E[Y]\) - \(\eta = 1 / (E[1/Y] - 1/\delta)\)
- fit_complete(X: ArrayLike, Y: ArrayLike, **kwargs) NormalInverseGaussian[source]
Fit distribution from complete data (both X and Y observed).
Since the joint distribution is an exponential family, this uses the closed-form MLE via the joint distribution’s fit method.
- Parameters:
X (array_like) – Observed X data, shape (n_samples, d) or (n_samples,) for d=1.
Y (array_like) – Observed Y data (mixing variable), shape (n_samples,).
**kwargs – Additional fitting parameters passed to joint.fit().
- Returns:
self – Fitted distribution (returns self for method chaining).
- Return type:
Generalized Hyperbolic (GH) marginal distribution \(f(x)\).
The marginal distribution of \(X\) obtained by integrating out \(Y\):
where:
The marginal PDF has a closed form involving modified Bessel functions.
Note: The marginal distribution is NOT an exponential family, but the joint
distribution \(f(x, y)\) IS an exponential family (accessible via .joint).
Special cases: - Variance Gamma (VG): \(b \to 0\) (GIG → Gamma) - Normal-Inverse Gaussian (NIG): \(p = -1/2\) (GIG → InverseGaussian) - Normal-Inverse Gamma (NInvG): \(a \to 0\) (GIG → InverseGamma) - Hyperbolic: \(p = 1\)
- class normix.distributions.mixtures.generalized_hyperbolic.GeneralizedHyperbolic[source]
Bases:
NormalMixtureGeneralized Hyperbolic (GH) marginal distribution.
The Generalized Hyperbolic distribution is a normal variance-mean mixture where the mixing distribution is Generalized Inverse Gaussian (GIG). It is the most general form of the normal variance-mean mixture family.
The marginal distribution \(f(x)\) is NOT an exponential family. The joint distribution \(f(x, y)\) IS an exponential family and can be accessed via the
.jointproperty.- Parameters:
instances. (None. Use factory methods to create)
- _joint
The underlying joint distribution.
- Type:
JointGeneralizedHyperbolic
Examples
>>> # 1D case >>> gh = GeneralizedHyperbolic.from_classical_params( ... mu=np.array([0.0]), ... gamma=np.array([0.5]), ... sigma=np.array([[1.0]]), ... p=1.0, ... a=1.0, ... b=1.0 ... ) >>> x = gh.rvs(size=1000, random_state=42) # Samples X only
>>> # Access joint distribution >>> gh.joint.pdf(x, y) # Joint PDF f(x, y) >>> gh.joint.natural_params # Natural parameters (exponential family)
>>> # Fit with EM algorithm and regularization >>> gh = GeneralizedHyperbolic().fit( ... X, max_iter=100, ... regularization='det_sigma_one' # or 'sigma_diagonal_one', 'fix_p', 'none' ... )
See also
JointGeneralizedHyperbolicJoint distribution (exponential family)
GeneralizedInverseGaussianThe mixing distribution for Y
VarianceGammaSpecial case with Gamma mixing (b → 0)
NormalInverseGaussianSpecial case with p = -1/2
NormalInverseGammaSpecial case with a → 0
Notes
The GH distribution is widely used in finance for modeling asset returns. It includes many important distributions as special cases:
Variance Gamma (VG): \(b \to 0, p > 0\) (Gamma mixing)
Normal-Inverse Gaussian (NIG): \(p = -1/2\) (IG mixing)
Normal-Inverse Gamma (NInvG): \(a \to 0, p < 0\) (InvGamma mixing)
Hyperbolic: \(p = 1\)
Student-t: \(p = -\nu/2, a \to 0, b = \nu\) gives Student-t with ν d.f.
The model is not identifiable since the parameter sets \((\mu, \gamma/c, \Sigma/c, p, c \cdot b, a/c)\) give the same distribution for any \(c > 0\). Use regularization to fix this.
Available regularization methods:
'det_sigma_one': Fix \(|\Sigma| = 1\) (recommended)'sigma_diagonal_one': Fix \(\Sigma_{11} = 1\)'fix_p': Fix \(p\) to a specific value (e.g., -0.5 for NIG)'none': No regularization
- classmethod as_normal_inverse_gamma(mu: ArrayLike, gamma: ArrayLike, sigma: ArrayLike, shape: float, rate: float) GeneralizedHyperbolic[source]
Create a GH distribution equivalent to Normal-Inverse Gamma.
NInvG is GH with a → 0 (Inverse Gamma mixing).
- Parameters:
mu (arrays) – Normal parameters.
gamma (arrays) – Normal parameters.
sigma (arrays) – Normal parameters.
shape (float) – InverseGamma shape parameter α.
rate (float) – InverseGamma rate parameter β.
- Returns:
gh – GH distribution equivalent to NInvG.
- Return type:
- classmethod as_normal_inverse_gaussian(mu: ArrayLike, gamma: ArrayLike, sigma: ArrayLike, delta: float, eta: float) GeneralizedHyperbolic[source]
Create a GH distribution equivalent to Normal-Inverse Gaussian.
NIG is GH with p = -1/2 (Inverse Gaussian mixing).
- Parameters:
mu (arrays) – Normal parameters.
gamma (arrays) – Normal parameters.
sigma (arrays) – Normal parameters.
delta (float) – IG mean parameter.
eta (float) – IG shape parameter.
- Returns:
gh – GH distribution equivalent to NIG.
- Return type:
- classmethod as_variance_gamma(mu: ArrayLike, gamma: ArrayLike, sigma: ArrayLike, shape: float, rate: float) GeneralizedHyperbolic[source]
Create a GH distribution equivalent to Variance Gamma.
VG is GH with b → 0 (Gamma mixing).
- Parameters:
mu (arrays) – Normal parameters.
gamma (arrays) – Normal parameters.
sigma (arrays) – Normal parameters.
shape (float) – Gamma shape parameter α.
rate (float) – Gamma rate parameter β.
- Returns:
gh – GH distribution equivalent to VG.
- Return type:
- cov() ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Covariance matrix of the marginal distribution.
\[\text{Cov}[X] = E[Y] \Sigma + \text{Var}[Y] \gamma\gamma^T\]- Returns:
cov – Covariance matrix, shape (d, d).
- Return type:
ndarray
- fit(X: ArrayLike, *, max_iter: int = 100, tol: float = 0.01, verbose: int = 0, regularization: str | Callable = 'det_sigma_one', regularization_params: Dict | None = None, random_state: int | None = None, fix_tail: bool = False) GeneralizedHyperbolic[source]
Fit distribution to marginal data X using the EM algorithm.
Since the marginal distribution is not an exponential family, we use the EM algorithm treating Y as latent. The joint distribution (X, Y) is an exponential family, which makes the EM algorithm tractable.
Convergence is checked based on the relative change in the normal parameters \((\mu, \gamma, \Sigma)\), which are more stable than the GIG parameters. Log-likelihood is optionally displayed per iteration when
verbose >= 1.- Parameters:
X (array_like) – Observed data, shape (n_samples, d) or (n_samples,) for d=1.
max_iter (int, optional) – Maximum number of EM iterations. Default is 100.
tol (float, optional) – Convergence tolerance for relative parameter change (based on \(\mu, \gamma, \Sigma\)). Default is 1e-4.
verbose (int, optional) – Verbosity level. 0 = silent, 1 = progress with llh, 2 = detailed with parameter norms. Default is 0.
regularization (str or callable, optional) –
Regularization method to use. Options:
'det_sigma_one': Fix |Σ| = 1 (recommended, default)'sigma_diagonal_one': Fix Σ₁₁ = 1'fix_p': Fix p to a specific value (use regularization_params)'none': No regularizationOr provide a custom callable with signature:
f(mu, gamma, sigma, p, a, b, d) -> dict
regularization_params (dict, optional) – Additional parameters for the regularization method. For ‘fix_p’:
{'p_fixed': -0.5}to fix p = -0.5 (NIG).random_state (int, optional) – Random seed for initialization.
fix_tail (bool, optional) – If True, do not update GIG parameters (p, a, b) during fitting. Useful for fitting special cases like VG, NIG, NInvG. Default False.
- Returns:
self – Fitted distribution (returns self for method chaining).
- Return type:
Notes
The EM algorithm alternates between:
E-step: Compute conditional expectations given current parameters:
\[E[Y^{-1}|X], E[Y|X], E[\log Y|X]\]M-step: Update parameters using closed-form solutions for normal parameters and numerical optimization for GIG parameters.
After each M-step, regularization is applied to ensure identifiability. The recommended regularization is \(|\Sigma| = 1\), which doesn’t affect EM convergence (see em_algorithm.rst for proof).
The initial parameters are obtained by running a short NIG EM (5 iterations) to get a reasonable starting point for \((\mu, \gamma, \Sigma)\) and the GIG parameters.
Examples
>>> # Default regularization (|Σ| = 1) >>> gh = GeneralizedHyperbolic().fit(X)
>>> # Fix p = -0.5 (NIG) >>> gh = GeneralizedHyperbolic().fit( ... X, ... regularization='fix_p', ... regularization_params={'p_fixed': -0.5} ... )
>>> # Custom regularization >>> def my_regularize(mu, gamma, sigma, p, a, b, d): ... return {'mu': mu, 'gamma': gamma, 'sigma': sigma, ... 'p': p, 'a': a, 'b': 1.0} # Fix b = 1 >>> gh = GeneralizedHyperbolic().fit(X, regularization=my_regularize)
- normix.distributions.mixtures.generalized_hyperbolic.regularize_det_sigma_one(mu: ndarray[tuple[Any, ...], dtype[_ScalarT]], gamma: ndarray[tuple[Any, ...], dtype[_ScalarT]], sigma: ndarray[tuple[Any, ...], dtype[_ScalarT]], p: float, a: float, b: float, d: int, log_det_sigma: float | None = None) Dict[str, Any][source]
Regularize GH parameters by enforcing \(|\Sigma| = 1\).
This is the recommended regularization that doesn’t affect convergence of the EM algorithm. Following the legacy implementation:
\[(\mu, \gamma, \Sigma, p, a, b) \to (\mu, |\Sigma|^{-1/d} \gamma, |\Sigma|^{-1/d} \Sigma, p, |\Sigma|^{-1/d} a, |\Sigma|^{1/d} b)\]- Parameters:
mu (parameters) – Classical GH parameters.
gamma (parameters) – Classical GH parameters.
sigma (parameters) – Classical GH parameters.
p (parameters) – Classical GH parameters.
a (parameters) – Classical GH parameters.
b (parameters) – Classical GH parameters.
d (int) – Dimension of X.
log_det_sigma (float, optional) – Precomputed log determinant of Sigma. If provided, avoids recomputing the determinant.
- Returns:
params – Regularized parameters with |Σ| = 1.
- Return type:
dict
- normix.distributions.mixtures.generalized_hyperbolic.regularize_fix_p(mu: ndarray[tuple[Any, ...], dtype[_ScalarT]], gamma: ndarray[tuple[Any, ...], dtype[_ScalarT]], sigma: ndarray[tuple[Any, ...], dtype[_ScalarT]], p: float, a: float, b: float, d: int, p_fixed: float = -0.5) Dict[str, Any][source]
Regularize by fixing the GIG parameter \(p\).
This converts GH to a special case: - p = -0.5: Normal-Inverse Gaussian - p > 0 with b → 0: Variance Gamma (approximately)
- Parameters:
mu (parameters) – Classical GH parameters.
gamma (parameters) – Classical GH parameters.
sigma (parameters) – Classical GH parameters.
p (parameters) – Classical GH parameters.
a (parameters) – Classical GH parameters.
b (parameters) – Classical GH parameters.
d (int) – Dimension of X.
p_fixed (float, optional) – Fixed value for p. Default is -0.5 (NIG).
- Returns:
params – Regularized parameters with fixed p.
- Return type:
dict
- normix.distributions.mixtures.generalized_hyperbolic.regularize_none(mu: ndarray[tuple[Any, ...], dtype[_ScalarT]], gamma: ndarray[tuple[Any, ...], dtype[_ScalarT]], sigma: ndarray[tuple[Any, ...], dtype[_ScalarT]], p: float, a: float, b: float, d: int) Dict[str, Any][source]
No regularization (identity function).
- Parameters:
mu (parameters) – Classical GH parameters.
gamma (parameters) – Classical GH parameters.
sigma (parameters) – Classical GH parameters.
p (parameters) – Classical GH parameters.
a (parameters) – Classical GH parameters.
b (parameters) – Classical GH parameters.
d (int) – Dimension of X.
- Returns:
params – Unchanged parameters.
- Return type:
dict
- normix.distributions.mixtures.generalized_hyperbolic.regularize_sigma_diagonal_one(mu: ndarray[tuple[Any, ...], dtype[_ScalarT]], gamma: ndarray[tuple[Any, ...], dtype[_ScalarT]], sigma: ndarray[tuple[Any, ...], dtype[_ScalarT]], p: float, a: float, b: float, d: int) Dict[str, Any][source]
Regularize by enforcing the first diagonal element of \(\Sigma\) to be 1.
\[\Sigma_{11} = 1\]This is an alternative regularization useful in some applications.
- Parameters:
mu (parameters) – Classical GH parameters.
gamma (parameters) – Classical GH parameters.
sigma (parameters) – Classical GH parameters.
p (parameters) – Classical GH parameters.
a (parameters) – Classical GH parameters.
b (parameters) – Classical GH parameters.
d (int) – Dimension of X.
- Returns:
params – Regularized parameters with Σ₁₁ = 1.
- Return type:
dict
Utilities
Bessel function utilities for numerical stability.
This module provides numerically stable implementations of log Bessel functions that are essential for the Generalized Inverse Gaussian distribution.
The modified Bessel function of the second kind \(K_\nu(z)\) appears in the normalizing constant of the GIG distribution:
For numerical stability, we work with \(\log K_\nu(z)\) instead of \(K_\nu(z)\) directly.
- normix.utils.bessel.kv_ratio(v1: float, v2: float, z: float | ndarray[tuple[Any, ...], dtype[_ScalarT]]) float | ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Compute the ratio K_{v1}(z) / K_{v2}(z) in a numerically stable way.
- Uses log space to avoid overflow/underflow:
K_{v1}(z) / K_{v2}(z) = exp(log K_{v1}(z) - log K_{v2}(z))
- Parameters:
v1 (float) – Order of numerator Bessel function.
v2 (float) – Order of denominator Bessel function.
z (float or ndarray) – Argument (must be > 0).
- Returns:
ratio – The ratio K_{v1}(z) / K_{v2}(z).
- Return type:
float or ndarray
- normix.utils.bessel.log_kv(v: float, z: float | ndarray[tuple[Any, ...], dtype[_ScalarT]]) float | ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Log modified Bessel function of the second kind: log K_v(z).
Computes \(\log K_\nu(z)\) in a numerically stable way.
This is an optimized version where \(\nu\) is a scalar and \(z\) can be vectorized. For the fully vectorized version (both \(\nu\) and \(z\) arrays), use
log_kv_vectorized().Uses the exponentially scaled function \(K_\nu^e(z) = K_\nu(z) e^z\) for numerical stability when \(z\) is large:
\[\log K_\nu(z) = \log K_\nu^e(z) - z\]When the result is inf (underflow for small \(z\)), uses asymptotic approximations:
If \(|\nu| > \varepsilon\): \(\log K_\nu(z) \approx \log\Gamma(|\nu|) - \log 2 + |\nu|(\log 2 - \log z)\)
If \(|\nu| \approx 0\): \(\log K_0(z) \approx \log(-\log(z/2) - \gamma)\) where \(\gamma\) is Euler’s constant
- Parameters:
v (float) – Order of Bessel function \(\nu\) (scalar).
z (float or ndarray) – Argument (must be > 0), can be vectorized.
- Returns:
log_kv – \(\log K_\nu(z)\).
- Return type:
float or ndarray
Examples
>>> log_kv(1.0, 2.0) -1.9670713500717493 >>> log_kv(1.5, np.array([0.5, 1.0, 2.0])) array([ 0.68024015, -0.38318674, -1.71531713])
- normix.utils.bessel.log_kv_derivative_z(v: float, z: float | ndarray[tuple[Any, ...], dtype[_ScalarT]]) float | ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Derivative of log(K_v(z)) with respect to z.
Using the recurrence relation:
K'_v(z) = -(K_{v-1}(z) + K_{v+1}(z)) / 2
So:
d/dz log(K_v(z)) = K'_v(z) / K_v(z) = -(K_{v-1}(z) + K_{v+1}(z)) / (2 K_v(z)) = -1/2 (exp(log K_{v-1}(z) - log K_v(z)) + exp(log K_{v+1}(z) - log K_v(z)))
- Parameters:
v (float) – Order of Bessel function (scalar).
z (float or ndarray) – Argument (must be > 0), can be vectorized.
- Returns:
deriv – Derivative of log K_v(z) with respect to z.
- Return type:
float or ndarray
Examples
>>> log_kv_derivative_z(1.0, 2.0) -1.3143079478654318
- normix.utils.bessel.log_kv_vectorized(v: float | ndarray[tuple[Any, ...], dtype[_ScalarT]], z: float | ndarray[tuple[Any, ...], dtype[_ScalarT]]) float | ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Log modified Bessel function of the second kind: log(K_v(z)).
Fully vectorized version where both v and z can be arrays. This is slower than log_kv when v is a scalar.
- Parameters:
v (float or ndarray) – Order of Bessel function.
z (float or ndarray) – Argument (must be > 0).
- Returns:
log_kv – Log of the modified Bessel function of the second kind.
- Return type:
float or ndarray