This module provides the classes GaussianModelBase, DummyModel, and GaussianMixtureModel.
Bases: onyx.am.gaussian.GaussianModelBase
DummyModel - return a constant score
>>> dm = DummyModel(3)
>>> dm2 = DummyModel(3)
>>> dm == dm2
True
>>> x = np.array([0, 0, 0])
>>> dm.score(x)
0.0
>>> dm.set_value(3.141)
>>> dm == dm2
False
>>> dm.score(x)
3.141
>>> dm.score_components(x)
array([ 3.141])
Return a numpy vector of weight * likelihood products
Use a seed other than None for reproducible randomness in the sample() function.
Bases: onyx.am.gaussian.GaussianModelBase
Gaussian mixture models
>>> m0 = GaussianMixtureModel(3, GaussianModelBase.DIAGONAL_COVARIANCE, 2)
>>> m1 = GaussianMixtureModel(3, GaussianModelBase.DIAGONAL_COVARIANCE, 2)
>>> m0.set_weights(np.array((0.75, 0.25)))
>>> m1.set_weights(np.array((0.75, 0.25)))
>>> mu = np.array(((1, 1, 1), (3, 3, 3)))
>>> m0.set_means(mu)
>>> m1.set_means(mu)
>>> v = np.array(((1, 1, 1), (1, 1, 1)))
>>> m0.set_vars(v)
>>> m1.set_vars(v)
>>> print m0
Gmm: (Type = diagonal, Dim = 3, NumComps = 2)
Weights Models
0.7500 Means: 1.0000 1.0000 1.0000 Vars: 1.0000 1.0000 1.0000
0.2500 Means: 3.0000 3.0000 3.0000 Vars: 1.0000 1.0000 1.0000
>>> print m1
Gmm: (Type = diagonal, Dim = 3, NumComps = 2)
Weights Models
0.7500 Means: 1.0000 1.0000 1.0000 Vars: 1.0000 1.0000 1.0000
0.2500 Means: 3.0000 3.0000 3.0000 Vars: 1.0000 1.0000 1.0000
>>> m0 == m1
True
>>> m2 = m0.copy()
>>> m0 == m2
True
>>> s0 = m0.score([0,0,0])
>>> float_to_readable_string(s0)
'+(-0007)0x5c2d69462ba21'
>>> s1 = m0.score([1,1,1])
>>> float_to_readable_string(s1)
'+(-0005)0x866d5e87388e3'
>>> s2 = m0.score([2,2,2])
>>> float_to_readable_string(s2)
'+(-0007)0xd03c4e0dff270'
>>> comp_scores = m0.get_likelihoods([2,2,2])
>>> [float_to_readable_string(x) for x in comp_scores]
['+(-0007)0xd03c4e0dff270', '+(-0007)0xd03c4e0dff270']
>>> weighted_scores = comp_scores * m0.weights
>>> [float_to_readable_string(x) for x in weighted_scores]
['+(-0007)0x5c2d3a8a7f5d4', '+(-0009)0xd03c4e0dff270']
>>> comps = m0.score_components([2,2,2])
>>> [float_to_readable_string(x) for x in comps]
['+(-0007)0x5c2d3a8a7f5d4', '+(-0009)0xd03c4e0dff270']
>>> comps.sum() == s2
True
>>> seq = np.array(([0,0,0],[1,1,1],[2,2,2],[3,3,3]))
>>> seq.shape
(4, 3)
>>> seq = np.rollaxis(seq, 1)
>>> seq.shape
(3, 4)
>>> comp_scores2 = m0.get_likelihoods_for_sequence(seq)
>>> comp_scores2.shape
(2, 4)
>>> (comp_scores2[:,2] == comp_scores).all()
True
>>> comp_scores3 = m0.score_components_for_sequence(seq)
>>> comp_scores3.shape
(2, 4)
>>> (comp_scores3[:,2] == comps).all()
True
>>> s3 = m0.get_log_score_for_sequence(np.array(([0,0,0],[1,1,1])).transpose())
>>> float_to_readable_string(s3)
'-(+0002)0xe5a488d2645a4'
>>> s3 == np.log(s0) + np.log(s1)
True
>>> s3 = m0.get_log_score_for_sequence(np.array(([0,0,0],[100,100,100])).transpose())
>>> float_to_readable_string(s3)
'-(+0009)0x5c92619f2f2f1'
Here’s an example with priming of the means and variances. A 1-component GaussianMixtureModel should be provided; this is used for all components.
>>> mu_primer = np.array((10, 20, 1000))
>>> var_primer = np.array((1, 22, 10000))
>>> GaussianModelBase.seed(0)
>>> primer = GaussianMixtureModel(3, GaussianModelBase.DIAGONAL_COVARIANCE, 1)
>>> primer.set_model(mu_primer, var_primer)
>>> m3 = GaussianMixtureModel(3, GaussianModelBase.DIAGONAL_COVARIANCE, 2)
>>> m3.init_models(primer)
>>> print m3
Gmm: (Type = diagonal, Dim = 3, NumComps = 2)
Weights Models
0.5000 Means: 9.9081 20.0762 1034.9414 Vars: 1.0000 22.0000 10000.0000
0.5000 Means: 9.9518 23.3150 923.3682 Vars: 1.0000 22.0000 10000.0000
>>> m4 = m3.copy()
>>> print m4
Gmm: (Type = diagonal, Dim = 3, NumComps = 2)
Weights Models
0.5000 Means: 9.9081 20.0762 1034.9414 Vars: 1.0000 22.0000 10000.0000
0.5000 Means: 9.9518 23.3150 923.3682 Vars: 1.0000 22.0000 10000.0000
Test whether invalid mixture weights are prohibited.
>>> bad_weights1 = np.array([0.2, -0.8])
>>> m4.set_weights(bad_weights1)
Traceback (most recent call last):
...
ValueError: Bad argument to set_weights: expected all non-negative values, but got [ 0.2 -0.8]
>>> bad_weights2 = np.array([0.0, 0.0])
>>> m4.set_weights(bad_weights2)
Traceback (most recent call last):
...
ValueError: Bad argument to set_weights: expected positive weight sum, but got 0.0
Test whether weights behave the same after linear scaling w.r.t. interpolation.
>>> m5 = GaussianMixtureModel(3, GaussianModelBase.FULL_COVARIANCE, 2)
>>> m6 = GaussianMixtureModel(3, GaussianModelBase.FULL_COVARIANCE, 2)
>>> weights1 = np.array([0.25, 0.75])
>>> weights2 = np.array([0.75, 0.25])
>>> scaled1 = weights1 * 10
>>> scaled2 = weights2 * 10
>>> m5.set_weights(weights1)
>>> m5.set_weights(weights2, 0.5)
>>> m5.weights
array([ 0.5, 0.5])
>>> m6.set_weights(scaled1)
>>> m6.set_weights(scaled2, 0.5)
>>> (m5.weights == m6.weights).all()
True
>>> (np.array(m6.covar_range) == np.array((2.0E-20, 2.0E+20))).all()
True
Return a deep copy of this model
Return an array of likelihoods, one for each component, for datapoint x
seq should be a Numpy array of datapoints with shape (dim, N) where N is the length of the sequence. Return a 2-d array of likelihoods, one for each component in the model and each datapoint in seq
Get the log likelihood for the points in a data iterable. The model assumption is that the points can be treated independently, so it is sufficient to sum their log-likelihoods.
Randomly sample from a GMM. A component is chosen with p(i) = w_i, then sampled.
>>> gmm = GaussianMixtureModel(2, GaussianModelBase.DIAGONAL_COVARIANCE, 2)
>>> gmm.set_means(np.array([[1.0, -1.0], [-1.0, 1.0]]))
>>> gmm.set_vars(np.array([[1.0, 1.0], [1.0, 1.0]]))
>>> gmm.set_weights(np.array([0.6, 0.4]))
>>> gmm.seed(0)
>>> gmm.sample()
array([-0.23626814, 0.15374746])
>>> gmm.sample()
array([ 1.69882769, -1.09636785])
>>> gmm.sample()
array([-0.98880416, 2.14990452])
>>> gmm = GaussianMixtureModel(2, GaussianModelBase.FULL_COVARIANCE, 2)
>>> gmm.set_means(np.array([[1.0, -1.0], [-1.0, 1.0]]))
>>> gmm.set_vars(np.array([[[1.0, 0.3], [0.3, 1.0]], [[1.0, -0.1], [-0.1, 1.0]]]))
>>> gmm.set_weights(np.array([0.6, 0.4]))
>>> gmm.seed(0)
>>> gmm.sample()
array([-0.23626814, 0.08161617])
>>> gmm.sample()
array([ 1.69882769, -0.88228076])
>>> gmm.sample()
array([-0.98880416, 2.14302097])
Score the GMM for datapoint x. See also score_components() and get_likelihoods()
Return a numpy vector of weight * likelihood products for datapoint x.
seq should be a Numpy array of datapoints with shape (dim, N) where N is the length of the sequence. Return a 2-d array of weight * likelihood products, one for each component in the model and each datapoint in seq
Score the GMM for sequence of datapoints seq. seq should be a Numpy array of datapoints with shape (dim, N) where N is the length of the sequence. Return a 1-d array of total scores for each datapoint in seq. See also score(), score_components_for_sequence() and get_likelihoods_for_sequence()
Use a seed other than None for reproducible randomness in the sample() function.
m is an np array of mean vectors with shape (num_components, dimension). Where this object only has one component, m can be any reasonable point.
Set relevances for adaptation - see adapt()
values must be a tuple of three non-negative numbers, the first is used for weights, the second for means, and the third for variances.
v is an np array of (co)variance/matrices with shape (num_components, dimension) or (num_components, dimension, dimension). Where this object only has one component and diagonal covariance, v can be any reasonable point. Where this object only has one component and full covariance, v can be a matrix with shape (dimension, dimension).
Bases: object
>>> m = GaussianModelBase(4, GaussianModelBase.DUMMY_COVARIANCE)
>>> try:
... m.dimension = 5
... except AttributeError:
... print "OK, dimension not settable"
... else:
... print "Problem! dimension was settable"
OK, dimension not settable
>>> try:
... m.covariance_type = GaussianModelBase.FULL_COVARIANCE
... except AttributeError:
... print "OK, covariance_type not settable"
... else:
... print "Problem! covariance_type was settable"
OK, covariance_type not settable
Use a seed other than None for reproducible randomness in the sample() function.
Bases: object
Internal class used for accumulation for GMM training - you shouldn’t be making these yourself.
>>> g0 = GmmAccumSet(10, 33, GaussianModelBase.DIAGONAL_COVARIANCE)
Compute weights, means, and vars from accum set.
Merge another accum set into this one
Make a GMM with the given dimension and number of components. The means of the components will be set, in all dimensions, at (mean_base * i) for the (1-based) i’th component.