Onyx logo

Previous topic

onyx.am.hmm_mgr – A manager for Hmms. Also read and write functions for acoustic models.

Next topic

onyx.am.acctrack – Processor-based tracking and printing of classification accuracy

This Page

onyx.am.classifier – Classifiers using GaussianMixtureModels

class onyx.am.classifier.AdaptingClassProcessor(classifier, sendee=None, sending=True, bypass_types=())

Bases: onyx.dataflow.streamprocess.ProcessorBase

Process labelled data events and adapt a classifier with them.

classifier may be either an AdaptingGmmClassifier or an
AdaptingSimpleClassifier.

Events to be processed should be pairs of the form (label, data) where data is a Numpy array in the form features X frames. The label may be None, in which case no adaptation will be performed. This processor will send events of the form (label, ((s0, l0), (s1, l1),...)) where label is the label of the incoming event and the s,l pairs are the scores and labels of the various classes in order of decreasing score (so l0 is the most likely label according to the classifier). One such event will be generated for each data point in the incoming event, so note that this processor can generate many outgoing events for each incoming event.

dc

A debug context for this processor. This attribute is an object returned by dcheck() in the onyx.util.debugprint module, and may be used for debug output. The tag for turning on such output is available as debug_tag

debug_tag

Activate this tag to get debugging information, see onyx.util.debugprint.DebugPrint

graph

Return a graph for this processor. By default this is just a single node whose label is the label of processor; derived classes may wish to override this property.

label

Return a label for this processor. By default this is just the name of the class; derived classes may wish to override this property by providing a different label to __init__().

process(event)

Process labelled data events.

labeled_data is a pair of (label, data) where data is a 2-d Numpy array with with features as the first, and frames as the last dimension.

send(result)

Internal function that pushes result into the sendee. Implementations of process() must call this to push results. To set up the sendee, (the target of the push), clients of the processor must either initialize the object with a sendee, or call set_sendee(). Processors created with a sendee of False will never send, but will not error if send is called.

sendee

The callable this processor will use to send events; see set_sendee()

sending

Whether this processor will currently send events at all; see set_sending()

set_sendee(sendee)

Clients call this to set up the callable where the processor will send its results.

set_sending(sending)

Clients call this to turn sending from a processor on or off.

static std_process_prologue(process_function)

Subclasses may use this decorater on their process function to implement the usual bypass and process semantics and to set up the debug context returned by dc().

class onyx.am.classifier.AdaptingGmmClassifier(gmm_mgr, label_index_pairs)

Bases: onyx.am.classifier.ClassifierBase

A classifier based on gaussian.GaussianMixtureModels

>>> dimension = 2
>>> covar_type = gaussian.GaussianModelBase.DIAGONAL_COVARIANCE
>>> labels = ('A', 'B', 'C')
>>> ncomps = (3, 2, 4)
>>> gaussian.GaussianMixtureModel.seed(0)
>>> init_dict = dict(dimension=dimension, covar_type=covar_type, num_comps=ncomps)
>>> gmm_mgr0 = modelmgr.GmmMgr(init_dict)
>>> c0 = AdaptingGmmClassifier(gmm_mgr0, izip(labels, count()))
>>> print c0
AdaptingGmmClassifier (num_classes = 3, dimension = 2)
   Labels/Models:
Label: A  Model:Gmm: (Type = diagonal, Dim = 2, NumComps = 3)
Weights   Models
 0.3333       Means: -0.1839 0.0325    Vars: 0.7000 0.7000
 0.3333       Means: 0.6988 -0.0964    Vars: 0.7000 0.7000
 0.3333       Means: 1.4135 -1.5326    Vars: 0.7000 0.7000
Label: B  Model:Gmm: (Type = diagonal, Dim = 2, NumComps = 2)
Weights   Models
 0.5000       Means: 0.2709 -1.2055    Vars: 0.7000 0.7000
 0.5000       Means: -0.0531 -0.2902    Vars: 0.7000 0.7000
Label: C  Model:Gmm: (Type = diagonal, Dim = 2, NumComps = 4)
Weights   Models
 0.2500       Means: -0.2928 -2.1074    Vars: 0.7000 0.7000
 0.2500       Means: 0.0847 0.6270    Vars: 0.7000 0.7000
 0.2500       Means: 1.6793 0.8341    Vars: 0.7000 0.7000
 0.2500       Means: -2.3151 -1.2254    Vars: 0.7000 0.7000

Here’s an example with priming. The priming value is an iterable with as many elements as there are labels. Each element should be a gaussian.GaussianMixtureModel Each classifier will be initialized from the model provided; see gaussian.GaussianMixtureModel for details on how this is done.

>>> means = np.array((1, 5))
>>> vars = np.array((5, 5))
>>> priming = tuple([gaussian.GaussianMixtureModel(dimension, covar_type, 1) for _ in xrange(3)])
>>> priming[0].set_model(means, vars)
>>> priming[1].set_model(means*10, vars*2)
>>> priming[2].set_model(means*2, vars*0.5)
>>> print priming[2]
Gmm: (Type = diagonal, Dim = 2, NumComps = 1)
Weights   Models
 1.0000       Means: 2.0000 10.0000    Vars: 2.5000 2.5000
>>> init_dict = dict(dimension=dimension, covar_type=covar_type, num_comps=ncomps, priming=priming)
>>> gmm_mgr0 = modelmgr.GmmMgr(init_dict)
>>> c0 = AdaptingGmmClassifier(gmm_mgr0, izip(labels, count()))
>>> print c0
AdaptingGmmClassifier (num_classes = 3, dimension = 2)
   Labels/Models:
Label: A  Model:Gmm: (Type = diagonal, Dim = 2, NumComps = 3)
Weights   Models
 0.3333       Means: 2.0534 3.8165    Vars: 5.0000 5.0000
 0.3333       Means: 1.2268 3.3290    Vars: 5.0000 5.0000
 0.3333       Means: 1.8754 6.3120    Vars: 5.0000 5.0000
Label: B  Model:Gmm: (Type = diagonal, Dim = 2, NumComps = 2)
Weights   Models
 0.5000       Means: 10.4282 50.4246    Vars: 10.0000 10.0000
 0.5000       Means: 10.2892 48.9640    Vars: 10.0000 10.0000
Label: C  Model:Gmm: (Type = diagonal, Dim = 2, NumComps = 4)
Weights   Models
 0.2500       Means: 2.4455 9.9650    Vars: 2.5000 2.5000
 0.2500       Means: 2.0905 10.3844    Vars: 2.5000 2.5000
 0.2500       Means: 3.3061 11.4513    Vars: 2.5000 2.5000
 0.2500       Means: 1.8027 10.4335    Vars: 2.5000 2.5000
>>> c0.set_num_em_iterations(5)
>>> f = StringIO()
>>> write_gmm_classifier(c0, f)
>>> #print f.getvalue()
>>> f.seek(0)
>>> c1 = read_gmm_classifier(f)
>>> str(c1) == str(c0)
True
adapt_all_classes(labeled_data)

Adapt the models for several classes on given data

labeled_data is a sequence of (label, point) pairs. All the data for a given label will be agglomerated before adapting.

adapt_one_class(label, points)

Adapt one class on a set of datapoints

Arguments are a single label and an iterable of points.

classify_one(datum)

Classify a single point.

Returns a tuple of (score, label) items sorted by score.

dimension
get_model(label)
labels
num_classes
set_ll_change_threshold(change_threshold)
set_num_em_iterations(num_iters)
set_relevance(relevance)

Set the relevances for all models in this classifier

Note that we use just one number for all models and for all parameters (i.e. for weights, means, and vars).

class onyx.am.classifier.AdaptingSimpleClassifier(labels, dimension)

Bases: onyx.am.classifier.ClassifierBase

A classifier based on 1-component gaussian.GaussianMixtureModels

>>> labels = ('A', 'B', 'C')
>>> c = AdaptingSimpleClassifier(labels, 2)
>>> print c
AdaptingSimpleClassifier (num_classes = 3, dimension = 2)
   Labels/Models:
Label: A  Model:Gmm: (Type = diagonal, Dim = 2, NumComps = 1)
Weights   Models
 1.0000       Means: Not set    Vars: Not set
Label: B  Model:Gmm: (Type = diagonal, Dim = 2, NumComps = 1)
Weights   Models
 1.0000       Means: Not set    Vars: Not set
Label: C  Model:Gmm: (Type = diagonal, Dim = 2, NumComps = 1)
Weights   Models
 1.0000       Means: Not set    Vars: Not set
>>> p0 = np.zeros(2)
>>> p1 = np.ones(2)
>>> p2 = p1 * 0.1
>>> data = (('A', p0), ('B', p1), ('A', p1), ('B', 2*p1), ('C', 4*p1), ('C', 5*p1))
>>> c.adapt_all_classes(data)
>>> print c
AdaptingSimpleClassifier (num_classes = 3, dimension = 2)
   Labels/Models:
Label: A  Model:Gmm: (Type = diagonal, Dim = 2, NumComps = 1)
Weights   Models
 1.0000       Means: 0.5000 0.5000    Vars: 0.2500 0.2500
Label: B  Model:Gmm: (Type = diagonal, Dim = 2, NumComps = 1)
Weights   Models
 1.0000       Means: 1.5000 1.5000    Vars: 0.2500 0.2500
Label: C  Model:Gmm: (Type = diagonal, Dim = 2, NumComps = 1)
Weights   Models
 1.0000       Means: 4.5000 4.5000    Vars: 0.2500 0.2500
>>> result = c.classify_one(p0)
>>> print tuple(((label, floatutils.float_to_readable_string(score)) for score, label in result))
(('A', '+(-0003)0xdfa3e572aa124'), ('B', '+(-0014)0x4986a82011d6e'), ('C', '+(-0118)0x6796d08b3cfa2'))
adapt_all_classes(labeled_data)

Adapt the models for several classes on given data.

labeled_data is a sequence of (label, point) pairs. All the data for a given label will be agglomerated before adapting.

adapt_one_class(label, data)

Adapt the model for one class on given data.

label is a single label; data is a sequence of data points

classify_one(datum)

Classify a single point.

Returns a tuple of (score, label) items sorted by score.

dimension
get_distance_dict()
get_max_distance()

Get maximum distance between any two distinct models

get_mean_distance()

Get mean distance between distinct models

get_min_distance()

Get minimum distance between any two distinct models

get_model(label)
labels
num_classes
set_relevance(relevance)

Set the relevances for all models in this classifier

Note that we use just one number for all models and for both means and vars.

class onyx.am.classifier.ClassifierBase(labels, dimension)

Bases: object

Base class for classifiers

classify_one(datum)

Classify a single point.

Returns a tuple of (score, label) items sorted by score.

dimension
labels
num_classes
class onyx.am.classifier.HmmClassifier(hmm_mgr, label_index_pairs)

Bases: onyx.am.classifier.ClassifierBase

A classifier using HMMs

adapt_all_classes(labeled_data)

Adapt the models for several classes on given data.

labeled_data is a sequence of (label, instance) pairs. All the data for a given label will be agglomerated before adapting.

adapt_one_class(label, data)

Adapt the model for one class on given data.

arguments are a single label and a sequence of adapting instances

begin_training()
classify_one(datum)

Classify a single point.

Returns a tuple of (score, label) items sorted by score.

dimension
get_model(label)
labels
num_classes
onyx.am.classifier.logreftest()
onyx.am.classifier.make_target(dimension, num_comps, weights, means, vars)
onyx.am.classifier.read_gmm_classifier(file)
onyx.am.classifier.test0()
onyx.am.classifier.test1()
onyx.am.classifier.write_gmm_classifier(classifier, file)