Onyx logo

Previous topic

onyx.graph.graphsandbox – Sandbox where new graph stuff is developed.

Next topic

onyx.graph.randomgraph – This module has tools that support the generation of random graphs.

This Page

onyx.graph.graphtools – Basic tools for building and using graphs

The classes in this module support common operations using graphs.

They are used to construct graphs, serialize and unserialize graphs, iterate through the nodes or arcs of a graph, sort nodes and arcs topologically, search for paths, etc.

A graph consists of a set of nodes and a set of directed arcs. Each arc connects two nodes in the graph. Each node and each arc is identified in the graph by a non-negative integer ID. Each node and each arc has an associated label of user-supplied data. Many of the supported graph operations depend only upon the structure presented by the set of nodes and arcs, and do not depend upon the labels. For instance, whether a directed graph is acyclic or not depends only upon the structure of the graph; acyclicity does not depend on the labels associated with the nodes and arcs.

The factoring here focusses on structural properties of graphs and performing operations that depend on these properties. Subclassing of these objects is not recommended. Rather, they should be used as members of user-specific classes. That is, these classes support “has a” usage rather than “is a” usage.

To support user-semantics in a graph, each node and each arc has a user-supplied label. These labels permit the user to attach arbitrary data to each node or arc. Although it is not required, best practices suggest that each label should be an immutable object such as an integer, a string, or a tuple of integers and strings. For greatest portability of graph-based objects, e.g. in a text representation, the use of non-negative integers for labels is recommended.

The use of a single label to capture all user-specific semantics of a node or an arc is a simplifying factorization of the general graph representation problem. It means that the graph tools do not have to specify or verify semantic constraints or subtleties on nodes and arcs; the graph tools focus on the semantics associated with the graph structure. For the most part graph methods do not apply any semantics to the labels. The iteration constructs typically yield the label associated with each node or arc they visit. The few methods that make semantic use of the labels typically do so by calling a user-supplied callback to generate a numerical metric based on the label.

Furthermore, this use of a single label on each node or arc encourages the implementor of a specific graph usage to focus on the semantics of what a node is and what an arc is. That is, the implementor makes node and arc objects with whatever properties they need, and the implementor also makes an object to manage the collection of nodes and arcs. This management object would also be responsible for maintaining the graph structure using the graphtools objects, and it would use the labels in the graph to identifiy the instances of the node and arc objects in the graph. E.g. each node label could be a unique unsigned integer that is an index into the table of data that the management object maintains for the nodes.

The tools implement several types of graph object. For instance, one type of a graph is used to build the graph structure. Due to its mutability (new nodes and new arcs are getting added) the builder graph does not support many graph semantics. Once the graph is done being built, an immutable (frozen) graph is generated. Because it is immutable, this form of the graph can collect global structural information at construction time and then readily supports read-only queries and traversals without having to solve any of the difficult semantic issues that arise while a graph is mutable.

There are of course cases when a graph need to be mutable and to maintain non-local semantics. To help simplify the semantic issues in such cases, support for a streaming model of graph usage has been adopted. The underlying constraint imposed by a streaming model is that it should be possible to specify a graph building and using algorithm that can exist in a bounded amount of memory while processing an unbounded amount of data. This implies an algorithm that will remove nodes and arcs from the graph at the same average rate that they are being added to the graph. Such an algorithm thus imposes constraints on the lifetimes of nodes and arcs. Adding and removing nodes and arcs according to a topologically ordered constraint is one simplification that can help to achieve bounded memory for an unbounded stream.

onyx.graph.graphtools.A_tutorial()

This is a tutorial on using graphtools. It’s also a doctest-based test of the graphtools module.

This tutorial demonstrates use of the graphtools module. It shows how to build and access a graph using GraphBuilder. It shows how to create a FrozenGraph and the operations and invariants associatated with a FrozenGraph.

Play around with points on the compass and connections between them.

Let’s build this graph:

       --------------------      /                      V
West -------> North ------> South --------> East
  \              \                          ^ ^
   \              \------------------------/  /
    \                                        /
     \--------------------------------------/

First some points:

>>> builder = GraphBuilder()
>>> north = builder.new_node("North")
>>> east =  builder.new_node("East")
>>> south =  builder.new_node("South")
>>> west =  builder.new_node("West")
>>> builder.num_nodes
4

Now make some directed connections:

>>> for start_node, end_node in (
...     (west, east),
...     (north, south),
...     (west, south),
...     (south, east),
...     (north, east),
...     (west, north)):
...   _ = builder.new_arc(start_node, end_node, builder.get_node_label(start_node) + 'To' + builder.get_node_label(end_node))
>>> builder.num_arcs
6

Get an immutable version of the graph and print it out:

>>> frozen = FrozenGraph(builder)
>>> for text in frozen.text_iter():
...   print text,
num_nodes 4
0   North
1   East
2   South
3   West
num_arcs 6
0   3  1   WestToEast
1   0  2   NorthToSouth
2   3  2   WestToSouth
3   2  1   SouthToEast
4   0  1   NorthToEast
5   3  0   WestToNorth

It’s a directed acyclic graph (DAG).

>>> frozen.has_cycle
False

See which nodes are global start nodes (have no arcs that end on them) and global end nodes (have no arcs that start on them):

>>> frozen.terminals
((3,), (1,))

Note:

>>> frozen.terminals == ((west,), (east,))
True

Iteration by node label with secondary iteration of in- and out-arcs, done in topological order since that’s well defined for this graph:

>>> for (node_label, in_arc_iter, out_arc_iter) in frozen.iter_nodes():
...    print('For node with label %s:' % (node_label,))
...    print('In-arcs are:')
...    for arc_label, other_node_label in in_arc_iter:
...        print('%s in from node %s' % (arc_label, other_node_label))
...    print('Out-arcs are:')
...    for arc_label, other_node_label in out_arc_iter:
...        print('%s out to node %s' % (arc_label, other_node_label))
For node with label West:
In-arcs are:
Out-arcs are:
WestToNorth out to node North
WestToEast out to node East
WestToSouth out to node South
For node with label North:
In-arcs are:
WestToNorth in from node West
Out-arcs are:
NorthToEast out to node East
NorthToSouth out to node South
For node with label South:
In-arcs are:
NorthToSouth in from node North
WestToSouth in from node West
Out-arcs are:
SouthToEast out to node East
For node with label East:
In-arcs are:
NorthToEast in from node North
SouthToEast in from node South
WestToEast in from node West
Out-arcs are:

And the same thing, but in reverse topological order

>>> for (node_label, in_arc_iter, out_arc_iter) in frozen.iter_nodes(reverse_order=True):
...    print('For node with label %s:' % (node_label,))
...    print('In-arcs are:')
...    for arc_label, other_node_label in in_arc_iter:
...        print('%s in from node %s' % (arc_label, other_node_label))
...    print('Out-arcs are:')
...    for arc_label, other_node_label in out_arc_iter:
...        print('%s out to node %s' % (arc_label, other_node_label))
For node with label East:
In-arcs are:
NorthToEast in from node North
SouthToEast in from node South
WestToEast in from node West
Out-arcs are:
For node with label South:
In-arcs are:
NorthToSouth in from node North
WestToSouth in from node West
Out-arcs are:
SouthToEast out to node East
For node with label North:
In-arcs are:
WestToNorth in from node West
Out-arcs are:
NorthToEast out to node East
NorthToSouth out to node South
For node with label West:
In-arcs are:
Out-arcs are:
WestToNorth out to node North
WestToEast out to node East
WestToSouth out to node South
>>> for (node_id, in_arc_iter, out_arc_iter) in frozen.iter_nodes_by_id():
...    print('For node with id %s:' % (node_id,))
...    print('In-arcs are:')
...    for arc_id, other_node_id in in_arc_iter:
...        print('%s in from node %s' % (arc_id, other_node_id))
...    print('Out-arcs are:')
...    for arc_id, other_node_id in out_arc_iter:
...        print('%s out to node %s' % (arc_id, other_node_id))
For node with id 3:
In-arcs are:
Out-arcs are:
5 out to node 0
0 out to node 1
2 out to node 2
For node with id 0:
In-arcs are:
5 in from node 3
Out-arcs are:
4 out to node 1
1 out to node 2
For node with id 2:
In-arcs are:
1 in from node 0
2 in from node 3
Out-arcs are:
3 out to node 1
For node with id 1:
In-arcs are:
4 in from node 0
3 in from node 2
0 in from node 3
Out-arcs are:

A helper for looking at paths between nodes, ranked by cost.

>>> def printpaths(pathiter, verbose=False):
...   print "score",  "((arc_id, (arc_start_node, arc_end_node, arc_label)), ...)" if verbose else "(arc_label, ...)"
...   for path_score, path, graph in pathiter:
...     print path_score, tuple( ((arcid, graph.get_arc(arcid),) if verbose else graph.get_arc_label(arcid)) for arcid in path)

There are two paths from north to east. Note that by using len as the arc label costing function, the score is just the sum of the lengths of the arcs’ label strings.

>>> printpaths(frozen.iterpaths(len, north, east))
score (arc_label, ...)
11 ('NorthToEast',)
23 ('NorthToSouth', 'SouthToEast')

Given that it’s a DAG and there are paths from north to east, we should expect no path from east to north.

>>> printpaths(frozen.iterpaths(len, east, north))
score (arc_label, ...)

Use the two terminals to get all paths through the graph. Verbose setting shows details of arc and node ids on each path.

>>> printpaths(frozen.iterpaths(len, west, east), True)
score ((arc_id, (arc_start_node, arc_end_node, arc_label)), ...)
10 ((0, (3, 1, 'WestToEast')),)
22 ((2, (3, 2, 'WestToSouth')), (3, (2, 1, 'SouthToEast')))
22 ((5, (3, 0, 'WestToNorth')), (4, (0, 1, 'NorthToEast')))
34 ((5, (3, 0, 'WestToNorth')), (1, (0, 2, 'NorthToSouth')), (3, (2, 1, 'SouthToEast')))

Get a permutation of node ids and arc ids maps node and arc ids into a topological order.

>>> maps = frozen.get_topological_reorderings()
>>> maps
((3, 0, 2, 1), (0, 2, 5, 1, 4, 3))
>>> (frozen.toponodes, frozen.topoarcs)
((3, 0, 2, 1), (5, 1, 2, 0, 3, 4))

E.g. node 3 (West) would be the first node in this topological ordering and node 1 (East) would be the last node in this topological ordering. Similarly, arc 0 (WestToEast) would remain the first arc in this topological ordering and arc 3 (SouthToEast) would be the last arc in this topological ordering.

Let’s make such a graph.

>>> topo = FrozenGraph(frozen.get_mapped_tables(maps))
>>> for text in topo.text_iter():
...   print text,
num_nodes 4
0   West
1   North
2   South
3   East
num_arcs 6
0   0  3   WestToEast
1   0  2   WestToSouth
2   0  1   WestToNorth
3   1  2   NorthToSouth
4   1  3   NorthToEast
5   2  3   SouthToEast

In this graph the topological permutations are identities.

>>> topo.get_topological_reorderings()
((0, 1, 2, 3), (0, 1, 2, 3, 4, 5))
>>> topo.toponodes, topo.topoarcs
((0, 1, 2, 3), (2, 1, 3, 0, 4, 5))

So the two terminal nodes have smallest and largest node ids.

>>> (start_node,), (end_node,) = terminals = topo.terminals
>>> terminals
((0,), (3,))

Despite the reordering of ids, the scores of paths and the sequence of arc labels are the same as for the non-topologically sorted graph.

>>> printpaths(topo.iterpaths(len, start_node, end_node))
score (arc_label, ...)
10 ('WestToEast',)
22 ('WestToSouth', 'SouthToEast')
22 ('WestToNorth', 'NorthToEast')
34 ('WestToNorth', 'NorthToSouth', 'SouthToEast')

A more lexically-involved cost metric can be used to break length ties:

>>> def lexlen(label):
...   accum = 0
...   for item in label:
...     accum = accum * 0x100 + ord(item)
...   return accum

No ties:

>>> printpaths(topo.iterpaths(lexlen, start_node, end_node))
score (arc_label, ...)
412717324528352340570996 ('WestToEast',)
200478143005669390070048732 ('WestToNorth', 'NorthToEast')
206522827443974778547267548 ('WestToSouth', 'SouthToEast')
24481084856605229727991028804 ('WestToNorth', 'NorthToSouth', 'SouthToEast')

Show that randomly shuffling the arcs and nodes doesn’t change the paths’ scores or sequences of arc-labels on each path.

>>> random.seed(123)
>>> maps = tuple(map(list, topo.get_topological_reorderings()))

Fiddle the elements from the original frozen graph:

>>> _ = map(random.shuffle, maps)
>>> maps
([1, 2, 3, 0], [2, 3, 1, 5, 4, 0])
>>> rand1 = FrozenGraph(frozen.get_mapped_tables(maps))
>>> (start_node1,), (end_node1,) = rand1.terminals
>>> printpaths(rand1.iterpaths(lexlen, start_node1, end_node1))
score (arc_label, ...)
412717324528352340570996 ('WestToEast',)
200478143005669390070048732 ('WestToNorth', 'NorthToEast')
206522827443974778547267548 ('WestToSouth', 'SouthToEast')
24481084856605229727991028804 ('WestToNorth', 'NorthToSouth', 'SouthToEast')

Now differently fiddle the elements from the topo graph:

>>> _ = map(random.shuffle, maps)
>>> maps
([2, 3, 1, 0], [0, 5, 4, 2, 3, 1])
>>> rand2 = FrozenGraph(topo.get_mapped_tables(maps))
>>> (start_node2,), (end_node2,) = rand2.terminals
>>> printpaths(rand2.iterpaths(lexlen, start_node2, end_node2))
score (arc_label, ...)
412717324528352340570996 ('WestToEast',)
200478143005669390070048732 ('WestToNorth', 'NorthToEast')
206522827443974778547267548 ('WestToSouth', 'SouthToEast')
24481084856605229727991028804 ('WestToNorth', 'NorthToSouth', 'SouthToEast')

But the underlying node and arc ids are different:

>>> printpaths(rand1.iterpaths(lexlen, start_node1, end_node1), True)
score ((arc_id, (arc_start_node, arc_end_node, arc_label)), ...)
412717324528352340570996 ((5, (2, 0, 'WestToEast')),)
200478143005669390070048732 ((3, (2, 3, 'WestToNorth')), (4, (3, 0, 'NorthToEast')))
206522827443974778547267548 ((0, (2, 1, 'WestToSouth')), (1, (1, 0, 'SouthToEast')))
24481084856605229727991028804 ((3, (2, 3, 'WestToNorth')), (2, (3, 1, 'NorthToSouth')), (1, (1, 0, 'SouthToEast')))
>>> printpaths(rand2.iterpaths(lexlen, start_node2, end_node2), True)
score ((arc_id, (arc_start_node, arc_end_node, arc_label)), ...)
412717324528352340570996 ((0, (3, 1, 'WestToEast')),)
200478143005669390070048732 ((3, (3, 2, 'WestToNorth')), (2, (2, 1, 'NorthToEast')))
206522827443974778547267548 ((5, (3, 0, 'WestToSouth')), (1, (0, 1, 'SouthToEast')))
24481084856605229727991028804 ((3, (3, 2, 'WestToNorth')), (4, (2, 0, 'NorthToSouth')), (1, (0, 1, 'SouthToEast')))
class onyx.graph.graphtools.FrozenGraph(source)

Bases: onyx.graph.graphtools.GraphBase

An immutable graph object suitable for graph operations that depend on global structural properties of the graph.

This graph contains two disconnected subgraphs, both of which are cyclic, one with a self loop.

>>> a = FrozenGraph(GraphTables(((0, 1, 2, 3), (0, 1, 2), (1, 0, 2), (3, None, 5))))
>>> a.has_cycle, a.is_connected, a.has_self_loop, a.has_multiarcs, a.is_lattice, a.node_labels_are_node_ids, a.arc_labels_are_arc_ids
(True, False, True, False, False, True, False)

Reverse each of the conditions

>>> b = FrozenGraph(GraphTables(((-1, -2), (0, 0), (1, 1), (0, 1))))
>>> b.has_cycle, b.is_connected, b.has_self_loop, b.has_multiarcs, b.is_lattice, b.node_labels_are_node_ids, b.arc_labels_are_arc_ids
(False, True, False, True, True, False, True)

XXX [Fix this documentation] Case of a cyclic graph that is not bakis_cyclic, i.e. the cyclicity is only due to self loops

>>> c = FrozenGraph(GraphTables((('a', 'b'), (0, 1), (1, 1), ('c', 'd'))))
>>> c.has_cycle, c.has_self_loop, c.strict_is_cyclic
(False, True, True)
>>> b = SetGraphBuilder()
>>> _ = b.add_arc(0, 1)
>>> _ = b.add_arc(0, 2)
>>> d = FrozenGraph(b)
>>> d.has_join, d.is_tree, d.is_tree_strict
(False, True, True)
>>> _ = b.add_arc(2, 2)
>>> e = FrozenGraph(b)
>>> e.has_join, e.is_tree, e.is_tree_strict
(False, True, False)
>>> _ = b.add_arc(3, 2)
>>> f = FrozenGraph(b)
>>> f.has_join, f.is_tree, f.is_tree_strict
(True, False, False)

Creates a frozen graph which is a snapshot of the current state of source. source must be an instance of GraphTables or GraphBase.

arc_labels_are_arc_ids

Return True if each arc label is the index of the arc.

clone_graph(node_xform=<function <lambda> at 0x1d06fb18>, arc_xform=<function <lambda> at 0x1d06fb90>)

Returns a new FrozenGraph based on the contents of self. The new graph will be isomorphic to the original, with node labels and/or arc labels transformed by the callables node_xform and arc_xform, respectively. node_xform should be a callable taking one argument, which will be the label of the node being transformed. arc_xform should be a callable of 3 arguments, which will be the label of the arc being transformed, the label of the start node for the arc, and the label of the end node for the arc, respectively. By default, each transform function is the identity function on the label of the arc or node. Thus, if no arguments are provided, an exact copy is returned.

>>> g0 = FrozenGraph(GraphTables(((1, 2, 3), (0, 1, 0), (1, 2, 2), ('123', '234', '345'))))
>>> g1 = g0.clone_graph()
>>> g1
FrozenGraph(GraphTables(((1, 2, 3), (0, 1, 0), (1, 2, 2), ('123', '234', '345'))))
>>> g0 == g1
True
>>> g2 = g0.clone_graph(str, lambda l, s, e: int(l))
>>> g2
FrozenGraph(GraphTables((('1', '2', '3'), (0, 1, 0), (1, 2, 2), (123, 234, 345))))
>>> g3 = g0.clone_graph(str, lambda l, s, e: int(l) * s * e)
>>> g3
FrozenGraph(GraphTables((('1', '2', '3'), (0, 1, 0), (1, 2, 2), (246, 1404, 1035))))
depth_first_callback(pre_callback=None, post_callback=None, join_callback=None, cycle_callback=None)

Performs a depth-first traversal of the graph, making callbacks for all structural events involving the nodes in the graph. The sequence of callbacks represent a structure-revealing serialization of events from which the graph could be constructed.

See also depth_first_iter(), an advanced, lower-level method that returns a generator that yields items from the callbacks.

The optional pre_callback and post_callback arguments are functions of two arguments, (graph, node_id), where graph is the FrozenGraph instance making the callback and node_id is the identifier of the node in question. These two callbacks are each called exactly once with the node identifier of each node in the graph. The interleaving of these calls reveals the topological, tree, and spanning-tree structure in the graph.

The pre_callback function is called each time a new node_id is first encountered in the depth traversal. It is guaranteed to be called with a given value for node_id prior to being called for any node that is reachable from the given value of node_id.

The post_callback function is called after all nodes reachable from node_id have been encountered and processed by pre_callback and post_callback. For DAGs, the reverse of the sequence of node_id in the calls to post_callback is a topological ordering of the nodes. For example, toponodes is constructed this way. Also, for any subtree in the graph, the node_id values appearing in the pre_callback and post_callback calls are nested.

The cycle_callback and join_callback arguments are functions of three arguments, (graph, start_node_id, end_node_id), where graph is the FrozenGraph instance making the callback and start_node_id and end_node_id are two nodes that are connected by an arc. These two callbacks reveal the joins and cycles in the graph.

The join_callback is called whenever a join is encountered that does not complete a cycle. That is, end_node_id will already have been used in calls to both pre_callback and post_callback. In each call to join_callback, it is guaranteed that start_node_id and end_node_id are distinct.

The cycle_callback is called whenever a join is encountered that does complete a cycle. In this case, end_node_id will already have been used in a pre_callback, but it will not have been used in a post_callback. If start_node_id and end_node_id are the same, the cycle is a self loop. Otherwise, start_node_id and end_node_id are distinct and the cycle includes multiple nodes. The set of arcs appearing in the calls to cycle_callback is a set of arcs which, if removed or reversed, would convert the graph a DAG. Put another way, if cycle_callback is not ever called, the graph is a DAG.

depth_first_iter(pre_callback=None, post_callback=None, join_callback=None, cycle_callback=None)

Performs a depth-first traversal of the graph, making callbacks to iterator-returning functions for all structural events involving the nodes in the graph. The sequence of callbacks represent a structure-revealing serialization of events from which the graph could be constructed. The iterator yields the items from each of the iterables/generators that each iterator-returning callback returns.

See also depth_first_callback(), a simpler, high-level method that just calls the callbacks.

The optional pre_callback and post_callback arguments are iterator-returning functions of two arguments, (graph, node_id), where graph is the FrozenGraph instance making the callback and node_id is the identifier of the node in question. These two callbacks are each called exactly once with the node identifier of each node in the graph. The interleaving of these calls reveals the topological, tree, and spanning-tree structure in the graph.

The pre_callback iterator-returning function is called each time a new node_id is first encountered in the depth traversal. It is guaranteed to be called with a given value for node_id prior to being called for any node that is reachable from the given value of node_id.

The post_callback iterator-returning function is called after all nodes reachable from node_id have been encountered and processed by pre_callback and post_callback. For DAGs, the reverse of the sequence of node_id in the calls to post_callback is a topological ordering of the nodes. For example, toponodes is constructed this way. Also, for any subtree in the graph, the node_id values appearing in the pre_callback and post_callback calls are nested.

The cycle_callback and join_callback arguments are iterator-returning functions of three arguments, (graph, start_node_id, end_node_id), where graph is the FrozenGraph instance making the callback and start_node_id and end_node_id are two nodes that are connected by an arc. These two callbacks reveal the joins and cycles in the graph.

The join_callback is called whenever a join is encountered that does not complete a cycle. That is, end_node_id will already have been used in calls to both pre_callback and post_callback. In each call to join_callback, it is guaranteed that start_node_id and end_node_id are distinct.

The cycle_callback is called whenever a join is encountered that does complete a cycle. In this case, end_node_id will already have been used in a pre_callback, but it will not have been used in a post_callback. If start_node_id and end_node_id are the same, the cycle is a self loop. Otherwise, start_node_id and end_node_id are distinct and the cycle includes multiple nodes. The set of arcs appearing in the calls to cycle_callback is a set of arcs which, if removed or reversed, would convert the graph a DAG. Put another way, if cycle_callback is not ever called, the graph is a DAG.

dot_display(temp_file_prefix='graphtools_', display_tool_format='open -a /Applications/Graphviz.app %s', **kwargs)

Display a dot-generated representation of the graph. Returns the name of the temporary file that the display tool is working from. The caller is responsible for removing this file. See also dot_iter().

Optional temp_file_prefix is the prefix used for the temporary filename.

Optional display_tool_format is formatting string, with %s where the filename goes, used to generate the command that will display the file. By default it assumes you’re on a Mac and have Graphviz.app installed in the /Applications directory.

Remaining keyword arguments are handed to the dot_iter function.

dot_iter(graph_label='', graph_type='digraph', globals=(), node_label_callback=<function <lambda> at 0x1d06a5f0>, node_attributes_callback=None, arc_label_callback=<function <lambda> at 0x1d06a5f0>, arc_attributes_callback=None, arc_ports_callback=None)

Returns a generator that yields lines of text, including newlines. The text represents the graph in the DOT language for which there are many displayers. See also dot_display().

Optional argument graph_label is a string label for the graph. Optional argument graph_type defaults to ‘digraph’, can also be ‘graph’.

Optional globals is an iterable that yields strings to put in the globals section of the DOT file.

Optional node_label_callback is a function that will be called with three arguments for each node: (node_label, is_start, is_end), where node_label is the node’s label, is_start is True if the node is a start node (has no incoming arcs), and is_end is True if the node is an end node (has no outgoing arcs). The callback function should return a string which will be used as the text label for the node in the figure, or None or the empty string for no label. If this argument is not given, each node with a non-None label will be labelled with the str value of its label. If this argument is None, nodes will not be labelled.

Optional arc_label_callback is a function that will be called with the arc label for each arc. The callback function should return a string which will be used as the text label for the arc in the figure, or None or the empty string for no label. If this argument is not given, each arc with a non-None label will be labelled with the str value of its label. If this argument is None, arcs will not be labelled.

Optional node_attributes_callback is a function that returns an iterable that will yield a string for each attribute assignment for the node. It will be called with three arguments for each node: (node_label, is_start, is_end), where node_label is the node’s label, is_start is True if the node is a start node (has no incoming arcs), and is_end is True if the node is an end node (has no outgoing arcs).

Optional arc_attributes_callback is a function that returns an iterable that will yield a string for each attribute assignment for the arc. It will be called with the arc label for each arc in the graph.

Optional arc_ports_callback is a function of two arguments that is called for each arc in the graph. The arguments are the labels of the start and end nodes of the arc. The function returns a pair of strings that name the respective ports of the nodes to which the arc should attach. The strings should be empty if no port is implied.

>>> graph = FrozenGraph(GraphTables(((1, 2, 'a'), (0, 1, 2), (1, 0, 2), (3, None, 5))))
>>> print ''.join(graph.dot_iter())
digraph  { 
  n0  [label="1"];
  n1  [label="2"];
  n2  [label="a"];
  n0 -> n1  [label="3"];
  n1 -> n0;
  n2 -> n2  [label="5"];
}
<BLANKLINE>
>>> print ''.join(graph.dot_iter(graph_label='Foo',
...                              globals=['size="5,6.125";', 'ordering=out;'],
...                              node_label_callback=lambda label, is_start, is_end: ('<' if is_start else '') + str(label) + ('>' if is_end else ''),
...                              arc_label_callback=str,
...                              node_attributes_callback=lambda label, is_start, is_end: ['shape=%s' % ('octagon' if is_start or is_end else 'ellipse',), 'style=%s' % ('bold' if isinstance(label, int) else 'normal',)],
...                              arc_attributes_callback=lambda label: ['style=%s' % ('bold' if label is None else 'normal',)]))
digraph Foo { 
  label="Foo";
  size="5,6.125";
  ordering=out;
  n0  [label="1", shape=ellipse, style=bold];
  n1  [label="2", shape=ellipse, style=bold];
  n2  [label="<a>", shape=octagon, style=normal];
  n0 -> n1  [label="3", style=normal];
  n1 -> n0  [label="None", style=bold];
  n2 -> n2  [label="5", style=normal];
}
<BLANKLINE>
endnodes

A tuple of the indices of the ending-node ids. These are nodes with no outgoing arcs.

get_arc(arc_id)

Return a tuple of the (start_node, end_node, arc_label) for the arc associated with the arc_id.

get_arc_label(arc_id)

Return the arc label associated with the arc_id.

get_backward_transitive_closures(seed_node_id_iter, force_reflexive=False, max_depth=None)

Find the part of the graph that is reachable going backward from the set of nodes in seed_node_id_iter. I.e., find the part of the graph that can reach the nodes in seed_node_id_iter when going forward.

Return a tuple of two sets: the ids of the nodes and the ids of the arcs in the backward-reachable part of the graph. If optional force_reflexive is True then include all the nodes from seed_node_id_iter in the set of returned node ids regardless of whether the graph has cycles that include them. If optional max_depth is not None, it should be a non-negative integer and the subgraph returned will be limited to those nodes that are at most max_depth from any node in seed_node_id_iter.

>>> g0 = FrozenGraph(GraphTables(((-10, -11, -12), (2, 1, 1), (0, 2, 0,), (-20, -21, -22))))
>>> g0.get_backward_transitive_closures((2,))
(set([1]), set([1]))
>>> g0.get_backward_transitive_closures((2,), force_reflexive=True)
(set([1, 2]), set([1]))
>>> g0.get_backward_transitive_closures((2,), max_depth=0)
(set([]), set([]))
>>> g0.get_backward_transitive_closures((2,), force_reflexive=True, max_depth=0)
(set([2]), set([]))
>>> g0.get_backward_transitive_closures((2,), max_depth=1)
(set([1]), set([1]))
>>> g0.get_backward_transitive_closures((0,), max_depth=1)
(set([1, 2]), set([0, 2]))
>>> g0.get_backward_transitive_closures((0,), max_depth=2)
(set([1, 2]), set([0, 1, 2]))
>>> FrozenGraph(GraphTables(((10, 20), (0, 0, 1, 1), (0, 1, 0, 1), ('00', '01', '10', '11')))).get_backward_transitive_closures((0,))
(set([0, 1]), set([0, 1, 2, 3]))
get_canonical_DAG()

Returns a new FrozenGraph based on the contents of self. In the new graph the nodes and arcs are rearranged to be indexed in a topological order.

Raises a ValueError if has_cycle() is True.

>>> FrozenGraph(GraphTables(((-10, -11, -12), (2, 1, 1), (0, 2, 0,), (-20, -21, -22)))).get_canonical_DAG()
FrozenGraph(GraphTables(((-11, -12, -10), (0, 1, 0), (1, 2, 2), (-21, -20, -22))))
get_forward_transitive_closures(seed_node_id_iter, force_reflexive=False, max_depth=None)

Find the part of the graph that is reachable going forward from the set of nodes in seed_node_id_iter.

Return a tuple of two sets: the ids of the nodes and the ids of the arcs in the forward-reachable part of the graph. If optional force_reflexive is True then include all the nodes from seed_node_id_iter in the set of returned node ids regardless of whether the graph has cycles that include them. If optional max_depth is not None, it should be a non-negative integer and the subgraph returned will be limited to those nodes that are at most max_depth from any node in seed_node_id_iter.

>>> g0 = FrozenGraph(GraphTables(((-10, -11, -12), (2, 1, 1), (0, 2, 0,), (-20, -21, -22))))
>>> g0.get_forward_transitive_closures((1,))
(set([0, 2]), set([0, 1, 2]))
>>> g0.get_forward_transitive_closures((1,), force_reflexive=True)
(set([0, 1, 2]), set([0, 1, 2]))
>>> g0.get_forward_transitive_closures((1,), max_depth=0)
(set([]), set([]))
>>> g0.get_forward_transitive_closures((1,), force_reflexive=True, max_depth=0)
(set([1]), set([]))
>>> g0.get_forward_transitive_closures((1,), max_depth=1)
(set([0, 2]), set([1, 2]))
>>> g0.get_forward_transitive_closures((1,), max_depth=2)
(set([0, 2]), set([0, 1, 2]))
>>> FrozenGraph(GraphTables(((10, 20), (0, 0, 1, 1), (0, 1, 0, 1), ('00', '01', '10', '11')))).get_forward_transitive_closures((0,))
(set([0, 1]), set([0, 1, 2, 3]))
get_induced_subgraph(start_node_iter, end_node_iter)

Returns a FrozenGraph of the induced subgraph between nodes in start_node_iter and nodes in end_node_iter, where each of the iterables yields node ids. If either iterator is empty, an empty graph is returned.

A DAG example

>>> g = FrozenGraph(GraphTables((xrange(12),
...                             (0, 0, 2, 2, 3, 3, 5, 6, 8, 8, 9, 10, 11),
...                             (2, 3, 4, 5, 6, 7, 8, 9, 1, 1, 10, 1, 2),
...                             xrange(13))))
>>> #g.dot_display()
>>> i = g.get_induced_subgraph([0],[1])
>>> i
FrozenGraph(GraphTables(((0, 1, 2, 3, 5, 6, 8, 9, 10), (0, 0, 2, 3, 4, 5, 6, 6, 7, 8), (2, 3, 4, 5, 6, 7, 1, 1, 8, 1), (0, 1, 3, 4, 6, 7, 8, 9, 10, 11))))
>>> #i.dot_display()

Examples involving cycles

>>> j = FrozenGraph(GraphTables((xrange(4),
...                              (0, 0, 1, 0, 3),
...                              (0, 1, 0, 2, 1),
...                              xrange(5))))
>>> #j.dot_display()
>>> j.get_induced_subgraph([3],[2]) == j
True
>>> j.get_induced_subgraph([2],[3])
FrozenGraph(GraphTables(((), (), (), ())))
>>> j.get_induced_subgraph([0],[1])
FrozenGraph(GraphTables(((0, 1), (0, 0, 1), (0, 1, 0), (0, 1, 2))))

If either iterator is empty, an empty graph is returned.

>>> j.get_induced_subgraph([],[1])
FrozenGraph(GraphTables(((), (), (), ())))
>>> j.get_induced_subgraph([0],[])
FrozenGraph(GraphTables(((), (), (), ())))
get_line_graph(node_labeler=None, arc_labeler=None)

Returns a new FrozenGraph based on the contents of self. The new graph will be the line-graph of the original, also called the interchange or edge-dual graph.

The line-graph (V*, E*) of a directed graph (V,E) has nodes V* which correspond to the arcs E of the original graph. Two nodes v1 and v2 in V* have an arc between them in E* if and only if the arcs e1 and e2 in E that they correspond to form a path of length two in V, that is, if and only if there is a node v in the original graph such that e1 is an in-arc of v and e2 is an out-arc of v.

By default, nodes in the resulting line-graph will have labels which are triples (pred_node_label, arc_label, succ_node_label) where these labels come from the original graph; arcs in the resulting graph will have labels which are triples (in_arc_label, node_label, out_arc_label), again with labels from the original graph. Callers may provide their own labeling functions for either nodes or arcs, each will be called with three arguments which are exactly the respective triples shown above.

>>> g0 = FrozenGraph(GraphTables(((0, 1, 2, 3), (0, 1, 2), (1, 2, 3), ('A', 'B', 'C'))))
>>> lg0 = g0.get_line_graph()
>>> lg0
FrozenGraph(GraphTables((((0, 'A', 1), (1, 'B', 2), (2, 'C', 3)), (0, 1), (1, 2), (('A', 1, 'B'), ('B', 2, 'C')))))

Here’s an example with a simpler labeler, showing that a diamond-shaped lattice is converted to a disconnected graph. Repeated applications of the line-graph operator result in the empty graph.

>>> diamond = FrozenGraph(GraphTables((('w', 'x', 'y', 'z'), (0, 0, 1, 2), (1, 2, 3, 3), ('A', 'B', 'C', 'D'))))
>>> labeler = lambda left, middle, right: middle
>>> lg1 = diamond.get_line_graph(labeler, labeler)
>>> lg1
FrozenGraph(GraphTables((('A', 'B', 'C', 'D'), (0, 1), (2, 3), ('x', 'y'))))
>>> lg2 = lg1.get_line_graph(labeler, labeler)
>>> lg2
FrozenGraph(GraphTables((('x', 'y'), (), (), ())))
>>> lg3 = lg2.get_line_graph(labeler, labeler)
>>> lg3
FrozenGraph(GraphTables(((), (), (), ())))

Cycle graphs are idempotent under two applications of the line-graph operator.

>>> cycle = FrozenGraph(GraphTables((('a', 'b', 'c', 'd'), (0, 1, 2, 3), (1, 2, 3, 0), ('A', 'B', 'C', 'D'))))
>>> cycle_lg1 = cycle.get_line_graph(labeler, labeler)
>>> cycle_lg1
FrozenGraph(GraphTables((('A', 'B', 'C', 'D'), (3, 0, 1, 2), (0, 1, 2, 3), ('a', 'b', 'c', 'd'))))
>>> cycle_lg2 = cycle_lg1.get_line_graph(labeler, labeler)
>>> cycle_lg2 == cycle
True

Here’s an example with two unconnected cycles of circumferences 3 and 4

>>> two_cycles = FrozenGraph(GraphTables((('a', 'b', 'c', 'd', 'x', 'y', 'z'),
...                                       (0, 1, 2, 3, 4, 5, 6),
...                                       (1, 2, 3, 0, 5, 6, 4),
...                                       ('A', 'B', 'C', 'D', 'X', 'Y', 'Z'))))
>>> two_cycles_lg1 = two_cycles.get_line_graph(labeler, labeler)
>>> two_cycles_lg2 = two_cycles_lg1.get_line_graph(labeler, labeler)
>>> two_cycles_lg2 == two_cycles
True
>>> debruijn_labeler = lambda left, middle, right: str(left) + str(right)[-1]
>>> empty_labeler = lambda a,b,c: None
>>> debruijn0 = FrozenGraph(GraphTables((('0', '1'), (0, 1, 0, 1), (0, 0, 1, 1), (None, None, None, None))))
>>> debruijn1 = debruijn0.get_line_graph(debruijn_labeler, empty_labeler)
>>> debruijn2 = debruijn1.get_line_graph(debruijn_labeler, empty_labeler)
>>> debruijn3 = debruijn2.get_line_graph(debruijn_labeler, empty_labeler)
>>> debruijn3.num_nodes
16
get_mapped_tables(maps)

Using the supplied id maps, return a GraphTables object with mapped graph tables.

XXX the funcgtionality of this method and get_topological_reorderings() are largely obsoleted by get_canonical_DAG()

get_nocycles()

Returns a new FrozenGraph based on the contents of self. The set of nodes is unchanged. Cycles appearing in the original graph are broken in the new graph by removing cycle-completing arcs. The set of cycle-completing arcs is not canonical; that is, a different ordering of nodes and arcs in the original graph can result in different set of arcs being removed.

XXX note: an alternative algorithm would just reverse each of the cycle-completing arcs

>>> g = FrozenGraph(GraphTables(((-10, -11, -12), (2, 1, 0, 1), (0, 2, 2, 0), (-20, -21, -22, None))))
>>> g.get_nocycles()
FrozenGraph(GraphTables(((-10, -11, -12), (2, 1, 1), (0, 2, 0), (-20, -21, None))))
get_node_in_arcs(node_id)

Returns a list of the arc_ids of the incident arcs on node_id.

Query a graph with a single node with two self loops

>>> FrozenGraph(GraphTables(((-10,), (0, 0), (0, 0,), (-20, -21)))).get_node_in_arcs(0)
[0, 1]
get_node_label(node_id)

Return the node label associated with the node_id.

get_node_out_arcs(node_id)

Returns a list of the arc_ids of the incident arcs from node_id.

Query a graph with a single node with two self loops

>>> FrozenGraph(GraphTables(((-10,), (0, 0), (0, 0,), (-20, -21)))).get_node_out_arcs(0)
[0, 1]
get_one_best(arc_cost_function, random_tie_breaks=False)

Using arc_cost_function to generate the cost for each arc, return a pair, (cost, arc_ids), giving the total cost of the cheapest path through the lattice and the sequence of arc_ids making up that path.

If optional random_tie_breaks is not False, use the value to seed randomness that’s used to break scoring ties. Use a value other than None for reproducible randomness. Clients using the random_tie_breaks option are strongly encouraged to use an arc_cost_function that returns integer values.

See also iterpaths() for a more general path-finding interface.

Both properties is_lattice and is_strict_dag must be True.

Here’s a latice, arc labels are unique non-negative ints

>>> lat = make_random_lattice(50, 3, 10, 400, node_labeler=2, seed=1)
>>> lat.num_nodes, lat.num_arcs
(21, 23)
>>> 

Make the cost just be a modulus plus one, get the one-best

>>> def cost_func(label): return label % 4 + 1
>>> cost, arc_ids = lat.get_one_best(cost_func)
>>> print cost, arc_ids
13 [3, 6, 14, 18, 21, 22]

To visualize, uncomment the dot_display

>>> arc_labels = tuple(lat.get_arc_label(arc_id) for arc_id in arc_ids)
>>> assert len(set(arc_labels)) == len(arc_ids)
>>> def alc(label, *_): return (label, label % 4 + 1)
>>> def aac(label):
...   if label in arc_labels: yield 'style=bold'
>>> #_ = lat.dot_display(globals=['rankdir=LR', 'label="Calvary"', 'labelloc=top', 'fontsize=24'], arc_label_callback=alc, arc_attributes_callback=aac)

A big one

>>> lat2 = make_random_lattice(500, 3, 10, 700, node_labeler=4, seed=1)
>>> lat2.num_nodes, lat2.num_arcs
(1401, 3258)
>>> cost_one_lat2, arc_ids = lat2.get_one_best(cost_func)        
>>> cost_one_lat2, len(arc_ids)
(62, 40)

Verify same score for top choice from iterpaths

>>> start_node, = lat2.startnodes
>>> end_node, = lat2.endnodes
>>> iterpaths = lat2.iterpaths(cost_func, start_node, end_node)
>>> cost_iter_lat2, path, graph = iterpaths.next()
>>> cost_iter_lat2, len(path)
(62, 40)

Even bigger

>>> lat3 = make_random_lattice(500, 4, 7, 700, node_labeler=8, seed=1)
>>> lat3.num_nodes, lat3.num_arcs
(3314, 10249)
>>> cost_one_lat3, arc_ids3 = lat3.get_one_best(cost_func)        
>>> cost_one_lat3, len(arc_ids3)
(68, 55)

Verify same score for top choice from iterpaths, even though the best paths chosen by the two algorithms are different!

>>> start_node, = lat3.startnodes
>>> end_node, = lat3.endnodes
>>> iterpaths = lat3.iterpaths(cost_func, start_node, end_node)
>>> cost_iter_lat3, path, graph = iterpaths.next()
>>> cost_iter_lat3, len(path)
(68, 54)
>>> cost_iter_lat3 == cost_one_lat3
True
>>> tuple(iterpaths.next()[0] for _ in xrange(20))
(68, 68, 69, 69, 69, 69, 69, 69, 69, 69, 69, 69, 69, 69, 69, 69, 69, 69, 69, 69)

Verify random selection

>>> cost_one_lat3_rand, arc_ids_rand3 = lat3.get_one_best(cost_func, random_tie_breaks=0)
>>> arc_ids_rand3 == arc_ids3
False

Lattice with more identical top choices; show that random one_best finds them all

>>> def cost_func2(label): return label % 2 + 1
>>> lat4 = make_random_lattice(500, 8, 8, 800, node_labeler=6, seed=1)
>>> lat4.num_nodes, lat4.num_arcs
(2887, 19093)

The non-random one-best

>>> cost_one_lat4, arc_ids4 = lat4.get_one_best(cost_func2)
>>> cost_one_lat4
37

Count number of tied top choices

>>> start4, = lat4.startnodes
>>> end4, = lat4.endnodes
>>> iterpaths4 = lat4.iterpaths(cost_func2, start4, end4)
>>> for num_same in itertools.count():
...   cost, path, graph = iterpaths4.next()
...   assert cost >= cost_one_lat4
...   if cost > cost_one_lat4: break
>>> num_same
8

See how many times it takes get_one_best with random tie breaks to enumerate all the tied top choices

>>> paths = set()
>>> paths.add(tuple(arc_ids4))
>>> for num_tries in itertools.count():
...   cost_one_lat4_rand, arc_ids_rand4 = lat4.get_one_best(cost_func2, random_tie_breaks=num_tries)
...   assert cost_one_lat4_rand == cost_one_lat4
...   paths.add(tuple(arc_ids_rand4))
...   assert len(paths) <= num_same
...   if len(paths) == num_same: break
>>> num_tries
13

See how iterpaths_rea does

>>> iterpaths_rea4 = lat4.iterpaths_rea(cost_func2)
>>> for num_same2 in itertools.count():
...   cost2, path, graph = iterpaths_rea4.next()
...   assert cost2 >= cost_one_lat4
...   if cost2 > cost_one_lat4: break
>>> num_same2, cost2
(8, 38)
>>> for num_same3 in itertools.count():
...   cost3, path, graph = iterpaths_rea4.next()
...   assert cost3 >= cost2
...   if cost3 > cost2: break
>>> num_same3, cost3
(13261, 39)

Test the errors

>>> make_random_DAG(1, 15, 2, 4, 2, 600, seed=0).get_one_best(lambda x: 1)
Traceback (most recent call last):
  ...
ValueError: expected both is_lattice and is_strict_dag to be True, got False True
>>> FrozenGraph(GraphTables(([None], [0], [0], [None]))).get_one_best(lambda x: 1)
Traceback (most recent call last):
  ...
ValueError: expected both is_lattice and is_strict_dag to be True, got True False
get_reversed()

Returns a new FrozenGraph based on the contents of self. In the new graph the directions of all the arcs are reversed.

>>> g = FrozenGraph(GraphTables(((-10, -11, -12), (2, 1, 1), (0, 2, 0,), (-20, -21, -22))))
>>> g.get_reversed()
FrozenGraph(GraphTables(((-10, -11, -12), (0, 2, 0), (2, 1, 1), (-20, -21, -22))))
>>> g.get_reversed().get_reversed() == g
True
get_topological_reorderings()

Find a topological ordering for the nodes and arcs in an acyclic graph.

Return two sequences, each of which maps old ids to new ids. The first is the map of node ids, the second is the map of arc ids.

XXX the funcgtionality of this method and get_mapped_tables() are largely obsoleted by get_canonical_DAG()

has_cycle

Return True if there are is any multi-node cycle in the graph, False otherwise.

has_join

True if there are any joins in the graph, False otherwise. A join occurs when two or more arcs end on the same node.

has_multiarcs

Return True if there are any two distinct nodes with parallel arcs between them, False if there is either zero or one arc between each pair of distinct nodes. This measure is of more interest for undirected graphs than directed graphs.

has_self_loop

Return True if there is any node with a self loop, False otherwise.

is_confusion_network

Returns True if the graph has the structure of a confusion network (also called a consensus network, or a sausage, or a linear grammar).

A confusion network is a strict DAG (no cycles or self loops) with a single start node and a single end node. The essential structural property is that for each node in the graph, u, all the arcs that start from u end on a single node, v, that is distinct from u.

>>> FrozenGraph(GraphTables((tuple(xrange(3)), (0,0,1,1,1,1), (1,1,2,2,2,2), tuple(xrange(6))))).is_confusion_network
True
>>> print FrozenGraph(GraphTables((tuple(xrange(3)), (0,0,1,1,1,1), (1,2,2,2,2,2), tuple(xrange(6))))).is_confusion_network
False
is_connected

Return True if the nodes of the graph are weakly connected, False if there are disconnected subsets of nodes.

is_lattice

Return True if the graph has no cycles and has a single start node and a single end node.

is_strict_dag

Return True if the graph has no self loops and no cycles.

is_tree

True if the graph is a tree, False otherwise. There are many ways to decide if a graph is a tree. For our purposes, a graph is a tree if it has a single startnode and no joins.

is_tree_strict

True if the graph is strictly a tree, False otherwise. There are many ways to define a strict a tree. For our purposes, a graph is a strict tree if it has a single startnode, no joins, and no self-loops.

iter_arc_ngrams(ngram_order)

Returns a generator that yields information about the arc-ngram-sequences of length ngram_order in the graph.

Each iteration yields a pair of tuples for an ngram in the graph. The first tuple in the pair is the sequence of ngram_order + 1 node labels for the nodes that define the ngram. The second tuple in the pair is the sequence of ngram_order arc labels for the arcs in the ngram.

Raises ValueError if self.is_strict_dag() is not True.

An example showing all ngrams of order 2 or more For readability purposes, we convert the node labels into node indicies.

>>> g = make_random_DAG(1, 15, 2, 4, 2, 600, seed=0)
>>> nodelabels, arcstartnodes, arcendnodes, arclabels = g.graphseqs
>>> g = FrozenGraph(GraphTables((tuple(xrange(len(nodelabels))), arcstartnodes, arcendnodes, arclabels)))
>>> for ngram in g.iter_arc_ngrams(2): print ngram
((0, 1, 6), (0, 6))
((0, 1, 6), (1, 6))
((0, 3, 4), (3, 4))
((1, 6, 9), (6, 10))
((3, 4, 5), (4, 5))
((3, 4, 7), (4, 7))
((3, 4, 8), (4, 8))
((4, 5, 9), (5, 9))
>>> for ngram in g.iter_arc_ngrams(3): print ngram
((0, 1, 6, 9), (0, 6, 10))
((0, 1, 6, 9), (1, 6, 10))
((0, 3, 4, 5), (3, 4, 5))
((0, 3, 4, 7), (3, 4, 7))
((0, 3, 4, 8), (3, 4, 8))
((3, 4, 5, 9), (4, 5, 9))
>>> for ngram in g.iter_arc_ngrams(4): print ngram
((0, 3, 4, 5, 9), (3, 4, 5, 9))
>>> for ngram in g.iter_arc_ngrams(5): print ngram
iter_nodes(reverse_order=False)

Returns a generator over the nodes in the graph. The generator yields a three-item tuple: the first item is the label of the node, the second item is a generator over the in-arcs of the node, and the third item is a generator over the out-arcs of the node. Each of these arc generators yields a pair (arc_label, node_label) where the node label is the label of the node at the other end of the arc relative to the node currently being yielded. When a topological ordering of nodes is possible, the nodes will be iterated in forward topological order by default, and in reverse topological order if reverse_order is True. If there is no topological ordering because the graph has cycles, then the order of iteration is undefined.

iter_nodes_by_id(reverse_order=False)

Returns a generator over the nodes in the graph. The generator yields a three-item tuple: the first item is the id of the node, the second item is a generator over the in-arcs of the node, and the third item is a generator over the out-arcs of the node. Each of these arc generators yields a pair (arc_id, node_id) where the node id is the id of the node at the other end of the arc relative to the node currently being yielded. When a topological ordering of nodes is possible, the nodes will be iterated in forward topological order by default, and in reverse topological order if reverse_order is True. If there is no topological ordering because the graph has cycles, then the order of iteration is undefined.

>>> fg0 = FrozenGraph(GraphTables(((10, 11, 12, 13, 14), (0, 0, 1, 2, 3), (1, 2, 3, 3, 4), ('A', 'B', 'C', 'D', 'E'))))
>>> for node_id, in_arc_iter, out_arc_iter in fg0.iter_nodes_by_id():
...     print('Node with id %d and label %s' % (node_id, fg0.get_node_label(node_id)))
...     for arc_id, pred_node_id in in_arc_iter:
...         print('In-arc from node %d with id %d and label %s' % (pred_node_id, arc_id, fg0.get_arc_label(arc_id)))
...     for arc_id, succ_node_id in out_arc_iter:
...         print('Out-arc to node %d with id %d and label %s' % (succ_node_id, arc_id, fg0.get_arc_label(arc_id)))
Node with id 0 and label 10
Out-arc to node 1 with id 0 and label A
Out-arc to node 2 with id 1 and label B
Node with id 1 and label 11
In-arc from node 0 with id 0 and label A
Out-arc to node 3 with id 2 and label C
Node with id 2 and label 12
In-arc from node 0 with id 1 and label B
Out-arc to node 3 with id 3 and label D
Node with id 3 and label 13
In-arc from node 1 with id 2 and label C
In-arc from node 2 with id 3 and label D
Out-arc to node 4 with id 4 and label E
Node with id 4 and label 14
In-arc from node 3 with id 4 and label E
iterpaths(cost_func, start_node, end_node, _optimize=True)

Returns a generator that yields paths between start_node and end_node based on the summed cost of the arcs on the path. The cost_func will be called with a single argument, an arc label, and it must return the cost for the arc.

See also get_one_best() for use when is_lattice is True and you just need a single best-path through the lattice.

The generator yields a three-item tuple, (cost, arc_ids, self). The first item is the cost for the path, the second item is a tuple of arc-ids for the path, the third item is the graph object itself. The costs are non-decreasing for successive yields. See get_arc() for getting the information associated with an arc_id.

It is an error to call this if the graph contains cycles. See strict_is_cyclic().

Because this DAG is not a lattice, this graph exercises one internal optimization

>>> dag = make_random_DAG(1, 15, 2, 4, 2, 600, seed=0)
>>> for cost, arcs, graph in dag.iterpaths(lambda x: 1, 0, 6): print cost, arcs
2 (0, 6)
2 (1, 6)
>>> dag._last_iterpaths_count
4
>>> for cost, arcs, graph in dag.iterpaths(lambda x: 1, 0, 6, _optimize=False): print cost, arcs
2 (0, 6)
2 (1, 6)
>>> dag._last_iterpaths_count
11
iterpaths_rea(arc_cost_func)

Returns a generator that yields paths between the start nodes and end nodes of a DAG. The arc_cost_func will be called with a single argument, an arc label, and it must return the cost of the arc.

The generator yields a three-item tuple, (cost, arc_ids, self). The first item is the cost for the path, the second item is a tuple of arc-ids for the path, the third item is the graph object itself. The costs are non-decreasing for successive yields. See get_arc() for getting the information associated with an arc_id.

It is an error to call this if the graph contains non-positve cycles.

>>> def cost_func(label): return label
>>> dag = make_random_DAG(15, 15, 3, 3, 5, 300, seed=0)
>>> dag.startnodes, dag.endnodes
((0, 1, 2, 3, 4), (4, 5, 8, 12, 13, 14, 19, 20, 22, 29, 30, 31, 32, 33, 34))
>>> [x for x in dag.iterpaths_rea(cost_func)]
Traceback (most recent call last):
  ...
ValueError: FrozenGraph must not have nodes that have no input or output arcs. Encountered empty nodes set([4])
>>> dag = make_random_DAG(5, 15, 5, 3, 5, 400, seed=0)
>>> dag.startnodes
(0, 1)
>>> dag.endnodes
(7, 11, 16, 18, 19, 22, 25, 42, 50, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74)
>>> for num_choice, (cost, arc_seq, _) in enumerate(dag.iterpaths_rea(cost_func)):
...     print 'Choice %d  Cost %d  Arcs %s' %(num_choice, cost, arc_seq)
...     if num_choice > 10: break
Choice 0  Cost 5  Arcs (5,)
Choice 1  Cost 15  Arcs (4, 11)
Choice 2  Cost 20  Arcs (20,)
Choice 3  Cost 25  Arcs (6, 19)
Choice 4  Cost 25  Arcs (4, 21)
Choice 5  Cost 26  Arcs (7, 19)
Choice 6  Cost 28  Arcs (12, 16)
Choice 7  Cost 30  Arcs (4, 26)
Choice 8  Cost 31  Arcs (4, 8, 19)
Choice 9  Cost 33  Arcs (3, 30)
Choice 10  Cost 46  Arcs (15, 31)
Choice 11  Cost 55  Arcs (1, 22, 32)
>>> #dag.dot_display(globals=['rankdir=LR'])
>>> ssdag = make_random_dag(50, 3, 10, 400, node_labeler=2, seed=1)
>>> ssdag.num_nodes, ssdag.num_arcs
(47, 54)
>>> ssdag.endnodes
(7, 10, 11, 12, 19, 21, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46)
>>> #ssdag.dot_display(globals=['rankdir=LR'])

Make the cost just be a modulus plus one, get the one-best

>>> for num_choice, (cost, arc_seq, _) in enumerate(ssdag.iterpaths_rea(cost_func)):
...     print 'Choice %d  Cost %d  Arcs %s' %(num_choice, cost, arc_seq)
...     label_seq = tuple(ssdag.get_arc_label(arc_id) for arc_id in arc_seq)
...     # uncomment to get lattice and path displayed.
...     #def alc(label, *_): return (label, label)
...     #def aac(label):
...     #    yield 'style=bold' if label in label_seq else 'style=dotted'
...     #_ = ssdag.dot_display(globals=['rankdir=LR', 'label="Choice %d"' %(num_choice), 'labelloc=top', 'fontsize=24'], arc_label_callback=alc, arc_attributes_callback=aac)
Choice 0  Cost 7  Arcs (0, 1, 6)
Choice 1  Cost 9  Arcs (0, 2, 7)
Choice 2  Cost 11  Arcs (0, 11)
Choice 3  Cost 17  Arcs (0, 5, 12)
Choice 4  Cost 23  Arcs (0, 2, 3, 8, 10)
Choice 5  Cost 32  Arcs (0, 2, 3, 8, 19)
Choice 6  Cost 37  Arcs (4, 13, 20)
Choice 7  Cost 67  Arcs (0, 2, 3, 8, 14, 18, 22)
Choice 8  Cost 81  Arcs (4, 13, 23, 41)
Choice 9  Cost 89  Arcs (0, 2, 3, 17, 25, 42)
Choice 10  Cost 90  Arcs (0, 2, 3, 17, 25, 43)
Choice 11  Cost 99  Arcs (0, 5, 9, 15, 31, 39)
Choice 12  Cost 101  Arcs (0, 2, 3, 17, 30, 49)
Choice 13  Cost 106  Arcs (0, 5, 9, 15, 31, 46)
Choice 14  Cost 110  Arcs (0, 5, 9, 15, 16, 27, 38)
Choice 15  Cost 111  Arcs (0, 5, 9, 15, 31, 51)
Choice 16  Cost 114  Arcs (4, 13, 23, 34, 40)
Choice 17  Cost 117  Arcs (0, 2, 3, 17, 25, 26, 44)
Choice 18  Cost 121  Arcs (4, 13, 23, 34, 47)
Choice 19  Cost 122  Arcs (4, 13, 23, 34, 48)
Choice 20  Cost 130  Arcs (4, 13, 28, 35, 50)
Choice 21  Cost 132  Arcs (4, 13, 28, 35, 52)
Choice 22  Cost 133  Arcs (4, 13, 28, 35, 53)
Choice 23  Cost 138  Arcs (0, 2, 3, 17, 30, 36, 50)
Choice 24  Cost 140  Arcs (0, 2, 3, 17, 30, 36, 52)
Choice 25  Cost 141  Arcs (0, 2, 3, 17, 30, 36, 53)
Choice 26  Cost 143  Arcs (0, 5, 9, 15, 16, 27, 32, 39)
Choice 27  Cost 150  Arcs (0, 5, 9, 15, 16, 27, 32, 46)
Choice 28  Cost 151  Arcs (0, 2, 3, 17, 25, 26, 33, 45)
Choice 29  Cost 155  Arcs (0, 5, 9, 15, 16, 27, 32, 51)
Choice 30  Cost 193  Arcs (0, 2, 3, 17, 25, 26, 33, 37, 50)
Choice 31  Cost 195  Arcs (0, 2, 3, 17, 25, 26, 33, 37, 52)
Choice 32  Cost 196  Arcs (0, 2, 3, 17, 25, 26, 33, 37, 53)
Choice 33  Cost 204  Arcs (0, 5, 9, 15, 16, 21, 24, 29, 35, 50)
Choice 34  Cost 206  Arcs (0, 5, 9, 15, 16, 21, 24, 29, 35, 52)
Choice 35  Cost 207  Arcs (0, 5, 9, 15, 16, 21, 24, 29, 35, 53)
>>> def cost_func1(label): return label % 4 + 1
>>> def countem(lat):
...   #return lat.num_nodes, lat.num_arcs
...   min_cost, max_cost = sys.maxint, 0
...   min_len, max_len = sys.maxint, 0
...   prev_cost = 0
...   for count, (cost, arcs, _) in enumerate(lat.iterpaths_rea(cost_func1)):
...     assert cost >= prev_cost
...     prev_cost = cost
...     if cost < min_cost: min_cost = cost
...     if cost > max_cost: max_cost = cost
...     if len(arcs) < min_len: min_len = len(arcs)
...     if len(arcs) > max_len: max_len = len(arcs)
...   return min_cost, max_cost, min_len, max_len, count
>>> lat = make_random_lattice(55, 2, 3, 750, node_labeler=1, seed=13)
>>> #lat.dot_display(globals=['rankdir=LR'])
>>> countem(lat)
(32, 57, 14, 22, 1535)
>>> countem(make_random_lattice(150, 2, 10, 900, node_labeler=2, seed=0))
(31, 72, 13, 24, 3926)
>>> countem(make_random_lattice(100, 2, 4, 600, node_labeler=2, seed=0))
(43, 85, 18, 32, 8779)
>>> countem(make_random_lattice(100, 3, 5, 500, node_labeler=2, seed=0))
(33, 77, 15, 28, 47151)
>>> countem(make_random_lattice(40, 7, 10, 800, node_labeler=4, seed=0))
(10, 45, 5, 17, 2994)
>>> countem(make_random_lattice(50, 2, 15, 1000, node_labeler=2, seed=1))
(14, 21, 6, 9, 3)
>>> countem(make_random_lattice(50, 2, 15, 1000, seed=1))
(13, 23, 7, 10, 7)
>>> countem(make_random_lattice(50, 3, 10, 400, node_labeler=2, seed=1))
(13, 24, 6, 11, 3)
>>> countem(make_random_lattice(60, 2, 3, 700, node_labeler=2, seed=0))
(33, 65, 16, 24, 4057)
>>> countem(make_random_lattice(60, 2, 3, 700, node_labeler=2, seed=0))
(33, 65, 16, 24, 4057)
>>> countem(make_random_lattice(500, 3, 10, 700, node_labeler=4, seed=1)) 
>>> countem(make_random_lattice(250, 6, 25, 600, node_labeler=50, seed=0)) 
>>> countem(make_random_lattice(500, 4, 7, 700, node_labeler=8, seed=1)) 
>>> countem(make_random_lattice(500, 8, 8, 800, node_labeler=6, seed=1)) 
>>> lat = FrozenGraph(GraphTables(((16, 17, 10, 11, 12, 14, 22, 4, 5, 6, 0, 23, 3),
...                                (0, 2, 4, 0, 7, 3, 9, 10, 3, 8, 1, 3, 10, 6, 10, 12, 5),
...                                (1, 3, 5, 6, 8, 4, 2, 8, 6, 9, 6, 5, 7, 11, 12, 7, 0),
...                                (100, 100, 25, 100, 100, 100, 10, 10, 10, 100, 100, 10, 14, 30, 14, 100, 100))))
>>> #lat.dot_display(globals=['rankdir=LR'])
>>> for num_choice, (cost, arc_seq, _) in enumerate(lat.iterpaths_rea(cost_func1)):
...     print 'Choice %d  Cost %d  Arcs %s' %(num_choice, cost, arc_seq)
...     label_seq = tuple(lat.get_arc_label(arc_id) for arc_id in arc_seq)
Choice 0  Cost 14  Arcs (7, 9, 6, 1, 8, 13)
Choice 1  Cost 15  Arcs (12, 4, 9, 6, 1, 8, 13)
Choice 2  Cost 16  Arcs (7, 9, 6, 1, 11, 16, 3, 13)
Choice 3  Cost 16  Arcs (7, 9, 6, 1, 5, 2, 16, 3, 13)
Choice 4  Cost 16  Arcs (14, 15, 4, 9, 6, 1, 8, 13)
Choice 5  Cost 17  Arcs (12, 4, 9, 6, 1, 5, 2, 16, 3, 13)
Choice 6  Cost 17  Arcs (12, 4, 9, 6, 1, 11, 16, 3, 13)
Choice 7  Cost 17  Arcs (7, 9, 6, 1, 11, 16, 0, 10, 13)
Choice 8  Cost 17  Arcs (7, 9, 6, 1, 5, 2, 16, 0, 10, 13)
Choice 9  Cost 18  Arcs (14, 15, 4, 9, 6, 1, 5, 2, 16, 3, 13)
Choice 10  Cost 18  Arcs (14, 15, 4, 9, 6, 1, 11, 16, 3, 13)
Choice 11  Cost 18  Arcs (12, 4, 9, 6, 1, 5, 2, 16, 0, 10, 13)
Choice 12  Cost 18  Arcs (12, 4, 9, 6, 1, 11, 16, 0, 10, 13)
Choice 13  Cost 19  Arcs (14, 15, 4, 9, 6, 1, 5, 2, 16, 0, 10, 13)
Choice 14  Cost 19  Arcs (14, 15, 4, 9, 6, 1, 11, 16, 0, 10, 13)
node_labels_are_node_ids

Return True if each node label is the index of the node.

num_arcs

The number of arcs in the graph.

num_endnodes

Number of end nodes in the graph.

num_nodes

The number of nodes in the graph.

num_startnodes

Number of start nodes in the graph.

startnodes

A tuple of the indices of the starting-node ids. These are nodes with no incoming arcs.

strict_is_cyclic

Return True if the graph contains one or more cycle of any kind, False if the graph is strictly acyclic (strictly a DAG). For the purposes of this function, a multinode cycle or a self loop constitutes cyclicity. See also has_self_loop() and has_cycle().

terminals

A pair of tuples, the startnodes and the endnodes.

text_iter()

Returns a generator that yields lines of text, including newlines. The text represents the graph in a human-readable, serializable form.

verify()
class onyx.graph.graphtools.FrozenGraphArcNodeStructure(num_nodes=None, arc_data=None)

Bases: onyx.graph.graphtools.GraphArcNodeStructure

static create(num_nodes, arc_data)
end_nodes

A sorted tuple of the of end nodes in the graph. A end node is a node that has no arc leading from it to another node.

has_self_loop

True if there is any node in the graph with an arc to itself, False if there is no node in the graph with an arc to itself.

iter_arc_ids

An iterator that enumerates valid arc identifiers in the graph. It’s like xrange(graph.num_arcs), except that for mutable subclasses, the iterator correctly iterates over identifiers for arcs that are added while the iteration is active.

iter_arcs

An iterator that enumerates the arcs in the graph in their arc identifier order. It yields a pair, (start_node_id, end_node_id), for each arc in the graph. For mutable subclasses, the iterator correctly iterates over arcs that are added while the iteration is active.

iter_node_ids

An iterator that enumerates valid node identifiers in the graph. It’s like xrange(graph.num_nodes), except that for mutable subclasses, the iterator correctly iterates over identifiers for nodes that are added while the iteration is active.

num_arcs

The number of arcs in the graph.

num_end_nodes

Number of end nodes in the graph. A end node is a node that has no arc leading from it to another node.

num_nodes

The number of nodes in the graph.

num_start_nodes

Number of start nodes in the graph. A start node is a node that has no arc from another node ending on it.

self_loops

A sorted tuple of the nodes in the graph, each of which has one or more arcs to itself.

start_nodes

A sorted tuple of the start nodes in the graph. A start node is a node that has no arc from another node ending on it.

verify()
class onyx.graph.graphtools.FrozenSubgraphs(subgraphs)

Bases: onyx.graph.graphtools._SubgraphsGetter

An immutable collection of subgraph specifiers.

>>> sb = SubgraphsBuilder()
>>> sb.add_subgraph(xrange(10), xrange(2), xrange(3, 5))
0
>>> sb.add_subgraph((47, 49, 51, 55), (47,), ())
1
>>> FrozenSubgraphs(sb).get_subgraph(0)
((0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1), (3, 4))
get_subgraph(index)

Returns a triple of tuples: (nodes, starts, ends) representing the subgraph at index.

Raises ValueError if index is not a valid indentifier for a subgraph.

class onyx.graph.graphtools.GraphArcNodeStructure

Bases: onyx.graph.graphtools.GraphArcStructure

A baseclass for graphical objects. In addition to being a GraphArcStructure it provides property access to derived information about distinguished nodes in the graph.

Subclasses must initialize self._node_data with three sets: a set of the identifiers for nodes that have self loops, a set of the identifiers for the starting nodes, and a set of the identifiers for the ending nodes of the graph. If the subclass is mutable, it must maintain these three sets.

static create(num_nodes, arc_data)
end_nodes

A sorted tuple of the of end nodes in the graph. A end node is a node that has no arc leading from it to another node.

has_self_loop

True if there is any node in the graph with an arc to itself, False if there is no node in the graph with an arc to itself.

iter_arc_ids

An iterator that enumerates valid arc identifiers in the graph. It’s like xrange(graph.num_arcs), except that for mutable subclasses, the iterator correctly iterates over identifiers for arcs that are added while the iteration is active.

iter_arcs

An iterator that enumerates the arcs in the graph in their arc identifier order. It yields a pair, (start_node_id, end_node_id), for each arc in the graph. For mutable subclasses, the iterator correctly iterates over arcs that are added while the iteration is active.

iter_node_ids

An iterator that enumerates valid node identifiers in the graph. It’s like xrange(graph.num_nodes), except that for mutable subclasses, the iterator correctly iterates over identifiers for nodes that are added while the iteration is active.

num_arcs

The number of arcs in the graph.

num_end_nodes

Number of end nodes in the graph. A end node is a node that has no arc leading from it to another node.

num_nodes

The number of nodes in the graph.

num_start_nodes

Number of start nodes in the graph. A start node is a node that has no arc from another node ending on it.

self_loops

A sorted tuple of the nodes in the graph, each of which has one or more arcs to itself.

start_nodes

A sorted tuple of the start nodes in the graph. A start node is a node that has no arc from another node ending on it.

verify()

Verify the object’s invariants.

class onyx.graph.graphtools.GraphArcStructure

Bases: object

A baseclass for graphical objects. It provides property access to the lowest-level structural information about the graph.

Subclasses must initialize self._num_nodes with the number of nodes, and self._arc_data with a pair of sequences representing the starting and ending nodes of the arcs in the graph. This baseclass is agnostic as to the types of the sequences. However, if the subclass is mutable, e.g. the sequences are lists, it must maintain the invariants involving self._num_nodes and self._arc_data that the private _verify method successfully checks the invariants.

static create(num_nodes=None, arc_data=None)
iter_arc_ids

An iterator that enumerates valid arc identifiers in the graph. It’s like xrange(graph.num_arcs), except that for mutable subclasses, the iterator correctly iterates over identifiers for arcs that are added while the iteration is active.

iter_arcs

An iterator that enumerates the arcs in the graph in their arc identifier order. It yields a pair, (start_node_id, end_node_id), for each arc in the graph. For mutable subclasses, the iterator correctly iterates over arcs that are added while the iteration is active.

iter_node_ids

An iterator that enumerates valid node identifiers in the graph. It’s like xrange(graph.num_nodes), except that for mutable subclasses, the iterator correctly iterates over identifiers for nodes that are added while the iteration is active.

num_arcs

The number of arcs in the graph.

num_nodes

The number of nodes in the graph.

verify()

Verify the object’s invariants.

class onyx.graph.graphtools.GraphBase(seqs)

Bases: object

Base class for graph work, contains utility and verification mixins.

dot_display(temp_file_prefix='graphtools_', display_tool_format='open -a /Applications/Graphviz.app %s', **kwargs)

Display a dot-generated representation of the graph. Returns the name of the temporary file that the display tool is working from. The caller is responsible for removing this file. See also dot_iter().

Optional temp_file_prefix is the prefix used for the temporary filename.

Optional display_tool_format is formatting string, with %s where the filename goes, used to generate the command that will display the file. By default it assumes you’re on a Mac and have Graphviz.app installed in the /Applications directory.

Remaining keyword arguments are handed to the dot_iter function.

dot_iter(graph_label='', graph_type='digraph', globals=(), node_label_callback=<function <lambda> at 0x1d06a5f0>, node_attributes_callback=None, arc_label_callback=<function <lambda> at 0x1d06a5f0>, arc_attributes_callback=None, arc_ports_callback=None)

Returns a generator that yields lines of text, including newlines. The text represents the graph in the DOT language for which there are many displayers. See also dot_display().

Optional argument graph_label is a string label for the graph. Optional argument graph_type defaults to ‘digraph’, can also be ‘graph’.

Optional globals is an iterable that yields strings to put in the globals section of the DOT file.

Optional node_label_callback is a function that will be called with three arguments for each node: (node_label, is_start, is_end), where node_label is the node’s label, is_start is True if the node is a start node (has no incoming arcs), and is_end is True if the node is an end node (has no outgoing arcs). The callback function should return a string which will be used as the text label for the node in the figure, or None or the empty string for no label. If this argument is not given, each node with a non-None label will be labelled with the str value of its label. If this argument is None, nodes will not be labelled.

Optional arc_label_callback is a function that will be called with the arc label for each arc. The callback function should return a string which will be used as the text label for the arc in the figure, or None or the empty string for no label. If this argument is not given, each arc with a non-None label will be labelled with the str value of its label. If this argument is None, arcs will not be labelled.

Optional node_attributes_callback is a function that returns an iterable that will yield a string for each attribute assignment for the node. It will be called with three arguments for each node: (node_label, is_start, is_end), where node_label is the node’s label, is_start is True if the node is a start node (has no incoming arcs), and is_end is True if the node is an end node (has no outgoing arcs).

Optional arc_attributes_callback is a function that returns an iterable that will yield a string for each attribute assignment for the arc. It will be called with the arc label for each arc in the graph.

Optional arc_ports_callback is a function of two arguments that is called for each arc in the graph. The arguments are the labels of the start and end nodes of the arc. The function returns a pair of strings that name the respective ports of the nodes to which the arc should attach. The strings should be empty if no port is implied.

>>> graph = FrozenGraph(GraphTables(((1, 2, 'a'), (0, 1, 2), (1, 0, 2), (3, None, 5))))
>>> print ''.join(graph.dot_iter())
digraph  { 
  n0  [label="1"];
  n1  [label="2"];
  n2  [label="a"];
  n0 -> n1  [label="3"];
  n1 -> n0;
  n2 -> n2  [label="5"];
}
<BLANKLINE>
>>> print ''.join(graph.dot_iter(graph_label='Foo',
...                              globals=['size="5,6.125";', 'ordering=out;'],
...                              node_label_callback=lambda label, is_start, is_end: ('<' if is_start else '') + str(label) + ('>' if is_end else ''),
...                              arc_label_callback=str,
...                              node_attributes_callback=lambda label, is_start, is_end: ['shape=%s' % ('octagon' if is_start or is_end else 'ellipse',), 'style=%s' % ('bold' if isinstance(label, int) else 'normal',)],
...                              arc_attributes_callback=lambda label: ['style=%s' % ('bold' if label is None else 'normal',)]))
digraph Foo { 
  label="Foo";
  size="5,6.125";
  ordering=out;
  n0  [label="1", shape=ellipse, style=bold];
  n1  [label="2", shape=ellipse, style=bold];
  n2  [label="<a>", shape=octagon, style=normal];
  n0 -> n1  [label="3", style=normal];
  n1 -> n0  [label="None", style=bold];
  n2 -> n2  [label="5", style=normal];
}
<BLANKLINE>
get_arc(arc_id)

Return a tuple of the (start_node, end_node, arc_label) for the arc associated with the arc_id.

get_arc_label(arc_id)

Return the arc label associated with the arc_id.

get_node_label(node_id)

Return the node label associated with the node_id.

num_arcs

The number of arcs in the graph.

num_nodes

The number of nodes in the graph.

text_iter()

Returns a generator that yields lines of text, including newlines. The text represents the graph in a human-readable, serializable form.

verify()
class onyx.graph.graphtools.GraphBuilder(init_graph=GraphTables(((), (), (), ())))

Bases: onyx.graph.graphtools.GraphBase

A lightweight object used to build a directed graph. An instance of GraphBuilder is used to build a representation of a graph by creating nodes and by creating arcs between these nodes.

Each node and each arc that is added to the graph can have a label. These labels allow the client to arrange the semantic content graphically. Each label can be any immutable (i.e. hashable) Python object. The user of the GraphBuilder applies semantics to the labels, often using the labels as indices or keys into other client data structures. Best practices in data portability are to use either a string or an integer for each label.

If optional init_graph is given its structure and labels will be used to initialize the builder. It must be an instance of GraphBase or GraphTables, otherwise a TypeError is raised.

Once a graph has been built, it is typically used to initialize a FrozenGraph which has many methods for examining structural properties of the graph and for performing iterations and transformations.

>>> gb = GraphBuilder()
>>> n1 = gb.new_node('A')
>>> n2 = gb.new_node('B')
>>> n3 = gb.new_node('C')
>>> a1 = gb.new_arc(n1, n2, 'X')
>>> a2 = gb.new_arc(n2, n3, 'Y')
>>> fg0 = FrozenGraph(gb)
>>> fg0
FrozenGraph(GraphTables((('A', 'B', 'C'), (0, 1), (1, 2), ('X', 'Y'))))
>>> all_nodes, starts, ends = mapped_nodes = gb.add_graph(fg0, lambda x: x+x, lambda x: 3*x)
>>> all_nodes
(3, 4, 5)
>>> starts, ends
((3,), (5,))
>>> fg1 = FrozenGraph(gb)
>>> fg1
FrozenGraph(GraphTables((('A', 'B', 'C', 'AA', 'BB', 'CC'), (0, 1, 3, 4), (1, 2, 4, 5), ('X', 'Y', 'XXX', 'YYY'))))
add_graph(subgraph, node_label_callback=<function <lambda> at 0x1d06bb18>, arc_label_callback=<function <lambda> at 0x1d06bb90>)

Add an entire subgraph to this builder’s graph.

Add all the nodes and edges and their respective labels from subgraph into the builder’s graph.

Returns a triple of tuples representing the nodes of the subgraph within the builder’s graph. The items in the triple are the node indices in the builder of all the nodes from subgraph, the indices of the subgraph’s startnodes, and the indices of the subgraph’s endnodes respectively. Note that these node indices are not necessarily in the same order as they are in subgraph.

add_graph_label_is_id(subgraph)
dot_display(temp_file_prefix='graphtools_', display_tool_format='open -a /Applications/Graphviz.app %s', **kwargs)

Display a dot-generated representation of the graph. Returns the name of the temporary file that the display tool is working from. The caller is responsible for removing this file. See also dot_iter().

Optional temp_file_prefix is the prefix used for the temporary filename.

Optional display_tool_format is formatting string, with %s where the filename goes, used to generate the command that will display the file. By default it assumes you’re on a Mac and have Graphviz.app installed in the /Applications directory.

Remaining keyword arguments are handed to the dot_iter function.

dot_iter(graph_label='', graph_type='digraph', globals=(), node_label_callback=<function <lambda> at 0x1d06a5f0>, node_attributes_callback=None, arc_label_callback=<function <lambda> at 0x1d06a5f0>, arc_attributes_callback=None, arc_ports_callback=None)

Returns a generator that yields lines of text, including newlines. The text represents the graph in the DOT language for which there are many displayers. See also dot_display().

Optional argument graph_label is a string label for the graph. Optional argument graph_type defaults to ‘digraph’, can also be ‘graph’.

Optional globals is an iterable that yields strings to put in the globals section of the DOT file.

Optional node_label_callback is a function that will be called with three arguments for each node: (node_label, is_start, is_end), where node_label is the node’s label, is_start is True if the node is a start node (has no incoming arcs), and is_end is True if the node is an end node (has no outgoing arcs). The callback function should return a string which will be used as the text label for the node in the figure, or None or the empty string for no label. If this argument is not given, each node with a non-None label will be labelled with the str value of its label. If this argument is None, nodes will not be labelled.

Optional arc_label_callback is a function that will be called with the arc label for each arc. The callback function should return a string which will be used as the text label for the arc in the figure, or None or the empty string for no label. If this argument is not given, each arc with a non-None label will be labelled with the str value of its label. If this argument is None, arcs will not be labelled.

Optional node_attributes_callback is a function that returns an iterable that will yield a string for each attribute assignment for the node. It will be called with three arguments for each node: (node_label, is_start, is_end), where node_label is the node’s label, is_start is True if the node is a start node (has no incoming arcs), and is_end is True if the node is an end node (has no outgoing arcs).

Optional arc_attributes_callback is a function that returns an iterable that will yield a string for each attribute assignment for the arc. It will be called with the arc label for each arc in the graph.

Optional arc_ports_callback is a function of two arguments that is called for each arc in the graph. The arguments are the labels of the start and end nodes of the arc. The function returns a pair of strings that name the respective ports of the nodes to which the arc should attach. The strings should be empty if no port is implied.

>>> graph = FrozenGraph(GraphTables(((1, 2, 'a'), (0, 1, 2), (1, 0, 2), (3, None, 5))))
>>> print ''.join(graph.dot_iter())
digraph  { 
  n0  [label="1"];
  n1  [label="2"];
  n2  [label="a"];
  n0 -> n1  [label="3"];
  n1 -> n0;
  n2 -> n2  [label="5"];
}
<BLANKLINE>
>>> print ''.join(graph.dot_iter(graph_label='Foo',
...                              globals=['size="5,6.125";', 'ordering=out;'],
...                              node_label_callback=lambda label, is_start, is_end: ('<' if is_start else '') + str(label) + ('>' if is_end else ''),
...                              arc_label_callback=str,
...                              node_attributes_callback=lambda label, is_start, is_end: ['shape=%s' % ('octagon' if is_start or is_end else 'ellipse',), 'style=%s' % ('bold' if isinstance(label, int) else 'normal',)],
...                              arc_attributes_callback=lambda label: ['style=%s' % ('bold' if label is None else 'normal',)]))
digraph Foo { 
  label="Foo";
  size="5,6.125";
  ordering=out;
  n0  [label="1", shape=ellipse, style=bold];
  n1  [label="2", shape=ellipse, style=bold];
  n2  [label="<a>", shape=octagon, style=normal];
  n0 -> n1  [label="3", style=normal];
  n1 -> n0  [label="None", style=bold];
  n2 -> n2  [label="5", style=normal];
}
<BLANKLINE>
get_arc(arc_id)

Return a tuple of the (start_node, end_node, arc_label) for the arc associated with the arc_id.

get_arc_label(arc_id)

Return the arc label associated with the arc_id.

get_node_label(node_id)

Return the node label associated with the node_id.

new_arc(start_node, end_node, arclabel=None)

Creates a new arc in the graph connecting the start_node with the end_node, and with an optional arclabel. It is an error if start_node or end_node are not valid node ids created by new_node(). Returns the id of the arc, a non-negative integer.

new_arc_label_is_id(start_node, end_node)

Creates a new arc in the graph, with arclabel being the id of the arc. Returns the id of the arc, a non-negative integer.

new_node(nodelabel=None)

Creates a new node in the graph, with optional nodelabel. Returns the id of the node, a non-negative integer.

new_node_label_is_id()

Creates a new node in the graph, with nodelabel being the id of the node. Returns the id of the node, a non-negative integer.

num_arcs

The number of arcs in the graph.

num_nodes

The number of nodes in the graph.

text_iter()

Returns a generator that yields lines of text, including newlines. The text represents the graph in a human-readable, serializable form.

verify()
class onyx.graph.graphtools.GraphStructureBuilder

Bases: onyx.graph.graphtools.GraphArcNodeStructure

Low level builder for the core structure of a graph.

>>> builder = GraphStructureBuilder()
>>> builder.num_nodes, builder.num_arcs, builder.num_start_nodes, builder.start_nodes, builder.num_end_nodes, builder.end_nodes
(0, 0, 0, (), 0, ())
>>> n0 = builder.new_node()
>>> n1, n2, n3 = builder.new_nodes(3)
>>> builder.num_nodes, builder.num_arcs, builder.num_start_nodes, builder.start_nodes, builder.num_end_nodes, builder.end_nodes
(4, 0, 4, (0, 1, 2, 3), 4, (0, 1, 2, 3))
>>> n0, n2
(0, 2)
>>> a0 = builder.new_arc(n0, n2)
>>> builder.num_nodes, builder.num_arcs, builder.num_start_nodes, builder.start_nodes, builder.num_end_nodes, builder.end_nodes
(4, 1, 3, (0, 1, 3), 3, (1, 2, 3))
>>> a1 = builder.new_arc_safe(n0, 5)
>>> builder.num_nodes, builder.num_arcs, builder.num_start_nodes, builder.start_nodes, builder.num_end_nodes, builder.end_nodes, builder.has_self_loop, builder.self_loops
(6, 2, 4, (0, 1, 3, 4), 5, (1, 2, 3, 4, 5), False, ())
>>> for node in builder.iter_node_ids:
...   print 'arc', n1, '->', node, '', builder.new_arc(n1, node)
...   if node < 3: print 'node', builder.new_node()
arc 1 -> 0  2
node 6
arc 1 -> 1  3
node 7
arc 1 -> 2  4
node 8
arc 1 -> 3  5
arc 1 -> 4  6
arc 1 -> 5  7
arc 1 -> 6  8
arc 1 -> 7  9
arc 1 -> 8  10
>>> builder.num_nodes, builder.num_arcs, builder.num_start_nodes, builder.start_nodes, builder.num_end_nodes, builder.end_nodes, builder.has_self_loop, builder.self_loops
(9, 11, 1, (1,), 7, (2, 3, 4, 5, 6, 7, 8), True, (1,))
>>> assert len(tuple(builder.iter_arc_ids)) == builder.num_arcs
>>> for index, (start_node, end_node) in enumerate(builder.iter_arcs):
...   print 'arc', index, '', start_node, '->', end_node
...   if end_node == 5: print 'arc', builder.new_arc(start_node, end_node - 1)
arc 0  0 -> 2
arc 1  0 -> 5
arc 11
arc 2  1 -> 0
arc 3  1 -> 1
arc 4  1 -> 2
arc 5  1 -> 3
arc 6  1 -> 4
arc 7  1 -> 5
arc 12
arc 8  1 -> 6
arc 9  1 -> 7
arc 10  1 -> 8
arc 11  0 -> 4
arc 12  1 -> 4
>>> builder.verify()
True
>>> builder.start_nodes, builder.end_nodes
((1,), (2, 3, 4, 5, 6, 7, 8))
>>> builder.num_nodes
9
>>> new_starts, new_ends = builder.add_graph(builder)
>>> tuple(sorted(new_starts)), tuple(sorted(new_ends))
((10,), (11, 12, 13, 14, 15, 16, 17))
>>> builder.num_nodes
18
add_graph(other)

Adds another graph, other, to this graph.

Returns a pair, the sets of node_ids in this graph of the start_nodes and end_nodes of other.

Raises TypeError if other is not an instace of GraphArcNodeStructure.

static create(num_nodes, arc_data)
end_nodes

A sorted tuple of the of end nodes in the graph. A end node is a node that has no arc leading from it to another node.

has_self_loop

True if there is any node in the graph with an arc to itself, False if there is no node in the graph with an arc to itself.

iter_arc_ids

An iterator that enumerates valid arc identifiers in the graph. It’s like xrange(graph.num_arcs), except that for mutable subclasses, the iterator correctly iterates over identifiers for arcs that are added while the iteration is active.

iter_arcs

An iterator that enumerates the arcs in the graph in their arc identifier order. It yields a pair, (start_node_id, end_node_id), for each arc in the graph. For mutable subclasses, the iterator correctly iterates over arcs that are added while the iteration is active.

iter_node_ids

An iterator that enumerates valid node identifiers in the graph. It’s like xrange(graph.num_nodes), except that for mutable subclasses, the iterator correctly iterates over identifiers for nodes that are added while the iteration is active.

new_arc(start_node, end_node)

Adds a new arc to the graph from the node identified by start_node to the node identified by end_node.

Returns an identifier for the new arc, a non-negative integer.

Raises ValueError if either start_node or end_node is not a valid node identifier for the graph.

new_arc_safe(start_node, end_node)

Adds a new arc to the graph from the node identified by start_node to the node identified by end_node.

Automatically adds nodes to the graph in order that non-negative start_node and end_node are valid node identifiers.

Returns an identifier for the new arc, a non-negative integer.

Raises ValueError if either of start_node or end_node is not a non-negative integer.

new_node()

Adds a new node to the graph.

Returns an identifier for the new node, a non-negative integer.

new_nodes(count)

Adds count new nodes to the graph.

Returns a tuple of identifiers for the new nodes, where each identifier is a non-negative integer.

num_arcs

The number of arcs in the graph.

num_end_nodes

Number of end nodes in the graph. A end node is a node that has no arc leading from it to another node.

num_nodes

The number of nodes in the graph.

num_start_nodes

Number of start nodes in the graph. A start node is a node that has no arc from another node ending on it.

self_loops

A sorted tuple of the nodes in the graph, each of which has one or more arcs to itself.

start_nodes

A sorted tuple of the start nodes in the graph. A start node is a node that has no arc from another node ending on it.

verify()

Verify the object’s invariants.

class onyx.graph.graphtools.GraphTables(*args)

Bases: tuple

A fixed-size tuple – used for exchanging graph data in a simple, tabular form. Subclassing is discouraged.

With no arguments, constructs an empty set of immutable tables.

>>> GraphTables()
GraphTables(((), (), (), ()))

Otherwise, a single argument is required, and it must be a sequence of four sequences. These four sequences describe the structure of the graph and the labels on the elements of the graph.

Conceptually, there are two tables that make up the description of the graph: - a single column table of node labels - a three column table of arcs: arc startnodes, arc endnodes, arc labels

The identifiers (ids) for the elements of the graph are implicit in these tables; they are the indices into the rows of these two tables. A node id is the index into the column of node labels. An arc id is the index into the rows of arc descriptions.

Each arc startnode and each arc endnode is the index of a node at the end of the arc. That is, each startnode and endnode is a non-negative integer that is less than the number of items in the node-labels column. Each node-label and each arc-label is an arbitrary object.

Consider the following informal description of a graph:
four nodes with labels:
node 0 label 1 node 1 label 2 node 2 label ‘a’ node 3 label ‘b’
three arcs with labels:
arc 0 from node 0 to node 1 label 3 arc 1 from node 1 to node 0 label None arc 2 from node 2 to node 2 label 5

This graph contains two disconnected subgraphs, both of which are cyclic. Here’s the GraphTables for that graph

>>> GraphTables(((1, 2, 'a', 'b'), (0, 1, 2), (1, 0, 2), (3, None, 5)))
GraphTables(((1, 2, 'a', 'b'), (0, 1, 2), (1, 0, 2), (3, None, 5)))

The constructor performs consistency checks on the contents of the argument to the constructor.

Constructor takes zero or one arguments, just like tuple. A common mistake is to unpack the argument:

>>> GraphTables((1, 2, 'a', 'b'), (0, 1, 2), (1, 0, 2), (3, None, 5))
Traceback (most recent call last):
  ...
ValueError: GraphTables() expected 0 or 1 constructor arguments, got 4

The single argument must contain four elements

>>> GraphTables(((1, 2, 'a', 'b'), (0, 1, 2), (1, 0, 2)))
Traceback (most recent call last):
  ...
ValueError: GraphTables(seq) expected 4 items in seq, got 3

The lengths of the columns for the arcs must all be the same:

>>> GraphTables(((1, 2, 3), (0, 1), (1, 0), (3, 4, 4)))
Traceback (most recent call last):
  ...
ValueError: expected equal lengths for arcstartnodes 2, andarcendnodes 2, and arclabels 3
>>> GraphTables(((1, 2, 3), (0,), (1, 0), (3, 4)))
Traceback (most recent call last):
  ...
ValueError: expected equal lengths for arcstartnodes 1, andarcendnodes 2, and arclabels 2
>>> GraphTables(((1, 2, 3), (0, 1), (1, 0, 2), (3, 4)))
Traceback (most recent call last):
  ...
ValueError: expected equal lengths for arcstartnodes 2, andarcendnodes 3, and arclabels 2

The node indices for the arcs must be valid indices into the node-labels column:

>>> GraphTables(((1, 2, 3, 4), (0, 1, 4), ('a', -1, 5), (3, 4, 5)))
Traceback (most recent call last):
  ...
ValueError: expected arcstartnodes and arcendodes to contain non-negative ids less than 4, but also got (-1, 4, 5, 'a')
count

T.count(value) -> integer – return number of occurrences of value

index

T.index(value, [start, [stop]]) -> integer – return first index of value. Raises ValueError if the value is not present.

verify()
class onyx.graph.graphtools.NodeLabelIdGraphBuilder(*_)

Bases: onyx.graph.graphtools.GraphBuilder

Graph builder that uses node labels as node ids. This supports the common situation of building a graph in which each node has a unique label that is easy for the client to use when creating arcs.

Each node label will be a tuple of a single integer

>>> builder = NodeLabelIdGraphBuilder()
>>> nodes = tuple(builder.new_node((x,)) for x in xrange(10))
>>> nodes
((0,), (1,), (2,), (3,), (4,), (5,), (6,), (7,), (8,), (9,))

Link the nodes into a chain

>>> i1 = iter(nodes); i1.next()
(0,)
>>> arcs = tuple(builder.new_arc(x, y) for x, y in itertools.izip(nodes, i1))
>>> arcs
(0, 1, 2, 3, 4, 5, 6, 7, 8)

Exercise some error messages

>>> builder.new_node((1,))
Traceback (most recent call last):
  ...
ValueError: NodeLabelIdGraphBuilder: unexpected duplicate node label: (1,)
>>> builder.new_arc((1,), 2)
Traceback (most recent call last):
  ...
ValueError: NodeLabelIdGraphBuilder: expected end_node argument to be an existing node label, got 2
add_graph(subgraph, node_label_callback=<function <lambda> at 0x1d06bb18>, arc_label_callback=<function <lambda> at 0x1d06bb90>)

Add an entire subgraph to this builder’s graph.

Add all the nodes and edges and their respective labels from subgraph into the builder’s graph.

Returns a triple of tuples representing the nodes of the subgraph within the builder’s graph. The items in the triple are the node indices in the builder of all the nodes from subgraph, the indices of the subgraph’s startnodes, and the indices of the subgraph’s endnodes respectively. Note that these node indices are not necessarily in the same order as they are in subgraph.

add_graph_label_is_id(subgraph)
dot_display(temp_file_prefix='graphtools_', display_tool_format='open -a /Applications/Graphviz.app %s', **kwargs)

Display a dot-generated representation of the graph. Returns the name of the temporary file that the display tool is working from. The caller is responsible for removing this file. See also dot_iter().

Optional temp_file_prefix is the prefix used for the temporary filename.

Optional display_tool_format is formatting string, with %s where the filename goes, used to generate the command that will display the file. By default it assumes you’re on a Mac and have Graphviz.app installed in the /Applications directory.

Remaining keyword arguments are handed to the dot_iter function.

dot_iter(graph_label='', graph_type='digraph', globals=(), node_label_callback=<function <lambda> at 0x1d06a5f0>, node_attributes_callback=None, arc_label_callback=<function <lambda> at 0x1d06a5f0>, arc_attributes_callback=None, arc_ports_callback=None)

Returns a generator that yields lines of text, including newlines. The text represents the graph in the DOT language for which there are many displayers. See also dot_display().

Optional argument graph_label is a string label for the graph. Optional argument graph_type defaults to ‘digraph’, can also be ‘graph’.

Optional globals is an iterable that yields strings to put in the globals section of the DOT file.

Optional node_label_callback is a function that will be called with three arguments for each node: (node_label, is_start, is_end), where node_label is the node’s label, is_start is True if the node is a start node (has no incoming arcs), and is_end is True if the node is an end node (has no outgoing arcs). The callback function should return a string which will be used as the text label for the node in the figure, or None or the empty string for no label. If this argument is not given, each node with a non-None label will be labelled with the str value of its label. If this argument is None, nodes will not be labelled.

Optional arc_label_callback is a function that will be called with the arc label for each arc. The callback function should return a string which will be used as the text label for the arc in the figure, or None or the empty string for no label. If this argument is not given, each arc with a non-None label will be labelled with the str value of its label. If this argument is None, arcs will not be labelled.

Optional node_attributes_callback is a function that returns an iterable that will yield a string for each attribute assignment for the node. It will be called with three arguments for each node: (node_label, is_start, is_end), where node_label is the node’s label, is_start is True if the node is a start node (has no incoming arcs), and is_end is True if the node is an end node (has no outgoing arcs).

Optional arc_attributes_callback is a function that returns an iterable that will yield a string for each attribute assignment for the arc. It will be called with the arc label for each arc in the graph.

Optional arc_ports_callback is a function of two arguments that is called for each arc in the graph. The arguments are the labels of the start and end nodes of the arc. The function returns a pair of strings that name the respective ports of the nodes to which the arc should attach. The strings should be empty if no port is implied.

>>> graph = FrozenGraph(GraphTables(((1, 2, 'a'), (0, 1, 2), (1, 0, 2), (3, None, 5))))
>>> print ''.join(graph.dot_iter())
digraph  { 
  n0  [label="1"];
  n1  [label="2"];
  n2  [label="a"];
  n0 -> n1  [label="3"];
  n1 -> n0;
  n2 -> n2  [label="5"];
}
<BLANKLINE>
>>> print ''.join(graph.dot_iter(graph_label='Foo',
...                              globals=['size="5,6.125";', 'ordering=out;'],
...                              node_label_callback=lambda label, is_start, is_end: ('<' if is_start else '') + str(label) + ('>' if is_end else ''),
...                              arc_label_callback=str,
...                              node_attributes_callback=lambda label, is_start, is_end: ['shape=%s' % ('octagon' if is_start or is_end else 'ellipse',), 'style=%s' % ('bold' if isinstance(label, int) else 'normal',)],
...                              arc_attributes_callback=lambda label: ['style=%s' % ('bold' if label is None else 'normal',)]))
digraph Foo { 
  label="Foo";
  size="5,6.125";
  ordering=out;
  n0  [label="1", shape=ellipse, style=bold];
  n1  [label="2", shape=ellipse, style=bold];
  n2  [label="<a>", shape=octagon, style=normal];
  n0 -> n1  [label="3", style=normal];
  n1 -> n0  [label="None", style=bold];
  n2 -> n2  [label="5", style=normal];
}
<BLANKLINE>
get_arc(arc_id)

Return a tuple of the (start_node, end_node, arc_label) for the arc associated with the arc_id.

get_arc_label(arc_id)

Return the arc label associated with the arc_id.

get_node_label(node_id)

Return the node label associated with the node_id.

new_arc(start_node, end_node, arclabel=None)

Creates a new arc in the graph connecting the start_node with the end_node, and with an optional arclabel. It is an error if start_node or end_node are not valid node labels for nodes created by new_node(). Returns the id of the arc, a non-negative integer.

new_arc_label_is_id()
new_node(nodelabel)

Creates a new node in the graph, with nodelabel, which must be immutable and unique across the labels of all nodes. Returns the label.

new_node_label_is_id()
num_arcs

The number of arcs in the graph.

num_nodes

The number of nodes in the graph.

text_iter()

Returns a generator that yields lines of text, including newlines. The text represents the graph in a human-readable, serializable form.

verify()
class onyx.graph.graphtools.SerializedGraphTables(*args)

Bases: onyx.graph.graphtools.GraphTables

GraphTables object built from the Yaml representation in a stream.

>>> doc = '''
... - __onyx_yaml__meta_version : '1'
...   __onyx_yaml__stream_type : SerializedGraphTables
...   __onyx_yaml__stream_version : '0'
...
... -
...   # canonic format for SerializedGraphTables
...   - num_nodes 3
...   # fields are: node-id node-label
...   - 0  -1
...   - 1  nil
...   - 2  end
...   - num_arcs 3
...   # fields are: arc-id start-node-id end-node-id arc-label
...   - 0  0 1  -1
...   - 1  0 2  -2
...   - 2  1 2  -3
... '''
>>> SerializedGraphTables(doc)
GraphTables((('-1', 'nil', 'end'), (0, 0, 1), (1, 2, 2), ('-1', '-2', '-3')))
>>> # FrozenGraph(SerializedGraphTables(doc)).dot_display(globals=['rankdir=LR'])
>>> SerializedGraphTables(doc) == GraphTables((('-1', 'nil', 'end'), (0, 0, 1), (1, 2, 2), ('-1', '-2', '-3')))
True
count

T.count(value) -> integer – return number of occurrences of value

index

T.index(value, [start, [stop]]) -> integer – return first index of value. Raises ValueError if the value is not present.

verify()
class onyx.graph.graphtools.SerializedGraphTables2(*args)

Bases: onyx.graph.graphtools.GraphTables

GraphTables object built from the Yaml representation in a stream.

>>> doc = '''
... - __onyx_yaml__meta_version : '1'
...   __onyx_yaml__stream_type : OnyxText
...   __onyx_yaml__stream_version : '0'
...
... -
...   # canonic OnyxText format for SerializedGraphTables2
...   - stream_type OnyxText stream_version 0 data_type SerializedGraphTables2 data_version 0
...   - num_nodes 3
...   - Nodes IndexedCollection Node 3
...   # fields are: 'Node' node-id node-label
...   - Node 0  -1
...   - Node 1  nil
...   - Node 2  end
...   - num_arcs 3
...   - Arcs IndexedCollection Arc 3
...   # fields are: 'Arc' arc-id start-node-id end-node-id arc-label
...   - Arc 0  0 1  -1
...   - Arc 1  0 2  -2
...   - Arc 2  1 2  -3
... '''
>>> sgt0 = SerializedGraphTables2(doc)
>>> sgt0
GraphTables((('-1', 'nil', 'end'), (0, 0, 1), (1, 2, 2), ('-1', '-2', '-3')))
>>> SerializedGraphTables2(doc) == GraphTables((('-1', 'nil', 'end'), (0, 0, 1), (1, 2, 2), ('-1', '-2', '-3')))
True
>>> out_stream = cStringIO.StringIO()
>>> sgt0.serialize(out_stream)
>>> print(out_stream.getvalue())
---
- __onyx_yaml__stream_version: "0"
  __onyx_yaml__meta_version: "1"
  __onyx_yaml__stream_type: "OnyxText"
- - stream_type OnyxText stream_version 0 data_type SerializedGraphTables2 data_version 0
  - num_nodes 3
  - Nodes IndexedCollection Node 3
  - Node 0 -1
  - Node 1 nil
  - Node 2 end
  - num_arcs 3
  - Arcs IndexedCollection Arc 3
  - Arc 0 0 1 -1
  - Arc 1 0 2 -2
  - Arc 2 1 2 -3
<BLANKLINE>
>>> out_stream.seek(0)
>>> sgt1 = SerializedGraphTables2(out_stream)
>>> sgt1 == sgt0 == GraphTables((('-1', 'nil', 'end'), (0, 0, 1), (1, 2, 2), ('-1', '-2', '-3')))
True
count

T.count(value) -> integer – return number of occurrences of value

index

T.index(value, [start, [stop]]) -> integer – return first index of value. Raises ValueError if the value is not present.

serialize(stream)

Serialize this graph into a YAML stream. The resulting text is human-readable and matches the format used to create a SerializedGraphTables2.

tuple_iter()

Returns a generator that yields tuples of tokens. The tuples represent the graph in a form serializable by YamldataWriter.

verify()
class onyx.graph.graphtools.SetGraphBuilder(*_)

Bases: onyx.graph.graphtools.GraphBuilder

A specialization of GraphBuilder with set semantics for nodes and arcs. A Node is unique based on its label. An Arc is unique based on its end nodes and its label. As with set objects, adding a node or arc more than once does not change the graph.

The constructor takes and optional init_graph argument, a subclass of GraphBase or GraphTables. If given, init_graph is used to initialize the instance. It is an error if init_graph has multiple nodes with the same label or multiple (start_node, end_node, arc_label) triples.

>>> b = SetGraphBuilder()
>>> b.add_node('A')
'A'
>>> b.add_node('B')
'B'
>>> b.num_nodes
2
>>> b.add_node('A')
'A'
>>> b.num_nodes
2
>>> b.add_arc('A', 'B', 'X')
0
>>> b.add_arc('A', 'B', 'Y')
1
>>> b.num_arcs
2
>>> b.add_arc('A', 'B', 'X')
0
>>> b.num_arcs
2

Nodes are added automatically by add_arc.

>>> b.add_arc('AA', 'BB', 'XX')
2
>>> b.num_nodes
4
>>> b.num_arcs
3

Use update_arcs() to add a bunch of arcs

>>> b.update_arcs(itertools.izip(xrange(20), xrange(5, 25), itertools.count()))
>>> b.num_nodes, b.num_arcs
(29, 23)
>>> g = FrozenGraph(b)
>>> #g.dot_display(globals=('label="SetGraphBuilder";', 'labelloc=top;',))
>>> g.get_forward_transitive_closures(['A'])
(set(['B']), set([0, 1]))
>>> g.get_backward_transitive_closures(['A'], force_reflexive=True)
(set(['A']), set([]))

Initialize from a FrozenGraph and continue building

>>> b1 = SetGraphBuilder(FrozenGraph(g))
>>> b == b1
True

Comparisons

>>> b1.add_arc(1,2,3)
23
>>> b1 == b
False
>>> b1 != b
True
>>> b.add_arc(0,1,2)
23
>>> b1 == b
False
>>> b.add_arc(1,2,3)
24
>>> b1 == b
False
>>> b1.add_arc(0,1,2)
24
>>> b1 == b
True
>>> b1 != b
False

It’s an error to try to initialize from a graph with non-unique node labels:

>>> gb = GraphBuilder(g)
>>> id = gb.new_node('A')
>>> b = SetGraphBuilder(FrozenGraph(gb))
Traceback (most recent call last):
...
ValueError: SetGraphBuilder must be initialized with a graph having unique node labels

It’s an error to try to initialize from a graph with non-unique (start_node, end_node, arc_label) triples:

>>> gb = GraphBuilder(g)
>>> id = gb.new_arc(2, 3, 'XX')
>>> b = SetGraphBuilder(FrozenGraph(gb))
Traceback (most recent call last):
...
ValueError: SetGraphBuilder must be initialized with a graph having distinct (start_node_label, end_node_label, arc_label) triples
add_arc(start_node, end_node, arclabel=None)

Add an arc to the graph connecting the start_node with the end_node, and with an optional arclabel. If not already present in the graph, each of start_node or end_node will be added to the graph as if add_node() had been called.

Returns the id of the arc, a non-negative integer. The arcs in the graph are a set, so add_arc() is idempotent; additional calls to add_arc() with the same arguments will not create additional arcs.

add_graph(subgraph, node_label_callback=<function <lambda> at 0x1d06bb18>, arc_label_callback=<function <lambda> at 0x1d06bb90>)

Add an entire subgraph to this builder’s graph.

Add all the nodes and edges and their respective labels from subgraph into the builder’s graph.

Returns a triple of tuples representing the nodes of the subgraph within the builder’s graph. The items in the triple are the node indices in the builder of all the nodes from subgraph, the indices of the subgraph’s startnodes, and the indices of the subgraph’s endnodes respectively. Note that these node indices are not necessarily in the same order as they are in subgraph.

add_graph_label_is_id(subgraph)
add_node(node_label)

Adds a node in the graph, with node_label, which must be immutable.

Returns node_label as the node id for reference to the node in calls to add_arc(). The nodes in the graph are a set, so add_node() is idempotent; additional calls to add_node() with the same node_label will not create additional nodes.

dot_display(temp_file_prefix='graphtools_', display_tool_format='open -a /Applications/Graphviz.app %s', **kwargs)

Display a dot-generated representation of the graph. Returns the name of the temporary file that the display tool is working from. The caller is responsible for removing this file. See also dot_iter().

Optional temp_file_prefix is the prefix used for the temporary filename.

Optional display_tool_format is formatting string, with %s where the filename goes, used to generate the command that will display the file. By default it assumes you’re on a Mac and have Graphviz.app installed in the /Applications directory.

Remaining keyword arguments are handed to the dot_iter function.

dot_iter(graph_label='', graph_type='digraph', globals=(), node_label_callback=<function <lambda> at 0x1d06a5f0>, node_attributes_callback=None, arc_label_callback=<function <lambda> at 0x1d06a5f0>, arc_attributes_callback=None, arc_ports_callback=None)

Returns a generator that yields lines of text, including newlines. The text represents the graph in the DOT language for which there are many displayers. See also dot_display().

Optional argument graph_label is a string label for the graph. Optional argument graph_type defaults to ‘digraph’, can also be ‘graph’.

Optional globals is an iterable that yields strings to put in the globals section of the DOT file.

Optional node_label_callback is a function that will be called with three arguments for each node: (node_label, is_start, is_end), where node_label is the node’s label, is_start is True if the node is a start node (has no incoming arcs), and is_end is True if the node is an end node (has no outgoing arcs). The callback function should return a string which will be used as the text label for the node in the figure, or None or the empty string for no label. If this argument is not given, each node with a non-None label will be labelled with the str value of its label. If this argument is None, nodes will not be labelled.

Optional arc_label_callback is a function that will be called with the arc label for each arc. The callback function should return a string which will be used as the text label for the arc in the figure, or None or the empty string for no label. If this argument is not given, each arc with a non-None label will be labelled with the str value of its label. If this argument is None, arcs will not be labelled.

Optional node_attributes_callback is a function that returns an iterable that will yield a string for each attribute assignment for the node. It will be called with three arguments for each node: (node_label, is_start, is_end), where node_label is the node’s label, is_start is True if the node is a start node (has no incoming arcs), and is_end is True if the node is an end node (has no outgoing arcs).

Optional arc_attributes_callback is a function that returns an iterable that will yield a string for each attribute assignment for the arc. It will be called with the arc label for each arc in the graph.

Optional arc_ports_callback is a function of two arguments that is called for each arc in the graph. The arguments are the labels of the start and end nodes of the arc. The function returns a pair of strings that name the respective ports of the nodes to which the arc should attach. The strings should be empty if no port is implied.

>>> graph = FrozenGraph(GraphTables(((1, 2, 'a'), (0, 1, 2), (1, 0, 2), (3, None, 5))))
>>> print ''.join(graph.dot_iter())
digraph  { 
  n0  [label="1"];
  n1  [label="2"];
  n2  [label="a"];
  n0 -> n1  [label="3"];
  n1 -> n0;
  n2 -> n2  [label="5"];
}
<BLANKLINE>
>>> print ''.join(graph.dot_iter(graph_label='Foo',
...                              globals=['size="5,6.125";', 'ordering=out;'],
...                              node_label_callback=lambda label, is_start, is_end: ('<' if is_start else '') + str(label) + ('>' if is_end else ''),
...                              arc_label_callback=str,
...                              node_attributes_callback=lambda label, is_start, is_end: ['shape=%s' % ('octagon' if is_start or is_end else 'ellipse',), 'style=%s' % ('bold' if isinstance(label, int) else 'normal',)],
...                              arc_attributes_callback=lambda label: ['style=%s' % ('bold' if label is None else 'normal',)]))
digraph Foo { 
  label="Foo";
  size="5,6.125";
  ordering=out;
  n0  [label="1", shape=ellipse, style=bold];
  n1  [label="2", shape=ellipse, style=bold];
  n2  [label="<a>", shape=octagon, style=normal];
  n0 -> n1  [label="3", style=normal];
  n1 -> n0  [label="None", style=bold];
  n2 -> n2  [label="5", style=normal];
}
<BLANKLINE>
get_arc(arc_id)

Return a tuple of the (start_node, end_node, arc_label) for the arc associated with the arc_id.

get_arc_label(arc_id)

Return the arc label associated with the arc_id.

get_node_label(node_id)

Return the node label associated with the node_id.

static init_from_arcs(arcs_iter)

Create and initialize a SetGraphBuilder instance using the arc specs in arcs_iter. See also method update_arcs().

new_arc(start_node, end_node, arclabel=None)

Raises NotImplementedError. Use add_arce() instead.

new_arc_label_is_id(start_node, end_node)

Creates a new arc in the graph, with arclabel being the id of the arc. Returns the id of the arc, a non-negative integer.

new_node(node_label=None)

Raises NotImplementedError. Use add_node() instead.

new_node_label_is_id()

Creates a new node in the graph, with nodelabel being the id of the node. Returns the id of the node, a non-negative integer.

num_arcs

The number of arcs in the graph.

num_nodes

The number of nodes in the graph.

text_iter()

Returns a generator that yields lines of text, including newlines. The text represents the graph in a human-readable, serializable form.

update_arcs(arc_iter)

Update the graph with the arcs from arc_iter. Each item from arc_iter is a sequence that specifies an arc, either a (start_node, end_node) pair or a (start_node, end_node, arc_label) triple. Each arc is added to the graph. See method add_arc() for futher details about adding an arc to the graph.

verify()
class onyx.graph.graphtools.SubgraphsBuilder(init_subgraphs=())

Bases: onyx.graph.graphtools._SubgraphsGetter

Lightweight object for keeping track of subgraphs of a graph.

A given subgraph is specified by a triple of sets, (nodes, starts, ends): where nodes is the set of immutable identifiers for all the nodes in the subgraph, starts is the set identifiers for what are to be considered the starting nodes of the subgraph, and ends is the set identifiers for what are to be considered the ending nodes of the subgraph. An instance of SubgraphsBuilder manages a growable collection of such triples, making sure for each subgraph that starts and ends are subsets of nodes.

If optional init_subgraphs is provided it must either be an instance of SubgraphsBuilder or an iterable that yields subgraph-specifying triples. It will be used to initialize the subgraphs in this instance.

>>> sb = SubgraphsBuilder()
>>> sb.add_subgraph(xrange(10), xrange(2), xrange(3, 5))
0
>>> sb.add_subgraph((47, 49, 51, 55), (47,), ())
1
>>> sb.get_subgraph(0)
((0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1), (3, 4))
>>> sb2 = SubgraphsBuilder(sb)
>>> id = sb2.add_subgraph('abcdef', 'ac', 'de')
>>> id + 1 == len(sb2)
True
>>> sb2.get_subgraph(id)
(('a', 'b', 'c', 'd', 'e', 'f'), ('a', 'c'), ('d', 'e'))
>>> sb.add_subgraph(xrange(10), xrange(2), xrange(8, 12))
Traceback (most recent call last):
  ...
ValueError: at index 2, expected starts and ends to be subsets of nodes, but got these too: 10 11
add_subgraph(nodes, starts, ends)

Add a new subgraph specification. Each of nodes, starts, and ends is an iterable collection of immutable node specifiers for, respectively, all the nodes in the subgraph, the starting nodes of the subgraph, and the ending nodes of the subgraph.

Returns an identifier that can be used in a call to get_subgraph() to retrieve the subgraph specification.

Raises TypeError if any item in starts, ends, or nodes is a mutable object. Raises ValueError if starts or ends are not a subset of nodes.

get_subgraph(index)

Returns a triple of tuples: (nodes, starts, ends) representing the subgraph at index.

Raises ValueError if index is not a valid indentifier for a subgraph.

class onyx.graph.graphtools.TopologicalGraphBuilder(init=GraphTables(((), (), (), ())), allow_self_loops=False)

Bases: onyx.graph.graphtools.GraphBuilder

Used to build an directed acyclic graph (DAG) by requiring a topologically ordered creation of the arcs in the graph.

The user is responsible for ensuring that the arcs are created according to a topological ordering relation for each of the nodes and for each of the arcs. For nodes this means that the start node for an arc must have been created prior to the creation of the end node for an arc. For arcs this means that all of the arcs that end on a given node must have been created prior to the creation of any arc that leaves that node.

Each of these constraints means that there will be no self loops.

This is a subclass of GraphBuilder and has the same interface.

Raises ValueError if an attempt is made to create an arc that violates the topological ordering constraints for nodes or arcs.

>>> builder = TopologicalGraphBuilder()

Do stuff topologically

>>> start = builder.new_node()
>>> mid1 = builder.new_node()
>>> mid2 = builder.new_node()
>>> end = builder.new_node()
>>> assert start < mid1 < mid2 < end
>>> l1 = builder.new_arc(start, mid2)
>>> l2 = builder.new_arc(start, end)
>>> l3 = builder.new_arc(mid2, end)
>>> assert l1 < l2 < l3
>>> g = FrozenGraph(builder)
>>> g.has_self_loop, g.has_cycle, g.is_strict_dag
(False, False, True)

Copy construction examples. These verify the constraints on the constructor argument.

>>> TopologicalGraphBuilder(builder)
TopologicalGraphBuilder(GraphTables(([None, None, None, None], [0, 0, 2], [2, 3, 3], [None, None, None])))
>>> TopologicalGraphBuilder(GraphTables(([None, None, None, None], [0, 0, 2], [2, 3, 3], [None, None, None])))
TopologicalGraphBuilder(GraphTables(([None, None, None, None], [0, 0, 2], [2, 3, 3], [None, None, None])))

Allow self-loops.

>>> builder = TopologicalGraphBuilder(allow_self_loops=True)
>>> num_nodes = 5
>>> nodes = tuple(builder.new_node_label_is_id() for i in xrange(num_nodes))

Self loop must be the first outgoing arc on a node. These are OK.

>>> tuple(builder.new_arc_label_is_id(node, node) for node in nodes)
(0, 1, 2, 3, 4)

This one isn’t.

>>> builder.new_arc_label_is_id(0, 0)
Traceback (most recent call last):
  ...
ValueError: arc violates the topological ordering because the end node, 0, already has at least one outgoing arc: failed arc (0, 0, <int>)

Another example showing the failure if the self loop is not the first outgoing arc.

>>> node1, node2 = builder.new_node_label_is_id(), builder.new_node_label_is_id()
>>> builder.new_arc_label_is_id(node1, node2)
5
>>> builder.new_arc_label_is_id(node2, node2)
6
>>> builder.new_arc_label_is_id(node1, node1)
Traceback (most recent call last):
  ...
ValueError: arc violates the topological ordering because the end node, 5, already has at least one outgoing arc: failed arc (5, 5, <int>)

Try to violate the constraint with an arc from a higher to a lower node

>>> builder.new_arc(end, mid1)
Traceback (most recent call last):
  ...
ValueError: arc violates the topological ordering because the start node is greater than or equal to the end node: failed arc (3, 1, <NoneType>)

Try to violate the constraint with an arc that ends on a node that has an outgoing arc

>>> builder.new_arc(mid1, mid2, 'foo')
Traceback (most recent call last):
  ...
ValueError: arc violates the topological ordering because the end node, 2, already has at least one outgoing arc: failed arc (1, 2, <str>)

Verify that the topological constraints are checked on copy construction

>>> TopologicalGraphBuilder(GraphTables(([None, None, None, None], [0, 0, 2, 3], [2, 3, 3, 3], [None, None, None, None])))
Traceback (most recent call last):
  ...
ValueError: arc violates the topological ordering because the start node is greater than or equal to the end node: failed arc (3, 3, <NoneType>)
>>> TopologicalGraphBuilder(GraphTables(([None, None, None, None], [0, 0, 2, 1], [2, 3, 3, 2], [None, None, None, 'bar'])))
Traceback (most recent call last):
  ...
ValueError: arc violates the topological ordering because the end node, 2, already has at least one outgoing arc: failed arc (1, 2, <str>)

Creates a buildable graph that requires topologically ordered construction. If allow_self_loops is True, then a single self loop is allowed on a node. The self loop must be the first outgoing arc for the node.

add_graph(subgraph, node_label_callback=<function <lambda> at 0x1d06bb18>, arc_label_callback=<function <lambda> at 0x1d06bb90>)

Add an entire subgraph to this builder’s graph.

Add all the nodes and edges and their respective labels from subgraph into the builder’s graph.

Returns a triple of tuples representing the nodes of the subgraph within the builder’s graph. The items in the triple are the node indices in the builder of all the nodes from subgraph, the indices of the subgraph’s startnodes, and the indices of the subgraph’s endnodes respectively. Note that these node indices are not necessarily in the same order as they are in subgraph.

add_graph_label_is_id(subgraph)
dot_display(temp_file_prefix='graphtools_', display_tool_format='open -a /Applications/Graphviz.app %s', **kwargs)

Display a dot-generated representation of the graph. Returns the name of the temporary file that the display tool is working from. The caller is responsible for removing this file. See also dot_iter().

Optional temp_file_prefix is the prefix used for the temporary filename.

Optional display_tool_format is formatting string, with %s where the filename goes, used to generate the command that will display the file. By default it assumes you’re on a Mac and have Graphviz.app installed in the /Applications directory.

Remaining keyword arguments are handed to the dot_iter function.

dot_iter(graph_label='', graph_type='digraph', globals=(), node_label_callback=<function <lambda> at 0x1d06a5f0>, node_attributes_callback=None, arc_label_callback=<function <lambda> at 0x1d06a5f0>, arc_attributes_callback=None, arc_ports_callback=None)

Returns a generator that yields lines of text, including newlines. The text represents the graph in the DOT language for which there are many displayers. See also dot_display().

Optional argument graph_label is a string label for the graph. Optional argument graph_type defaults to ‘digraph’, can also be ‘graph’.

Optional globals is an iterable that yields strings to put in the globals section of the DOT file.

Optional node_label_callback is a function that will be called with three arguments for each node: (node_label, is_start, is_end), where node_label is the node’s label, is_start is True if the node is a start node (has no incoming arcs), and is_end is True if the node is an end node (has no outgoing arcs). The callback function should return a string which will be used as the text label for the node in the figure, or None or the empty string for no label. If this argument is not given, each node with a non-None label will be labelled with the str value of its label. If this argument is None, nodes will not be labelled.

Optional arc_label_callback is a function that will be called with the arc label for each arc. The callback function should return a string which will be used as the text label for the arc in the figure, or None or the empty string for no label. If this argument is not given, each arc with a non-None label will be labelled with the str value of its label. If this argument is None, arcs will not be labelled.

Optional node_attributes_callback is a function that returns an iterable that will yield a string for each attribute assignment for the node. It will be called with three arguments for each node: (node_label, is_start, is_end), where node_label is the node’s label, is_start is True if the node is a start node (has no incoming arcs), and is_end is True if the node is an end node (has no outgoing arcs).

Optional arc_attributes_callback is a function that returns an iterable that will yield a string for each attribute assignment for the arc. It will be called with the arc label for each arc in the graph.

Optional arc_ports_callback is a function of two arguments that is called for each arc in the graph. The arguments are the labels of the start and end nodes of the arc. The function returns a pair of strings that name the respective ports of the nodes to which the arc should attach. The strings should be empty if no port is implied.

>>> graph = FrozenGraph(GraphTables(((1, 2, 'a'), (0, 1, 2), (1, 0, 2), (3, None, 5))))
>>> print ''.join(graph.dot_iter())
digraph  { 
  n0  [label="1"];
  n1  [label="2"];
  n2  [label="a"];
  n0 -> n1  [label="3"];
  n1 -> n0;
  n2 -> n2  [label="5"];
}
<BLANKLINE>
>>> print ''.join(graph.dot_iter(graph_label='Foo',
...                              globals=['size="5,6.125";', 'ordering=out;'],
...                              node_label_callback=lambda label, is_start, is_end: ('<' if is_start else '') + str(label) + ('>' if is_end else ''),
...                              arc_label_callback=str,
...                              node_attributes_callback=lambda label, is_start, is_end: ['shape=%s' % ('octagon' if is_start or is_end else 'ellipse',), 'style=%s' % ('bold' if isinstance(label, int) else 'normal',)],
...                              arc_attributes_callback=lambda label: ['style=%s' % ('bold' if label is None else 'normal',)]))
digraph Foo { 
  label="Foo";
  size="5,6.125";
  ordering=out;
  n0  [label="1", shape=ellipse, style=bold];
  n1  [label="2", shape=ellipse, style=bold];
  n2  [label="<a>", shape=octagon, style=normal];
  n0 -> n1  [label="3", style=normal];
  n1 -> n0  [label="None", style=bold];
  n2 -> n2  [label="5", style=normal];
}
<BLANKLINE>
get_arc(arc_id)

Return a tuple of the (start_node, end_node, arc_label) for the arc associated with the arc_id.

get_arc_label(arc_id)

Return the arc label associated with the arc_id.

get_node_label(node_id)

Return the node label associated with the node_id.

new_arc(start_node, end_node, arclabel=None)

Creates a new arc in the graph connecting the start_node with the end_node, and with an optional arclabel. It is an error if start_node or end_node are not valid node ids created by new_node() or if the topological construction constraints are violated. Returns the id of the arc, a non-negative integer.

new_arc_label_is_id(start_node, end_node)

Creates a new arc in the graph, with arclabel being the id of the arc. Returns the id of the arc, a non-negative integer.

new_node(nodelabel=None)

Creates a new node in the graph, with optional nodelabel. Returns the id of the node, a non-negative integer.

new_node_label_is_id()

Creates a new node in the graph, with nodelabel being the id of the node. Returns the id of the node, a non-negative integer.

num_arcs

The number of arcs in the graph.

num_nodes

The number of nodes in the graph.

text_iter()

Returns a generator that yields lines of text, including newlines. The text represents the graph in a human-readable, serializable form.

verify()
class onyx.graph.graphtools.Transducer(*_)

Bases: tuple

Transducer element, a three-item tuple, (left, right, label), where left is the input label, right is the output label, and trans is the transducing element.

By convention of the clients that use instances of Transducer, left or right can be None to indicate an epsilon, and trans can be a number to indicate a transduction cost. By convention, a pure epsilon transducer is a tuple of three None items; this common case generated by calling the class with no arguments.

>>> Transducer()
Transducer((None, None, None))
>>> t = Transducer('a', 'alpha', 23)
>>> t
Transducer(('a', 'alpha', 23))
>>> t is Transducer(t)
True
>>> t == Transducer(('a', 'alpha', 23)) == Transducer(['a', 'alpha', 23])
True
>>> t.left
'a'
>>> t.right
'alpha'
>>> t.trans
23
>>> str(t)
'a:alpha/23'
>>> str(Transducer('b', None, 0))
'b:-/0'
>>> str(Transducer())
'-:-/-'
>>> assert Transducer() is Transducer.epsilon_transducer
static as_label(item)
count

T.count(value) -> integer – return number of occurrences of value

index

T.index(value, [start, [stop]]) -> integer – return first index of value. Raises ValueError if the value is not present.

left

itemgetter(item, ...) –> itemgetter object

Return a callable object that fetches the given item(s) from its operand. After, f=itemgetter(2), the call f(r) returns r[2]. After, g=itemgetter(2,5,3), the call g(r) returns (r[2], r[5], r[3])

right

itemgetter(item, ...) –> itemgetter object

Return a callable object that fetches the given item(s) from its operand. After, f=itemgetter(2), the call f(r) returns r[2]. After, g=itemgetter(2,5,3), the call g(r) returns (r[2], r[5], r[3])

trans

itemgetter(item, ...) –> itemgetter object

Return a callable object that fetches the given item(s) from its operand. After, f=itemgetter(2), the call f(r) returns r[2]. After, g=itemgetter(2,5,3), the call g(r) returns (r[2], r[5], r[3])

onyx.graph.graphtools.compose_DAGs(dag1, dag2)

Composes two DAGs into a new graph. Returns a FrozenGraph of the composition. Each node label and each arc label in the new graph is a tuple of (label1, label2) from the corresponding nodes or arcs being composed.

>>> dag1 = FrozenGraph(GraphTables(((-11, -12, -10), (0, 1, 0), (1, 2, 2), (-21, -20, -22))))
>>> dag2 = FrozenGraph(GraphTables(((-111, -112, -110), (0, 1, 0), (1, 2, 2), (-121, -120, -122))))
>>> compose_DAGs(dag1, dag2)
Traceback (most recent call last):
  ...
NotImplementedError: early stages of work in progress

FrozenGraph(GraphTables(((), (), (), ())))

Must use cycle-free graphs

>>> compose_DAGs(dag1, FrozenGraph(GraphTables(((-1,), (0,), (0,), (-2,)))))
Traceback (most recent call last):
  ...
ValueError: expected strict cycle checks to be (False, False), got (False, True)
onyx.graph.graphtools.invert_index_permutation(index_permutation)

Given index_permutation, an ndarray that is a permutation of zero-based indices, return the inverse of the permutation.

Raises ValueError if index_permutation is not actually a permutation.

>>> p = np.array((1,2,4,3,0))
>>> invert_index_permutation(p)
array([4, 0, 1, 3, 2])
>>> (p == invert_index_permutation(invert_index_permutation(p))).all()
True
>>> invert_index_permutation(np.array((1,2,2,5,0)))
Traceback (most recent call last):
  ...
ValueError: expected a permutation of zero-based indices of 5 items: but missing 3, 4; and invalid 5
onyx.graph.graphtools.make_initialized_set_graph_builder(arc_iterator)

Make a new SetGraphBuilder and populate it with arcs specified by items in arc_iterator. Each item in arc_iterator is either a pair or a triple of immutable objects. The first and second objects are the labels for the start and end nodes of the arc. The optional third object (default None) is the arc label. The set-based semantics of SetGraphBuilder hold for the nodes and arcs.

Returns the SetGraphBuilder that has been initialized with the arc specifications in arc_iterator.

Function returns an iterator that yields a multigraph that also shows the set semantics:

>>> def arc_yielder():
...   yield 'A1', 'A2'
...   yield 'A1', 'A2', 'a'
...   yield 'A2', 'B1', 'ab'
...   yield 'A2', 'B1', 'ab'
>>> b = make_initialized_set_graph_builder(arc_yielder())
>>> for line in b.text_iter():
...   print line,
num_nodes 3
0   A1
1   A2
2   B1
num_arcs 3
0   0  1   None
1   0  1   a
2   1  2   ab

Add a new arc:

>>> b.add_arc('X1', 'X2', 'x')
3
>>> for line in b.text_iter():
...   print line,
num_nodes 5
0   A1
1   A2
2   B1
3   X1
4   X2
num_arcs 4
0   0  1   None
1   0  1   a
2   1  2   ab
3   3  4   x
>>> g = FrozenGraph(b)
>>> #g.dot_display(globals=['label="make_initialized_set_graph_builder";', 'labelloc=top;', 'rankdir=LR;'])
onyx.graph.graphtools.make_random_DAG(maxnumstarts, endtime, averageoutorder, averagearclength, numnodetokens, averagearcsuccesspermil, seed=None)

Generate a DAG that has nice structural properties for streaming models of speech-centric graph processing.

Returns the FrozenGraph corresponding to the generated DAG.

Raises ValueError if the lattice dies out, which can happen if averagearcsuccesspermil and averageoutorder are both small.

>>> def printit(g): print 'numnodes', g.num_nodes, 'numarcs', g.num_arcs, 'self_loop', g.has_self_loop, 'cyclic', g.has_cycle, 'connected', g.is_connected, 'strict_dag', g.is_strict_dag
>>> a = make_random_DAG(5, 20, 7, 10, 4, 800, seed=0)
>>> printit(a)
numnodes 136 numarcs 411 self_loop False cyclic False connected True strict_dag True
>>> assert not a.has_cycle
>>> a._verify_topologicals()

The transpose has same acyclic stats

>>> nodelabels, starts, ends, arclabels = a.graphseqs
>>> b = FrozenGraph(GraphTables((nodelabels, ends, starts, arclabels)))
>>> printit(b)
numnodes 136 numarcs 411 self_loop False cyclic False connected True strict_dag True
>>> assert not b.has_cycle
>>> a.graphseqs == FrozenGraph(GraphTables((nodelabels, starts, ends, arclabels))).graphseqs
True

A larger example

>>> a = make_random_DAG(1, 250, 6, 25, 50, 600, seed=0)
>>> printit(a)
numnodes 10006 numarcs 31767 self_loop False cyclic False connected True strict_dag True

Some decent ones to plot. Uncomment to display it with Graphviz

>>> # _ = make_random_DAG(1, 100, 3, 5, 2, 500, seed=0).dot_display(globals=('rankdir=LR;',))
>>> # _ = make_random_DAG(1, 60, 2, 3, 2, 700, seed=0).dot_display(globals=('rankdir=LR;',))
>>> testDAG = make_random_DAG(1, 60, 2, 3, 2, 700, seed=0)
>>> # _ = testDAG._random_dag_lattice().dot_display(globals=('rankdir=LR;',), arc_label_callback=None)

# work in progress

>>> endtime = 80
>>> make_random_DAG(1, endtime, 2, 4, 2, 600, seed=0)._lattice_work(endtime) and None

A graph that dies out due to low-branching factor (averageoutorder) and low arc success rate (averagearcsuccesspermil)

>>> make_random_DAG(1, 320, 3, 10, 5, 300, seed=0)
Traceback (most recent call last):
 ...
ValueError: lattice died out: make_random_DAG(1, 320, 3, 10, 5, 300, 0)
onyx.graph.graphtools.make_random_confusion_network(min_links, max_links, min_arcs_per_link, max_arcs_per_link, arc_labeller, seed=None)

Generate and return a random confusion network. The number of links will be uniformly chosen between min_links and max_links. The number of arcs in each link will be uniformly chosen (separately for each link) between min_arcs_per_link and max_arcs_per_link. arc_labeller should be a callable taking no arguments which provides arc labels; the nodes will all have the label None. seed will be used to seed the random number generator used.

>>> rand = random.Random()
>>> rand.seed(0)
>>> labeller = lambda: rand.sample(('the', 'quick', 'brown', 'fox', 'jumped', 'over', 'the', 'lazy', 'dog'), 1)[0]
>>> g = make_random_confusion_network(3, 20, 1, 10, labeller, seed=0)
>>> g.num_nodes
19
>>> g.num_arcs
116
>>> # g.dot_display()
onyx.graph.graphtools.make_random_dag(dag_endtime, average_out_order, average_arc_length, average_arc_success_per_mil=1000, node_labeler=1, arc_labeler=None, seed=None)

Generate a DAG that has some speech-like structural properties for streaming graph processing. See also make_random_lattice(), which is very similar but returns a lattice.

Times are integers. Start node is at time 0. No arc will be created that starts after dag_endtime. Nodes will have an average of average_out_order * average_arc_success_per_mil / 1000 outgoing arcs. If average_arc_success_per_mil is less than 1000, then there is a chance that the DAG will die out – which some algorithms need to handle.

Arcs are between 1 and average_arc_length long, as reflected in their terminal node times.

If optional node_labeler is an integer, N, user_labels will be chosen at random from range(N). If it is callable() it will be called with no arguments in order to generate the user_label portion of each node label. The default value is 1, meaning all user_labels will be 0. The node label itself is a pair (time, user_label) where time is a non-negative integer in the average_arc_length space. The purpose of node_labeler is to generate distinct nodes with the same timestamp.

If optional arc_labeler is not None it will be called with no arguments in order to generate arc labels respectively. The default behavior is to use non-negative integers for the arc labels.

Returns the FrozenGraph corresponding to the generated DAG.

Raises ValueError if the lattice dies out, which can happen if average_arc_success_per_mil and average_out_order are both small. Arguably the DAG should be returned and the user should notice that no arc reached the endtime.

>>> def printit(g): print 'numnodes', g.num_nodes, 'numarcs', g.num_arcs, 'self_loop', g.has_self_loop, 'cyclic', g.has_cycle, 'connected', g.is_connected, 'strict_dag', g.is_strict_dag
>>> a = make_random_dag(50, 2, 15, 1000, seed=1)
>>> printit(a)
numnodes 42 numarcs 58 self_loop False cyclic False connected True strict_dag True
>>> #_ = a.dot_display(globals=['rankdir=LR'])
>>> b = make_random_dag(50, 2, 15, 1000, node_labeler=2, seed=1)
>>> printit(b)
numnodes 55 numarcs 74 self_loop False cyclic False connected True strict_dag True
>>> #_ = b.dot_display(globals=['rankdir=LR'])
>>> c = make_random_dag(40, 7, 10, 800, node_labeler=4, seed=0)
>>> printit(c)
numnodes 181 numarcs 722 self_loop False cyclic False connected True strict_dag True

The transpose has same acyclic stats

>>> nodelabels, starts, ends, arclabels = c.graphseqs
>>> cc = FrozenGraph(GraphTables((nodelabels, ends, starts, arclabels)))
>>> printit(cc)
numnodes 181 numarcs 722 self_loop False cyclic False connected True strict_dag True
>>> assert not c.has_cycle
>>> c.graphseqs == FrozenGraph(GraphTables((nodelabels, starts, ends, arclabels))).graphseqs
True

A larger example

>>> d = make_random_dag(250, 6, 25, 600, node_labeler=50, seed=0)
>>> printit(d)
numnodes 10006 numarcs 31767 self_loop False cyclic False connected True strict_dag True
>>> #_ = d.dot_display(globals=['rankdir=LR'])

Some other decent ones to plot. Uncomment to display it with Graphviz

>>> #_ = make_random_dag(100, 3, 5, 500, node_labeler=2, seed=0).dot_display(globals=('rankdir=LR;',))
>>> #_ = make_random_dag(60, 2, 3, 700, node_labeler=2, seed=0).dot_display(globals=('rankdir=LR;',))
>>> testDAG = make_random_dag(60, 2, 3, 700, node_labeler=2, seed=0)
>>> printit(testDAG)
numnodes 92 numarcs 152 self_loop False cyclic False connected True strict_dag True
>>> #_ = testDAG._random_dag_lattice().dot_display(globals=('rankdir=LR;',), arc_label_callback=None)

# work in progress

>>> dag_endtime = 80
>>> make_random_dag(dag_endtime, 2, 4, 600, node_labeler=2, seed=0)._lattice_work(dag_endtime) and None

A graph that dies out due to low-branching factor (low average_out_order) and low arc success rate (low average_arc_success_per_mil)

>>> make_random_dag(320, 3, 10, 300, node_labeler=5, seed=0)
Traceback (most recent call last):
 ...
ValueError: dag died out: make_random_dag(320, 3, 10, 300, node_labeler=5, seed=0, arc_labeler=None)
onyx.graph.graphtools.make_random_graph(numnodes, numarcs, seed=None)

Return a FrozenGraph with numnodes nodes and numarcs arcs. Each arc’s start node and end node are chosen at random (with replacement) from the set of nodes. Optional seed is used to seed the randomness for repeatable behavior.

Each arc label will be the arc index, and each node label will be the node index. This facilitates using these labels to index into external sequences of node-specific or arc-specific information.

With this particular set of 100 nodes and 120 random arcs, most of the nodes are in a large weakly-connected set, but there are a few other isolated nodes and a couple of very small weakly-connected sets. There are no self loops.

>>> a = make_random_graph(100, 120, seed=6)
>>> a.has_self_loop, a.has_cycle, a.has_multiarcs, a.is_connected, a.is_lattice, a.is_strict_dag, a.has_join
(False, True, True, False, False, False, True)
>>> tuple(len(x) for x in a.connected_sets)
(89, 1, 1, 1, 3, 1, 2, 1, 1)

This holds by construction, regardless of the randomness

>>> a.node_labels_are_node_ids, a.arc_labels_are_arc_ids
(True, True)

Using a different randomness, show that the nodes are indeed chosen with replacement, giving rise to self loops in this case.

>>> b = make_random_graph(100, 120, seed=0)
>>> b.has_self_loop
True

This holds by construction, regardless of the randomness

>>> b.node_labels_are_node_ids, b.arc_labels_are_arc_ids
(True, True)

Example of plotting a graph. Uncomment it to display it with Graphviz

>>> # _ = b.dot_display()
onyx.graph.graphtools.make_random_lattice(dag_endtime, average_out_order, average_arc_length, average_arc_success_per_mil=1000, arc_labeler=None, node_labeler=1, seed=None)

Generate a lattice that has some speech-like structural properties for streaming graph processing.

The arguments are documented in make_random_dag().

>>> def printit(g): print ' numnodes', g.num_nodes, ' numarcs', g.num_arcs, ' is_strict_dag', g.is_strict_dag, ' is_lattice', g.is_lattice
>>> dag = make_random_dag(50, 3, 10, 400, node_labeler=2, seed=1)
>>> printit(dag)
 numnodes 47  numarcs 54  is_strict_dag True  is_lattice False
>>> lat = make_random_lattice(50, 3, 10, 400, node_labeler=2, seed=1)
>>> printit(lat)
 numnodes 21  numarcs 23  is_strict_dag True  is_lattice True
>>> #_ = dag.dot_display(globals=['rankdir=LR'])
>>> #_ = lat.dot_display(globals=['rankdir=LR'])
onyx.graph.graphtools.tutorial()