Onyx logo

Previous topic

onyx.graph.randomgraph – This module has tools that support the generation of random graphs.

Next topic

Grid Tools

This Page

onyx.graph.trie – Trie structure and iterators

The Trie supports building trie structures and various iterations over them.

class onyx.graph.trie.Trie

Bases: object

A trie structure with methods for adding to the trie and iterating over its contents.

Each element in a trie is stored based on a sequence of keys, the path, via add_element(). The set of paths in the tree can be iterated by method iter_paths() which, for each path in the tree, yields the sequence of keys that address the path’s final node in the trie and the set of elements stored at that node. The set of elements in the trie can be iterated via iter_elements(). An element can appear at the end of more than one path in the trie; the paths that address a given element can be iterated via iter_element_paths().

Containment of a path having a final node with one or more elements is determined via the in operator. has_prefix() is used to determine the existence of a path prefix in the trie regardless of whether there are elements at the final node of the prefix.

Experimental support for ‘dense’ trie structures in dense_trie().

>>> trie = Trie()
>>> 'abcd' in trie
False
>>> trie.add_element('abcd', 'ABCD')
>>> 'abcd' in trie
True
>>> 'ab' in trie
False
>>> trie.add_element('a', 'A')
>>> 'ab' in trie
False
>>> trie.add_element('ab', 'AB')
>>> 'ab' in trie
True
>>> trie.add_element('adaada', 'ADAADA')
>>> trie.add_element('adacad', 'ADACAD')
>>> trie.add_element('cat', 'CAT2')
>>> trie.add_element('cat', 'CAT')
>>> trie.add_element('kat', 'CAT')
>>> trie.add_element('c', 'C')
>>> trie.add_element('car', 'CAR')
>>> trie.add_element('car', 'CAR2')
>>> trie.add_element('cart', 'CART')
>>> trie.add_element('boondoggle', 'BOONDOGGLE')

This empty iteration is a regression test

>>> for _ in trie.iter_paths(): pass
>>> trie.add_element((), '[Empty]')
>>> for seq, elements in trie.iter_paths(): print seq, sorted(elements)
('a', 'b', 'c', 'd') ['ABCD']
('a', 'b') ['AB']
('a', 'd', 'a', 'a', 'd', 'a') ['ADAADA']
('a', 'd', 'a', 'c', 'a', 'd') ['ADACAD']
('a',) ['A']
('c', 'a', 'r', 't') ['CART']
('c', 'a', 'r') ['CAR', 'CAR2']
('c', 'a', 't') ['CAT', 'CAT2']
('c',) ['C']
('b', 'o', 'o', 'n', 'd', 'o', 'g', 'g', 'l', 'e') ['BOONDOGGLE']
('k', 'a', 't') ['CAT']
() ['[Empty]']
>>> for element in sorted(trie.iter_elements()): print element
A
AB
ABCD
ADAADA
ADACAD
BOONDOGGLE
C
CAR
CAR2
CART
CAT
CAT2
[Empty]
>>> trie.has_element('ABCD')
True
>>> trie.has_element('ABCDE')
False
>>> for seq, elements in trie.iter_element_paths('ABCD'): print seq, sorted(elements)
('a', 'b', 'c', 'd') ['ABCD']
>>> for seq, elements in trie.iter_element_paths('CAT'): print seq, sorted(elements)
('c', 'a', 't') ['CAT', 'CAT2']
('k', 'a', 't') ['CAT']
>>> trie.is_dense_trie
False
>>> dense_trie = trie.dense_trie()
>>> dense_trie.is_dense_trie
True
>>> for seq, elements in dense_trie.iter_paths(): print seq, sorted(elements)
(('c',), ('a',), ('t',)) ['CAT', 'CAT2']
(('c',), ('a',), ('r',), ('t',)) ['CART']
(('c',), ('a',), ('r',)) ['CAR', 'CAR2']
(('c',),) ['C']
(('k', 'a', 't'),) ['CAT']
(('a',), ('b',), ('c', 'd')) ['ABCD']
(('a',), ('b',)) ['AB']
(('a',), ('d', 'a'), ('c', 'a', 'd')) ['ADACAD']
(('a',), ('d', 'a'), ('a', 'd', 'a')) ['ADAADA']
(('a',),) ['A']
(('b', 'o', 'o', 'n', 'd', 'o', 'g', 'g', 'l', 'e'),) ['BOONDOGGLE']
() ['[Empty]']
>>> dense_trie2 = dense_trie.dense_trie()
>>> dense_trie2.is_dense_trie
True
>>> for seq, elements in dense_trie2.iter_paths(): print seq, sorted(elements)
((('k', 'a', 't'),),) ['CAT']
((('a',),), (('d', 'a'),), (('c', 'a', 'd'),)) ['ADACAD']
((('a',),), (('d', 'a'),), (('a', 'd', 'a'),)) ['ADAADA']
((('a',),), (('b',),), (('c', 'd'),)) ['ABCD']
((('a',),), (('b',),)) ['AB']
((('a',),),) ['A']
((('c',),), (('a',),), (('t',),)) ['CAT', 'CAT2']
((('c',),), (('a',),), (('r',),), (('t',),)) ['CART']
((('c',),), (('a',),), (('r',),)) ['CAR', 'CAR2']
((('c',),),) ['C']
((('b', 'o', 'o', 'n', 'd', 'o', 'g', 'g', 'l', 'e'),),) ['BOONDOGGLE']
() ['[Empty]']
>>> trie2 = Trie()
>>> for _ in trie2.iter_paths(): assert False
>>> trie2.add_element('boondoggle', 'BOONDOGGLE')
>>> for _ in trie2.iter_paths(): pass
>>> trie3 = Trie()
>>> for _ in trie3.iter_paths(): assert False
>>> trie3.add_element((), None)
>>> for seq, elements in trie3.iter_paths(): print seq, sorted(elements)
() [None]
>>> trie3.add_element((), ())
>>> for seq, elements in trie3.iter_paths(): print seq, sorted(elements)
() [None, ()]
>>> for _ in dense_trie.iter_paths(): dense_trie.add_element((), ())
Traceback (most recent call last):
  ...
ValueError: multiple attempts to iterate and/or mutate
add_element(seq_iter, element)

Ensure that the path of keys in seq_iter exists in the trie by creating nodes as necessary. Add element to the set of elements on the node at the end of the path.

dense_trie()

Returns a new Trie that has isomorphic branching structure with self, but has keys that are sequences of the non-branching, non-element-containing sequences of keys in self. Resulting trie will have is_dense_trie is True.

It is an error to call this if is_empty is True.

>>> trie = Trie()
>>> trie.add_element('abc', 'ABC')
>>> trie.add_element('abcd', 'ABCD')
>>> trie.add_element('abx', 'ABCX')
>>> trie.update_elements('abex', ['ABCX', 'ABEX'])
>>> trie.add_element('abex', 'FOO')
>>> trie.add_element('abexfoo', 'ABEXFOO')
>>> trie.add_element('ab', 'AB')
>>> trie.add_element((), None)
>>> trie.is_dense_trie
False
>>> for key_seq, elements_iter in trie.iter_paths():
...   print key_seq, ' ', sorted(elements_iter)
('a', 'b', 'x')   ['ABCX']
('a', 'b', 'c', 'd')   ['ABCD']
('a', 'b', 'c')   ['ABC']
('a', 'b', 'e', 'x', 'f', 'o', 'o')   ['ABEXFOO']
('a', 'b', 'e', 'x')   ['ABCX', 'ABEX', 'FOO']
('a', 'b')   ['AB']
()   [None]
>>> dense_trie = trie.dense_trie()
>>> dense_trie.is_dense_trie
True
>>> for key_seq, elements_iter in dense_trie.iter_paths():
...   print key_seq, ' ', sorted(elements_iter)
(('a', 'b'), ('c',), ('d',))   ['ABCD']
(('a', 'b'), ('c',))   ['ABC']
(('a', 'b'), ('e', 'x'), ('f', 'o', 'o'))   ['ABEXFOO']
(('a', 'b'), ('e', 'x'))   ['ABCX', 'ABEX', 'FOO']
(('a', 'b'), ('x',))   ['ABCX']
(('a', 'b'),)   ['AB']
()   [None]

Empty trie is the only trie that cannot be made dense

>>> Trie().dense_trie()
Traceback (most recent call last):
  ...
ValueError: cannot create a dense trie from an is_empty trie.
has_element(element)

Returns True if element appears at one or more nodes in the trie, False otherwise. See also iter_elements() and iter_element_paths().

>>> trie = Trie()
>>> trie.add_element('abc', 'ABC')
>>> trie.add_element('abx', 'ABCX')
>>> trie.update_elements('abex', ['ABCX', 'ABEX'])
>>> trie.has_element('ABCX')
True
>>> trie.has_element('ABCD')
False
has_prefix(seq_iter)

Returns True if the sequence of keys in seq_iter is a prefix of some path in the trie, False otherwise.

>>> trie = Trie()
>>> trie.add_element('cart', 'CART')
>>> trie.has_prefix('car')
True
>>> 'car' in trie
False
>>> trie.has_prefix('cart')
True
>>> 'cart' in trie
True
>>> trie.has_prefix('carts')
False
>>> 'carts' in trie
False
>>> trie.add_element('cartsing', 'CARTSING')
>>> trie.has_prefix('carts')
True
>>> 'carts' in trie
False
is_dense_trie

True if the trie has the property that every node in the trie has elements or more than one child (or both), False otherwise. See also dense_trie().

Empty trie is not dense

>>> trie = Trie()
>>> trie.is_dense_trie
False
>>> trie.add_element((), None)
>>> trie.is_dense_trie
True
is_empty

True if the trie has no paths and no elements, False otherwise.

>>> trie = Trie()
>>> trie.is_empty
True

Minimal thing you can add, no path, just an element at the root

>>> trie.add_element((), None)
>>> trie.is_empty
False
>>> trie2 = Trie()
>>> trie2.add_element(xrange(100), None)
>>> trie2.is_empty
False
iter_element_paths(element)

A generator yielding the paths in the trie to nodes that contain element. For each path to a node containing element, the generator yields a pair, (key_seq, elements_iter), where key_seq is a tuple of the keys for the path to the node, and elements_iter is an iterator over the all the elements at that node.

It is an error if has_element() of element is False.

>>> trie = Trie()
>>> trie.add_element('abc', 'ABC')
>>> trie.add_element('abx', 'ABCX')
>>> trie.update_elements('abex', ['ABCX', 'ABEX'])

Simple case

>>> for key_seq, elements_iter in trie.iter_element_paths('ABC'):
...   print key_seq, ' ', sorted(elements_iter)
('a', 'b', 'c')   ['ABC']

Showing multiple paths, and all the elements at each node

>>> for key_seq, elements_iter in trie.iter_element_paths('ABCX'):
...   print key_seq, ' ', sorted(elements_iter)
('a', 'b', 'x')   ['ABCX']
('a', 'b', 'e', 'x')   ['ABCX', 'ABEX']
iter_elements()

A generator that yields each element in the trie.

>>> trie = Trie()
>>> trie.add_element('abc', 'ABC')
>>> trie.add_element('abx', 'ABCX')
>>> trie.update_elements('abex', ['ABCX', 'ABEX'])
>>> sorted(trie.iter_elements())
['ABC', 'ABCX', 'ABEX']
iter_paths()

A generator yielding paths and elements in the trie. For each path to a node with elements, the generator yields a pair, (key_seq, elements_iter), where key_seq is a tuple of the keys for the path to the node, and elements_iter is an iterator over the elements at that node.

>>> trie = Trie()
>>> trie.add_element('abc', 'ABC')
>>> trie.add_element('abcd', 'ABCD')
>>> trie.add_element('abx', 'ABCX')
>>> trie.update_elements('abex', ['ABCX', 'ABEX'])
>>> trie.add_element('abex', 'FOO')

Of note: ABCX is an element at more than one node; path a b e x has multiple elements at its node.

>>> for key_seq, elements_iter in trie.iter_paths():
...   print key_seq, ' ', sorted(elements_iter)
('a', 'b', 'x')   ['ABCX']
('a', 'b', 'c', 'd')   ['ABCD']
('a', 'b', 'c')   ['ABC']
('a', 'b', 'e', 'x')   ['ABCX', 'ABEX', 'FOO']
update_elements(seq_iter, elements_iter)

Ensure that the path of keys in seq_iter exists in the trie by creating nodes as necessary. Update the set of elements on the node at the end of the path with the items from elements_iter.

class onyx.graph.trie.TrieNode

Bases: onyx.builtin._rst_clean_dict

A node for a trie. This is a helper class for Trie.

It’s a subclass of dict with (key, value) pairs for child nodes, and each value is also a TrieNode. It has the defaultdict-like property that a new TrieNode is created whenever an item is looked up and doesn’t exist. Each instance that’s created via the defaultdict behaviour knows its parent node and the key by which it’s looked up in its parent. Nodes are hashable. Distinct nodes never compare equal.

>>> x, y = TrieNode(), TrieNode()
>>> hash(x) == hash(y)
False
>>> for i in xrange(5): (x[i], y[i]) and None
>>> x
{0: {}, 1: {}, 2: {}, 3: {}, 4: {}}
>>> y
{0: {}, 1: {}, 2: {}, 3: {}, 4: {}}
>>> x == y
False
>>> for i in xrange(5): x[i].has_elements
False
False
False
False
False
>>> for i in xrange(5):
...   if i%2 == 0: x[i][i].add_element(-i-1)
>>> for i in xrange(5): x[i][i].has_elements
True
False
True
False
True
>>> for i in xrange(5): print x[i][i].parent, x[i][i].key, x[i][i].elements
{0: {}} 0 (-1,)
{1: {}} 1 ()
{2: {}} 2 (-3,)
{3: {}} 3 ()
{4: {}} 4 (-5,)
>>> for i in xrange(5): x[i][2*i].update_elements(xrange(i)) and None
>>> for i in xrange(5): print x[i][2*i].parent, x[i][2*i].key, sorted(x[i][2*i].iter_elements())
{0: {}} 0 [-1]
{1: {}, 2: {}} 2 [0]
{2: {}, 4: {}} 4 [0, 1]
{3: {}, 6: {}} 6 [0, 1, 2]
{8: {}, 4: {}} 8 [0, 1, 2, 3]
add_element(element)

Add element to the set of elments associated with the node.

clear

D.clear() -> None. Remove all items from D.

copy

D.copy() -> a shallow copy of D

elements

A new tuple of the elements, if any, associated with the node.

static fromkeys()

dict.fromkeys(S[,v]) -> New dict with keys from S and values equal to v. v defaults to None.

get

D.get(k[,d]) -> D[k] if k in D, else d. d defaults to None.

has_elements

A bool indicating whether there are elements associated with the node.

has_key

D.has_key(k) -> True if D has a key k, else False

items

D.items() -> list of D’s (key, value) pairs, as 2-tuples

iter_elements()

A generator of the elements, if any, associated with the node.

iteritems

D.iteritems() -> an iterator over the (key, value) items of D

iterkeys

D.iterkeys() -> an iterator over the keys of D

itervalues

D.itervalues() -> an iterator over the values of D

key

The node’s key in its parent’s namespace, or None.

keys

D.keys() -> list of D’s keys

parent

The node’s parent node, or None.

pop

D.pop(k[,d]) -> v, remove specified key and return the corresponding value. If key is not found, d is returned if given, otherwise KeyError is raised

popitem

D.popitem() -> (k, v), remove and return some (key, value) pair as a 2-tuple; but raise KeyError if D is empty.

setdefault

D.setdefault(k[,d]) -> D.get(k,d), also set D[k]=d if k not in D

update(E, **kwargs)

D.update(E, **F) -> None. Update D from dict/iterable E and F. If E has a .keys() method, does: for k in E: D[k] = E[k] If E lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]

update_elements(elements_iter)

Add items from elements_iter to the set of elments associated with the node.

values

D.values() -> list of D’s values

viewitems

D.viewitems() -> a set-like object providing a view on D’s items

viewkeys

D.viewkeys() -> a set-like object providing a view on D’s keys

viewvalues

D.viewvalues() -> an object providing a view on D’s values