`onyx.util.process` – Utilities for running processes.¶

onyx.util.process.str_as_fd(str_data)¶

Takes str_data, a string, and returns a context manager. The target of the context manager is a file descriptor (readable end of a pipe) from which the bytes of the string can be read. The context manager closes the file descriptor as the context is exited, so the user should not close it. This function is useful for piping data into a subprocess (via a file descriptor) without creating a file.

Test str_as_fd() on its own doc string, reading it from the fd in up-to 12 byte chunks.

>>> res = list()
>>> with str_as_fd(str_as_fd.__doc__) as fd:
...   while True:
...     x = os.read(fd, 12)
...     if not x: break
...     res.append(x)
>>> max(len(x) for x in res) == 12
True
>>> ''.join(res) == str_as_fd.__doc__
True

Test it out on a subprocess

>>> with str_as_fd(_random_string) as fd:
...   stdout, stderr, cmd = subprocess(['md5sum', '-b'], stdin=fd)
>>> stdout
'807b2053419d6ad71e58ccee7197aa5c *-\n'
>>> import hashlib
>>> stdout.split()[0] == hashlib.md5(_random_string).hexdigest()
True

onyx.util.process.str_iter_as_fd(*args, **kwds)¶

Given str_iter, an iterable that yields strings, return a context manager. The target of the context manager is a file descriptor (readable end of a pipe) from which the byte stream of the yielded strings can be read. The context manager closes the file descriptor as the context is exited, so the user should not close it. This function is useful for piping data into a subprocess (via a file descriptor) without creating a file.

Note

os.read(fd, nbytes) returns at most nbytes, but maybe less, so code that reads from the fd needs to deal with the length of the returned byte sequence. So we use safe_os_read(fd, nbytes) when we need to be sure about getting what we asked for.

>>> def safe_os_read(fd, nbytes):
...   parts, num_read = list(), 0
...   while True:
...     got = os.read(fd, nbytes - num_read)
...     if got == '': return ''.join(parts)
...     parts.append(got)
...     num_read += len(got)

We don’t need to use the safe function to be sure to get back 1 item, but for anything more, we do

>>> with str_iter_as_fd('foo bar baz'.split()) as fd:
...   os.read(fd, 1)
...   safe_os_read(fd, 5)
'f'
'oobar'

Test str_iter_as_fd() on a per-line iterated version of its own doc string, reading it back from the fd in up-to 16 byte chunks.

>>> import cStringIO
>>> doc_iter = cStringIO.StringIO(str_iter_as_fd.__doc__)
>>> res = list()
>>> with str_iter_as_fd(doc_iter) as fd:
...   while True:
...     x = os.read(fd, 16)
...     if not x: break
...     res.append(x)
>>> max(len(x) for x in res) == 16
True
>>> ''.join(res) == str_iter_as_fd.__doc__
True

Send data to a subprocess

>>> with str_iter_as_fd(cStringIO.StringIO(str_iter_as_fd.__doc__)) as fd:
...   stdout, stderr, cmd = subprocess(['cat'], stdin=fd)
>>> stdout == str_iter_as_fd.__doc__
True
>>> stderr
''
>>> cmd
'cat'

Edge cases, small and large

>>> with str_iter_as_fd([]) as fd:
...   os.read(fd, 1)
''
>>> with str_iter_as_fd(['X']) as fd:
...   os.read(fd, 0)
''
>>> with str_iter_as_fd(['X']) as fd:
...   os.read(fd, 1)
'X'
>>> with str_iter_as_fd(['X']) as fd:
...   safe_os_read(fd, 1<<20)
'X'

Work with a couple of long strings (put in the module when it’s being tested):

>>> import hashlib
>>> len(_long_string)
1114112
>>> hashlib.md5(_long_string).hexdigest()
'6e683a33919f2cb5784fa87df03f3e34'
>>> len(_random_string)
1048576
>>> hashlib.md5(_random_string).hexdigest()
'807b2053419d6ad71e58ccee7197aa5c'

Read from the fd directly

>>> parts = list()
>>> with str_iter_as_fd([_long_string, _random_string]) as fd:
...   while True:
...     parts.append(os.read(fd, (1 << 16)))
...     if not parts[-1]: break
>>> ''.join(parts) == _long_string + _random_string
True

Make a much longer list of strings, send it through cat

>>> big_list_o_strings = list()
>>> big_list_o_strings.append(_long_string)
>>> big_list_o_strings.extend(_long_string.split())
>>> big_list_o_strings.extend(_random_string.split())
>>> big_list_o_strings.append(_long_string)
>>> len(big_list_o_strings)
89543
>>> with str_iter_as_fd(big_list_o_strings) as fd:
...   stdout, stderr, cmd = subprocess(['cat'], stdin=fd)
>>> stdout == ''.join(big_list_o_strings)
True
>>> len(stdout)
4300800

Errors from bad arguments to str_iter_as_fd:

str_iter must be iterable

>>> with str_iter_as_fd(None): pass
Traceback (most recent call last):
  ...
TypeError: 'NoneType' object is not iterable

str_iter must yield strings

>>> with str_iter_as_fd(['A', 1]) as fd:
...   os.read(fd, 2)
Traceback (most recent call last):
  ...
TypeError: expected a str, got a int

Error from the with-block:

>>> with str_iter_as_fd(()) as fd: raise Exception('testing')
Traceback (most recent call last):
  ...
Exception: testing

An exception based on bad arguments to str_iter_as_fd trumps a concurrent error from the with-block:

>>> did_block_work = [False]
>>> with str_iter_as_fd(None) as fd:
...   did_block_work[0] = Exception('will be trumped')
...   raise did_block_work[0]
Traceback (most recent call last):
  ...
TypeError: 'NoneType' object is not iterable
>>> did_block_work
[Exception('will be trumped',)]

>>> did_block_work = [False]
>>> with str_iter_as_fd(['f', None]) as fd:
...   did_block_work[0] = Exception('will be trumped')
...   os.read(fd, 1)
...   os.read(fd, 1)
...   raise did_block_work[0]
Traceback (most recent call last):
  ...
TypeError: expected a str, got a NoneType
>>> did_block_work
[Exception('will be trumped',)]

The with-block exception will get out if the bogus str_iter argument is not processed, e.g. if preceding strings from str_iter are long enough to fill the fd’s pipe.

>>> did_block_work = [False]
>>> with str_iter_as_fd([_random_string, None]) as fd:
...   did_block_work[0] = Exception('str_iter_as_fd never got to bogus item')
...   raise did_block_work[0]
Traceback (most recent call last):
  ...
Exception: str_iter_as_fd never got to bogus item
>>> did_block_work
[Exception('str_iter_as_fd never got to bogus item',)]

Regression test against a problem with writing more than (1 << 16) bytes to a pipe and closing it from separate threads on Mac OS X 10.5 and 10.6. The symptom is that Python hangs!

>>> def reg1():
...   for i in xrange(250):
...     with str_iter_as_fd([_random_string]) as fd: pass
>>> reg1()

onyx.util.process.subprocess(args, stdin=-1)¶

Execute a subprocess using the strings in args. Optional stdin will be used for the subprocess’s stdin. It can be an open file object. See documentation for Python’s subprocess.Popen() function for full details.

Returns a triple, (stdout, stderr, cmd), where stdout is the bytes from the subprocess standard out, stderr is the bytes from the subprocess standard error, and cmd is a single string giving the command and its arguments.

Raises onyx.SubprocessError if the command fails to execute or if the command exits with a non-zero return code.

>>> args_string = 'foo bar baz'
>>> stdout, stderr, cmd = subprocess(['echo', args_string])
>>> stdout
'foo bar baz\n'

>>> with str_as_fd(args_string) as fd:
...   stdout, stderr, cmd = subprocess(['wc'], stdin=fd)
>>> ' '.join(stdout.split())
'0 3 11'

>>> stdout, stderr, cmd = subprocess('true unused'.split())
>>> stdout
''
>>> stderr
''
>>> cmd
'true unused'

>>> subprocess('no_such_command --version'.split())
Traceback (most recent call last):
  ...
SubprocessError: [Errno 2] got 'No such file or directory' while trying to execute 'no_such_command --version'

>>> subprocess('false'.split())
Traceback (most recent call last):
  ...
SubprocessError: [Errno 1] command failed: 'false' : 

Previous topic

Next topic

This Page

`onyx.util.process` – Utilities for running processes.¶

Navigation

Previous topic

Next topic

This Page

Quick search

onyx.util.process – Utilities for running processes.¶

Navigation

`onyx.util.process` – Utilities for running processes.¶