Wednesday, November 28, 2012

efficient(-ish) bit reversal of small-ish integers

import struct

BYTE_REV = "".join([chr(int("{0:08b}".format(n)[::-1], 2)) for n in range(256)])
QUAD = struct.Struct('>Q')
QUAD_R = struct.Struct('<Q')

def int_reverse(n, base=64):
    if n >= 2**64:
        raise ValueError("n must be less than 2**64; recieved "+str(n))
    return QUAD_R.unpack(QUAD.pack(n).translate(BYTE_REV))[0] >> (64-base)

>>> int_reverse(1, 3)
>>> int_reverse(1, 4)
>>> int_reverse(1, 5)
>>> int_reverse(5, 3)
>>> int_reverse(3, 3)

This is useful for example for doing a binary search of a space.

>>> [int_reverse(n, 10) for n in range(2**10)][:16]
[0, 512L, 256L, 768L, 128L, 640L, 384L, 896L, 64L, 576L, 320L, 832L, 192L, 704L, 448L, 960L]

First it does 0, then 1/2 way through the range, then 1/4, then 3/4, etc...  Sometimes this is a useful search order.

Monday, November 19, 2012

back-porting non-local

There have been discussions about back-porting the nonlocal keyword from Python 3 to Python 2.
But, is there some crazy, misguided hack we can use to edit closures in Python 2?

Let's make a closure!

>>> def foo():
...    a = 2
...    def bar():
...       print a
...    return bar
>>> foo().func_closure
(<cell at 0x02515FF0: int object at 0x023AA6CC>,)

A closure apparently consists of a tuple of cell objects.  The tuple is immutable.  What about the cell itself?

>>> foo().func_closure[0].cell_contents
>>> foo().func_closure[0].cell_contents = 3
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: attribute 'cell_contents' of 'cell' objects is not writable

Well played, Python 2.7 runtime.  Time to go nuclear: the python C api.

>>> b = foo()
>>> b()
>>> import ctypes
>>> ctypes.pythonapi.PyCell_Set(id(b.func_closure[0]), id(3))
>>> b()

So, there you have it.  The contents of a closure can be changed in Python 2.

Wednesday, November 14, 2012

python working directory

import inspect
import sys
import os.path

def caller_directory():
    mod_name = inspect.currentframe().f_back.f_back.f_globals.get('__name__')
    module = sys.modules[mod_name]
    if not hasattr(module, '__file__'): #probably REPL
        return os.getcwd()
    path = module.__file__
    #now, where does this path meet the python path?
    if os.path.isabs(path): #already absolute path? hurray, we are done
        return os.path.dirname(path)
    #otherwise, need to go down the python path finding where it joins
    for p in sys.path:
        p = os.path.join(p, path)
        if os.path.exists(p):
            return os.path.dirname(os.path.abspath(p))
    raise Exception("could not find caller directory")
This function, when called, will return the directory of the current functions caller.

This allows for directory normalization.  A call to open() will always be relative to the current working directory.  This way, the directory relative to the __file__ of the caller can be found.  In some cases, this is what a user would intuitively expect.

Monday, November 12, 2012

lambda default parameters

Making lambdas behave nicely by using default parameters

>>> [c() for c in [(lambda: i) for i in range(10)]]
[9, 9, 9, 9, 9, 9, 9, 9, 9, 9]

>>> [c() for c in [(lambda a=i: a) for i in range(10)]]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Friday, November 9, 2012

Thursday, January 5, 2012

__dict__ and vars()

vars() is the lesser-known and arguably more Pythonic version of the __dict__ attribute (available on all classes that do not use slots). From the docs:

Without an argument, act like locals().

With a module, class or class instance object as argument (or anything else that has a __dict__ attribute), return that attribute.

As core Python contributor Raymond Hettinger points out on Twitter:
"Most folks prefer len(s) to s.__len__(), prefer next(it) to, but forget to use vars(s) rather than s.__dict__"

However, I think there's room to argue that s.__dict__ may be a little clearer than vars(s). After all, (almost) everything in Python is an object, and very nearly all objects are backed by dicts. Furthermore, s.__dict__ is faster:

>>> from timeit import timeit
>>> timeit('object.__dict__;object.__dict__')
>>> timeit('vars(object);vars(object)')

Four million iterations later, we see that grabbing the __dict__ is over four and a half times faster. This should actually come as no surprise: accessing built-in functions (like len()/vars()/etc.) is always slower than object attribute access.