Python Does What?!?: 2011

Sunday, December 25, 2011

Detecting new- and old-style classes

Be it typos, old libraries, or just backwards-compatibility, old-style classes are Python relics that have a tendency to creep into your shiny new programs. Given that new-style classes are almost certainly what you want, it can't hurt to know how to detect the difference.


>>> # Using Python 2
...
>>> class old_style():       # doesn't inherit from object
...    pass
...
>>> class new_style(object): # does inherit from object
...    pass
...
>>> old_object = old_style()
>>> new_object = new_style()
>>> type(old_object) == old_object.__class__
False
>>> type(new_object) == new_object.__class__
True

For the inquisitive, new-style classes unified the concept of class and type in Python 2.2. For a complete description, check out Guido's essay on the topic, circa 2002.

Merry Guido-mas, and may you only find new-style classes under your AST.

Thursday, December 22, 2011

Easily import a dynamically created module

Python modules are typically Python source files that are used like simple namespaces for variables. Take this simple module:

mod.py:


x = "Hi, I'm a module variable."

Provided mod.py is on the import path (which usually includes the current working directory), you can run Python and see:


>>> import mod
>>> print mod.x
"Hi, I'm a module variable."

Of course, this is Python and Python modules are objects of type 'module'. You can dynamically create an object of this type easily enough:


>>> from types import ModuleType
>>> mymodule = ModuleType('mymodule')
>>> mymodule.x = "Hi, I'm in a dynamically-created module."
>>> print mymodule.x
"Hi, I'm in a dynamically-created module."

Pretty much what you'd expect. ModuleType is the same thing you'd get back from import sys; type(sys). That's actually what the types module uses.

But, you can't import mymodule because it's not a real, source-based module on the import path. If you want other Pythonauts to be able to use standard import syntax to import your dynamic module, a little more trickery is involved.

dynmod.py:


import sys
from types import ModuleType

not_present = "Hi, I was masked by the dynamic module."

dm = ModuleType(__name__)
dm.x = 5

sys.modules[__name__] = dm

If you're doing this you probably already know, but __name__ is just a string containing the name of the current module, based on the filename ('dynmod'). Using __name__ just keeps things a bit more consistent.

Now, look what happens when you import dynmod:


>>> import dynmod
>>> dynmod.x
5
>>> dynmod.not_present
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'module' object has no attribute 'not_present'
>>> dir(dynmod)
['__doc__', '__name__', 'x']

All the normal import syntax should work, including from dynmod import x, even though the module isn't strictly what's in the source file.

The dynamic behavior is only executed once, on initial import. When imported, Python modules create a name->module_object entry in the sys.modules dictionary. From then on, if the module name is found in sys.modules, the associated module object is used. That's how normal modules work, and that's how we want our dynamic module to work, too.

The key here is that the sys.modules entry seems to be created before the module code is run, allowing the module code to modify its own entry. Be careful with this, because Python's import system has some complex locking behavior that can make certain manipulations not work:

dynmod_broken.py:


# Breaks some import locks or something
import sys
from types import ModuleType

not_present = "masked"

sys.modules[__name__]   = ModuleType(__name__)
sys.modules[__name__].x = 5

(In fact, according to my testing, this left every single variable in the namespace set to None. Pretty PDW, if you ask me.)

If that's not interesting enough for you, then stay tuned for the follow-up post on more advanced dynamic modules where we show you how to use modules' simple and familiar nature to do your darkest biddings.

(Note: if you're looking for the right way to simply mask certain variables, consider using __all__. Also, you could probably do more fancy things with Import Hooks, but this is tested to work on Python 2.7 and 3.1, so maybe keep it simple, if simple's all you need.)

Tuesday, December 13, 2011

when is a no-op not a no-op?

When is a = a not a no-op? When a is a class variable accessed through self.a. May be useful when you want a dynamic initializer for your instances on your class. For example, a property like time_created.

>>> class A(object):

... w = 4

... def __init__(self):

... self.w = self.w

...

>>> A()

<__main__.A object at 0x00DAE770>

>>> A().__dict__

{'w': 4}

>>> a = A()

>>> a.__dict__

{'w': 4}

>>> a.w = 5

>>> A.w

>>> a.w

Monday, November 21, 2011

metaclass depth > call depth

How far deep can a metaclass tree go?
type(A) = B, type(B) = C, type(C) = D...?
Can it go further than the function call depth?

def tower(n=100, sofar=type):

class next(sofar):

__metaclass__ = sofar

return tower(n-1, next)

>>> tower(997)

>>> tower(998)

#...

File "<stdin>", line 3, in tower

RuntimeError: maximum recursion depth exceeded while calling a Python object

Friday, October 21, 2011

Class is Set of Instances

This shows how to make a class also be a set of its own instances.
A conceptually simpler way of maintaining a global registry of instances.

>>> class SetMetaClass(type):
... def __new__(cls, *a, **kw):
... cls.all = set()
... return type.__new__(cls, *a, **kw)
... def __iter__(cls): return cls.all.__iter__()
... def __len__(cls): return len(cls.all)
... def __contains__(cls, e): return e in cls.all
... def add(cls, e): cls.all.add(e)
... def discard(cls, e): cls.all.discard(e)
...
>>> class InstanceSet(object):
... __metaclass__ = SetMetaClass
... def __init__(self, *a, **kw):
... self.__class__.add(self)
...
>>> InstanceSet()
<__main__.InstanceSet object at 0x00DAB0F0>
>>> InstanceSet()
<__main__.InstanceSet object at 0x00D23610>
>>> [a for a in InstanceSet]
[<__main__.InstanceSet object at 0x00DAB0F0>, <__main__.InstanceSet object at 0x
00D23610>]
>>> len(InstanceSet)
2

Tuesday, October 11, 2011

Decoration? Instantiation? Decorstantiation

>>> class Decorstantiator(object):
... def __init__(self, func): self.func = func
... def __call__(self): print "foo"; func()
...
>>> @Decorstantiator
... def bar(): print "bar"
...
>>> bar()
foo
bar

Sunday, October 9, 2011

I can DoAnything

Sometimes you just want to test a piece of code that does deep calls into other objects in isolation.
For example:
def _change(self, op, val):
self.parent.context.change(self.parent.model, self.attr, op, val)
How can we create an object to stand in for parent in the above code?
class DoAnything(object):
def __call__(self, *a, **kw): return self
__getattr__ = __add__ = __sub__ = __mul__ = __div__ = __call__
#overload more operators to taste/requirements

>>> parent = DoAnything()
>>> parent.context.change(parent.model, "foo", "bar", "baz")
<__main__.DoAnything object at 0x02169B10>

Friday, September 30, 2011

slots cannot be updated

>>> class Slots(object): __slots__ = ()
...
>>> Slots.__slots__ = ('t',)
>>> Slots.__slots__
('t',)
>>> s2 = Slots()
>>> s2.t = 5
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Slots' object has no attribute 't'

namedtupledict

With three kinds of member access, it is easier to use than ever.

>>> def namedtupledict(*a, **kw):
... namedtuple = collections.namedtuple(*a, **kw)
... def getitem(self, key):
... if type(key) == str:
... return getattr(self, key)
... return tuple.__getitem__(self, key)
... namedtuple.__getitem__ = getitem
... return namedtuple
...
>>> t = namedtupledict('t', 'a b c')
>>> t1 = t(1,2,3)
>>> t1.a
1
>>> t1[0]
1
>>> t1['a']
1

Thursday, September 29, 2011

A is for Attribute!

>>> a = 'attribute'
>>> getattr(a, a, a)
'attribute'

Wednesday, September 21, 2011

I'm my own grandpa.

A fellow Pythonaut had a simple problem...
How to test code with dependencies on the date? Specifically, a Django project with models that have DateTimeFields with auto_now=True.

The chosen approach was to monkey-patch datetime.date.today() in order to return a constant value.
The first snag: datetime.date is a C class. This means it is not monkey-patchable.

This can be solved by monkey-patching the entire datetime.date class. The most straightforward thing to replace datetime.date with is a new subclass of datetime.date, whose only change is to override the today() function.

The second problem: there are now two kinds of datetime.date's floating around the system. Those by datetime.date.today(), and those created by every other function, which being C code will still return the same class.

Specifically, DateTimeField validates isinstance(data, datetime.date) before saving to the database.

A class which is simultaneously a super and subclass of datetime.date is required.
Override __subclasshook__ so that isinstance(super(datetime.date)(), datetime.date) is True.

>>> class DateProxy(datetime.date):
... __metaclass__ = abc.ABCMeta
... @classmethod
... def __subclasshook__(cls, C): return True
... @classmethod
... def today(cls): return datetime.date(2011,9,20)
...
>>> datetime.date = DateProxy

In other words, the superclass is also a subclass. The parent of the parent being the grandparent, this class is it's own grandpa.

Saturday, August 27, 2011

Incrementing and Decrementing

Depending on the language/style you're coming from, you may have run into this very early or not at all:

>>> x = 1 >>> ++x 1 >>> x 1 >>> x++ File "", line 1 x++ ^ SyntaxError: invalid syntax

Somewhat confusing! The supposed pre-increment operator doesn't generate an error, but it doesn't work, either. The post-increment raises a syntax error. Same thing happens with the decrement operator (--).

The reason for this madness is that Python doesn't have increment and decrement operators like those in C/C++/PHP. The "increment operator" is actually being interpreted as two identity operators, i.e., vanilla plus signs:

+(+x)

This usually has no effect on Python number types. The "decrement operator" double-negates the number, again, having no apparent effect.

So why doesn't Python have increment/decrement operators? It's probably to keep Python's interpreter simple. Guido says simple is better than complex, thusly Python's interpreter won't get more complex than LL(1). Not that LL(1) can't do this, but in general complexifications aren't necessary when you can just do:

>>> x = x + 1 # or >>> x += 1

You only need one right way to do things, and here you have two. Don't get greedy.

Metablog note: Gist embeds somewhat nicely in Blogger, but doesn't show up in the RSS, so we're back to monospace blockquotes.

Wednesday, August 24, 2011

Range woes

If you've ever iterated, you probably know about the built-in range() and xrange() functions. They usually do the job, but neither is perfect. xrange() can't deal with numbers larger than the system max integer, while range() chokes on very long ranges:

If you simply must have numbers larger than sys.maxint, then you need to use range() while limiting the length with a start argument.

Thursday, August 18, 2011

Passing Parameters 4 Ways at Once

>>> def foo(a,b,c,d): print a, b, c, d
...
>>> foo(1, *(2,), c=3, **{'d':4} )
1 2 3 4

Friday, August 5, 2011

Stealth Metaclass

>>> class StealthMeta(type):
... def __new__(cls, name, bases, attrs):
... del attrs['__metaclass__']
... return type.__new__(cls, name, bases, attrs)
...
>>> class A(object):
... __metaclass__ = StealthMeta
...
>>> A.__metaclass__
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: type object 'A' has no attribute '__metaclass__'
>>> isinstance(A, StealthMeta)
True

Decorate then Meta, or Meta then Decorate?

Which is applied second, a meta-class or a class decorator?

>>> def add_a(cls):
... cls.a = "decorator!"
... return cls
...
>>> class MetaA(type):
... def __new__(cls, name, bases, attrs):
... attrs["a"] = "metaclass!"
... return type.__new__(cls, name, bases, attrs)
...
>>> @add_a
... class Test(object):
... __metaclass__=MetaA
...
>>> Test.a
'decorator!'

Monday, July 18, 2011

String Concatenation

Basic introduction to Python:
"Two string literals next to each other are automatically concatenated"

>>> "string" "ing"
'stringing'

Thursday, June 23, 2011

Dict Destroyer!

Let's make a class that returns a random number for its hash.

>>> import random
>>> class DictDestroyer(object):
... def __hash__(self): return random.randint(0, 2**31)
...
>>> d = DictDestroyer()
>>> hash(d)
520880499
>>> hash(d)
793941724

Now, DictDestroyer lives up to its name:

>>> a = {d:1, d:2, d:3}
>>> import pprint
>>> pprint.pprint(a)
{<__main__.DictDestroyer object at 0x00CE8E90>: 1,
<__main__.DictDestroyer object at 0x00CE8E90>: 2,
<__main__.DictDestroyer object at 0x00CE8E90>: 3}

What is this? One key has multiple values? What happens when data is fetched?

>>> a[d]
1
>>> a[d]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: <__main__.DictDestroyer object at 0x00CE8E90>

Inconsistent behavior. Some times it returns one of the values, other times it gives a key error.
The C source for python dictionaries holds the answer.
I'll have to update later when I figure it out.

Wednesday, June 22, 2011

cron-like functionality (sched module)

The Python standard library includes a handy little scheduler, with which you can do cron-like functionality. The following code will print the current time every 5 seconds forever.

>>> import sched
>>> import datetime
>>> import time
>>> s = sched.scheduler(time.time, time.sleep)
>>> def print_time():
... print datetime.datetime.now()
... s.enter(5, 1, print_time, ())
...
>>> print_time()
2011-06-22 17:56:41.262000
>>> s.run()
2011-06-22 17:56:46.668000
2011-06-22 17:56:51.669000
2011-06-22 17:56:56.669000
2011-06-22 17:57:01.669000

This is a simple example of continuation passing style -- where functions "schedule" each other to run at a later time rather than calling directly.

Sunday, May 29, 2011

inner functions, how do they work?

When a higher order function returns a new function, a new context has been wrapped around the existing code object. The code of that function is created once, at compile time. There is absolutely no compiler involved at runtime, no new byte code is generated.

>>> def foo(n):
... def bar():
... print n
... return bar
...
>>> a,b = foo(1), foo(2)
>>> a == b
False
>>> a()
1
>>> b()
2
>>> a.__code__ == b.__code__
True

Wednesday, May 18, 2011

class is special

>>> class Classless(object):
... def __getattr__(self, name): raise AttributeError(name)
...
>>> Classless().__class__
<class '__main__.Classless'>
>>> c = Classless()
>>> c.__class__ = None
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: __class__ must be set to new-style class, not 'NoneType' object

Monday, May 16, 2011

the keys to the kingdom

Most of the low-level guts of an operating system are only exposed to C. Enter ctypes, part of the python standard library which allows python to call C code natively.

For example, say you are trying to access some function of open-ssl that is not exposed through the ssl module.

>>> import ctypes
>>> libssl = ctypes.cdll.LoadLibrary('libssl.so')
>>> libssl.PEM_read_bio_PrivateKey
<_FuncPtr object at 0x7fc001ba3a10>

With ctypes, your code is a full peer of C/C++ in how it can interact with the OS.

Thursday, May 5, 2011

reloading a module does not regenerate the module object

When a module is reloaded, the module object is not removed from memory. Old data which is not overwritten will stick around.

>>> import json
>>> oldjson = json
>>> reload(json)
<module 'json' from 'C:\Python26\lib\json\__init__.pyc'>
>>> json is oldjson
True

To truly clean up modules in memory is a tricky process.

Wednesday, April 20, 2011

making dict(myobject) do something useful

>>> class T(object):
... def __iter__(self):
... return self.__dict__.items().__iter__()
...
>>> t = T()
>>> t.a = "cat"; t.b = "dog"
>>> dict(t)
{'a': 'cat', 'b': 'dog'}

Thursday, March 31, 2011

every string is an iterable of strings

>>> 'a'[0][0][0][0][0]
'a'

Python does not actually create a bunch of string objects when you do this.

>>> a = 'a'
>>> id(a) == id(a[0][0][0][0])
True

Thursday, March 17, 2011

Negative Integer Gotchas (mod and div)

Quick! What are the values of these expressions: -5%100, -5/100?
If you said -5 and 0, you'd be right... if this was C or Java.
>>> print -5%100, -5/100
95 -1
Some more fun with negative modulos.
>>> 3%-1

0

>>> 3%-2

-1

>>> 3%-3

0

>>> 3%-4

-1

>>> 3%-5

-2

>>> 3%-6

-3

>>> 3%-7

-4

Why does Python have a different behavior than C, Java, Fortran et al?
The decree of the Benevolent Dictator For Life!

Monday, March 14, 2011

Local Variable Performance & dis

Local variables are faster in python. They are accessed as offsets from a stack, rather than looked up in a hashmap.

Here is an example showing the use of local variables as a performance optimization. http://wiki.python.org/moin/PythonSpeed/PerformanceTips#Local_Variables

Using the dis module, we can investigate this behavior. Note LOAD_FAST versus LOAD_DEREF.

>>> import dis
>>> def foo():
... a = 1
... def bar():
... return a
... return bar
...
>>> b = foo()
>>> dis.dis(b)
  4 0 LOAD_DEREF 0 (a)
   3 RETURN_VALUE
>>> def loc():
... a = 1
... return a
...
>>> dis.dis(loc)
  2 0 LOAD_CONST 1 (1)
   3 STORE_FAST 0 (a)

  3 6 LOAD_FAST 0 (a)
   9 RETURN_VALUE

Monday, February 28, 2011

PHP-like string concatenation (almost)

class PHPstr(str):
def __getattr__(self, attr):
return PHPstr(self+attr)

>>> a = PHPstr("test")
>>> a.test
'testtest'

Almost, but not quite the concatenation operator of PHP strings.

Friday, January 21, 2011

custom string class

Here is how to make a subclass of string which has all the same operations available, but which returns instances of that subclass:

import inspect

import functools

class MyStr(str):

def __radd__(self, other):

return MyStr(str.__add__(other, self))

def __repr__(self):

return self.__class__.__name__ + str.__repr__(self)

def _mystr_wrap(f):

@functools.wraps(f, ('__name__','__doc__'))

def g(*a, **kw):

res = f(*a, **kw)

if res.__class__ == str:

return MyStr(res)

return res

return g

for name, f in inspect.getmembers(str, callable):

#skip some special cases

if name not in ['__class__', '__new__', '__str__', '__init__', '__repr__']:

setattr(MyStr, name, _mystr_wrap(f))

>>> MyStr("cat")

MyStr'cat'

>>> MyStr("cat")+"dog"

MyStr'catdog'

>>> MyStr("cat")[1:]

MyStr'at'

Note: if you find yourself actually doing this, it is probably a bad code smell :-)

Wednesday, January 5, 2011

making your class work with built-in math functions

Just add __float__ and/or __int__. Then the built-in math functions, or anything that sanitizes its input via int() or float() will see your objects as a numeric type.

>>> import math

>>> class B(object):
... def __float__(self): return 0.5
...
>>> math.sin(B())
0.47942553860420301