Consider a REPL with two tuples, a and b.

>>> type(a), type(b)

(<type 'tuple'>, <type 'tuple'>)

>>> a == b

True

So far, so good. But let's dig deeper...

>>> a[0] == b[0]

False

The tuples are equal, but their contents is not.

>>> a is b

True

In fact, there was only ever one tuple.

What is this madness?

>>> a

(nan,)

Welcome to the float zone.

Many parts of python assume that a is b implies a == b, but floats break this assumption. They also break the assumption that hash(a) == hash(b) implies a == b.

>>> hash(float('nan')) == hash(float('nan'))

True

Dicts handle this pretty elegantly:

>>> n = float('nan')

>>> {n: 1}[n]

1

>>> a = {float('nan'): 1, float('nan'): 2}

>>> a

{nan: 1, nan: 2}

## Thursday, September 12, 2019

## Monday, June 3, 2019

### They say a python tuple can't contain itself...

... but here at PDW we abhor that kind of defeatism!

>>> import ctypes

>>> tup = (None,)

>>> ctypes.pythonapi.PyTuple_SetItem.argtypes = ctypes.c_void_p, ctypes.c_int, ctypes.c_void_p

>>> ctypes.pythonapi.PyTuple_SetItem(id(tup), 0, id(tup))

0

Showing the tuple itself is a little problematic

>>> tup

# ... hundreds of lines of parens ...

(((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((

(((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((

(((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((

(((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((

(((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((

((Segmentation fault

>>> import ctypes

>>> tup = (None,)

>>> ctypes.pythonapi.PyTuple_SetItem.argtypes = ctypes.c_void_p, ctypes.c_int, ctypes.c_void_p

>>> ctypes.pythonapi.PyTuple_SetItem(id(tup), 0, id(tup))

0

Showing the tuple itself is a little problematic

>>> tup

# ... hundreds of lines of parens ...

(((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((

(((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((

(((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((

(((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((

(((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((

((Segmentation fault

## Wednesday, January 23, 2019

### So a list and a tuple walk into a sum()

As a direct side effect of glom's 19.1.0 release, the authors here at PDW got to re-experience one of the more surprising behaviors of three of Python's most basic constructs:

Most experienced developers know the quickest way to combine a short list of short lists:

But what happens when we throw a tuple into the mix:

The trick here is that Python has two addition operators. The simple "+" or "add" operator, used by sum(), and the more nuanced "+=" or "iadd" operator, add's inplace variant.

But why is ok for one addition to error and the other to succeed?

Symmetry. And maybe commutativity if you remember that math class.

"+" in Python is symmetric: A + B and B + A should always yield the same result. To do otherwise would be more surprising than any of the surprises above. list and tuple cannot be added with this operator because in a mixed-type situation, the return type would change based on ordering.

Meanwhile, "+=" is asymmetric. The left side of the statement determines the type of the return completely. A += B keeps A's type. A straightforward, Pythonic reason if there ever was one.

Going back to the start of our story, by building on operator.iadd, glom's new flatten() function avoids sum()'s error-raising behavior and works wonders on all manner of nesting iterable.

Most experienced developers know the quickest way to combine a short list of short lists:

list_of_lists = [[1], [2], [3, 4]]Ah, nice and flat, much better.

sum(list_of_lists, [])

# [1, 2, 3, 4]

But what happens when we throw a tuple into the mix:

list_of_seqs = [[1], [2], (3, 4)]This is kind of surprising! Especially when you consider this:

sum(list_of_seqs, [])

# TypeError: can only concatenate list (not "tuple") to list

seq = [1, 2]Why should sum() fail when addition succeeds?! We'll get to that.

seq += (3, 4)

# [1, 2, 3, 4]

new_list = [1, 2] + (3, 4)There's that error again!

# TypeError: can only concatenate list (not "tuple") to list

The trick here is that Python has two addition operators. The simple "+" or "add" operator, used by sum(), and the more nuanced "+=" or "iadd" operator, add's inplace variant.

But why is ok for one addition to error and the other to succeed?

Symmetry. And maybe commutativity if you remember that math class.

"+" in Python is symmetric: A + B and B + A should always yield the same result. To do otherwise would be more surprising than any of the surprises above. list and tuple cannot be added with this operator because in a mixed-type situation, the return type would change based on ordering.

Meanwhile, "+=" is asymmetric. The left side of the statement determines the type of the return completely. A += B keeps A's type. A straightforward, Pythonic reason if there ever was one.

Going back to the start of our story, by building on operator.iadd, glom's new flatten() function avoids sum()'s error-raising behavior and works wonders on all manner of nesting iterable.

Subscribe to:
Posts (Atom)