Wednesday, January 23, 2019

So a list and a tuple walk into a sum()

As a direct side effect of glom's 19.1.0 release, the authors here at PDW got to re-experience one of the more surprising behaviors of three of Python's most basic constructs:
Most experienced developers know the quickest way to combine a short list of short lists:
list_of_lists = [[1], [2], [3, 4]]
sum(list_of_lists, [])
# [1, 2, 3, 4]
Ah, nice and flat, much better.

But what happens when we throw a tuple into the mix:
list_of_seqs = [[1], [2], (3, 4)]
sum(list_of_seqs, [])
# TypeError: can only concatenate list (not "tuple") to list
This is kind of surprising! Especially when you consider this:
seq = [1, 2]
seq += (3, 4)
# [1, 2, 3, 4]
Why should sum() fail when addition succeeds?! We'll get to that.
new_list = [1, 2] + (3, 4)
# TypeError: can only concatenate list (not "tuple") to list
There's that error again!

The trick here is that Python has two addition operators. The simple "+" or "add" operator, used by sum(), and the more nuanced "+=" or "iadd" operator, add's inplace variant.

But why is ok for one addition to error and the other to succeed?

Symmetry. And maybe commutativity if you remember that math class.

"+" in Python is symmetric: A + B and B + A should always yield the same result. To do otherwise would be more surprising than any of the surprises above. list and tuple cannot be added with this operator because in a mixed-type situation, the return type would change based on ordering.

Meanwhile, "+=" is asymmetric. The left side of the statement determines the type of the return completely. A += B keeps A's type. A straightforward, Pythonic reason if there ever was one.

Going back to the start of our story, by building on operator.iadd, glom's new flatten() function avoids sum()'s error-raising behavior and works wonders on all manner of nesting iterable.

13 comments:

  1. Wow, this is new to me. Thanks for sharing it.

    ReplyDelete
  2. "+" is not very symmetric when you're adding strings... (or lists, for that matter, [1] + [2] != [2] + [1]).

    ReplyDelete
  3. >>> a=[4,6]
    >>> a += (i**2 for i in range(4))
    >>> a
    [4, 6, 0, 1, 4, 9]

    ReplyDelete
  4. Positive site, where did u come up with the information on this posting?I have read a few of the articles on your website now, and I really like your style. Thanks a million and please keep up the effective work. satta king

    ReplyDelete
  5. A round of applause for your article.Much thanks again. Really Cool. Satta king

    ReplyDelete
  6. Might as well look at itertools.chain

    ReplyDelete
  7. Such a very useful article. Very interesting to read this article. I would like to thank you for the efforts you had made for writing this awesome article.
    Data Science Course in Pune
    Data Science Training in Pune

    ReplyDelete
  8. All of these posts were incredible perfect. It would be great if you’ll post more updates.
    Data Science Course in Bangalore

    ReplyDelete
  9. These thoughts just blew my mind. I am glad you have posted this.
    Data Science Training in Bangalore

    ReplyDelete
  10. I am really enjoying reading your well written articles. It looks like you spend a lot of effort and time on your blog. I have bookmarked it and I am looking forward to reading new articles. Keep up the good work.
    Data Science Institute in Bangalore

    ReplyDelete
  11. Such a very useful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article.
    Data Science Certification in Bangalore

    ReplyDelete