Thursday, May 5, 2016

String optimization in Python

Strings are terribly important in programming. A program without some form of string input, manipulation, and output is a rarity.

Of course this means that speed and sanity surrounding string features is important. One important feature of Python is string immutability. This opens up dozens of features, such as using strings as dictionary keys, but there are some downsides.

Immutable strings means that any string manipulation, such as splitting or appending, is making a copy of that string. This can become a performance problem, especially in a world where zero-copy is one of the favorite general optimization techniques. If you've done enough string mutation, you're probably aware of the following techniques:
But in some cases Python uses the immutability to avoid making copies:
>>> a = 'a' * 1024 * 1024  # a 1 megabyte string
>>> z = '' + a
>>> z is a
True
Here, because adding an empty string does not change the value, z is the same exact string object as a. And it doesn't matter how many times you append an empty string:
>>> z = '' + '' + '' + a
>>> z is a
True
It even works when a is the only item in a list:
>>> z = ''.join([a])
>>> z is a
True
But it falls apart when you put an empty string in the list with a:
>>> z = ''.join(['', a])
>>> z is a
False
And unfortunately even the first example seems to make a copy on PyPy:
>>>> a = 'a' * 1024 * 1024  # a 1 megabyte string again
>>>> z = '' + a
>>>> z is a 
False 
Although something more advanced may be going on under the covers, as is often the case with PyPy.

I'm almost done stringing you along, but as a corollary reminder:
Never rely on "is" checks with ints, floats, and strings. "==" and other value checks are what you need. As a general rule, "is" is for objects, None, and sometimes True/False.

Keep on stringifying!

Mahmoud
http://sedimental.org/
https://github.com/mahmoud
https://twitter.com/mhashemi

33 comments:

  1. I think the cpython behaviour is an optimisation possible because of reference counting. Cpython can tell when adding strings if it's the only reference, and can reuse the memory rather than copying in cases like above. Pypy doesn't use reference counting, so can't do the same trick, aiui

    ReplyDelete
    Replies
    1. While it's true that PyPy does not use CPython-style reference counting, I don't think that implies that PyPy can't have this optimization, per se. PyPy still has immutable strings, as that's a property of the Python language, not the runtime.

      Delete
  2. This could be related to interned strings: you can force any string to be unique and in a global table using the intern() function.

    ReplyDelete
    Replies
    1. You are correct about intern() and that may be the culprit with shorter strings, but you'll notice I created a 1MB string to start with. Very much hope our relatively lightweight CPython doesn't magically intern that :)

      Delete
  3. Thanks for sharing the information about the Python and keep updating us.This information is really useful to me.

    ReplyDelete
  4. I believe there are many more pleasurable opportunities ahead for individuals that looked at your site.
    python training in chennai

    ReplyDelete


  5. Australia Best Tutor is offering assignment help Services for all universities learners or students at an affordable price. Here All students are joining for best grades and better results.

    Visit Here

    Australia Best Tutor
    Sydney, NSW, Australia
    Call @ +61-730-407-305
    Live Chat @ https://www.australiabesttutor.com

    Our Services

    Online assignment help
    my assignment help
    assignment help
    help with assignment
    instant assignment help
    Assignment help Services

    ReplyDelete
  6. Good Post! Thank you so much for sharing this pretty post, it was so good to read and useful to improve my knowledge as updated one, keep blogging.
    Devops training in tambaram|Devops training in velachery|Devops training in annanagar|Devops training in sholinganallur

    ReplyDelete
  7. Great Article… I love to read your articles because your writing style is too good, its is very very helpful for all of us. Do check Six Sigma Training in Bangalore | Six Sigma Training in Dubai & Get trained by an expert who will enrich you with the latest trends.

    ReplyDelete
  8. Nice blog.Thank you for sharing your information and keep going on. See more: Python Online Training

    ReplyDelete
  9. This is very good content you share on this blog. it's very informative and provide me future related information.
    java training in annanagar | java training in chennai

    java training in chennai | java training in electronic city

    ReplyDelete
  10. Whoa! I’m enjoying the template/theme of this website. It’s simple, yet effective. A lot of times it’s very hard to get that “perfect balance” between superb usability and visual appeal. I must say you’ve done a very good job with this.


    Best AWS Training in Marathahalli | AWS Training in Marathahalli

    Amazon Web Services Training in Anna Nagar, Chennai |Best AWS Training in Anna Nagar, Chennai

    AWS Training in Velachery | Best AWS Course in Velachery,Chennai

    Best AWS Training in Chennai | AWS Training Institutes |Chennai,Velachery

    ReplyDelete
  11. Cool post, thanks. Speaking about online technologies. I suggest to keep your paperwork stored online.
    http://www.mac-torrent-download.website/

    ReplyDelete
  12. I really enjoy simply reading all of your weblogs. Simply wanted to inform you that you have people like me who appreciate your work. Definitely a great post I would like to read this
    angularjs online Training

    angularjs Training in marathahalli

    angularjs interview questions and answers

    angularjs Training in bangalore

    angularjs Training in bangalore

    angularjs online Training

    ReplyDelete
  13. I appreciate your efforts because it conveys the message of what you are trying to say. It's a great skill to make even the person who doesn't know about the subject could able to understand the subject . Your blogs are understandable and also elaborately described. I hope to read more and more interesting articles from your blog. All the best.
    Java training in Chennai | Java training institute in Chennai | Java course in Chennai

    Java training in Bangalore | Java training in Electronic city

    Java training in Bangalore | Java training in Marathahalli

    Java training in Bangalore | Java training in Btm layout

    ReplyDelete
  14. Wow it is really wonderful and awesome thus it is very much useful for me to understand many concepts and helped me a lot. it is really explainable very well and i got more information from your blog.
    python training in chennai
    python course institute in chennai

    ReplyDelete
  15. This is such a good post. One of the best posts that I\'ve read in my whole life. I am so happy that you chose this day to give me this. Please, continue to give me such valuable posts. Cheers!

    Data Science training in rajaji nagar | Data Science Training in Bangalore
    Data Science with Python training in chennai
    Data Science training in electronic city
    Data Science training in USA
    Data science training in pune

    ReplyDelete
  16. I always enjoy reading quality articles by an individual who is obviously knowledgeable on their chosen subject. Ill be watching this post with much interest. Keep up the great work, I will be back..Best Python Training Institute In Hyderabad | Best Python Online Training Institute

    ReplyDelete
  17. Great post! I am actually getting ready to across this information, It’s very helpful for this blog.Also great with all of the valuable information you have Keep up the good work you are doing well...Best Python Online Training Institute In Chennai

    ReplyDelete
  18. Softhax
    Find best android iOS apps free now. Install them on your phone with easily simple steps. Free android games, free applications.
    softhax

    ReplyDelete
  19. All are saying the same thing repeatedly, but in your blog I had a chance to get some useful and unique information, I love your writing style very much, I would like to suggest your blog in my dude circle, so keep on updates.
    Best Devops training in sholinganallur
    Devops training in velachery
    Devops training in annanagar
    Devops training in tambaram

    ReplyDelete
  20. Thank you so much for the post you do. I like your post and all you share with us is up to date and quite informative, schweiz vpn

    ReplyDelete
  21. Oceanofmobile
    Do you want to crack your softwares or want any generator and activator for personal computers and mac. Get serial keys and mode of any software to activate Stardock WindowBlinds Crack With Serial Key

    ReplyDelete

  22. Hello! This is my first visit to your blog! We are a team of volunteers and starting a new initiative in a community in the same niche. Your blog provided us useful information to work on. You have done an outstanding job.
    Advanced AWS Training in Chennai | Best Amazon Web Services Training in Chennai
    Best Amazon Web Services Training Course in Bangalore | AWS Training in Bangalore
    AWS Online Training and Certification | AWS Certification Course

    ReplyDelete
  23. This is an awesome post.Really very informative and creative contents. These concept is a good way to enhance the knowledge.I like it and help me to development very well.Thank you for this brief explanation and very nice information.Well, got a good knowledge.

    angularjs Training in bangalore

    angularjs Training in btm

    angularjs Training in electronic-city

    angularjs online Training

    angularjs Training in marathahalli

    ReplyDelete