<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-36456651</id><updated>2012-01-31T14:57:43.868-08:00</updated><category term='gil'/><category term='threads'/><category term='introduction'/><category term='python'/><category term='essential reference'/><title type='text'>Dabeaz</title><subtitle type='html'>Dave Beazley's  mondo computer blog.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>41</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-36456651.post-1178596837625720513</id><published>2012-01-31T05:10:00.000-08:00</published><updated>2012-01-31T05:13:11.980-08:00</updated><title type='text'>Drunk Tweeting in Chicago</title><content type='html'>&lt;p&gt;
Lately, I've been messing around with the &lt;a href="http://pypi.python.org/pypi/requests"&gt;requests&lt;/a&gt; and &lt;a href="http://pypi.python.org/pypi/regex"&gt;regex&lt;/a&gt; libraries.  They are awesome.  So, without any further explanation, I present this short script that uses both in an attempt to identify people drunk-tweeting in Chicago.  Enjoy.&lt;/p&gt;

&lt;blockquote&gt;
&lt;pre&gt;
# drunktweet.py
'''
Print out possible drunk tweets from the city of Chicago.
'''

import regex
import requests
import json

# Terms for being "wasted"
terms = { 'drunk','wasted','buzzed','hammered','plastered' }

# A fuzzy regex for people who can't type
pat = regex.compile(r"(?:\L&amp;lt;terms&gt;){i,d,s,e&amp;lt;=2}$", regex.I, terms=terms)

# Connect to the Twitter streaming API
url   = "https://stream.twitter.com/1/statuses/filter.json"
parms = {
    'locations' : "-87.94,41.644,-87.523,42.023"    # Chicago
    }
auth  = ('username','password')
r = requests.post(url, data=parms, auth=auth)

# Print possible candidates
for line in r.iter_lines():
    if line:
        tweet = json.loads(line)
        status = tweet.get('text',u'')
        words = status.split()
        if any(pat.match(word) for word in words):
           print(tweet['user']['screen_name'], status)
&lt;/pre&gt;
&lt;/blockquote&gt;

&lt;p&gt;
It's left as an exercise to reader to filter out false-positives and have the script call a cab.  By the way, you should check out some of my &lt;a href="http://www.dabeaz.com/chicago/index.html"&gt;Python Classes in Chicago&lt;/a&gt;.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-1178596837625720513?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/1178596837625720513/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=1178596837625720513' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/1178596837625720513'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/1178596837625720513'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2012/01/drunk-tweeting-in-chicago.html' title='Drunk Tweeting in Chicago'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-4228375977808455279</id><published>2012-01-03T05:22:00.000-08:00</published><updated>2012-01-12T06:52:52.459-08:00</updated><title type='text'>The Compiler Experiment Begins</title><content type='html'>&lt;p&gt;&lt;b&gt;January 13, 2012 Update:&lt;/b&gt; There are still a few seats left in the compilers class for January 17-20, 2012.  More details &lt;a href="http://www.dabeaz.com/chicago/compiler.html"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;
In the spring of 1995, I took a course on compiler design.  At the time, I was just a first year Ph.D. Computer Science student making my way through various course requirements.  Before that, I was a mathematician working on computational physics software--writing a lot of finely tuned C code for solving differential equations on big supercomputers.  Although I already considered myself to be a pretty knowledgable programmer, I think compilers was probably the one course that had the most profound impact on my later work.  In fact, this is the course that inspired me to look at the use of scripting languages for controlling scientific software.  It also directly led to the &lt;a href="http://www.swig.org"&gt;Swig&lt;/a&gt; project, first implemented in the summer of 1995.  Last, but not least, this is how I ultimately ended up in the world of Python.&lt;/p&gt;

&lt;p&gt;
I think the great thing about compilers was how it simply tied so many topics together all in one place.  Everything from mathematical theory, clever algorithms, programming language semantics, computer architecture, software design, clever hacking, and even the nature of computation itself.  As part of that course, we had to write our own compiler--a tangled mess of C code that turned a subset of Pascal into executable code that would actually run on a Sun Sparcstation.  To be sure, the code was a horrible disaster.  However, simply having written a working compiler was definitely one of the most memorable parts of graduate school.&lt;/p&gt;

&lt;p&gt;
In 2001, I had an opportunity to revisit the topic of compilers.  At the time, I was an assistant professor at the University of Chicago and an opportunity to teach compilers came up.  I jumped at it.  I also used the opportunity to try an experiment of what it might be like to write a compiler in Python instead of C.   As a bit of context, a lot of people had been asking me about the idea of rewriting Swig in Python (instead of C++).  I wasn't so sure.  In fact, I really didn't even know how to do it given doubts about Python's performance as well as a general lack of sufficiently powerful parsing tools at the time.  Long story short--this is how the &lt;a href="http://www.dabeaz.com/ply/index.html"&gt;PLY&lt;/a&gt; project came into existence.  I used it in the class and had about 25 students write a compiler for an even more powerful subset of Pascal, creating runnable code for the Sparc.&lt;/p&gt;

&lt;p&gt;
Fast forward 11 years. I've long since left the University, but I still continue to teach quite a few classes--especially various sorts of Python classes.  Over the past year or so, students and I have often discussed the idea of having some kind of advanced project course.  Something that would be quite a bit harder and involve much more coding.   I think you might see where this is going.&lt;/p&gt;

&lt;h3&gt;Write a Compiler (in Python)&lt;/h3&gt;
&lt;p&gt;
So, today is the first day of another compiler experiment.   Over the course of 4 days, I'm going to attempt to take six students through a compiler writing project similar to the one at the University. It's basically a nine-stage project:
&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Lexing and tokenizing.&lt;/li&gt;
&lt;li&gt;Parsing and parse trees.&lt;/li&gt;
&lt;li&gt;Type checking.&lt;/li&gt;
&lt;li&gt;Intermediate code generation.&lt;/li&gt;
&lt;li&gt;Simple optimization (constant folding, etc.).&lt;/li&gt;
&lt;li&gt;Relations&lt;/li&gt;
&lt;li&gt;Control flow&lt;/li&gt;
&lt;li&gt;Functions&lt;/li&gt;
&lt;li&gt;Output code in RPython (from the PyPy project)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;
One interesting thing about using Python for such a project is that you can use the internals of Python itself to explore important concepts. For example, if you want to see what happens when you compile a regular expression, you can just try it:
&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
&gt;&gt;&gt; &lt;b&gt;import sre_parse&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;sre_parse.parse(r"[a-zA-Z_][a-zA-Z0-9_]*")&lt;/b&gt;
[('in', [('range', (97, 122)), ('range', (65, 90)), 
 ('literal', 95)]), ('max_repeat', (0, 65535, [('in', 
 [('range', (97, 122)), ('range', (65, 90)), 
('range', (48, 57)), ('literal', 95)])]))]
&gt;&gt;&gt; 
&lt;/pre&gt;
&lt;/blockquote&gt;

&lt;p&gt;
Or, if you want to look at how Python makes an AST:
&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
&gt;&gt;&gt; &lt;b&gt;import ast &lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;node = ast.parse("a = x + 2*y")&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;ast.dump(node)&lt;/b&gt;
"Module(body=[Assign(targets=[Name(id='a', ctx=Store())], value=BinOp(left=Name(id='x', ctx=Load()), op=Add(), right=BinOp(left=Num(n=2), op=Mult(), right=Name(id='y', ctx=Load()))))])"
&gt;&gt;&gt; 
&lt;/pre&gt;
&lt;/blockquote&gt;

&lt;p&gt;
Or, if you want to see what kind of code Python generates:
&lt;/p&gt;

&lt;blockquote&gt;
&lt;pre&gt;
&gt;&gt;&gt; &lt;b&gt;def fact(n):
...     if n &lt;= 1:
...             return 1
...     else:
...             return n*fact(n-1)&lt;/b&gt;
... 
&gt;&gt;&gt; &lt;b&gt;import dis&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;dis.dis(fact)&lt;/b&gt;
  2           0 LOAD_FAST                0 (n)
              3 LOAD_CONST               1 (1)
              6 COMPARE_OP               1 (&amp;lt;=)
              9 POP_JUMP_IF_FALSE       16

  3          12 LOAD_CONST               1 (1)
             15 RETURN_VALUE        

  5     &gt;&gt;   16 LOAD_FAST                0 (n)
             19 LOAD_GLOBAL              0 (fact)
             22 LOAD_FAST                0 (n)
             25 LOAD_CONST               1 (1)
             28 BINARY_SUBTRACT     
             29 CALL_FUNCTION            1
             32 BINARY_MULTIPLY     
             33 RETURN_VALUE        
             34 LOAD_CONST               0 (None)
             37 RETURN_VALUE        
&gt;&gt;&gt; 
&lt;/pre&gt;
&lt;/blockquote&gt;

&lt;p&gt;
By looking at what Python does itself, I think it can be related back the work the students will be doing on their own project and might be an interesting way to explore important concepts without getting completely bogged down in a theory-heavy exposition (as one might find in a compilers textbook).  I don't have any grand illusions about the students running off afterwards to do research in compilers.  However, I think it will be an interesting experiment where everyone still learns a lot.&lt;/p&gt;

&lt;h3&gt;Follow the Project&lt;/h3&gt;

&lt;p&gt;
Due to time constraints of the project, I won't be blogging during the week.  However, you can &lt;a href="http://www.twitter.com/dabeaz"&gt;follow me on Twitter&lt;/a&gt; for updates to see how it's going.  I will be posting a more detailed followup describing the project and how it worked out after it's over.
&lt;/p&gt;

&lt;p&gt;
If you would like to write a compiler yourself, there are still some seats left in a second running of the project, January 17-20. Click &lt;a href="http://www.dabeaz.com/chicago/compiler.html"&gt;here&lt;/a&gt; for more information.
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-4228375977808455279?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/4228375977808455279/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=4228375977808455279' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/4228375977808455279'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/4228375977808455279'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2012/01/compiler-experiment-begins.html' title='The Compiler Experiment Begins'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-5220215951565676318</id><published>2011-12-21T20:13:00.000-08:00</published><updated>2011-12-28T04:05:01.754-08:00</updated><title type='text'>Python Courses for 2012</title><content type='html'>&lt;p&gt;
I'm excited to announce my new &lt;a href="http://www.dabeaz.com/chicago/index.html"&gt;Python training courses&lt;/a&gt; for the first part 2012.  These are intense hands-on classes that are strictly limited to 6 attendees.  Unlike an online course, you'll get to escape work, your family, and friends for several days while you become completely immersed in the topic at hand.  Needless to say, you won't be disappointed.
&lt;/p&gt;

&lt;center&gt;
&lt;img src="http://www.dabeaz.com/chicago/class_small.jpg"&gt;&lt;br&gt;
&lt;em&gt;A Python course in action&lt;/em&gt;
&lt;/center&gt;

&lt;p&gt;
&lt;b&gt;&lt;a href="http://www.dabeaz.com/chicago/compiler.html"&gt;Write a Compiler (In Python) : January 17-20, 2012&lt;/a&gt;&lt;/b&gt;
&lt;blockquote&gt;
So you never got to take a compilers course in college or you're simply a masochist looking to take your programming skills up a few levels?   Then this is the course for you.  Come to Chicago in January and spend 4 days writing a compiler for your own programming language and have it run on top of rpython, the implementation language used by PyPy.  In this course, you'll learn about all of the major parts of what makes a compiler work, see a bunch of advanced Python programming techniques, and dive into all sorts of low-level black magic.  Why?  Because it's fun.  Update: seats are still available!
&lt;/blockquote&gt;
&lt;/p&gt;
&lt;p&gt;
&lt;b&gt;&lt;a href="http://www.dabeaz.com/chicago/science.html"&gt;Python for Scientists and Engineers : February 27-March 2, 2012&lt;/a&gt;&lt;/b&gt;
&lt;blockquote&gt;
Join special guest Mike Müller, founder of &lt;a href="http://www.python-academy.com/"&gt;Python Academy&lt;/a&gt; for an exclusive 5-day in-depth course on using Python for Science and Engineering. Topics include numerical computing with numpy/scipy, plotting, working with large data files, Python extensions, testing, version control, and more. I'm pleased to have Mike join me in Chicago for this special course before he heads to PyCon'2012. 
&lt;/blockquote&gt;
&lt;/p&gt;
&lt;p&gt;
&lt;b&gt;&lt;a href="http://www.dabeaz.com/chicago/concurrent.html"&gt;Python Concurrency and Distributed Computing Workshop : March 19-22, 2012&lt;/a&gt;&lt;/b&gt;
&lt;blockquote&gt;
A 4-day in-depth exploration of everything you could possibly want to know about concurrency and distributed computing in Python.  Major topics include threads, multiprocessing, event-driven I/O (async), message passing, and coroutines.  If you must know, this is the same workshop that spawned my &lt;a href="http://blip.tv/pycon-us-videos-2009-2010-2011/pycon-2010-understanding-the-python-gil-82-3273690"&gt;infamous talks on the Python GIL&lt;/a&gt;.  Past participants have described this workshop as combining the contents of about three different university systems courses crammed into the span of four days.
&lt;/blockquote&gt;
&lt;/p&gt;
&lt;p&gt;
&lt;b&gt;&lt;a href="http://www.dabeaz.com/chicago/mastery.html"&gt;Advanced Python Mastery : April 2-6, 2012&lt;/a&gt;&lt;/b&gt;
&lt;blockquote&gt;
Go far beyond the basic tutorial and learn the secret Python programming techniques used by the authors of frameworks and libraries. Topics include some of Python's most advanced features including object implementation, descriptors, decorators, metaclasses, packaging, optimization, and more.  Simply stated, the aim of this course is to cover the entire Python programming language, leaving no stone unturned.
&lt;/blockquote&gt;
&lt;/p&gt;
&lt;p&gt;
&lt;b&gt;&lt;a href="http://www.dabeaz.com/chicago/practical.html"&gt;Practical Python Programming : May 14-18, 2012&lt;/a&gt;&lt;/b&gt;
&lt;blockquote&gt;
A thorough introduction to Python for software developers, scientists, and engineers who already know how to program in another programming language. A major focus of this course is on using Python to process various kinds of datasets, especially those associated with systems scripting and open data sources.  In addition to Python, this course includes a basic introduction to some of the scientific tools including numpy and matplotlib.  If you're looking to improve your Python programming skills after complete a more basic tutorial, this is a great course to take.
&lt;/blockquote&gt;
&lt;p&gt;
2012 marks the start of my fourth year of offering Python courses to small groups in Chicago.  One of the best parts of these classes is the interaction that results from putting programmers with different backgrounds together in the same room.  Everyone who attends wants to be there and conversations are likely to cover just about any topic imaginable (not just computers).  I learn new things with every class and think it's a lot of fun.  Hopefully I'll see you in a future class!
&lt;/p&gt;
&lt;p&gt;
Cheers,
Dave
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-5220215951565676318?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/5220215951565676318/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=5220215951565676318' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/5220215951565676318'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/5220215951565676318'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2011/12/python-courses-for-2012.html' title='Python Courses for 2012'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-4336792360497631397</id><published>2011-09-18T20:27:00.000-07:00</published><updated>2011-09-18T20:27:28.657-07:00</updated><title type='text'>Three Python Courses for Fall</title><content type='html'>&lt;p&gt;
As the leaves start to turn, I'm finally pleased to announce the dates for my fall Python courses in Chicago. &lt;/p&gt;
&lt;p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;b&gt;&lt;a href="http://www.dabeaz.com/chicago/concurrent.html"&gt;Python Concurrency and Distributed Computing Workshop (Nov 1-4).&lt;/a&gt;&lt;/b&gt; 
&lt;p&gt;The concurrency workshop is back, but is now expanded to four full days.  Since its start in 2009, the concurrency workshop has been my favorite place to try out new material and explore some really cutting edge Python topics.  This is the same workshop that spawned the infamous &lt;a href="http://blip.tv/carlfk/mindblowing-python-gil-2243379"&gt;Mindblowing GIL&lt;/a&gt; talk and subsequent &lt;a href="http://www.dabeaz.com/GIL"&gt;GIL presentation&lt;/a&gt; at PyCon.  More recent editions of the workshop have expanded to include Python 3, messaging frameworks such as ZeroMQ, and the use of NoSQL databases.  Past participants have described the workshop as covering about the same amount of material as three college courses.  If you like geeking out with other programmers and learning new things, you'll have a great time.&lt;/p&gt;
&lt;/li&gt;
&lt;P&gt;
&lt;li&gt;
&lt;b&gt;&lt;a href="http://www.dabeaz.com/chicago/mastery.html"&gt;Advanced Python Mastery (Nov 14-18).&lt;/a&gt;&lt;/b&gt;   
&lt;p&gt;
If you just want to learn Python basics, you can find about a million free on-line tutorials and videos to get you going.  However, if you want to understand all of the deep Python magic used by various application frameworks, then this is the class for you.  This is one of the only truly advanced training courses around that deeply explores the internals of Python's built-in objects, underlying object model, functions, and metaprogramming features.  Topics include most of Python's advanced features including cooperative inheritance (super, mixins, etc.), descriptors, decorators, context managers, metaclasses, generators, coroutines, packages, and more. Even seasoned Python programmers will learn new tricks.
&lt;/p&gt;
&lt;/li&gt;

&lt;p&gt;
&lt;li&gt;&lt;b&gt;&lt;a href="http://www.dabeaz.com/chicago/practical.html"&gt;Practical Python Programming (Dec. 12-16).&lt;/a&gt;&lt;/b&gt;
&lt;P&gt;
If you're new to Python and want to learn more in the company of other enthusiastic programmers, then this is the class for you. A major theme of this course is on using Python to analyze data.  Over the course of the week, you'll learn how to use Python to analyze datafiles, extract information from public web services (REST APIs), use popular extensions such as numpy and matplotlib.  Most of the exercises in this course involve open data published by various government and city sources.  So, not only will you get to learn Python, you'll get to do all sorts of neat stuff such as analyze crime data, locate huge rats, make maps, and more.  It should be great fun.
&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;
More information about these courses can be found &lt;a href="http://www.dabeaz.com/chicago/index.html"&gt;here&lt;/a&gt;.  In the meantime, be sure to catch my talks at &lt;a href="http://py.codeconf.com"&gt;PyCodeConf&lt;/a&gt;, Oct 6-7, in Miami and at &lt;a href="http://rupy.eu"&gt;RuPy 2011&lt;/a&gt;, Oct 14-16, in Poznan, Poland.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-4336792360497631397?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/4336792360497631397/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=4336792360497631397' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/4336792360497631397'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/4336792360497631397'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2011/09/three-python-courses-for-fall.html' title='Three Python Courses for Fall'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-4731544515209926406</id><published>2011-08-12T15:19:00.000-07:00</published><updated>2011-08-16T20:09:00.432-07:00</updated><title type='text'>An Inside Look at the GIL Removal Patch of Lore</title><content type='html'>&lt;p&gt;
As most Python programmers know, people love to hate
the Global Interpreter Lock (GIL).  Why can't it
simply be removed?  What's the problem?   However, if you've been around the
Python community long enough, you might also know that the
GIL was already removed once before--specifically, by Greg Stein who created
a patch against Python 1.4 in 1996. In fact, here's a link:&lt;/p&gt;
&lt;p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;a
  href="http://www.python.org/ftp/python/contrib-09-Dec-1999/System/threading.tar.gz"&gt;http://www.python.org/ftp/python/contrib-09-Dec-1999/System/threading.tar.gz&lt;/a&gt;
  &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;
This patch often gets mentioned in discussions regarding the
GIL--especially as justification for keeping the GIL.  For
example, see the &lt;a
href="http://docs.python.org/faq/library#can-t-we-get-rid-of-the-global-interpreter-lock"&gt;Python
FAQ&lt;/a&gt;, the forum post &lt;a
href="http://www.artima.com/forums/flat.jsp?forum=106&amp;thread=214235&amp;start=0&amp;msRange=15"&gt;It
Isn't Easy to Remove the GIL&lt;/a&gt; or this mailing discussion on 
&lt;a
href="http://mail.python.org/pipermail/python-dev/2001-August/017099.html"&gt;Free
Threading&lt;/a&gt;.  These discussions usually point out that the patch
made the performance of single-threaded applications much worse--so
much so that the patch couldn't be adopted.  However, beyond that,
technical details about the patch are somewhat sketchy.&lt;/p&gt;
&lt;p&gt;
Despite using Python since early 1996, I will
freely admit that I never really knew the details of Greg's GIL
removal patch.  For the most part, I just vaguely knew that someone
had attempted to remove the GIL, that it apparently killed the performance
of single-threaded apps, and that it subsequently faded into
oblivion.   However, given my recent interest in making the GIL better, I
thought it might be a fun challenge to turn on the time machine, 
see if I could actually find the patch, and compile a version of GIL-less Python in order to take a peek
under the covers.&lt;/p&gt;
&lt;p&gt;
So, in this post, I'll do just that and try to offer some commentary on some of
the more interesting and subtle aspects of the patch--in particular,
aspects of it that seem especially tricky or problematic.
Given the increased interest in concurrency, the GIL, and other matters, I hope
that this information might be of some use to others, or at the very
least, help explain why Python still has a GIL.  Plus, as the saying
goes, those who don't study the past are doomed to repeat it.  So,
let's jump in.&lt;/p&gt;
&lt;h3&gt;Python 1.4&lt;/h3&gt;
&lt;p&gt;
In order to play with the patch, you must first download and build
Python-1.4. You can find it on &lt;a href="http://www.python.org/download/releases/src/"&gt;python.org&lt;/a&gt; if
you look long enough.  After some bit of Makefile twiddling, I was
able to build it and try a few things out on a Linux system.&lt;/p&gt;
&lt;p&gt;
Using Python-1.4 is a big reminder of how much Python has
changed. It has none of the nice features you're used to to using now (list
comprehensions, sets, string methods, &lt;tt&gt;sum()&lt;/tt&gt;,
&lt;tt&gt;enumerate()&lt;/tt&gt;, etc.).  In playing with it, I realize that about
half of everything I type results in some kind of error. I
digress.&lt;/p&gt;
&lt;p&gt; Thread support in Python-1.4 is equally minimal.  The
&lt;a href="http://docs.python.org/library/threading.html"&gt;&lt;tt&gt;threading.py&lt;/tt&gt;&lt;/a&gt; module that most people know doesn't yet
exist. Instead, there is just the lower-level &lt;a href="http://docs.python.org/library/thread.html"&gt;&lt;tt&gt;thread&lt;/tt&gt;&lt;/a&gt; module
which simply lets you launch a thread, allocate a mutex lock, and not much
else.  Here's a small sample:&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
import thread
import time

def countdown(n):
    while n &gt; 0:
          print "T-minus", n
          n = n - 1
          time.sleep(1)

thread.start_new_thread(countdown,(10,))
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
Under the covers though, the implementation is
remarkably similar to modern Python. Each thread consists of a C
function that runs the specified Python callable and there is a GIL that
guards access to some critical global interpreter state. &lt;/p&gt;
&lt;h3&gt;A Reentrant Interpreter?&lt;/h3&gt;
&lt;p&gt;If you read the &lt;tt&gt;threading.README&lt;/tt&gt; file included in the patch,
you will find this description:&lt;/p&gt;
&lt;p&gt;
&lt;blockquote&gt;
&lt;em&gt;These patches enable Python to be "free threaded" or, in other words,
fully reentrant across multiple threads.  This is particularly important
when Python is embedded within a C program.&lt;/em&gt;
&lt;/blockquote&gt;
&lt;/p&gt;
&lt;p&gt;
The stated goal of making Python "fully reentrant" is important so
let's discuss.&lt;/p&gt;
&lt;p&gt;
All Python code gets compiled down to an intermediate "machine
langauge" before it executes.  For example, consider a simple function
like this:
&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
def countdown(n):
    while n &gt; 0:
         print "T-minus", n
         n -= 1
         time.sleep(1)
    print "Blastoff!"
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
You can view the underlying low-level instructions using the &lt;a
href=""&gt;dis&lt;/a&gt; module.&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
&gt;&gt;&gt; &lt;b&gt;import dis&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;dis.dis(countdown)&lt;/b&gt;
  2           0 SETUP_LOOP              48 (to 51)
        &gt;&gt;    3 LOAD_FAST                0 (n)
              6 LOAD_CONST               1 (0)
              9 COMPARE_OP               4 (&gt;)
             12 POP_JUMP_IF_FALSE       50

  3          15 LOAD_CONST               2 ('T-minus')
             18 PRINT_ITEM          
             19 LOAD_FAST                0 (n)
             22 PRINT_ITEM          
             23 PRINT_NEWLINE       

  4          24 LOAD_FAST                0 (n)
             27 LOAD_CONST               3 (1)
             30 INPLACE_SUBTRACT    
             31 STORE_FAST               0 (n)

  5          34 LOAD_GLOBAL              0 (time)
             37 LOAD_ATTR                1 (sleep)
             40 LOAD_CONST               3 (1)
             43 CALL_FUNCTION            1
             46 POP_TOP             
             47 JUMP_ABSOLUTE            3
        &gt;&gt;   50 POP_BLOCK           

  6     &gt;&gt;   51 LOAD_CONST               4 ('Blastoff!')
             54 PRINT_ITEM          
             55 PRINT_NEWLINE       
             56 LOAD_CONST               0 (None)
             59 RETURN_VALUE        
&gt;&gt;&gt;
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
Now, here's the critical bit--as a general rule, most low-level
interpreter instructions tend to execute atomically. That is, while
executing an instruction, the GIL is held, and no preemption is
possible until completion.&lt;/p&gt;
&lt;p&gt; For a large majority of interpreter instructions, the lack of
preemption is rarely a concern because they execute almost
immediately.  However, every now and then, a program may execute a
long-running operation. Typically this happens if an operation is
performed on a huge amount of data or if Python calls out to
long-running C code that doesn't release the GIL.&lt;/p&gt;
&lt;p&gt;
Here is a simple example you can try to see it.  Launch the
above countdown function in its own thread like this (using the legacy
thread module).&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
&gt;&gt;&gt; &lt;b&gt;import thread&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;thread.start_new_thread(countdown,(20,))&lt;/b&gt;
T-minus 20
T-minus 19
...
&gt;&gt;&gt;
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
Now, while that is running, do this:
&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
&gt;&gt;&gt; &lt;b&gt;max(xrange(1000000000))&lt;/b&gt;
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
If you do this, everything should just grind to a halt.  You will see
no output from the countdown thread at all and Ctrl-C will be frozen.
In a nutshell, this is the issue with preemption (or lack thereof).
Due to the GIL, it's possible for one thread to temporarily block the progress of
all other threads on the system.&lt;/p&gt;
&lt;p&gt;
The lack of preemption together with the GIL presents certain challenges when Python is
embedded into a multithreaded C program.  In such applications, the
Python interpreter might be used for high-level scripting, but also for
things such as event handling and callbacks. For example, maybe you have some C
thread that's part of a game or visualization package that's calling
into the Python interpreter to trigger event-handler methods in real
time.  In such code, control flow might pass from Python, to C, back
to Python, and so forth.&lt;/p&gt;
&lt;p&gt; Needless to say, it's possible that in such an environment, you might
have a collection of C or C++ threads that compete for use of the
Python interpreter and are forced to synchronize on the GIL. This means that the interpreter might
become a bottleneck of the whole system.  If, somehow, you could get
rid of the GIL, then any thread would be allowed to use the
interpreter without worrying about other threads.  For example, a C++
program triggering a Python event callback, wouldn't have to concern
itself with other Python threads---the callback would simply run
without being blocked.   This is what you get by making the
interpreter fully reentrant.
&lt;/p&gt;
&lt;p&gt;
It is in this context of embedding that the GIL removal patch should
probably be viewed. At the time it was created,
significantly more Python users were involved with integrating Python
with C/C++ applications. In my own area of
scientific computing, people were using Python to build interactive
data visualization tools which often involved heavy amounts
of CPU computation.  I knew others who were tring to use
Python for internal scripting of commercial PC video games. For all of
these groups, removal of the GIL (if possible) was viewed as desirable
because doing so would simplify programming and improve the user-experience (better
responsiveness of the GUI, fewer stalls, etc.).  If you've ever been
sitting there staring at the spinning beachball on your Mac wishing it
would just go away, well, then you get
the general idea.&lt;/p&gt;
&lt;h3&gt;Exploring the Patch&lt;/h3&gt;
&lt;p&gt;
If you download the free-threading patch, you will find that it is a
relatively small set of files that replace and add functionality to Python-1.4. Here is
a complete list of the modified files included in the patch:&lt;/p&gt;
&lt;p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
./Include:
listobject.h	pymutex.h	sysmodule.h
object.h	pypooledlock.h	threadstate.h

./Modules:
signalmodule.c	threadmodule.c

./Objects:
complexobject.c	intobject.c	longobject.c	stringobject.c
frameobject.c	listobject.c	mappingobject.c	tupleobject.c

./Python:
Makefile.in	importdl.c	pythonrun.c	traceback.c
ceval.c		pymutex.c	sysmodule.c
errors.c	pypooledlock.c	threadstate.c
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
As you can see, it's certainly not a rewrite of the entire
interpreter.  In fact, if you run a diff across the Python-1.4 source
and the patch, you'll find that the changes amount to about 1000 lines
of code (in contrast, the complete source code to Python-1.4 is about
82000 lines as measured by 'wc').&lt;/p&gt;
&lt;p&gt;I won't go into the details of applying or compiling the patch 
except to say that detailed instructions are included in the README should you want
to build it yourself.&lt;/p&gt;
&lt;h3&gt;Initial Impressions&lt;/h3&gt;
&lt;p&gt;
With the patch applied, I tried to do a few rough performance tests (note: I
ran these under Ubuntu 8.10 on a dual-core VMWare Fusion instance
running on my Mac). First, let's just write a simple spin-loop and see what happens:&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
import time
def countdown(n):
   while n &gt; 0:
          n = n - 1

start = time.time()
countdown(10000000)
end = time.time()
print end-start
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
Using the original version of Python-1.4 (with the GIL), this code runs
in about 1.9 seconds.  Using the patched GIL-less version, it runs in
about 12.7 seconds.  That's about 6.7 times slower.  Yow!&lt;/p&gt;
&lt;p&gt;
Just to further confirm that finding, I ran the included
&lt;tt&gt;Tools/scripts/pystone.py&lt;/tt&gt; benchmark (modified to run slightly
longer in order to get more accurate timings).  First, with the GIL:&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
$ python1.4g Tools/scripts/pystone.py
Pystone(1.0) time for 100000 passes = 3.09
This machine benchmarks at 32362.5 pystones/second
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;Now, without the GIL:&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
$ python1.4ng Tools/scripts/pystone.py
Pystone(1.0) time for 100000 passes = 12.73
This machine benchmarks at 7855.46 pystones/second
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;Here, the GIL-less Python is only about 4 times slower.  Now, I'm just
slightly more impressed.&lt;/p&gt;
&lt;p&gt;
To test threads, I wrote a small sample that subdivided the work
across two worker threads is an embarrassingly parallel manner (note: this
code is a little wonky due to the fact that Python-1.4 doesn't
implement thread joining--meaning that you have to do it yourself with
the included binary-semaphore lock).
&lt;/P&gt;
&lt;blockquote&gt;
&lt;pre&gt;
import thread
import time
import sys
sys.setcheckinterval(1000)

def countdown(n,lck):
    while n &gt; 0:
         n = n - 1
    lck.release()     # Signal termination

lck_1 = thread.allocate_lock()
lck_2 = thread.allocate_lock()
lck_1.acquire()
lck_2.acquire()
start = time.time()
thread.start_new_thread(countdown,(5000000,lck_1))
thread.start_new_thread(countdown,(5000000,lck_2))
lck_1.acquire()      # Wait for termination
lck_2.acquire()
end = time.time()
print end-start
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;If you run this code with the GIL, the execution time is about
2.5 seconds or approximately 1.3 times slower than the single-threaded
version (1.9 seconds).  Using the GIL-less Python, the execution time is
18.5 seconds or approximately 1.45 times slower than the
single-threaded version (12.7 seconds). Just to emphasize, the GIL-less Python
running with two-threads is running more than 7 times slower than the
version with a GIL.&lt;/p&gt;
&lt;p&gt;
Ah, but what about preemption you ask?   If you return to the example
above in the section about reentrancy, you will find that removing the
GIL, does indeed, allow free threading and long-running calculations
to be preempted.  Success!&lt;/p&gt;
&lt;p&gt;
Needless to say, there might be a few reasons why the patch quietly
disappeared.&lt;/p&gt;
&lt;h3&gt;Under the Covers&lt;/h3&gt;
&lt;p&gt;Okay, the performance is terrible, but what is actually going on
inside?  Are there any lessons to be learned?  A look at the source
code and related documentation reveals all.&lt;/p&gt;
&lt;h3&gt;Capturing Thread State&lt;/h3&gt;
&lt;p&gt;
For free-threading to work, each thread has to isolate its
interpreter state and not rely on C global variables.  The threading patch does this by defining a new
data structure such as the following:&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
/* Include/threadstate.h */

typedef struct PyThreadState_s
{
    PyFrameObject *		current_frame;		/* ceval.c */
    int				recursion_depth;	/* ceval.c */
    int				interp_ticker;		/* ceval.c */
    int				tracing;		/* ceval.c */

    PyObject *			sys_profilefunc;	/* sysmodule.c */
    PyObject *			sys_tracefunc;		/* sysmodule.c */
    int				sys_checkinterval;	/* sysmodule.c */

    PyObject *			last_exception;		/* errors.c */
    PyObject *			last_exc_val;		/* errors.c */
    PyObject *			last_traceback;		/* traceback.c */

    PyObject *			sort_comparefunc;	/* listobject.c */

    char			work_buf[120];		/* &amp;lt;anywhere&gt; */
    int				c_error;		/* complexobject.c */
} PyThreadState;
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
Essentially, all global variables in the interpreter have been picked
up and moved into a per-thread data structure.  Some of these values are obvious
candidates such as exception information, tracing, and profiling
hooks.  Other parts are semi-random.  For example, there is storage
for the compare callback function used by list sorting and a global
error handling variable (c_error) used to propagate errors across
internal functions in the implementation of complex numbers.&lt;/p&gt;
&lt;p&gt;
To manage multiple threads, the interpreter builds a linked-list of
all active threads.  This linked list contains the thread-identifier
along with the corresponding &lt;tt&gt;PyThreadState&lt;/tt&gt;
structure.  For example:&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
/* Python/threadstate.c */
...

typedef struct PyThreadStateLL_s
{
    long                        thread_id;
    struct PyThreadStateLL_s *  next;
    PyThreadState               state;
} PyThreadStateLL;
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
Whenever a thread wants to get its per-state state information, it
simply calls a function &lt;tt&gt;PyThreadState_Get()&lt;/tt&gt;.  This function
scans the linked-list searching for the caller's thread-identifier.
When found, the matching thread state structure is moved to the front
of the linked list and the value returned. Here is a short example of
code that illustrates an example use with the relevant bits highlighted:&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
/* Objects/listobject.c */
...
static int
cmp(v, w)
	const ANY *v, *w;
{
	object *t, *res;
	long i;
&lt;font color="#0000ff"&gt;	PyThreadState *pts = PyThreadState_Get();&lt;/font&gt;


	if (err_occurred())
		return 0;

	if (&lt;font color="#0000ff"&gt;pts-&gt;sort_comparefunc&lt;/font&gt; == NULL)
		return cmpobject(* (object **) v, * (object **) w);

	/* Call the user-supplied comparison function */
	t = mkvalue("(OO)", * (object **) v, * (object **) w);
	if (t == NULL)
		return 0;
	res = call_object(&lt;font color="#0000ff"&gt;pts-&gt;sort_comparefunc&lt;/font&gt;, t);
	DECREF(t);
        ...
}
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
Bits and pieces of the thread state code live on in Python3.2 today.  In
particular, per-thread state is captured in a data similar structure and
there are functions for obtaining the state.  In fact, there is even a
function called &lt;tt&gt;PyThreadState_Get()&lt;/tt&gt;.&lt;/p&gt;
&lt;p&gt;
&lt;h3&gt;Fine-Grained Locking of Reference Counting&lt;/h3&gt;
&lt;/p&gt;
&lt;p&gt;
Memory management of Python objects relies on reference counting. In
the C API, there are macros for manipulating reference counts:&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
/* Include/objects.h */
...
#define Py_INCREF(op) (_Py_RefTotal++, (op)-&gt;ob_refcnt++)
#define Py_DECREF(op) \
        if (--_Py_RefTotal, --(op)-&gt;ob_refcnt != 0) \
                ; \
        else \
                _Py_Dealloc(op)
...
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
These macros, along with their NULL-pointer safe variants &lt;tt&gt;Py_XINCREF&lt;/tt&gt; and
&lt;tt&gt;Py_XDECREF&lt;/tt&gt; are used throughout the Python source. A quick
search of the Python-1.4 source reveals about 250 uses. &lt;/p&gt;
&lt;p&gt;
With free-threading, reference counting operations lose their
thread-safety.  Thus, the patch introduces a global reference-counting mutex
lock along with atomic operations for updating the count. On Unix,
locking is implemented using a standard
&lt;tt&gt;pthread_mutex_t&lt;/tt&gt; lock (wrapped inside a &lt;tt&gt;PyMutex&lt;/tt&gt; structure) and the following functions:&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
/* Python/pymutex.c */
...
PyMutex * _Py_RefMutex;
...
int _Py_SafeIncr(pint)
    int *pint;
{
    int result;

    PyMutex_Lock(_Py_RefMutex);
    result = ++*pint;
    PyMutex_Unlock(_Py_RefMutex);
    return result;
}

int _Py_SafeDecr(pint)
    int *pint;
{
    int result;

    PyMutex_Lock(_Py_RefMutex);
    result = --*pint;
    PyMutex_Unlock(_Py_RefMutex);
    return result;
}
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;The &lt;tt&gt;Py_INCREF&lt;/tt&gt; and &lt;tt&gt;Py_DECREF&lt;/tt&gt; macros are then
redefined to use these thread-safe functions.&lt;/p&gt;
&lt;p&gt;
On Windows, fine-grained locking is achieved by redefining
&lt;tt&gt;Py_INCREF&lt;/tt&gt; and &lt;tt&gt;Py_DECREF&lt;/tt&gt; to use 
&lt;tt&gt;InterlockedIncrement&lt;/tt&gt; and
&lt;tt&gt;InterlockedDecrement&lt;/tt&gt; calls (see &lt;a
href="http://msdn.microsoft.com/en-us/library/ms684122(v=vs.85).aspx"&gt;Interlocked
Variable Access&lt;/a&gt; [MSDN]).&lt;/p&gt;
&lt;p&gt;
On Unix, it must be emphasized that simple reference count
manipulation has been replaced by no fewer than three function calls,
plus the overhead of the actual locking.  It's far more expensive.&lt;/p&gt;
&lt;p&gt;
As a performance experiment, I decided to comment out the
&lt;tt&gt;PyMutex_Lock&lt;/tt&gt; and &lt;tt&gt;PyMutex_Unlock&lt;/tt&gt; calls and run the
interpreter in an unsafe mode.  With this change, the performance of my
single-threaded 'spin-loop' dropped from 12.7 seconds to about 3.9 seconds.  The threaded version dropped from 18.5 seconds to about 4 seconds. [ Note: corrected due to an unnoticed build-error when trying this experiment initially ].
&lt;/p&gt;
&lt;p&gt; Clearly fine-grained locking of reference counts is the major
culprit behind the poor performance, but even if you take away the locking, the reference counting performance is still very sensitive to any kind of extra overhead (e.g., function call, etc.).  In this case, the performance is still about twice as slow as Python with the GIL. &lt;/p&gt;

&lt;h3&gt;Locking of Mutable Builtins&lt;/h3&gt;
&lt;p&gt;
Mutable builtins such as lists and dicts need their own locking to
synchronize modifications. Thus, these objects grow an extra lock
attribute per instance.  For example:&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
/* Include/listobject.h */
...
typedef struct {
	PyObject_VAR_HEAD
&lt;font color="#0000ff"&gt;	Py_DECLARE_POOLED_LOCK&lt;/font&gt;
	PyObject **ob_item;
} PyListObject;
...
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
Virtually all underlying methods (append, insert, setitem, getitem,
repr, etc.) then use the per-instance lock under the covers.&lt;/p&gt;
&lt;p&gt;
A interesting aspect of the implementation is the way that locks are
allocated.  Instead of allocating a dedicated mutex lock for each
list or dictionary, the interpreter simply keeps a small pool of
available locks. When a list or dict first needs to be locked, a lock
is taken from the pool and used as long as it is needed (until the instance
is no longer being manipulated by any threads).  At this point, the
lock is simply released back to the pool.&lt;/p&gt;
&lt;p&gt;
This scheme greatly reduces the number of needed locks. Generally
speaking, the number of locks is about the same as the number of
active threads.  Deeply nested data structures (e.g., lists
of lists of lists) may also increase the number of locks needed if certain
recursive operations are invoked.  For example, printing a deeply
nested data structure might cause a lock to be allocated for each level
of nesting.&lt;/p&gt;
&lt;p&gt;
Locking of mutable containers does not involve anything more
sophisticated than a mutex lock.   For example, no attempt has been
made to utilize reader-writer locks.
&lt;/p&gt;
&lt;h3&gt;Locking of Various Internal Operations&lt;/h3&gt;
&lt;p&gt;
In various places thoughout the interpreter, there are low-level
book-keeping operations, often related to memory management and optimization.
Their implementation often relies upon the use of unsafe C static variables.
&lt;/p&gt;
&lt;p&gt;
For this, the patch defines a mutex lock dedicated generally to
executing critical sections along with a pair of C macros for
locking.&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
/* Include/pymutex.h */
...
#define Py_CRIT_LOCK()          PyMutex_Lock(_Py_CritMutex)
#define Py_CRIT_UNLOCK()        PyMutex_Unlock(_Py_CritMutex)
...
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
As an example of use, consider the &lt;tt&gt;int&lt;/tt&gt; object.  Due to the fact that
integers are used so frequently, the underlying C code uses a custom
memory allocator and other tricks to avoid excessive calls to C's
&lt;tt&gt;malloc&lt;/tt&gt; and &lt;tt&gt;free&lt;/tt&gt; functions.  Here is an example of
the kind of thing you see in the patch (changes highlighted):&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
/* Objects/intobject.c */

...
static intobject *&lt;font color="#0000ff"&gt;volatile&lt;/font&gt; free_list = NULL;
...
object *
newintobject(ival)
	long ival;
{
     ...
&lt;font color="#0000ff"&gt;	Py_CRIT_LOCK();&lt;/font&gt;
	if (free_list == NULL) {
		if ((free_list = fill_free_list()) == NULL) {
&lt;font color="#0000ff"&gt;			Py_CRIT_UNLOCK();&lt;/font&gt;
			return err_nomem();
		}
	}
	v = free_list;
	free_list = *(intobject **)free_list;
&lt;font color="#0000ff"&gt;	Py_CRIT_UNLOCK();&lt;/font&gt;
    ...
}
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
A quick analysis of the patch shows that there are about 20 such
critical sections that have to be protected.  This includes low-level
code in the implementation of ints, tuples, dicts as well as code
generally related to the runtime of the interpreter (e.g., module
imports, signal handling, sys module, etc.).&lt;/p&gt;
&lt;p&gt;
Although all of these sections share the same lock, the associated
overhead appears to be negligible compared to that of reference
counting.
&lt;/p&gt;
&lt;h3&gt;Other Tricky Bits&lt;/h3&gt;
&lt;p&gt;
Although the patch addresses some fundamental issues needed to make
free-threading work, it is only a small start.  In particular, no
effort has been made to verify the thread-safety of any standard
library modules.  This includes a large body of C code that would have
be audited in detail to identify and fix potential race
conditions.&lt;/p&gt;
&lt;p&gt;
Certain parts of the Python implementation also remain problematic.
For example, certain low-level C API functions such as
&lt;tt&gt;PyList_GetItem()&lt;/tt&gt; and &lt;tt&gt;PyDict_GetItem()&lt;/tt&gt; return
borrowed references (e.g, objects without an increased reference
count).  Although it seems remote, there is a possibility that such
functions could return a reference to an object that then gets destroyed by
another thread.
&lt;/p&gt;
&lt;h3&gt;Final Words&lt;/h3&gt;
&lt;p&gt;Looking at the patch has been an interesting trip through history,
but is there anything to learn from it? This is by no means an
exhaustive list, but a few thoughts come to mind:&lt;/p&gt;
&lt;ul&gt;
  &lt;p&gt;
  &lt;li&gt;Reference counting is a really lousy memory-management technique
  for free-threading.  This was already widely known, but the performance
  numbers put a more concrete figure on it.  This will definitely be
  the most challenging issue for anyone attempting a GIL removal patch.&lt;/li&gt;
  &lt;/p&gt;
  &lt;P&gt;
  &lt;li&gt;For mutable types, you need per-instance locking.   However,
  through clever lock management, you can probably do it with a
  relatively small number of locks (proportional to the number
  of threads) as opposed to actually putting a dedicated lock on every
  instance.  The performance impact of such locking warrants further
  study--especially given the heavy use of dictionaries throughout the
  interpreter.
 &lt;/li&gt;
  &lt;/p&gt;
  &lt;p&gt;
  &lt;li&gt;Various internal parts of the interpreter will need locking, but
  such locking doesn't appear to have as much of an impact as one might
  expect (at least I wasn't able to measure a huge performance
  hit due to it).&lt;/li&gt;
  &lt;/p&gt;
  &lt;p&gt;
  &lt;li&gt;Python 3 already includes a few critical pieces needed to make
  free-threading work.  In particular, there are data structures that
  capture per-thread state and functions for obtaining that state.
  Because of that, if you were to eliminate the GIL, most of the
  effort would tend to focus on locking as opposed to isolating
  state.&lt;/li&gt;
  &lt;/p&gt;
 &lt;p&gt;
  &lt;li&gt;Even though you might be able to patch the interpreter with a
  small amount of code, verifying the thread safety of all standard library
  modules (both Python and C code) will probably be a daunting
  (and possibly never-ending) endeavor.&lt;/li&gt;
  &lt;/p&gt;
  &lt;p&gt; &lt;li&gt;Despite removing the GIL, I was unable to produce any
  performance experiment that showed a noticeable improvement on
  multiple cores.  Really, the only benefit (ignoring the horrible
  performance) seen in pure Python code, was having preemptible
  instructions.&lt;/li&gt; &lt;/p&gt;
&lt;/ul&gt;
&lt;p&gt; That's about it for now. I hope you found this trip through time
interesting.  In a future installment, I'll explore the problem of
pushing locked reference counting as far as it can possibly go.  As a
preview, a simple patch involving less than a dozen lines of code
makes the whole GIL-less Python run more than twice as fast.  However,
can it go even faster?  Stay tuned.  &lt;/p&gt;
&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-4731544515209926406?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/4731544515209926406/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=4731544515209926406' title='25 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/4731544515209926406'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/4731544515209926406'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2011/08/inside-look-at-gil-removal-patch-of.html' title='An Inside Look at the GIL Removal Patch of Lore'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>25</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-3874537759043264915</id><published>2011-05-27T14:44:00.000-07:00</published><updated>2011-05-27T14:44:29.113-07:00</updated><title type='text'>Class decorators might also be super!</title><content type='html'>&lt;p&gt;
Recently Raymond Hettinger posted an amazing article &lt;a href="http://rhettinger.wordpress.com/2011/05/26/super-considered-super/"&gt;Python's super() considered super!"&lt;/a&gt;.  Even if you think you know what &lt;tt&gt;super()&lt;/tt&gt; does, you should go read it.&lt;/p&gt;

&lt;p&gt;
A commonly cited applications of &lt;tt&gt;super()&lt;/tt&gt; is using it to implement a kind of cooperative inheritance as is sometimes found with mixin classes. Consider this code which is a slight variation of Raymond's example:&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
class LoggedSetItemMixin:
    def __setitem__(self,index,value):
        logging.info('Setting %r to %r', index,value)
        super().__setitem__(index,value)
&lt;/pre&gt;
&lt;/blockquote&gt;

&lt;p&gt;
Using this class, you could add logging to any class that implements &lt;tt&gt;__setitem__()&lt;/tt&gt; by combining classes via multiple inheritance.  For example:&lt;/p&gt;

&lt;blockquote&gt;
&lt;pre&gt;
class LoggingDict(LoggedSetItemMixin,dict):
    pass

class LoggingList(LoggedSetItemMixin,list):
    pass
&lt;/pre&gt;
&lt;/blockquote&gt;

&lt;p&gt;
Here's some sample output:
&lt;/p&gt;

&lt;blockquote&gt;
&lt;pre&gt;
&gt;&gt;&gt; &lt;b&gt;d = LoggingDict()&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;d['a'] = 1&lt;/b&gt;
INFO:root:Setting 'a' to 1
&gt;&gt;&gt; &lt;b&gt;e = LoggingList([0,1,2])&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;e[0] = 99&lt;/b&gt;
INFO:root:Setting 0 to 99
&gt;&gt;&gt;
&lt;/pre&gt;
&lt;/blockquote&gt;

&lt;p&gt;
The whole reason that this works is that &lt;tt&gt;super()&lt;/tt&gt; delegates to the next class on the MRO.  Thus, the &lt;tt&gt;__setitem__()&lt;/tt&gt; call in &lt;tt&gt;LoggedSetItemMixin&lt;/tt&gt; actually steps over to the next class in MRO of whatever kind of instance is being used.  If you find this amazing, consider the fact that &lt;tt&gt;LoggedSetItemMixin&lt;/tt&gt; is using &lt;tt&gt;super()&lt;/tt&gt; even though it doesn't even specify a base class! It's pretty cool--maybe even a slight bit diabolical.&lt;/p&gt;

&lt;p&gt;
As amazing as this is, I've recently been thinking about a completely different approach to these kinds of problems based on class decorators. Consider this function:
&lt;/p&gt;

&lt;blockquote&gt;
&lt;pre&gt;
def LoggedSetItem(cls):
    orig_setitem = cls.__setitem__
    def __setitem__(self, index, value):
        logging.info('Setting %r to %r' ,index, value)
        return orig_setitem(self,index,value)
    cls.__setitem__ = __setitem__
    return cls
&lt;/pre&gt;
&lt;/blockquote&gt;

&lt;p&gt;This function is meant to be used as a decorator to class definitions.  For example:
&lt;/p&gt;

&lt;blockquote&gt;
&lt;pre&gt;
@LoggedSetItem
class LoggingDict(dict):
    pass

@LoggedSetItem
class LoggingList(list):
    pass
&lt;/pre&gt;
&lt;/blockquote&gt;

&lt;p&gt;
Carefully study the implementation of &lt;tt&gt;LoggedSetItem&lt;/tt&gt;.  As input, it receives a class object.  It then looks up the unbound &lt;tt&gt;__setitem__&lt;/tt&gt; method and stores it in a variable.  This lookup, as it turns out, is doing exactly the same work as &lt;tt&gt;super()&lt;/tt&gt;.   That is, it simply finds the implementation of the method being used by the class regardless of where it is actually located.   After that, the function simply defines a replacement for &lt;tt&gt;__setitem__&lt;/tt&gt; with added logging and attaches it back to the class object.  References to the original implementation of &lt;tt&gt;__setitem__&lt;/tt&gt; are held inside a closure so it all works out.&lt;/p&gt;

&lt;p&gt;
The class decorator approach has several notable features.  First, it doesn't even involve the use of &lt;tt&gt;super()&lt;/tt&gt; (or multiple inheritance for that matter).  Second, as with &lt;tt&gt;super()&lt;/tt&gt;, you don't have to hard-code any classnames--the class is simply passed in as an argument. Third,it has very good runtime performance.   This is because the work normally performed by &lt;tt&gt;super()&lt;/tt&gt; is only performed once, at the time of class decoration.  Finally, there is a kind of built-in error checking.  For example, if you try to apply the decorator to a class that doesn't support the required method, you will immediately get an error:&lt;/p&gt;

&lt;blockquote&gt;
&lt;pre&gt;
&gt;&gt;&gt; &lt;b&gt;@LoggedSetItem
class loggedint(int): pass&lt;/b&gt;

Traceback (most recent call last):
  File "&amp;lt;stdin&gt;", line 2, in &amp;lt;module&gt;
  File "logsetitem.py", line 9, in LoggedSetItem
    orig_setitem = cls.__setitem__
AttributeError: type object 'loggedint' has no attribute '__setitem__'
&gt;&gt;&gt; 
&lt;/pre&gt;
&lt;/blockquote&gt;

&lt;p&gt;
As interesting as this is, I have no idea if using class decorators in this manner would be considered to be good practice or not.  One potential problem is that by putting the code in a decorator, a lot of the work is performed just once at the time of class definition.  If a program was playing sneaky tricks like dynamically changing method definitions at runtime, it clearly wouldn't work.   There's also a certain risk that this approach is just too clever for it's own good.&lt;/p&gt;
&lt;p&gt;
Do you see any other downsides?  I'd love to get your feedback.
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-3874537759043264915?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/3874537759043264915/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=3874537759043264915' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/3874537759043264915'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/3874537759043264915'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2011/05/class-decorators-might-also-be-super.html' title='Class decorators might also be super!'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-5622924048685242226</id><published>2011-04-27T12:45:00.000-07:00</published><updated>2011-04-28T05:05:01.188-07:00</updated><title type='text'>Practical Python with Raymond Hettinger</title><content type='html'>&lt;p&gt;Raymond Hettinger is coming to Chicago May 16-20 to put his unique spin on my &lt;a href="http://www.dabeaz.com/chicago/practical.html"&gt;Practical Python Programming&lt;/a&gt; course.  Although that is coming up soon, there is still time to register and a few slots are still available.  Needless to say, if you've been looking for a class where you can learn more about Python and improve your skills, you won't find a better class anywhere!&lt;/p&gt;

&lt;p&gt;Raymond Hettinger is the same core developer whose name can be found on no fewer than 13 &lt;a href="http://www.python.org/dev/peps/"&gt;PEPs&lt;/a&gt; including a variety of very useful features of modern Python programming.  For example, the &lt;a href="http://www.python.org/dev/peps/pep-0279/"&gt;&lt;tt&gt;enumerate()&lt;/tt&gt;&lt;/a&gt; function that lets you keep track of where you are in iteration such as this example that gives you a line number when reading a file:&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
&gt;&gt;&gt; f = open("data.dat")
&gt;&gt;&gt; for lineno, line in enumerate(f,1):
        ...
&lt;/pre&gt;
&lt;/blockquote&gt;

&lt;p&gt;
Or maybe you like reversing things with the &lt;a href="http://www.python.org/dev/peps/pep-0322/"&gt;&lt;tt&gt;reversed()&lt;/tt&gt;&lt;/a&gt; function:&lt;/p&gt;

&lt;blockquote&gt;
&lt;pre&gt;
&gt;&gt;&gt; for x in reversed(seq):
        ...
&gt;&gt;&gt;
&lt;/pre&gt;
&lt;/blockquote&gt;

&lt;p&gt;
Or what about putting a &lt;a href="http://www.python.org/dev/peps/pep-0378/"&gt;thousands separator&lt;/a&gt; on numbers?&lt;/p&gt;

&lt;blockquote&gt;
&lt;pre&gt;
&gt;&gt;&gt; x = 123456789
&gt;&gt;&gt; format(x,",")
'123,456,789'
&gt;&gt;&gt;
&lt;/pre&gt;
&lt;/blockquote&gt;

&lt;p&gt;
Or &lt;a href="http://www.python.org/dev/peps/pep-0218/"&gt;sets&lt;/a&gt;?&lt;/p&gt;

&lt;blockquote&gt;
&lt;pre&gt;
&gt;&gt;&gt; a = set(['a','b','c'])
&gt;&gt;&gt; b = set(['c','d','e'])
&gt;&gt;&gt; a &amp; b
set(['c'])
&gt;&gt;&gt; a | b
set(['a', 'b', 'c', 'd', 'e'])
&gt;&gt;&gt;
&lt;/pre&gt;
&lt;/blockquote&gt;

&lt;p&gt;
All of these features contain some of Raymond's handiwork.  However, that's really only scratching the surface.  Maybe you've used various features in the &lt;tt&gt;collections&lt;/tt&gt; or &lt;a href="http://svn.python.org/view/python/trunk/Modules/itertoolsmodule.c?view=markup"&gt;&lt;tt&gt;itertools&lt;/tt&gt;&lt;/a&gt; modules. Or maybe you've used &lt;a href="http://www.python.org/dev/peps/pep-0289/"&gt;generator expressions&lt;/a&gt;, one of my favorite Python features.  Again, Raymond's work.&lt;/p&gt;
&lt;p&gt;
Last, but not least, Raymond is a well-known speaker and presenter.  I distinctly remember seeing him give one of the most amazing presentations at PyCon UK in 2008 about the inner secrets of Python containers--a talk that left me thinking "I had no idea Python worked like that."  At PyCon'2011 Raymond gave a well-received talk about &lt;a href="http://blip.tv/file/4883290"&gt;API Design&lt;/a&gt;.  &lt;b&gt;Update:&lt;/b&gt;  Raymond is giving no fewer than 6 talks at &lt;a href="http://ep2011.europython.eu/conference/speakers/raymond-hettinger"&gt;EuroPython&lt;/a&gt; including an &lt;a href="http://ep2011.europython.eu/conference/talks/what-makes-python-so-awesome"&gt;invited keynote talk&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;
So, if you're thinking about learning more about Python, you could certainly read an online tutorial, watch a video, or take a class where an instructor shows up.   Or, you can join five other developers for an in-depth class created by the author who wrote one of the most &lt;a href="http://www.amazon.com/Python-Essential-Reference-David-Beazley/dp/0672329786"&gt;in-depth Python books&lt;/a&gt; and presented by a core developer who knows Python inside-out.  Needless to say,  you won't be disappointed.&lt;/p&gt;
&lt;p&gt;
As a bonus, if you stick around for Friday afternoon, you can have your head completely exploded by signing up for my &lt;a href="http://www.dabeaz.com/chicago/hardpython.html"&gt;Learn Hard Python&lt;/a&gt; seminar--a 3 hour tour through some of Python's most advanced features including descriptors, super(), function objects, closures, decorators, context managers, and metaclasses.&lt;/p&gt;
&lt;p&gt;
Hopefully you'll join Raymond and myself for a great week of Python.  More information is available at &lt;a href="http://www.dabeaz.com/chicago/index.html"&gt;http://www.dabeaz.com/chicago/index.html&lt;/a&gt;.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-5622924048685242226?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/5622924048685242226/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=5622924048685242226' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/5622924048685242226'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/5622924048685242226'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2011/04/practical-python-with-raymond-hettinger.html' title='Practical Python with Raymond Hettinger'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-3003461230942627231</id><published>2011-04-04T07:23:00.001-07:00</published><updated>2011-04-24T13:18:05.973-07:00</updated><title type='text'>Learn Python from Raymond Hettinger in Chicago</title><content type='html'>&lt;p&gt;
In one of the hallway tracks at Pycon, &lt;a href="http://rhettinger.wordpress.com/"&gt;Raymond Hettinger&lt;/a&gt; came up to me and said "I've been thinking about teaching a Python class."  Needless to say, I couldn't pass on that kind of opportunity.  So, I'm pleased to announce that Raymond is coming to Chicago, May 16-20 to put his unique spin on my &lt;a href="http://www.dabeaz.com/chicago/practical.html"&gt;Practical Python Programming&lt;/a&gt; course along with an assortment of his own material.  The course is being held in my Python lair so I'll stop by to say "hi" before leaving you in the hands of one of Python's foremost experts. In short, this might be one of the most fantastic Python courses ever offered--and as with past courses, space is limited to just six students.&lt;/p&gt;
&lt;p&gt;
In case you're not so familiar with Raymond's work, let's just say that it's hard to escape it if you've done any kind of Python programming at all.  Not only is Raymond a Python core developer responsible for numerous features such as collections, itertools, sets, generator expressions, and the peephole optimizer, he is a well-known Pycon speaker and board member of the Python Software Foundation.   In short, if you take this class, you'll not only learn about features of the Python language, you'll be learning from the person who contributed many of them in the first place.&lt;/p&gt;
&lt;p&gt;
I should emphasize that this class is really designed for new Python programmers who want to get off to a great start.  As long as you know about general programming concepts, no prior Python experience is required.  Of course, even if you know some Python, you are still going to learn a wide variety of new and interesting things.&lt;/p&gt;

&lt;p&gt;
More information about this and other courses is available &lt;a href="http://www.dabeaz.com/chicago/index.html"&gt;here&lt;/a&gt;.   Hopefully you'll join Raymond in May!&lt;/p&gt;

&lt;p&gt;&lt;b&gt;Update (April 24, 2011)&lt;/b&gt; There is just one slot left!  What are you waiting for?&lt;/p&gt;

&lt;p&gt;
-- Dave
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-3003461230942627231?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/3003461230942627231/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=3003461230942627231' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/3003461230942627231'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/3003461230942627231'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2011/04/learn-python-from-raymond-hettinger-in.html' title='Learn Python from Raymond Hettinger in Chicago'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-2863507789180218103</id><published>2011-03-15T13:59:00.001-07:00</published><updated>2011-03-15T13:59:04.165-07:00</updated><title type='text'>The Superboard Takes Pycon!</title><content type='html'>&lt;p&gt;
Well, the Superboard and I are back in Chicago after surviving PyCon. What a great conference--it's always exciting to see 1400 enthusiastic Python programmers in one place!&lt;/p&gt;
&lt;center&gt;
&lt;img src="http://www.dabeaz.com/images/osi_small.jpg"&gt;&lt;/img&gt;
&lt;/center&gt;
&lt;p&gt;In case you missed it, you can now watch the video of my &lt;a href="http://pycon.blip.tv/file/4878868/"&gt;Building a Cloud Computing Service for my Superboard II&lt;/a&gt; presentation.  In this post, I just briefly wanted to fill in more details about the talk, including some links to prior blog posts, ported libraries, code, etc.&lt;/p&gt;
&lt;p&gt;
First, as background, you might check out some of my earlier blog posts that describe audio encoding/decoding as well as the problem of building an emulated version of the Superboard using Py65.  Here are some links:&lt;/p&gt;
&lt;p&gt;
&lt;ul&gt;
&lt;li&gt;19 Jan 2011. &lt;a href="http://dabeaz.blogspot.com/2011/01/porting-py65-and-my-superboard-to.html"&gt;Porting Py65 (and my Superboard) to Python 3&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;04 Sep 2010. &lt;a href="http://dabeaz.blogspot.com/2010/09/using-telnet-to-access-my-superboard-ii.html"&gt;Using telnet to access my Superboard II (via Python and cassette ports)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;29 Aug 2010. &lt;a href="http://dabeaz.blogspot.com/2010/08/decoding-superboard-ii-cassette-audio.html"&gt;Decoding Superboard II Cassette Audio Using Python 3, Two Generators, and a Deque&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;22 Aug 2010. &lt;a href="http://dabeaz.blogspot.com/2010/08/using-python-to-encode-cassette.html"&gt;Using Python to Encode Cassette Recordings for my Superboard II&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;
An &lt;a href="http://blip.tv/file/4639616/"&gt;earlier talk&lt;/a&gt; about the Superboard was given at the January, 2011 Chipy meeting.  This talk was quite a bit different than the Pycon presentation and focused more on the problem of building an emulated Superboard.  It also includes some live demos and more general history about the Superboard.&lt;/p&gt;

&lt;p&gt;
In the Pycon talk, I described how I built a 6502 assembler from scratch.  At one point, I was planning on writing a separate blog post about that, but for now, you can just look at the raw code &lt;a href="http://www.dabeaz.com/superboard/asm6502.py"&gt;here&lt;/a&gt;.   Related to that, you can also see the assembly code for the Superboard &lt;a href="http://www.dabeaz.com/superboard/msgdrv.asm"&gt;messaging driver&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;
&lt;a href="http://zeromq.org"&gt;ZeroMQ&lt;/a&gt; played a big role in the project--specifically, I used it to build all sorts of client applications on the Macintosh.  The starting point for that code was a program &lt;a href="http://www.dabeaz.com/superboard/aciamsg.py"&gt;aciamsg.py&lt;/a&gt; that implemented the binary messaging link to the Superboard and bridged it to clients via a set of ZeroMQ sockets.  Client services were supported by a class defined in &lt;a href="http://www.dabeaz.com/superboard/msgservice.py"&gt;msgservice.py&lt;/a&gt;.  For example, &lt;a href="http://www.dabeaz.com/superboard/divmod.py"&gt;divmod.py&lt;/a&gt; computes the divmod of two variables and &lt;a href="http://www.dabeaz.com/superboard/fibo.py"&gt;fibo.py&lt;/a&gt; computes fibonacci numbers.&lt;/p&gt;
&lt;p&gt;
An emulated Superboard was created using Py65.  An earlier &lt;a href="http://dabeaz.blogspot.com/2011/01/porting-py65-and-my-superboard-to.html"&gt;blog post&lt;/a&gt; describes that project, but the version I used for my Pycon talk is in a file &lt;a href="http://www.dabeaz.com/superboard/superboard2.py"&gt;superboard2.py&lt;/a&gt;.  Essentially, it emulates a superboard in a VT100 compatible terminal window.  Operations on the video ram are translated into VT100 compatible terminal commands.  You might be shocked at the size of the emulator--it's only around 220 lines.&lt;/p&gt;
&lt;p&gt;
For the cloud service, a special &lt;a href="http://www.dabeaz.com/superboard/supercloud.py"&gt;supercloud.py&lt;/a&gt; service is used to listen for USR(0) requests. This service feeds work into a queue that is processed by a &lt;a href="http://www.dabeaz.com/superboard/superun.py"&gt;superrun.py&lt;/a&gt; program which runs emulated the Superboards in the background.  The code is actually written in a way that allows for different implementations of the job queue and program store.  In the talk, I described the use of &lt;a href="http://redis.io"&gt;Redis&lt;/a&gt;, but that's not the only option.&lt;/p&gt;
&lt;p&gt;
There are a few other bits of code not shown, but the above fragments should give you enough of a general idea how things were put together.  I have to admit that some of the code was rather hastily written so don't expect too much from it.&lt;/p&gt;
&lt;p&gt;
&lt;b&gt;When did you find time to do this?&lt;/b&gt;
&lt;/p&gt;
&lt;p&gt;
This entire Superboard project was nothing more than an interesting hobby project. Most of the really hard work including the audio encoding/decoding, 6502 assembler, and messaging device driver were coded over a couple of weekends in late August 2010. For a few months after that, I messed around with different possible designs of a "cloud service", but since I also had other work to do, progress was spread out and sporadic.  Initially, I thought the service was going to implement a kind of remote "gosub" service (i.e., BASIC programs on the Superboard could remotely GOSUB to code living elsewhere and that remote code would be able to see the BASIC workspace via shared memory), but that never really panned out. In January, 2011 I was fooling around with Py65 and created the first emulated Superboard.  That work resulted in the final design of the system presented at Pycon (namely, having a cloud of emulated Superboard instances).  I have to admit that I liked this design much better than my original GOSUB idea.&lt;/p&gt;
&lt;p&gt;
&lt;b&gt;Ported Python3 libraries&lt;/b&gt;
&lt;/p&gt;
&lt;p&gt;
You can find some of the libraries I ported to Python 3 on &lt;a href="http://github.com/dabeaz"&gt;Github&lt;/a&gt;.  Some of the other libraries are still just sitting on my machine. Eventually I'm hoping to have everything published online on my Github account as time allows.
&lt;/p&gt;
&lt;p&gt;
&lt;b&gt;The Superboard's Favorite Pycon Talks&lt;/b&gt;
&lt;/p&gt;
&lt;p&gt;
I wanted to mention a few really outstanding Pycon talks that I attended.  First, check out Richard Saunder's &lt;a href="http://pycon.blip.tv/file/4882867/"&gt;Everything You Wanted to Know About Pickling, But Were Afraid To Ask&lt;/a&gt;. I have to admit that to me, pickling is almost more mysterious than the GIL.  Richard did a great job peeling back the covers.   I also really enjoyed Van Lindberg's &lt;a href="http://pycon.blip.tv/file/4879824/"&gt;How to Kill a Patent with Python&lt;/a&gt;.  Van Lindberg is one diabolical lawyer indeed.&lt;/p&gt;
&lt;p&gt;
As always, I enjoyed meeting everyone at Pycon.   If you ever want to meet the Superboard in person, you should come to one of my &lt;a href="http://www.dabeaz.com/chicago/index.html"&gt;Python courses&lt;/a&gt; in Chicago.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-2863507789180218103?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/2863507789180218103/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=2863507789180218103' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/2863507789180218103'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/2863507789180218103'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2011/03/superboard-takes-pycon.html' title='The Superboard Takes Pycon!'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-7249239556948612695</id><published>2011-02-04T11:12:00.000-08:00</published><updated>2011-02-09T00:46:05.542-08:00</updated><title type='text'>Does Anyone In Australia Want a Free Python3 PyCon Tutorial?</title><content type='html'>&lt;p&gt;&lt;b&gt;Update : Feb 9, 2011:&lt;/b&gt;  The tutorial is a go in Melbourne for Saturday, February 12, 2011 at 2pm!  Contact Steven.cyphers@gmail.com to RSVP.&lt;/p&gt;

&lt;p&gt;
Well, the title of this post just about says it all. I'm heading down under to do some Python training in Canberra, but I have a free weekend February 12-13, 2011.  So, I'm wondering if anyone might have an interest in attending a free preview of my PyCon'2011 tutorial on &lt;a href="http://us.pycon.org/2011/schedule/sessions/122/"&gt;Mastering Python 3 I/O&lt;/a&gt;.   Here are the ground rules:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You provide the space, supply a video projector, and deal with any logistics concerning the location.&lt;/li&gt;
&lt;li&gt;You tell me where it is.&lt;/li&gt;
&lt;li&gt;I show up.&lt;/li&gt;
&lt;li&gt;We have a great time talking about Python 3 for half a day.&lt;/li&gt;
&lt;li&gt;Beers to follow.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Although I'm staying in Canberra, I can travel anywhere nearby that is easy to get to by plane including Sydney and Melbourne (in fact, travel is preferable since I also want to play tourist).   Send me an &lt;a href="mailto:dave@dabeaz.com"&gt;email&lt;/a&gt; if you're interested.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-7249239556948612695?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/7249239556948612695/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=7249239556948612695' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/7249239556948612695'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/7249239556948612695'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2011/02/does-anyone-in-australia-want-free.html' title='Does Anyone In Australia Want a Free Python3 PyCon Tutorial?'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-5523699269187039520</id><published>2011-01-19T15:38:00.000-08:00</published><updated>2011-01-19T15:38:36.797-08:00</updated><title type='text'>Porting Py65 (and my Superboard) to Python 3</title><content type='html'>&lt;p&gt;
One of my resolutions for 2011 is to write all of my software in
Python 3.  As a hardened Python 2 programmer, I think my initial reaction
to Python 3 was lukewarm at best--it felt foreign and it made life
painful in ways that I found irritating (looking at you Unicode). However, as I have used it
more (and it has improved), I've really grown to like it.  Most
recently, I used Python 3 as the base language for my &lt;a
href="http://www.dabeaz.com/chicago/concurrent.html"&gt;Concurrency
Workshop&lt;/a&gt;.   I have also been using it as the language for my
various diabolical &lt;a
href="http://dabeaz.blogspot.com/2010/08/using-python-to-encode-cassette.html"&gt;Superboard
II&lt;/a&gt; projects.   Last, but not least, I find myself as one of the
editors working to update the O'Reilly Python Cookbook--which is going
to be &lt;a href="http://dabeaz.blogspot.com/2010/12/oreilly-python-cookbook-python-3-all.html"&gt;Python 3 only&lt;/a&gt;.
&lt;/p&gt;
&lt;p&gt;
If you're going to use Python 3, the first thing to know is that not
all libraries are going to work--not everyone has gotten around to
porting their code.   This means that you have to adopt a more
"pioneering" mindset.    In my case, I've simply decided to port the
libraries that I wanted to use as I go.  From a purely academic
viewpoint, taking someone else's code and porting it to Python 3 is an
interesting exercise.  Not only will you learn a lot simply by reading
someone else's code, you'll learn about all sorts of sneaky little gotchas
that aren't necessarily discussed in the Python 3 porting guides.
&lt;/p&gt;
&lt;p&gt;
Over the next few months, I intend to make a series of blog posts
about my experiences porting different libraries.  In this
installment, I port Py65, a Python emulation of the 6502.
&lt;/p&gt;
&lt;p&gt;
&lt;b&gt;Py65 - A 6502 Emulator in Python&lt;/b&gt;
&lt;/p&gt;
&lt;p&gt; &lt;a href="https://github.com/mnaberez/py65"&gt;Py65&lt;/a&gt; is a pure
Python emulation of the 6502 microprocessor created by Mike Naberezny.
I don't really know what motivated Mike to create an emulated 6502 in
Python, but I became interested in Py65 because I suddenly had the
idea that I might be able to use to create an emulated version of my
old &lt;a
 href="http://dabeaz.blogspot.com/2010/08/using-python-to-encode-cassette.html"&gt;Superboard
II&lt;/a&gt; entirely as a Python 3 program.  Why, you ask?  Because it
would be fun. Now, stop asking silly questions--the Superboard is
getting annoyed.&lt;/p&gt;
&lt;p&gt;
&lt;b&gt;Py65 - A Quick Overview&lt;/b&gt;
&lt;/p&gt;
&lt;p&gt;
One of the main features of Py65 is a 6502 machine monitor
where you can load/save memory, step through programs, and try things
out.  For example, if you had an old 6502 ROM image sitting around,
you can load it, disassemble it, and step through parts of it like this:
&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
bash % &lt;b&gt;py65mon&lt;/b&gt;
Py65 Monitor

       PC  AC XR YR SP NV-BDIZC
6502: 0000 00 00 00 ff 00110000
&lt;b&gt;.load rom.bin f800&lt;/b&gt;
Wrote +2048 bytes from $f800 to $ffff

       PC  AC XR YR SP NV-BDIZC
6502: 0000 00 00 00 ff 00110000
&lt;b&gt;.disassemble ff00:ff20&lt;/b&gt;
$ff00  d8        CLD
$ff01  a2 28     LDX #$28
$ff03  9a        TXS
$ff04  a0 0a     LDY #$0a
$ff06  b9 ef fe  LDA $feef,Y
$ff09  99 17 02  STA $0217,Y
$ff0c  88        DEY
$ff0d  d0 f7     BNE $ff06
$ff0f  20 a6 fc  JSR $fca6
$ff12  8c 12 02  STY $0212
$ff15  8c 03 02  STY $0203
$ff18  8c 05 02  STY $0205
$ff1b  8c 06 02  STY $0206
$ff1e  ad e0 ff  LDA $ffe0

       PC  AC XR YR SP NV-BDIZC
6502: 0000 00 00 00 ff 00110000
&lt;b&gt;.registers pc=ff00&lt;/b&gt;

       PC  AC XR YR SP NV-BDIZC
6502: ff00 00 00 00 ff 00110000
&lt;b&gt;.step&lt;/b&gt;
$ff01  a2 28     LDX #$28

       PC  AC XR YR SP NV-BDIZC
6502: ff01 00 00 00 ff 00110000
&lt;b&gt;.step&lt;/b&gt;
$ff03  9a        TXS

       PC  AC XR YR SP NV-BDIZC
6502: ff03 00 28 00 ff 00110000
...
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
Of course, there are many other features described in the
&lt;a href="http://6502.org/users/mike/projects/py65/index.html"&gt;Py65 Documentation&lt;/a&gt;.
&lt;p&gt;
&lt;b&gt;Porting Py65 to Python 3&lt;/b&gt;
&lt;/p&gt;
&lt;p&gt;
Py65 consists of 27 &lt;tt&gt;.py&lt;/tt&gt; files and about 12000 lines of code.
More than half of the code consists of unit tests.&lt;/p&gt;
&lt;p&gt;
To start porting, I decided that I would just run all of the files
through &lt;tt&gt;2to3&lt;/tt&gt; to get a basic sense for what I might have to
change at a syntactic level.  Here is the complete output of doing that.  In a nutshell,
36 lines were identified.   Most of the changes were due to well-known
Python 3 changes such as changed exception handling syntax,
&lt;tt&gt;xrange()&lt;/tt&gt; and so forth.&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
bash % &lt;b&gt;2to3 src&lt;/b&gt;
RefactoringTool: Skipping implicit fixer: buffer
RefactoringTool: Skipping implicit fixer: idioms
RefactoringTool: Skipping implicit fixer: set_literal
RefactoringTool: Skipping implicit fixer: ws_comma
--- src/py65/monitor.py (original)
+++ src/py65/monitor.py (refactored)
@@ -32,7 +32,7 @@
             result = cmd.Cmd.onecmd(self, line)
         except KeyboardInterrupt:
             self._output("Interrupt")
-        except Exception,e:
+        except Exception as e:
             (file, fun, line), t, v, tbinfo = compact_traceback()
             error = 'Error: %s, %s: file: %s line: %s' % (t, v, file, line)
             self._output(error)
@@ -85,7 +85,7 @@
           line = self._shortcuts['~'] + ' ' + line[1:]
       
         # command shortcuts
-        for shortcut, command in self._shortcuts.iteritems():
+        for shortcut, command in self._shortcuts.items():
             if line == shortcut:
                 line = command
                 break
@@ -150,7 +150,7 @@
         mpus = {'6502': NMOS6502, '65C02': CMOS65C02}
         
         def available_mpus():
-            mpu_list = ', '.join(mpus.keys())
+            mpu_list = ', '.join(list(mpus.keys()))
             self._output("Available MPUs: %s" % mpu_list)            
         
         if args == '':                      
@@ -315,14 +315,14 @@
         if args != '':
             new = args[0].lower()
             changed = False
-            for name, radix in radixes.iteritems():
+            for name, radix in radixes.items():
                 if name[0].lower() == new:
                     self._address_parser.radix = radix
                     changed = True
             if not changed:
                 self._output("Illegal radix: %s" % args)
 
-        for name, radix in radixes.iteritems():
+        for name, radix in radixes.items():
             if self._address_parser.radix == radix:
                 self._output("Default radix is %s" % name)
 
@@ -364,7 +364,7 @@
                     if len(register) == 1:
                         intval &amp;amp;= 0xFF
                     setattr(self._mpu, register, intval)
-                except KeyError, why:
+                except KeyError as why:
                     self._output(why[0])
     
     def help_cd(self, args):
@@ -374,7 +374,7 @@
     def do_cd(self, args):
         try:
             os.chdir(args)
-        except OSError, why:
+        except OSError as why:
             msg = "Cannot change directory: [%d] %s" % (why[0], why[1])
             self._output(msg)
         self.do_pwd()
@@ -407,12 +407,12 @@
             f = open(filename, 'rb')
             bytes = f.read()
             f.close()
-        except (OSError, IOError), why:
+        except (OSError, IOError) as why:
             msg = "Cannot load file: [%d] %s" % (why[0], why[1])
             self._output(msg)
             return
 
-        self._fill(start, start, map(ord, bytes))
+        self._fill(start, start, list(map(ord, bytes)))
 
     def do_save(self, args):
         split = shlex.split(args)
@@ -430,7 +430,7 @@
             for byte in bytes:
                 f.write(chr(byte))
             f.close()
-        except (OSError, IOError), why:
+        except (OSError, IOError) as why:
             msg = "Cannot save file: [%d] %s" % (why[0], why[1])
             self._output(msg)
             return
@@ -455,7 +455,7 @@
             return
 
         start, end = self._address_parser.range(split[0])
-        filler = map(self._address_parser.number, split[1:])
+        filler = list(map(self._address_parser.number, split[1:]))
         
         self._fill(start, end, filler)
 
@@ -518,10 +518,10 @@
         self._output("Display current label mappings.")
 
     def do_show_labels(self, args):
-        values = self._address_parser.labels.values()
-        keys = self._address_parser.labels.keys()
+        values = list(self._address_parser.labels.values())
+        keys = list(self._address_parser.labels.keys())
       
-        byaddress = zip(values, keys)
+        byaddress = list(zip(values, keys))
         byaddress.sort()
         for address, label in byaddress:
             self._output("%04x: %s" % (address, label))
--- src/py65/tests/test_memory.py (original)
+++ src/py65/tests/test_memory.py (refactored)
@@ -56,7 +56,7 @@
         def read_subscriber(address, value):
             return 0xAB
         
-        mem.subscribe_to_read(xrange(0xC000, 0xC001+1), read_subscriber)
+        mem.subscribe_to_read(range(0xC000, 0xC001+1), read_subscriber)
     
         mem[0xC000] = 0xAB
         mem[0xC001] = 0xAB
@@ -141,7 +141,7 @@
             return 0xFF
         mem.subscribe_to_write([0xC000,0xC001], write_subscriber)
         
-        mem.write(0xC000, [0x01, 002])
+        mem.write(0xC000, [0x01, 0o02])
         self.assertEqual(0x01, subject[0xC000])
         self.assertEqual(0x02, subject[0xC001])
 
--- src/py65/tests/test_monitor.py (original)
+++ src/py65/tests/test_monitor.py (refactored)
@@ -4,7 +4,7 @@
 import os
 import tempfile
 from py65.monitor import Monitor
-from StringIO import StringIO
+from io import StringIO
 
 class MonitorTests(unittest.TestCase):
 
@@ -168,7 +168,7 @@
         mon = Monitor(stdout=stdout)
         mon._address_parser.labels['foo'] = 0xc000
         mon.do_delete_label('foo') 
-        self.assertFalse(mon._address_parser.labels.has_key('foo'))
+        self.assertFalse('foo' in mon._address_parser.labels)
         out = stdout.getvalue()
         self.assertEqual('', out)
 
--- src/py65/tests/devices/test_mpu6502.py (original)
+++ src/py65/tests/devices/test_mpu6502.py (refactored)
@@ -4979,8 +4979,7 @@
     self.assertEquals(0x0001, mpu.pc)
 
   def test_decorated_addressing_modes_are_valid(self):
-    valid_modes = map(lambda x: x[0], 
-                      py65.assembler.Assembler.Addressing)
+    valid_modes = [x[0] for x in py65.assembler.Assembler.Addressing]
     mpu = self._make_mpu()
     for name, mode in mpu.disassemble:
         self.assert_(mode in valid_modes)
@@ -5024,12 +5023,12 @@
   def _make_mpu(self, *args, **kargs):
     klass = self._get_target_class()
     mpu = klass(*args, **kargs)
-    if not kargs.has_key('memory'):
+    if 'memory' not in kargs:
         mpu.memory = 0x10000 * [0xAA]
     return mpu
   
   def _get_target_class(self):
-    raise NotImplementedError, "Target class not specified"
+    raise NotImplementedError("Target class not specified")
 
 
 class MPUTests(unittest.TestCase, Common6502Tests):      
--- src/py65/tests/utils/test_addressing.py (original)
+++ src/py65/tests/utils/test_addressing.py (refactored)
@@ -48,7 +48,7 @@
     try:
       parser.number('bad_label')
       self.fail()
-    except KeyError, why:
+    except KeyError as why:
       self.assertEqual('Label not found: bad_label', why[0])
 
   def test_number_label_hex_offset(self):
@@ -94,7 +94,7 @@
     try:
       parser.number('bad_label+3')
       self.fail()
-    except KeyError, why:
+    except KeyError as why:
       self.assertEqual('Label not found: bad_label', why[0])    
 
   def test_number_truncates_address_at_maxwidth_16(self):
--- src/py65/tests/utils/test_hexdump.py (original)
+++ src/py65/tests/utils/test_hexdump.py (refactored)
@@ -27,7 +27,7 @@
         try:
             Loader(text)
             self.fail()
-        except ValueError, why:
+        except ValueError as why:
             msg = 'Start address was not found in data'
             self.assert_(why[0].startswith('Start address'))
 
@@ -36,7 +36,7 @@
         try:
             Loader(text)
             self.fail()
-        except ValueError, why:                   
+        except ValueError as why:                   
             msg = 'Could not parse address: oops'
             self.assertEqual(msg, why[0])
 
@@ -45,7 +45,7 @@
         try:
             Loader(text)
             self.fail()
-        except ValueError, why:                   
+        except ValueError as why:                   
             msg = 'Expected address to be 2 bytes, got 1'
             self.assertEqual(msg, why[0])
 
@@ -54,7 +54,7 @@
         try:
             Loader(text)
             self.fail()
-        except ValueError, why:                   
+        except ValueError as why:                   
             msg = 'Expected address to be 2 bytes, got 3'
             self.assertEqual(msg, why[0])
 
@@ -63,7 +63,7 @@
         try:
             Loader(text)
             self.fail()
-        except ValueError, why:
+        except ValueError as why:
             msg = 'Non-contigous block detected.  Expected next ' \
                   'address to be $c001, label was $c002'
             self.assertEqual(msg, why[0])
@@ -73,7 +73,7 @@
         try:
             Loader(text)
             self.fail()
-        except ValueError, why:
+        except ValueError as why:
             msg = 'Could not parse data: foo'
             self.assertEqual(msg, why[0]) 
 
--- src/py65/utils/addressing.py (original)
+++ src/py65/utils/addressing.py (refactored)
@@ -26,7 +26,7 @@
     def label_for(self, address, default=None):
         """Given an address, return the corresponding label or a default.
         """
-        for label, label_address in self.labels.iteritems():
+        for label, label_address in self.labels.items():
             msg = "Expected address to be 2 bytes, got %d" % (
                                                     len(addr_bytes))
-            raise ValueError, msg
+            raise ValueError(msg)
 
         address = (addr_bytes[0] &amp;lt;&amp;lt; 8) + addr_bytes[1]
 
@@ -62,19 +62,19 @@
             msg = "Non-contigous block detected.  Expected next address " \
                   "to be $%04x, label was $%04x" % (self.current_address, 
                                                                     address)
-            raise ValueError, msg
+            raise ValueError(msg)
 
     def _parse_bytes(self, piece):
         if self.start_address is None:
             msg = "Start address was not found in data"
-            raise ValueError, msg
+            raise ValueError(msg)
 
         else:
             try:
                 bytes = [ ord(c) for c in a2b_hex(piece) ]  
             except (TypeError, ValueError):
                 msg = "Could not parse data: %s" % piece
-                raise ValueError, msg
+                raise ValueError(msg)
 
             self.current_address += len(bytes)
             self.data.extend(bytes)
RefactoringTool: Files that need to be modified:
RefactoringTool: src/py65/monitor.py
RefactoringTool: src/py65/tests/test_memory.py
RefactoringTool: src/py65/tests/test_monitor.py
RefactoringTool: src/py65/tests/devices/test_mpu6502.py
RefactoringTool: src/py65/tests/utils/test_addressing.py
RefactoringTool: src/py65/tests/utils/test_hexdump.py
RefactoringTool: src/py65/utils/addressing.py
RefactoringTool: src/py65/utils/hexdump.py
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;Not seeing anything too critical, I decided to invoke &lt;tt&gt;2to3 -w&lt;/tt&gt; to
simply patch all of the code.   However, I must emphasize--using
&lt;tt&gt;2to3&lt;/tt&gt; is almost never enough to make a Python 3 port.    In
the next few parts, I discuss a few tricky porting problems
encountered in making the new library work.  This is by no means an
exhaustive list.
&lt;/p&gt;
&lt;p&gt;
&lt;b&gt;Python 3 Porting Issue : Exception Indexing&lt;/b&gt;
&lt;/p&gt;
&lt;p&gt;
In several places, Py65 performs an indexed lookup on exception values.
For example, consider this fragment:&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
try:
   f = open("somebadfile")
except IOError as why:
   msg = "Cannot open file: [%d] %s" % (why[0], why[1])
   print(msg)
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
If you try this code in Python 2, it works.  However, if you try it in
Python 3, you will get an &lt;tt&gt;TypeError&lt;/tt&gt; crash like this:
&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
Traceback (most recent call last):
  File "&amp;lt;stdin&gt;", line 2, in &amp;lt;module&gt;
IOError: [Errno 2] No such file or directory: 'badfile'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "&amp;lt;stdin&gt;", line 4, in &amp;lt;module&gt;
TypeError: 'IOError' object is not subscriptable
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;Under the covers, exceptions hold their value in an
&lt;tt&gt;args&lt;/tt&gt; tuple. In Python 2, operations such as &lt;tt&gt;why[0]&lt;/tt&gt;
and &lt;tt&gt;why[1]&lt;/tt&gt; would simply return &lt;tt&gt;why.args[0]&lt;/tt&gt; and
&lt;tt&gt;why.args[1]&lt;/tt&gt;.  This no longer works in Python 3 so you can't
rely on it.  A better fix is to either refer to &lt;tt&gt;args&lt;/tt&gt; directly
or to use the documented exception attributes.  For example: &lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
try:
   f = open("somebadfile")
except IOError as why:
   msg = "Cannot open file: [%d] %s" % (why.errno, why.strerror)
   print(msg)
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
In Py65, I identified 12 lines where exceptions are indexed in this
manner.  Most of those changes were in unit tests that checked for specific
exception messages and error codes.
&lt;/p&gt;
&lt;p&gt; While we're on the subject of exceptions, it's also worth noting
that the scope of the &lt;tt&gt;why&lt;/tt&gt; variable in the above example is
different in Python 3.  Specifically, exception variables are only
defined for code inside the &lt;tt&gt;except&lt;/tt&gt; block.  In Python 2, such
variables persists after the &lt;tt&gt;try-except&lt;/tt&gt; statement.&lt;/p&gt;
&lt;p&gt;
&lt;b&gt;Python 3 Porting Issue : Overloaded Slicing&lt;/b&gt;
&lt;/p&gt;
&lt;p&gt;
One of the objects defined by Py65 is an observable memory buffer.
The precise implementation is not so important, but it's programmed to
be a list-like object that supports both indexing and slicing, but
with the ability to invoke registered observer functions on
user-specified indices (see the project at the end of the post for an example).
&lt;/p&gt;
&lt;p&gt;
In Python 2, you could use different methods for indexing and slicing by
implementing &lt;tt&gt;__getitem__()&lt;/tt&gt; and &lt;tt&gt;__getslice__()&lt;/tt&gt; like this:
&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
class ListLike:
     def __getitem__(self,n):
         print("getitem",n)
     def __getslice__(self,start,stop=None,step=None):
         print("getslice", start,stop,step)
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt; The only problem is that in Python 3, &lt;tt&gt;__getslice__()&lt;/tt&gt; no
longer exists as a special method (in fact, it's deprecated in Python 2
as well, but is still supported for backwards compatibility).  So, if
you try the following example, you'll see &lt;tt&gt;__getitem__()&lt;/tt&gt; being
called for both indexing and slicing. Here is what happens: &lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
&gt;&gt;&gt; &lt;b&gt;s = ListLike()&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;s[2]&lt;/b&gt;
getitem 2
&gt;&gt;&gt; &lt;b&gt;s[2:4]&lt;/b&gt;
getitem slice(2, 4, None)
&gt;&gt;&gt; 
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
Unless you've programmed &lt;tt&gt;__getitem__()&lt;/tt&gt; specifically to look
for &lt;tt&gt;slice&lt;/tt&gt; objects, you will run into trouble. For example,
when trying Py65, I started getting all sorts of errors about
incorrect use of &lt;tt&gt;slice&lt;/tt&gt; objects.  However,
here's a little bit of code that solves that problem:
&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
class ListLike:
     def __getitem__(self,n):
         if isinstance(n,slice):
            return [self[i] for i in range(*n.indices(len(self)))]
         # Return item n
         ...
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
Or, if you're a little more sneaky, you might use &lt;tt&gt;itertools&lt;/tt&gt;:
&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
class ListLike:
     def __getitem__(self,n):
         if isinstance(n,slice):
            return list(itertools.islice(self,*n.indices(len(self))))
         # Return item n
         ...
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
For slices, the value passed to &lt;tt&gt;__getitem__()&lt;/tt&gt; will be a
&lt;tt&gt;slice&lt;/tt&gt; object.  You can create these yourself.&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
&gt;&gt;&gt; &lt;b&gt;n = slice(2,4)&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;n&lt;/b&gt;
slice(2, 4, None)
&gt;&gt;&gt;
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
The &lt;tt&gt;indices(size)&lt;/tt&gt; method of a slice returns a tuple &lt;tt&gt;(start,
stop, step)&lt;/tt&gt; that you can use should you decide to iterate over
the slice using &lt;tt&gt;range()&lt;/tt&gt; or some other function. For example:&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
&gt;&gt;&gt; &lt;b&gt;n.indices(100)&lt;/b&gt;
(2, 4, 1)
&gt;&gt;&gt;
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
You can use this result as input to &lt;tt&gt;range()&lt;/tt&gt; to generate the
needed sequence of indices associated with the slice.&lt;/p&gt;
&lt;p&gt;
&lt;b&gt;Python 3 Porting Issue: Treating bytes as character arrays&lt;/b&gt;
&lt;/p&gt;
&lt;p&gt;
If you perform any kind of binary I/O in Python 3, be aware that data
will be read as &lt;tt&gt;bytes&lt;/tt&gt; objects and that those objects do not
have the same behavior as strings.&lt;/p&gt;
&lt;p&gt;
Consider this code fragment from Py65, in particular, the parts
highlighted in &lt;font color="#ff0000"&gt;red&lt;/font&gt;.  
&lt;blockquote&gt;
&lt;pre&gt;
try:
    f = open(filename, 'rb')
&lt;font color="#ff0000"&gt;    bytes = f.read()&lt;/font&gt;
    f.close()
except (OSError, IOError) as why:
     msg = "Cannot load file: [%d] %s" % (why[0], why[1])
     self._output(msg)
     return

self._fill(start, start, &lt;font color="#ff0000"&gt;list(map(ord, bytes))&lt;/font&gt;)
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
First complaint--don't use &lt;tt&gt;bytes&lt;/tt&gt; as the name of a variable.
&lt;tt&gt;bytes&lt;/tt&gt; is now the name of a built-in type.   However, that's
not the problem here.  Instead, the problem is with the &lt;tt&gt;map()&lt;/tt&gt;
operation at the end. Here is what happens in Python 2:&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
&gt;&gt;&gt; &lt;b&gt;s = "Hello"&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;list(map(ord,s))&lt;/b&gt;
[72, 101, 108, 108, 111]
&gt;&gt;&gt;
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
If you try it in Python 3, you get an error:
&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
&gt;&gt;&gt; &lt;b&gt;s = b"Hello"&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;list(map(ord,s))&lt;/b&gt;
Traceback (most recent call last):
  File "&amp;lt;stdin&gt;", line 1, in &amp;lt;module&gt;
TypeError: ord() expected string of length 1, but int found
&gt;&gt;&gt; 
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
What's happening here?  Well, the answer is simple--&lt;tt&gt;bytes&lt;/tt&gt;
objects in Python 3 are already treated as arrays as integers so the
extra conversion using &lt;tt&gt;ord()&lt;/tt&gt; isn't needed.  For
example:
&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
&gt;&gt;&gt; &lt;b&gt;s = b"Hello"&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;s[0]&lt;/b&gt;
72
&gt;&gt;&gt; &lt;b&gt;s[1]&lt;/b&gt;
101
&gt;&gt;&gt; &lt;b&gt;s[2]&lt;/b&gt;
108
&gt;&gt;&gt; 
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt; In the case of the above example, you can replace
&lt;tt&gt;list(map(ord,bytes))&lt;/tt&gt; with &lt;tt&gt;list(bytes)&lt;/tt&gt; or maybe even
just &lt;tt&gt;bytes&lt;/tt&gt; as it is already considered to be an array of
integer values.&lt;/p&gt;
&lt;p&gt;
&lt;b&gt;Porting Summary&lt;/b&gt;
&lt;/p&gt;
&lt;p&gt;
All told, I don't think I spent more than about an hour porting Py65
so that I could use it with Python 3.  As part of this work, I must
emphasize that I ported all of the supplied unit tests and also ran them
under Python 3 until all reported test failures were resolved.
Although I can't claim that it is bug-free, it was good enough to do
the project described next.
&lt;/p&gt;
&lt;p&gt;
&lt;b&gt;Py65 Project: Creating an Emulated Superboard II&lt;/b&gt;
&lt;/p&gt;
&lt;p&gt;
In previous blog posts, I've described a couple of projects involving
my old Superboard II system--my first computer.  Here is a picture of
it.
&lt;/p&gt;
&lt;p&gt;
&lt;center&gt;
&lt;img src="http://www.dabeaz.com/images/osi_small.jpg"/&gt;
&lt;/center&gt;
&lt;/p&gt;
&lt;p&gt;
To make an emulator, you need to know details about the underlying
hardware including memory map, ROMs, and hardware devices.   For this,
I referred to the Superboard II memory map taken straight from its user
manual.  Here it is:
&lt;/p&gt;
&lt;p&gt;
&lt;center&gt;
&lt;img src="http://www.dabeaz.com/images/sb_mmap.png"/&gt;
&lt;/center&gt;
&lt;/p&gt;
&lt;p&gt;
To capture the ROM images, I wrote two simple BASIC program to dump the
ROM data out of the cassette port.  For example, like this:&lt;/p&gt;
&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
5 REM DUMP THE BASIC ROM TO CASSETTE
10 FOR X = 40960 TO 49151
20 WAIT 61440, 2
30 B = PEEK(X)
40 POKE 61441, B
50 NEXT
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
By recording the audio stream using Audicity on my Mac and decoding
the resulting WAV files using the Python scripts described in a 
&lt;a href="http://dabeaz.blogspot.com/2010/08/decoding-superboard-ii-cassette-audio.html"&gt;previous
post&lt;/a&gt; I was able to capture both the 8K BASIC ROM and 2K system
ROM.  I put these in files &lt;tt&gt;&lt;a
href="http://www.dabeaz.com/basic.bin"&gt;basic.bin&lt;/a&gt;&lt;/tt&gt;
and &lt;tt&gt;&lt;a href="http://www.dabeaz.com/rom.bin"&gt;rom.bin&lt;/a&gt;&lt;/tt&gt;.
&lt;/p&gt;
&lt;p&gt;
Next up, you need to understand how the hardware devices work such as
the Video RAM, polled keyboard, and 6850 ACIA serial port.  For
example, you need to wrap your brain around everything that is going on this figure:&lt;/p&gt;
&lt;p&gt;
&lt;center&gt;
&lt;img src="http://www.dabeaz.com/images/sb_poll.png"/&gt;
&lt;/center&gt;
&lt;/p&gt;
&lt;p&gt;
Once you understand that, you're ready to make an emulation.  To do
it, you need to address two basic problems. First, you need to load the captured ROM
images.  That's the easy part.  Next, you need to install observer functions on the memory
addresses mapped to different hardware devices and make those
functions immitate the actual hardware.  That's the tricky bit.
&lt;/p&gt;
&lt;p&gt;Here is an example of doing just that. The most notable part of
this code is found in the &lt;tt&gt;map_hardware()&lt;/tt&gt; function that maps
functions to certain memory addresses. If you look at these functions,
you can see how they capture memory access and use that to emulate
hardware devices.  Of course, figuring out all of the subtle details
of the Superboard II hardware is left as an exercise to the reader:
&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
#!/usr/bin/env python3 -u

import py65.monitor
import sys
import select

# Write to a specific video address (using VT100 cursor control)
def video_output(address,value):
    row = (address - 0xd000) // 32
    column = address % 32
    sys.stdout.write(('\x1b[7m\x1b[&amp;lt;%d&gt;;&amp;lt;%d&gt;H' % (row,column)) + chr(value) + '\x1b[0m')
    sys.stdout.flush()

# Keyboard mapping table (for polled keyboard)
keymap = {
    b'\x00' : {254:254, 253:255, 251:255, 247:255, 239:255, 223:255, 191:255, 127:255}, 
    b'\r' : {254:254, 223:247}, 
    b'\n' : {254:254, 223:247}, 
    b' ' : {254:254, 253:239}, 
    b'/' : {254:254, 253:247}, 
    b';' : {254:254, 253:251}, 
    b':' : {254:254, 191:239}, 
    b'-' : {254:254, 191:247}, 
    b'.' : {254:254, 223:127}, 
    b',' : {254:254, 251:253}, 
    b'A' : {254:254, 253:191}, 
    b'B' : {254:254, 251:239}, 
    b'C' : {254:254, 251:191}, 
    b'D' : {254:254, 247:191}, 
    b'E' : {254:254, 239:191}, 
    b'F' : {254:254, 247:223}, 
    b'G' : {254:254, 247:239}, 
    b'H' : {254:254, 247:247}, 
    b'I' : {254:254, 239:253}, 
    b'J' : {254:254, 247:251}, 
    b'K' : {254:254, 247:253}, 
    b'L' : {254:254, 223:191}, 
    b'M' : {254:254, 251:251}, 
    b'N' : {254:254, 251:247}, 
    b'O' : {254:254, 223:223}, 
    b'P' : {254:254, 253:253}, 
    b'Q' : {254:254, 253:127}, 
    b'R' : {254:254, 239:223}, 
    b'S' : {254:254, 247:127}, 
    b'T' : {254:254, 239:239}, 
    b'U' : {254:254, 239:251}, 
    b'V' : {254:254, 251:223}, 
    b'W' : {254:254, 239:127}, 
    b'X' : {254:254, 251:127}, 
    b'Y' : {254:254, 237:247}, 
    b'Z' : {254:254, 253:223}, 
    b'1' : {254:254, 127:127}, 
    b'2' : {254:254, 127:191}, 
    b'3' : {254:254, 127:223}, 
    b'4' : {254:254, 127:239}, 
    b'5' : {254:254, 127:247}, 
    b'6' : {254:254, 127:251}, 
    b'7' : {254:254, 127:253}, 
    b'8' : {254:254, 191:127}, 
    b'9' : {254:254, 191:191}, 
    b'0' : {254:254, 191:223}, 
    b'!' : {254:252, 127:127}, 
    b'"' : {254:252, 127:191}, 
    b'#' : {254:252, 127:223}, 
    b'$' : {254:252, 127:239}, 
    b'%' : {254:252, 127:247}, 
    b'&amp;' : {254:252, 127:251}, 
    b"'" : {254:252, 127:254}, 
    b'(' : {254:252, 191:127}, 
    b')' : {254:252, 191:191}, 
    b'*' : {254:252, 191:239}, 
    b'=' : {254:252, 191:247}, 
    b'&gt;' : {254:252, 223:127}, 
    b'&amp;lt;' : {254:252, 251:253}, 
    b'?' : {254:252, 253:247}, 
    b'+' : {254:252, 253:251}, 
}

# Raw file underlying stdin
raw_stdin = sys.stdin.buffer.raw

# State about what's being polled
kb_row = 0
kb_current = keymap[b'\x00']
kb_count = 0

# Read the row values for the polled row
def keyboard_read(address):
    global kb_count, kb_current
    if kb_count &gt; 0:
        kb_count -= 1
        if kb_count &amp;lt; 5:
            # Simulate key-release
            kb_current = keymap[b'\x00']
    else:
        kb_current = keymap[b'\x00']
        if kb_row == 254:
            # Poll stdin to see any input
            r,w,e = select.select([raw_stdin],[],[],0)
            if r:
                keyboard_press(raw_stdin.read(1))

    return kb_current.get(kb_row,255)

# Set the current keyboard poll row
def keyboard_write(address, val):
    global kb_row
    kb_row = val

# Initiate a keypress
def keyboard_press(ch):
    global kb_count, kb_current
    if ch in keymap:
        kb_current = keymap[ch]
        kb_count = 30 

def map_hardware(m):
    # Video RAM at 0xd000-xd400
    m.subscribe_to_write(range(0xd000,0xd400),video_output)

    # Monitor the polled keyboard port
    m.subscribe_to_read([0xdf00], keyboard_read)
    m.subscribe_to_write([0xdf00], keyboard_write)

    # Bad memory address to force end to memory check
    m.subscribe_to_read([0x8000], lambda x: 0)

def main(args=None):
    c = py65.monitor.Monitor()
    map_hardware(c._mpu.memory)
    try:
        import readline
    except ImportError:
        pass

    # Load the ROMs and boot
    c.onecmd("load rom.bin f800")
    c.onecmd("load basic.bin a000")
    c.onecmd("goto ff00")
    try:
        c.onecmd('version')
        c.cmdloop()
    except KeyboardInterrupt:
        c._output('')

if __name__ == "__main__":
    main()
&lt;/pre&gt;
&lt;/blockquote&gt;

&lt;p&gt;
&lt;b&gt;Running the Emulation&lt;/b&gt;
&lt;/p&gt;
&lt;p&gt;
Running the emulation in a VT100 compatible terminal window, you'll
get output that looks like this.  Yep, that's my Superboard II running
up in the upper left corner of the terminal window (click on the image
to see a video):
&lt;/p&gt;
&lt;center&gt;
&lt;a href="http://www.youtube.com/watch?v=unAKUE0fUnA"&gt;&lt;img src="http://www.dabeaz.com/images/sb_emul.png"/&gt;&lt;/a&gt;
&lt;/center&gt;
&lt;p&gt;
Admittedly, it's kind of a hack, but then again, that's the whole point.
&lt;/p&gt;
&lt;p&gt;
&lt;b&gt;Final Words&lt;/b&gt;
&lt;/p&gt;
&lt;p&gt;
I've put my modified Py65 code online at &lt;a
href="http://github.com/dabeaz/py65"&gt;http://github.com/dabeaz/py65&lt;/a&gt;.
The distribution also includes a slightly different emulation example
that allows you to telnet to an emulated Superboard.&lt;/p&gt;
&lt;p&gt;
I gave a talk about this at the January 13, 2011 meeting of &lt;a
href="http://chipy.org"&gt;Chipy&lt;/a&gt;.  Check out the &lt;a href="http://carlfk.blip.tv/file/4639616"&gt;video&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-5523699269187039520?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/5523699269187039520/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=5523699269187039520' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/5523699269187039520'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/5523699269187039520'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2011/01/porting-py65-and-my-superboard-to.html' title='Porting Py65 (and my Superboard) to Python 3'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-36190070114665604</id><published>2010-12-16T15:04:00.000-08:00</published><updated>2010-12-16T15:04:32.363-08:00</updated><title type='text'>O'Reilly Python Cookbook: Python 3 All The Way</title><content type='html'>&lt;p&gt;
I'm pleased to announce that Brian Jones and I have just signed on to be the editors/curators of the upcoming O'Reilly Python Cookbook, 3rd Edition--to appear sometime in late 2011.  Brian has posted some &lt;a href="http://www.protocolostomy.com/2010/12/16/good-things-come-in-threes-python-cookbook-third-edition/"&gt;details&lt;/a&gt; on his blog, but let's just say that I'm really excited to be working on this project.  I think it's going to be great!&lt;/p&gt;
&lt;p&gt;
I've had both prior editions of the Cookbook in my library for some time--in fact, I wrote the section introduction for the chapter on "Extending and Embedding."   One thing that I didn't remember until now was that my biographical sketch from the preface of the past edition included the following description:
&lt;/p&gt;
&lt;blockquote&gt;
&lt;em&gt;"David Beazley is a fairly sick man (in a good way)"&lt;/em&gt;
&lt;/blockquote&gt;

&lt;p&gt;
I'm not sure who I have to thank for that, but I can say that Brian and I hope to put together the sickest, baddest, most useful Cookbook yet.&lt;/p&gt;

&lt;h3&gt;Python 3 - All The Way&lt;/h3&gt;
&lt;p&gt;
Yep. It's true. A major feature of the new edition will be an exclusive focus on Python 3. In fact, we simply won't include coverage of anything that doesn't work with it.&lt;/p&gt;
&lt;p&gt;
Now, I know what you're thinking, this is going to result in the smallest Cookbook ever--coming in just slightly more than 25 pages.  Wrong!&lt;/p&gt;
&lt;p&gt;
There are all sorts of new and exciting things about Python 3 worth writing about. For example, did you know that quite a few past Cookbook recipes are now simply built-in features or one-line Python 3 statements?   Moreover, Python 3 has all sorts of interesting new programming idioms--especially related to I/O handling, concurrency, metaprogramming, and more.&lt;/p&gt;
&lt;p&gt;
Thus, one of our main goals is to present useful recipes that take full advantage of new idioms and which do things the "Python 3" way.  In part, this will be welcome information for anyone who has decided to make Python 3 their primary programming environment.  However, we also hope that having a useful set of idiomatic recipes will be useful to anyone who is thinking about porting code from Python 2.&lt;/p&gt;
&lt;p&gt;
Of course, we obviously want to include useful recipes for modules that have already made the transition.&lt;/p&gt;

&lt;p&gt;
&lt;h3&gt;We Want Your Help and Feedback&lt;/h3&gt;
&lt;/p&gt;
&lt;p&gt;
Past editions of the Cookbook have always been a community effort.  The recipes themselves are drawn from submissions to the &lt;a href="http://code.activestate.com/recipes/langs/python/"&gt;ActiveState Python Recipes&lt;/a&gt; site and are fully attributed.  In fact, the folks at ActiveState are an active participant in this project.&lt;/p&gt;
&lt;p&gt;
As editors, Brian and I play a number of roles.  First and foremost, we're simply going to work to put together a great set of recipes along with tests to make sure they work as advertised.  However, we also have the job of soliciting feedback and guiding the overall project.  As part of that, we'd really like to know more about what kinds of recipes to include.  Specific programming techniques?  More coverage of certain built-in libraries?  Information on important third-party extensions?  Everything is fair game.&lt;/p&gt;
&lt;p&gt;
Throughout the project, you can contact us by sending email to 'PythonCookbook' at 'oreilly.com' or writing comments on our blog posts.
&lt;/p&gt;
&lt;h3&gt;Stay Tuned&lt;/h3&gt;

&lt;p&gt;
Throughout the project, Brian and I hope to blog about our progress. 
You can also follow &lt;a href="http://twitter.com/#!/bkjones"&gt;@bkjones&lt;/a&gt; and &lt;a href="http://twitter.com/#!/dabeaz"&gt;@dabeaz&lt;/a&gt; on Twitter for updates. &lt;/p&gt;
&lt;p&gt;
-- Dave
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-36190070114665604?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/36190070114665604/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=36190070114665604' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/36190070114665604'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/36190070114665604'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2010/12/oreilly-python-cookbook-python-3-all.html' title='O&apos;Reilly Python Cookbook: Python 3 All The Way'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-5765736266635743072</id><published>2010-12-04T14:46:00.001-08:00</published><updated>2010-12-04T14:46:53.953-08:00</updated><title type='text'>Python Concurrency Workshop - 2011</title><content type='html'>&lt;p&gt;Well, January in Chicago can only mean one thing--that my &lt;a href="http://www.dabeaz.com/chicago/concurrent.html"&gt;Python Concurrency and Distributed Computing Workshop&lt;/a&gt; is back!  If you've wanted to learn more about concurrency, threads, messaging, and other related topics, then this is the workshop for you.   There also promises to be a certain amount of insanity--after all, past editions of the workshop were responsible for my whole exploration into the &lt;a href="http://www.dabeaz.com/GIL"&gt;GIL&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;
Unlike a normal Python course, the concurrency workshop is more experimental in nature--tending to focus on cutting edge topics and exploration of lesser-known areas of Python programming.  However, no topic is off-limits as discussions might dive into facets of C programming, operating systems, other programming languages.  Needless to say, a good time will be had by all.     
&lt;/p&gt;
&lt;p&gt;
The &lt;a href="http://www.dabeaz.com/chicago/concurrent.html"&gt;course page&lt;/a&gt; has detailed information on the previous workshop.  This year, we'll cover much of that material, but here are some exciting new highlights for 2011:&lt;/p&gt;

&lt;ul&gt;
&lt;p&gt;
&lt;li&gt;&lt;b&gt;Python 3.&lt;/b&gt;  Want to know what Python 3 is all about?  You'll find out in a big way as this year's workshop is entirely based on Python 3, preferably Python-3.2.&lt;/li&gt;
&lt;/p&gt;
&lt;p&gt;
&lt;li&gt;&lt;b&gt;Messaging.&lt;/b&gt; There will be significantly more material on messaging architectures.  As part of that, we'll look in some depth at 0MQ, distributed key-value stores, actors, and more.&lt;/li&gt;
&lt;/p&gt;
&lt;p&gt;
&lt;li&gt;&lt;b&gt;Mondo Threads.&lt;/b&gt; A completely revised thread-programming section that will present Python thread programming as you've never seen it before. Prepare to be amazed.&lt;/li&gt;
&lt;/p&gt;
&lt;p&gt;
&lt;li&gt;&lt;b&gt;Reliability.&lt;/b&gt; There's a great deal of added information on software design and debugging techniques for reliable concurrent programming.  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;
As usual, the course is strictly limited to 6 students and being held in Chicago's Andersonville neighborhood.  Worried about the cold?  Well, in this course, there are far more scary things to be worried about than that.  Besides, the classroom is completely surrounded by coffee shops and places to get strong Belgian ales.  The cold is going to be the least of your problems.&lt;/p&gt;
&lt;p&gt;
Hopefully I'll see you in Chicago.  It's going to be great!
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-5765736266635743072?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/5765736266635743072/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=5765736266635743072' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/5765736266635743072'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/5765736266635743072'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2010/12/python-concurrency-workshop-2011.html' title='Python Concurrency Workshop - 2011'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-3762917410109995246</id><published>2010-09-25T18:25:00.000-07:00</published><updated>2010-09-25T18:25:55.576-07:00</updated><title type='text'>Putting all of my Past PyCon/IPC Presentations on Slideshare</title><content type='html'>&lt;p&gt;For the past few years, I've been making my PyCon tutorials and presentations available online.  For example, &lt;a href="http://www.dabeaz.com/generators"&gt;Generator Tricks for Systems Programmers&lt;/a&gt; from PyCon'2008, &lt;a href="http://www.dabeaz.com/coroutines"&gt;A Curious Course on Coroutines and Concurrency&lt;/a&gt; from PyCon'2009, and &lt;a href="http://www.dabeaz.com/python3io"&gt;Mastering Python 3 I/O&lt;/a&gt; from PyCon'2010.  Although there have been many downloads, I've occasionally received requests to post material in a format more suitable for sharing online.&lt;/p&gt;
&lt;p&gt;
Thus, I'm pleased to announce that I've set up a &lt;a href="http://www.slideshare.net/dabeaz"&gt;Slideshare channel&lt;/a&gt; that has the slides from almost all my past presentations and tutorials from PyCon, the International Python Conference, USENIX, and a few other conferences, going all the way back to 1996.  All told, there are more than 1700 slides on Python programming, Swig, PLY, and other topics.&lt;/p&gt;
&lt;p&gt;
I hope someone finds this material useful so enjoy!   I'm still going through my presentation archive and will probably add even more to Slideshare as I find time.&lt;/p&gt;
&lt;p&gt;
-- Dave
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-3762917410109995246?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/3762917410109995246/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=3762917410109995246' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/3762917410109995246'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/3762917410109995246'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2010/09/putting-all-of-my-past-pyconipc.html' title='Putting all of my Past PyCon/IPC Presentations on Slideshare'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-5039667672014613981</id><published>2010-09-15T09:05:00.000-07:00</published><updated>2010-09-23T07:52:56.650-07:00</updated><title type='text'>A few good reasons to take one of my Fall 2010 Python courses</title><content type='html'>&lt;p&gt;
This fall, I am offering three intense Python courses in Chicago:&lt;/p&gt;
&lt;p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="http://www.dabeaz.com/chicago/practical.html"&gt;Practical Python Programming&lt;/a&gt;, October 25-28, 2010.&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.dabeaz.com/chicago/mastery.html"&gt;Advanced Python Mastery&lt;/a&gt;,
November 8-11, 2010. (&lt;b&gt;Only two slots left!&lt;/b&gt;)&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.dabeaz.com/chicago/django.html"&gt;Practical Python Programing plus Django&lt;/a&gt;, November 15-19, 2010.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;
Here are some reasons you might want to attend:&lt;/p&gt;
&lt;p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;b&gt;Courses are held in a certifiably "evil" Python programming lair.&lt;/b&gt;  Aside from some occasional C and assembly hacking, this is where I do all of my Python programming.  Want to take a class to go get "certified" in some kind of "enterprise" software or Microsoft Office?  Bah. Better look elsewhere. Python is my only focus here.
&lt;p&gt;
&lt;center&gt;
&lt;img src="http://www.dabeaz.com/chicago/class_small.jpg"&gt;
&lt;/center&gt;
&lt;/p&gt;
&lt;/li&gt;
&lt;p&gt;
&lt;li&gt;&lt;b&gt;Be like a rocket scientist.&lt;/b&gt;  These are the same Python classes I regularly teach on-site to scientists, engineers, and yes, rocket scientists--who think Python is pretty useful by the way.  However, do you have to be an expert to attend?  Nope.  These courses are for anyone who wants to learn more--including programmers new to Python. &lt;/li&gt;&lt;/p&gt;
&lt;p&gt;
&lt;li&gt;&lt;b&gt;You'll learn some new tricks for making your code better.&lt;/b&gt;&lt;/li&gt; Even if you've been programming in Python for awhile, you will learn some new techniques.  This is because I spend most of my free time exploring different ways to effectively use Python's various features--often in preparation for future writing projects, PyCon tutorials, or for use in my own coding projects.  And after you've mastered everything there is to know about Python, you can move on to mastering the &lt;a href="http://www.dabeaz.com/chicago/curta.html"&gt;Curta&lt;/a&gt;.&lt;/li&gt;
&lt;/p&gt;
&lt;p&gt;
&lt;li&gt;&lt;b&gt;You'll be well fed.&lt;/b&gt; These courses aren't held in some sterile hotel or corporate training center.   The lair is surrounded by great restaurants, cafes, and bakeries.  For instance, you probably don't want to know how many calories are in this picture (from the bakery located immediately below the lair):
&lt;p&gt;
&lt;center&gt;
&lt;img src="http://www.dabeaz.com/chicago/pastry.jpg"&gt;
&lt;/center&gt;
&lt;/p&gt;
&lt;/li&gt;
&lt;p&gt;
&lt;li&gt;&lt;b&gt;All Python, All Day&lt;/b&gt;.  You're going to spend several days doing nothing but hacking and talking about Python with people who like Python as much as you do.  What's not to like about that?&lt;/li&gt;
&lt;/p&gt;
&lt;/ol&gt;
&lt;p&gt;
That is all for now.  Hopefully you'll join me for a future course!&lt;/p&gt;
&lt;p&gt;
--Dave
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-5039667672014613981?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/5039667672014613981/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=5039667672014613981' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/5039667672014613981'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/5039667672014613981'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2010/09/few-good-reasons-to-take-one-of-my-fall.html' title='A few good reasons to take one of my Fall 2010 Python courses'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-6649301725574666151</id><published>2010-09-04T09:26:00.000-07:00</published><updated>2010-09-04T10:55:27.175-07:00</updated><title type='text'>Using telnet to access my Superboard II (via Python and cassette ports)</title><content type='html'>&lt;P&gt;
Welcome to part 3 of my "Superboard II" trilogy.   For the first two parts, see these posts:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="http://dabeaz.blogspot.com/2010/08/using-python-to-encode-cassette.html"&gt;Using Python to Encode Cassette Recordings for my Superboard II&lt;/a&gt;
&lt;li&gt;&lt;a href="http://dabeaz.blogspot.com/2010/08/decoding-superboard-ii-cassette-audio.html"&gt;Decoding Superboard II Cassette Audio Using Python 3, Two Generators, and a Deque&lt;/a&gt;
&lt;/ul&gt;
&lt;p&gt;
&lt;center&gt;
&lt;img src="http://www.dabeaz.com/images/osi_small.jpg"&gt;&lt;br&gt;
&lt;em&gt;Dave's Superboard II&lt;/em&gt;
&lt;/center&gt;
&lt;p&gt;
First, a brief digression.
&lt;/p&gt;
&lt;p&gt;
&lt;b&gt;Why Bother?&lt;/b&gt;
&lt;/p&gt;
&lt;p&gt;Aside from the obvious nostalgia (the Superboard II being my first computer), why bother messing around with something like this?   After all, we're talking about a long-since-dead 1970s technology. Any sort of practical application certainly seems far-fetched.&lt;/p&gt;
&lt;p&gt;
The simple answer is that doing this sort of thing is fun--fun for the same reasons I got into programming in the first place.   When my family first got the Superboard, it was this magical device--a device where you could command it to do anything you wanted.  You could write programs to make it play games.  Or, more importantly, you could command it to do your math homework.  Not only that, everything about the machine was open.  It came with electrical schematics and memory maps.  You could directly input hex 6502 opcodes. There were no rules at all.  Although writing a game or doing your homework might be an end goal, the real fun was the process of figuring out how to do those things (to be honest, I think I learned much more about math by writing programs to do my math homework than I ever did by actually doing the homework, but that's a different story). &lt;/p&gt;
&lt;p&gt;
Flash forward about 30 years and I'm now doing most of my coding in Python.  However, Python (and most other dynamic languages) embody everything that was great about my old Superboard II.   For instance, the instant gratification of using the interactive interpreter to try things out.  Or, the complete freedom to do almost anything you want in a program (first-class functions, duck-typing, metaprogramming, etc.).  Or, the ability to dig deep into the bowels of your system (ctypes, Swig, etc.).  Frankly, it's all great fun.  It's what programming should be about.  Clearly the designers of more "serious" languages (especially those designed for the "enterprise") never had anything like a Superboard.&lt;/p&gt;  
&lt;P&gt;
Anyways, getting back to my motivations, I don't really have any urgent need to access my Superboard from my Mac.  I'm mostly just interested in the problem of &lt;em&gt;how&lt;/em&gt; I would do it.  The fun is all in the process of figuring it out.&lt;/p&gt;
&lt;p&gt;
&lt;b&gt;Back to the Superboard Cassette Ports&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;
Getting back to topic, you will recall that in my prior posts, I was interested in the problem of &lt;a href="http://dabeaz.blogspot.com/2010/08/using-python-to-encode-cassette.html"&gt;encoding&lt;/a&gt; and &lt;a href="http://dabeaz.blogspot.com/2010/08/decoding-superboard-ii-cassette-audio.html"&gt;decoding&lt;/a&gt; the audio stream transmitted from the cassette input and output ports on my Superboard II.  In part, this was due to the fact that those are the only available I/O ports--forget about USB, Firewire, Ethernet, RS-232, or a parallel port.  Nope, cassette audio is all there is.&lt;/p&gt;
&lt;p&gt;
From the two parts, I wrote some Python scripts that &lt;a href="http://www.dabeaz.com/kcs_encode.py"&gt;encode&lt;/a&gt; and &lt;a href="http://www.dabeaz.com/kcs_decode.py"&gt;decode&lt;/a&gt; the cassette audio data to and from WAV files.  Although that is somewhat interesting, working with WAV files was never my real goal.  Instead, what I &lt;em&gt;really&lt;/em&gt; wanted to do was to set up a real-time bidirectional data communication channel between my Mac and the Superboard II.  Simply stated, I wanted to create the equivalent of a network connection using the cassette ports.  Would it even be possible? Who knows?&lt;/p&gt;
&lt;p&gt;
So far as I know, the cassette ports on the Superboard were never intended for this purpose.  Although there are commands to save a program and to load a program, driving both the cassette input and output simultaneously isn't something you would do.  It didn't even make any sense.  There certainly weren't any Superboard commands to do that.
&lt;/p&gt;
&lt;p&gt;
&lt;b&gt;Building a Soft-Modem Using PyAudio&lt;/b&gt;
&lt;/p&gt;
&lt;p&gt;
To perform real-time communications, the Superboard needs to be connected to both the audio line-out and line-in ports of my Mac.  Using those connections, I would then need to write a program that operates as a soft-modem.  This program would simultaneously read and transmit audio data by encoding or decoding it as appropriate (see my past posts).&lt;/p&gt;
&lt;p&gt;
I've never written a program for manipulating audio on my Mac, but after some searching, I found the &lt;a href="http://people.csail.mit.edu/hubert/pyaudio/"&gt;PyAudio&lt;/a&gt; extension that seemed to provide the exact set of features I needed.
&lt;/p&gt;
&lt;p&gt;
To create a soft-modem, I defined reader and writer threads as follows:
&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
# Note : This is Python 2 due to the PyAudio dependency.
import pyaudio
import kcs_decode      # See prior posts
import kcs_encode      # See prior posts
from Queue import Queue

FORMAT    = pyaudio.paInt8
CHANNELS  = 1
RATE      = 9600
CHUNKSIZE = 1024

# Buffered data received and waiting to transmit
audio_write_buffer = Queue()
audio_read_buffer = Queue()

# Generate a sequence representing sign change bits on the real-time
# audio stream (needed as input for decoding)
def generate_sign_change_bits(stream):
    previous = 0
    while True:
        frames = stream.read(CHUNKSIZE)
        if not frames:
            break
        msbytes = bytearray(frames)
        # Emit a stream of sign-change bits
        for byte in msbytes:
            signbit = byte &amp; 0x80
            yield 1 if (signbit ^ previous) else 0
            previous = signbit

# Thread that reads and decodes KCS audio input
def audio_reader():
    print("Reader starting")
    p = pyaudio.PyAudio()
    stream = p.open(format = FORMAT,
                    channels = CHANNELS,
                    rate = RATE,
                    input=True,
                    frames_per_buffer=CHUNKSIZE)

    bits = generate_sign_change_bits(stream)
    byte_stream = kcs_decode.generate_bytes(bits, RATE)
    for b in byte_stream:
        audio_read_buffer.put(chr(b))

# Thread that writes KCS audio data
def audio_writer():
    print("Writer starting")
    p = pyaudio.PyAudio()
    stream = p.open(format = FORMAT,
                    channels = CHANNELS,
                    rate = RATE,
                    output=True)
    while True:
        if not audio_write_buffer.empty():
            msg = kcs_encode.kcs_encode_byte(ord(audio_write_buffer.get()))
            stream.write(buffer(msg))
        else:
            stream.write(buffer(kcs_encode.one_pulse))

if __name__ == '__main__':
    import threading

    # Launch the reader/writer threads
    reader_thr = threading.Thread(target=audio_reader)
    reader_thr.daemon = True
    reader_thr.name = "Reader"
    reader_thr.start()

    writer_thr = threading.Thread(target=audio_writer)
    writer_thr.daemon = True
    writer_thr.name = "Writer"
    writer_thr.start()    
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
The operation of this code is relatively straightforward.  There is a reader thread that constantly samples audio on the line-in port and decodes it into bytes which are stored in a
queue for later consumption.   There is a writer thread that encodes and transmits outgoing data (if any).  If there is no data, the writer transmits a constant carrier tone on the line out (a 2400 Hz wave).&lt;/p&gt;
&lt;p&gt;
These threads operate entirely in the background.  To read data from the Superboard, you simply check the contents of the audio read buffer.  To send data to the Superboard, you simply append outgoing data to the audio write buffer.&lt;/p&gt;
&lt;p&gt;
&lt;b&gt;Creating a Network Server&lt;/b&gt;
&lt;/p&gt;
&lt;p&gt;
To tie all of this together, you can now write a network server that connects the real-time audio streams to a network socket.   This can be done by defining a third thread like this:
&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
import socket
import time

def server(addr):
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR,1)
    s.bind(addr)
    s.listen(1)
    print("Server running on", addr)
    # Wait for the client to connect
    while True:
        c,a = s.accept()
        print("Got connection",a)
        c.setblocking(False)
        try:
            # Enter a loop where we try to transmit data back and forth between the client and the audio stream
            while True:
                # Check for incoming data
                try:
                    indata = c.recv(8192)
                    if not indata:
                        raise EOFError()
                    indata = indata.replace(b'\r',b'\r' + b'\x00'*20)
                    for b in indata:
                        audio_write_buffer.put(b)
                except socket.error:
                    pass
                # Check if there is any outgoing data to transmit (try to send it all)
                if not audio_read_buffer.empty():
                    while not audio_read_buffer.empty():
                        b = audio_read_buffer.get()
                        c.send(b)
                else:
                    # Sleep briefly if nothing is going on.  This is fine, the max
                    # data transfer rate of the Superboard is 300 baud
                    time.sleep(0.01)
        except EOFError:
            print("Connection closed")
            c.close()

if __name__ == '__main__':
    import threading

    # Launch the reader/writer threads
    ... see above code ..

    # Launch the network server
    server_thr = threading.Thread(target=server,args=(("",15000),))
    server_thr.daemon = True
    server_thr.name = "Server"
    server_thr.start()

    # Have the main thread do something (so Ctrl-C works)
    while True:
            time.sleep(1)
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
This server operates as a simple polling loop over a client socket and the incoming audio data stream.   Data received on the socket is placed in the write buffer used by the audio writer thread.  Data received by the audio reader is send back to the client.  This code could probably be cleaned up through the use of the &lt;tt&gt;select()&lt;/tt&gt; call, but I frankly don't know if &lt;tt&gt;select()&lt;/tt&gt; works with PyAudio and didn't investigate it.  Given that the maximum data rate of the Superboard is 300 baud, a "good enough" solution seemed to be just that.&lt;/p&gt;
&lt;p&gt;
&lt;b&gt;Putting it to the Test&lt;/b&gt;
&lt;/p&gt;
&lt;p&gt;
Now, the ultimate test--does it actually work?   To try it out, you first have to launch the above audio server process.  For example:&lt;/p&gt;

&lt;blockquote&gt;
&lt;pre&gt;
bash % &lt;b&gt;python audioserv.py&lt;/b&gt;
Reader starting
Writer starting
Server running on ('', 15000)
&lt;/pre&gt;
&lt;/blockquote&gt;

&lt;p&gt;
Next, make sure the Superboard II is plugged into the line-in and line-out ports on my Mac.  On the Superboard, I had to manually type two &lt;tt&gt;POKE&lt;/tt&gt; statements to make it send all output to the cassette output and to read all keyboard input from the cassette input.&lt;/p&gt;
&lt;p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
POKE 517, 128
POKE 515, 128
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
Finally, use the &lt;tt&gt;telnet&lt;/tt&gt; command to connect to the audio server.&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
bash $ &lt;b&gt;telnet localhost 15000&lt;/b&gt;
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
^]
telnet&gt; &lt;b&gt;mode character&lt;/b&gt;
&lt;b&gt;LIST&lt;/b&gt;

OK
&lt;b&gt;PRINT "HELLO WORLD"&lt;/b&gt;
HELLO WORLD

OK
&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
Excellent! It seems to be working.  It's a little hard to appreciate with just a screenshot.  Therefore, you can check out the following &lt;a href="http://www.youtube.com/watch?v=FMGG33IHg_4"&gt;movie&lt;/a&gt; that shows it all in action:
&lt;/p&gt;
&lt;center&gt;
&lt;object width="425" height="344"&gt;&lt;param name="movie" value="http://www.youtube.com/v/FMGG33IHg_4?hl=en&amp;fs=1"&gt;&lt;/param&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;/param&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/FMGG33IHg_4?hl=en&amp;fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"&gt;&lt;/embed&gt;&lt;/object&gt;
&lt;/center&gt;
&lt;p&gt;
Again, it's important to emphasize that there is no other connection between the two machines other than a pair of audio cables.&lt;/p&gt;  
&lt;p&gt;
&lt;b&gt;That is all (for now)&lt;/b&gt;
&lt;/p&gt;
&lt;p&gt;
Well, there you have it--using Python to implement a soft-modem that encodes/decodes cassette audio data in real-time, allowing me to remotely access my old Superboard using telnet.  At last, I can write old Microsoft Basic 1.0 programs from the comfort of my Aeron chair and a 23-inch LCD display--and there's nothing old-school about that!&lt;/p&gt;
&lt;p&gt;
Hope you enjoyed this series of posts.  Sadly, it's now time to get back to some "real work."  Of course, if you'd like to see all of this in person, you should sign up for one of my &lt;a href="http://www.dabeaz.com/chicago/index.html"&gt;Python courses&lt;/a&gt;.
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-6649301725574666151?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/6649301725574666151/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=6649301725574666151' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/6649301725574666151'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/6649301725574666151'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2010/09/using-telnet-to-access-my-superboard-ii.html' title='Using telnet to access my Superboard II (via Python and cassette ports)'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-2212420545471036146</id><published>2010-08-29T20:39:00.000-07:00</published><updated>2010-08-30T07:33:27.945-07:00</updated><title type='text'>Decoding Superboard II Cassette Audio Using Python 3, Two Generators, and a Deque</title><content type='html'>&lt;p&gt;Welcome to the second installment of using Python to encode/decode cassette audio data for use with my resurrected Superboard II system. Last time, I talked about the problem of &lt;a
 href="http://dabeaz.blogspot.com/2010/08/using-python-to-encode-cassette.html"&gt;encoding text files into WAV audio files&lt;/a&gt; for uploading via the Superboard cassette input.  In this post, I explore the opposite problem--namely using Python to decode WAV audio files recorded from the cassette output port back into the transmitted byte stream--in essence, writing a Python script that performs the same function as a modem.&lt;/p&gt;&lt;center&gt;&lt;br /&gt;
&lt;img src="http://www.dabeaz.com/images/osi_back.jpg"&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;em&gt;The cassette ports of my Superboard II&lt;/em&gt;&lt;br /&gt;
&lt;/center&gt;&lt;br /&gt;
&lt;p&gt;Although decoding audio data from the cassette output sounds like it might be a tricky exercise involving sophisticated signal processing (e.g., FFTs), it turns out that you can easily solve this problem using nothing more than a few built-in objects (bytearrays, deques, etc.) and a couple of simple generator functions.  In fact, it's a neat exercise involving some of the lesser known, but quite useful data processing features of Python. Plus, it seems like a good excuse to further bang on the new &lt;a href="http://www.dabeaz.com/python3io"&gt;Python 3 I/O system&lt;/a&gt;. So, let's get started.&lt;/p&gt;&lt;p&gt;&lt;b&gt;Audio Format&lt;/b&gt;&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;In my &lt;a href="http://dabeaz.blogspot.com/2010/08/using-python-to-encode-cassette.html"&gt;earlier post&lt;/a&gt;, I described how the format used for cassette recordings is the &lt;a href="http://en.wikipedia.org/wiki/Kansas_City_standard"&gt;Kansas City Standard&lt;/a&gt; (KCS).   The encoding is really simple--8 cycles at 2400 HZ represent a 1-bit and 4 cycles at 1200 HZ represent a 0-bit. Individual bytes are encoded with 1 start bit (0) and two stop bits (1s). Here is a plot that shows some waveforms from a fragment of &lt;a href="http://www.dabeaz.com/images/osi_sample.wav"&gt;recorded audio&lt;/a&gt;.&lt;/p&gt;&lt;center&gt;&lt;br /&gt;
&lt;img src="http://www.dabeaz.com/images/osi_wave_small.png"&gt;&lt;br /&gt;
&lt;/center&gt;&lt;br /&gt;
&lt;p&gt;It is important to stress that this encoding is intentionally simple--designed to operate on systems of its era (1970s) and to be resistant to all sorts of problems associated with cassette tapes.  For example, noise, low-fidelity, variations in tape playback speed, etc.  Needless to say, it's not especially fast.  Encoding a single byte of data requires 11 bits or 88 cycles of a 2400 HZ wave.  If you do the math, that works out to about 27 bytes per second or 300 baud.&lt;/p&gt;&lt;p&gt;&lt;b&gt;A Decoding Strategy (Big Picture)&lt;/b&gt;&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;KCS decoding is almost entirely based on counting cycles of two different wave frequencies. That is, to decode the data we simply sample the audio data and count the number of zero-crossings. At a high level, decoding a single bit works as follows:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Read a sample of N audio frames where N represents the number of frames required to represent an entire bit (8 cycles at 2400 Hz).&lt;/li&gt;
&lt;li&gt;Count the number of zero crossings found in the sample.&lt;/li&gt;
&lt;li&gt;If the the number of crossings is near 16, then it represents a 1.&lt;/li&gt;
&lt;li&gt;If the number of crossings is near 8, then it represents a 0.&lt;br /&gt;
&lt;/ul&gt;&lt;p&gt;From bits, it's relatively simple to make the transition to bytes. You simply have to recognize the start bit and sample the next 8 bits as data bits to form a byte.&lt;/p&gt;&lt;p&gt;&lt;b&gt;Deconstructing a WAV File to Sign Bits&lt;/b&gt; &lt;/p&gt;&lt;p&gt;Python has a module &lt;a href="http://docs.python.org/library/wave.html"&gt;&lt;tt&gt;wave&lt;/tt&gt;&lt;/a&gt; that can be used to read WAV files. Here is an example of opening a WAV file and obtaining some useful metadata about the recorded audio.&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;&gt;&gt;&gt; &lt;b&gt;import wave&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;wf = wave.open("osi_sample.wav")&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;wf.getnchannels()&lt;/b&gt;
2
&gt;&gt;&gt; &lt;b&gt;wf.getsampwidth()&lt;/b&gt;
2
&gt;&gt;&gt; &lt;b&gt;wf.getframerate()&lt;/b&gt;
44100
&gt;&gt;&gt;
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;In the above example, the WAV file is a 44100Hz stereo recording using 16-bit (2 byte) samples.&lt;/p&gt;&lt;p&gt;For our decoding, we are only interested in counting the number of zero-crossings in the audio data.  For a 16-bit WAV file, the "zero" is represented by a sample value of 2**15 (32768).  A "positive" wave sample has a value greater than 2**15 whereas a "negative" wave sample has a value less than that.  Conveniently, this determination can be made by simply stripping all sample data away except for the most significant bit.&lt;/p&gt;&lt;p&gt;Here is a generator function that takes a sequence of WAV audio data and reduces it to a sequence of sign bits.&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;# Generate a sequence representing sign bits
def generate_wav_sign_bits(wavefile):
    samplewidth = wavefile.getsampwidth()
    nchannels = wavefile.getnchannels()
    while True:
        frames = wavefile.readframes(8192)
        if not frames:
            break

        # Extract most significant bytes from left-most audio channel
        msbytes = bytearray(frames[samplewidth-1::samplewidth*nchannels])

        # Emit a stream of sign bits
        for byte in msbytes:
            yield 1 if (byte &amp; 0x80) else 0
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;This generator works by reading a chunk of raw audio frames and using an extended slice &lt;tt&gt;frames[samplewidth-1::samplewidth*nchannels]&lt;/tt&gt; to extract the most significant byte from each sample of the left-most audio channel. The result is placed into a &lt;tt&gt;bytearray&lt;/tt&gt; object.  A &lt;tt&gt;bytearray&lt;/tt&gt; stores a sequence of bytes (like a string), but has the nice property that the stored data is presented as integers instead of 1-character strings. This makes it easy to perform numeric calculations on the data. The &lt;tt&gt;yield 1 if (byte &amp;amp 0x80) else 0&lt;/tt&gt; simply yields the most significant bit of each byte.&lt;/p&gt;&lt;p&gt;The resulting output from this generator is simply a sequence of sign bits.  For example, the output will look similar to this:&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;&gt;&gt;&gt; &lt;b&gt;import wave&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;wf = wave.open("sample.wav")&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;for bit in generate_wav_sign_bits(wf):&lt;/b&gt;
...     &lt;b&gt;print(bit,end="")&lt;/b&gt;
...
11111111000000000111111111000000000011111111100000000011111111110000000001111111
11000000000011111111100000000011111111110000000001111111110000000000111111111000
00000011111111110000000001111111110000000000111111111000000000111111111100000000
01111111110000000000111111111000000000111111111100000000011111111100000000001111
...
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;&lt;b&gt;From Sign Bits to Sign Changes&lt;/b&gt; &lt;/p&gt;&lt;p&gt;Although a sequence of wave sign bits is interesting, it's not really that useful. Instead, we're really more interested in zero-crossings or samples where the sign changes. Getting this information is actually pretty easy--simply compute the exclusive-or (XOR) of successive sign bits.  If you do this, you will always get 0 when the sign stays the same or a value 0x80 when the sign flips.  Here is a modified version of our generator function.&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;# Generate a sequence representing changes in sign
def generate_wav_sign_change_bits(wavefile):
    samplewidth = wavefile.getsampwidth()
    nchannels = wavefile.getnchannels()
    previous = 0
    while True:
        frames = wavefile.readframes(8192)
        if not frames:
            break

        # Extract most significant bytes from left-most audio channel
        msbytes = bytearray(frames[samplewidth-1::samplewidth*nchannels])

        # Emit a stream of sign-change bits
        for byte in msbytes:
            signbit = byte &amp; 0x80
            yield 1 if (signbit ^ previous) else 0
            previous = signbit
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;This slightly modified generator now produces a sequence of data with sign change pulses in it similar to this:&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;&gt;&gt;&gt; &lt;b&gt;import wave&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;wf = wave.open("sample.wav")&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;for bit in generate_wav_sign_change_bits(wf):&lt;/b&gt;
...     &lt;b&gt;print(bit,end="")&lt;/b&gt;
...
00000000100000000100000000100000000010000000010000000010000000001000000001000000
00100000000010000000010000000010000000001000000001000000001000000000100000000100
00000010000000001000000001000000001000000000100000000100000000100000000010000000
01000000001000000000100000000100000000100000000010000000010000000010000000001000
...
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;&lt;b&gt;Bit Sampling&lt;/b&gt; &lt;/p&gt;&lt;p&gt;At this point, the WAV file has been deconstructed into a sequence of sign changes.  Now, all we have to do is sample the data and count the number of sign changes.   To do this, use a &lt;tt&gt;deque&lt;/tt&gt; and some clever iterator tricks. Here is some code:&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;from itertools import islice
from collections import deque

# Base frequency (representing a 1)
BASE_FREQ = 2400

# Generate a sequence of data bytes by sampling the stream of sign change bits
def generate_bytes(bitstream,framerate):
    bitmasks = [0x1,0x2,0x4,0x8,0x10,0x20,0x40,0x80]

    # Compute the number of audio frames used to encode a single data bit
    frames_per_bit = int(round(float(framerate)*8/BASE_FREQ))

    # Queue of sampled sign bits
    sample = deque(maxlen=frames_per_bit)     

    # Fill the sample buffer with an initial set of data
    sample.extend(islice(bitstream,frames_per_bit-1))
    sign_changes = sum(sample)

    # Look for the start bit
    for val in bitstream:
        if val:
            sign_changes += 1
        if sample.popleft():
            sign_changes -= 1
        sample.append(val)

        # If a start bit detected, sample the next 8 data bits
        if sign_changes &lt;= 9:
            byteval = 0
            for mask in bitmasks:
                if sum(islice(bitstream,frames_per_bit)) &gt;= 12:
                    byteval |= mask
            yield byteval
            # Skip the final two stop bits and refill the sample buffer 
            sample.extend(islice(bitstream,2*frames_per_bit,3*frames_per_bit-1))
            sign_changes = sum(sample)
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;This code might require some study, but the concept is simple. A sample &lt;tt&gt;deque&lt;/tt&gt; (the &lt;tt&gt;sample&lt;/tt&gt; variable) is created, the size of which corresponds to the number of audio frames needed to represent a single data bit.  It might be a little known fact, but if you create a &lt;tt&gt;deque&lt;/tt&gt; with a &lt;tt&gt;maxlen&lt;/tt&gt; setting, it turns into a kind of shift register.   That is, new items added at the end will automatically cause old items to fall off the front if the length is exceeded. It is also very fast.&lt;/p&gt;&lt;p&gt;Getting back to our algorithm, audio data is pushed into this deque and the number of sign changes updated. If no data is being transmitted, the number of sign changes in the sample will hover around 16.  However, if a start-bit is encountered, the number of sign changes in the sample will drop to around 8.  In our code, this is detected by checking for 9 or fewer sign changes in the sample.  Keep in mind that we don't really know when the start bit will appear--thus, the code proceeds frame-by-frame until the number of sign changes drops to a sufficiently low value. Once the start bit is detected, data bits are quickly sampled, one after the other, to form a complete byte. After the data bits are sampled, the two stop bits are skipped and the sample buffer refilled with the next potential start bit. &lt;/p&gt;&lt;p&gt;&lt;b&gt;Does it Work?&lt;/b&gt; &lt;/p&gt;&lt;p&gt;Hell yes it works.   Here is a short test script that ties it all together: &lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;if __name__ == '__main__':
    import wave
    import sys
    if len(sys.argv) != 2:
        print("Usage: %s infile" % sys.argv[0],file=sys.stderr)
        raise SystemExit(1)

    wf = wave.open(sys.argv[1])
    sign_changes = generate_wav_sign_change_bits(wf)
    byte_stream  = generate_bytes(sign_changes, wf.getframerate())

    # Output the byte stream
    outf = sys.stdout.buffer.raw
    while True:
        buffer = bytes(islice(byte_stream,80))
        if not buffer:
            break
        outf.write(buffer)
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;If we run this program on the &lt;tt&gt;&lt;a href="http://www.dabeaz.com/images/osi_sample.wav"&gt;osi_sample.wav&lt;/a&gt;&lt;/tt&gt; file, we get the following output (which is exactly what it should be):&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;bash-3.2$ &lt;b&gt;python3 kcs_decode.py osi_sample.wav&lt;/b&gt;


 10 FOR I = 1 TO 1000
 20 PRINT I;
 30 NEXT I
 40 END
OK
bash-3.2$ 
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;That's pretty nice--two relatively simple generator functions and some basic data manipulation on deques has turned the audio stream into a stream of bytes.&lt;/p&gt;&lt;p&gt;One thing that's not shown above is the embedded NULLs related to newline handling.  You can see them if you do this:&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;bash-3.2$ &lt;b&gt;python3 kcs_decode.py osi_sample.wav | cat -e&lt;/b&gt;
^M^@^@^@^@^@^@^@^@^@^@$
^M^@^@^@^@^@^@^@^@^@^@$
 10 FOR I = 1 TO 1000^M^@^@^@^@^@^@^@^@^@^@$
 20 PRINT I;^M^@^@^@^@^@^@^@^@^@^@$
 30 NEXT I^M^@^@^@^@^@^@^@^@^@^@$
 40 END^M^@^@^@^@^@^@^@^@^@^@$
OK^M^@^@^@^@^@^@^@^@^@^@$
bash-3.2$ 
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;&lt;b&gt;How well does it work?&lt;/b&gt; &lt;/p&gt;&lt;p&gt;To test this decoding process, I recorded various audio samples directly from my Superboard using Audacity on my Mac.  I used different sampling frequencies ranging from 8000 Hz to 48000 Hz.  For all of the samples, the decoding process worked exactly as expected, producing no observable decoding errors.  &lt;/p&gt;&lt;p&gt;Decoding 5788 bytes of transmitted test data from 47 Mbyte WAV file of 48 KHz stereo samples takes about 5.7 seconds on my Macbook (2.4 Ghz Intel Core Duo) for a baud rate of about 11000--more than 35 times faster than the Superboard can actually send it.   Decoding the same data recorded in a 7.3 Mbyte WAV file with 8 KHz stereo samples takes about 0.97 seconds for a baud rate of about 65000 (Note: these baud rates are based on 11 bits of encoding for every data byte).&lt;/p&gt;&lt;p&gt;Although I could work to make the script run faster, it is already plenty fast for my purposes. Moreover, the generator-based approach means that they really aren't limited by the size of the input WAV files.&lt;/p&gt;&lt;p&gt;&lt;b&gt;Final Words&lt;/b&gt; &lt;/p&gt;&lt;p&gt;If you are interested in the final script, you can find it in the file &lt;a href="http://www.dabeaz.com/kcs_decode.py"&gt;kcs_decode.py&lt;/a&gt;. Although I've now written scripts to encode and decode Superboard II cassette audio data, this is the hardly the last word.  Stay tuned (evil wink ;-). &lt;/p&gt;&lt;p&gt;&lt;b&gt;Footnote&lt;/b&gt; &lt;/p&gt;&lt;p&gt;If you're going to try any of this code, make sure you're using Python-3.1.2 or newer.  Earlier versions of Python 3 seem to have buggy versions of the &lt;tt&gt;wave&lt;/tt&gt; module.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-2212420545471036146?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/2212420545471036146/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=2212420545471036146' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/2212420545471036146'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/2212420545471036146'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2010/08/decoding-superboard-ii-cassette-audio.html' title='Decoding Superboard II Cassette Audio Using Python 3, Two Generators, and a Deque'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-9149143544906598503</id><published>2010-08-22T13:47:00.000-07:00</published><updated>2010-09-08T20:32:16.382-07:00</updated><title type='text'>Using Python to Encode Cassette Recordings for my Superboard II</title><content type='html'>&lt;p&gt;See &lt;a href="http://dabeaz.blogspot.com/2010/08/decoding-superboard-ii-cassette-audio.html"&gt;Part 2&lt;/a&gt; for a discussion of decoding audio&lt;/p&gt;
&lt;p&gt;See &lt;a href="http://dabeaz.blogspot.com/2010/09/using-telnet-to-access-my-superboard-ii.html"&gt;Part 3&lt;/a&gt; to see real-time audio encoding/decoding used in conjunction with telnet.&lt;/p&gt;
&lt;p&gt;My family's first computer was an &lt;a href="http://oldcomputers.net/osi-600.html"&gt;Ohio Scientific Superboard II&lt;/a&gt;--something that my father purchased around 1979.   At the time, the Superboard II was about the most inexpensive computer you could get.  In fact, it didn't even include a power supply or a case.  If you wanted those features, you had to add them yourself.  Here's a picture of our system with the top of the (homemade) case removed so that you can see inside.&lt;/p&gt;&lt;center&gt;&lt;img src="http://www.dabeaz.com/images/osi_small.jpg"&gt;&lt;/center&gt;&lt;br /&gt;
&lt;p&gt;To say that the Superboard II is minimal is certainly an understatement by today's standards.   There was only 8192 total bytes of memory and no real operating system to speak of.  When you powered on the system you could either run the machine language monitor or Microsoft Basic Version 1.0.  Here's a sample of what appeared on the screen (yes, that's maximum resolution):&lt;/p&gt;&lt;center&gt;&lt;img src="http://www.dabeaz.com/images/osi_screen.jpg"&gt;&lt;/center&gt;&lt;br /&gt;
&lt;p&gt;Much to my amazement, our old Superboard II system stayed in the family.   For about 20-25 years it sat in the basement of my mother's house surrounded by boxes.  After that, it sat for a few years in a closet at my brother's condo.  Occasionally, we had discussed the idea of powering it up to see if it still worked, but never got around to it--until now.  About a week ago, my brother threw the old computer along with an old Amiga monitor in the back of his car and headed east to Chicago.  After some discussion, we decided we'd just blow the dust out of it, power it on, and see what would happen.&lt;/p&gt;&lt;p&gt;Unbelievably, the machine immediately sprang to life.  The above screenshot was taken just today.  Since powering it up, I've written a few short programs to test the integrity of the memory and ROMs.  Aside from a 1-bit memory error (bit 2 at location 0x861) it appears to be fully functional.&lt;/P&gt;&lt;p&gt;One problem with these old machines is that they had very little support for any kind of real I/O.  Forget about USB, Firewire, or Ethernet.  Heck, this machine didn't even have a serial or parallel port on it.  In fact, the only external interface was a pair of audio ports for saving and loading programs on a cassette tape player--which was also the only way to save any of your work as there was no disk drive of any kind.  Here is a picture of the back&lt;/p&gt;&lt;center&gt;&lt;img src="http://www.dabeaz.com/images/osi_back.jpg"&gt;&lt;/center&gt;&lt;br /&gt;
&lt;p&gt;Since the old machine seemed to be working, I got to thinking about ways to program it.  Working directly on the machine was certainly possible, but if you look at the keyboard, you'll notice that there aren't even any arrow keys (there is no cursor control anyways) and some of the characters are in unusual locations.  Plus, some of the keys are starting to show their age.  For example, pressing '+' tends to produce about 3 or 4 '+' characters due to some kind of key debouncing problem.   So, like most Python programmers, I started to wonder if there was some way I could write a script that would let me program the machine in a more straightforward manner from my Mac.&lt;/p&gt;&lt;p&gt;Since the only input port available on the machine was a cassette audio port, the proposition seemed simple enough: could I write a Python script to convert a normal text file into a WAV audio file that when played, would upload the contents of the text file into the Superboard II?  Obviously, the answer is yes, but let's look at the details.&lt;/p&gt;&lt;p&gt;&lt;b&gt;Viewing Cassette Audio Output&lt;/b&gt;&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;On many old machines, cassette output is encoded using something called the &lt;a href="http://en.wikipedia.org/wiki/Kansas_City_standard"&gt;Kansas City Standard&lt;/a&gt;.  It's a pretty simple encoding.  A 0 is encoded as 4 cycles of a 1200 Hz sine wave and a 1 is encoded as 8 cycles of a 2400 Hz sine wave.   If no data is being transmitted, there is a constant 2400 Hz wave.  Each byte of data is transmitted by first sending a 0 start bit followed by 8 bits of data (LSB first) followed by two stop bits (1s).  Click &lt;a href="http://www.dabeaz.com/images/osi_sample.wav"&gt;here&lt;/a&gt; to hear a WAV file sample of actual data being saved by my Superboard II.  I recorded this sample using &lt;a href="http://audacity.sourceforge.net/"&gt;Audacity&lt;/a&gt; on my Mac.&lt;/p&gt;&lt;p&gt;Python has a built-in module for reading WAV files.   Combined with &lt;a href="http://matplotlib.sourceforge.net/"&gt;Matplotlib&lt;/a&gt; you can easily view the waveform.  For example:&lt;/P&gt;&lt;blockquote&gt;&lt;pre&gt;&gt;&gt;&gt; &lt;b&gt;import wave&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;f = wave.open("osi_sample.wav")&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;f.getnchannels()&lt;/b&gt;
2
&gt;&gt;&gt; &lt;b&gt;f.getsampwidth()&lt;/b&gt;
2
&gt;&gt;&gt; &lt;b&gt;f.getnframes()&lt;/b&gt;
1213851
&gt;&gt;&gt; &lt;b&gt;rawdata = bytearray(f.readframes(1000000))&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;del rawdata[2::4]&lt;/b&gt;    # Delete the right stereo channel    
&gt;&gt;&gt; &lt;b&gt;del rawdata[2::3]&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;wavedata = [a + (b &lt;&lt; 8) for a,b in zip(rawdata[::2],rawdata[1::2])]&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;import pylab&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;pylab.plot(wavedata)&lt;/b&gt;
&gt;&gt;&gt;
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;After some panning and zooming, you'll see a plot like this.   You can observe the different frequencies used for representing 0s and 1s.  Again, this plot was created from an actual sound recording of data saved by the system.&lt;/p&gt;&lt;center&gt;&lt;img src="http://www.dabeaz.com/images/osi_wave_small.png"&gt;&lt;/center&gt;&lt;br /&gt;
&lt;p&gt;&lt;b&gt;Converting Text into a KCS WAV File&lt;/b&gt;&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;Using Python's &lt;a href="http://docs.python.org/library/wave.html"&gt;wave&lt;/a&gt; module, it is relatively straightforward to go in the other direction--that is, take a text file and encode it into a WAV file suitable for playback.  Here is the general strategy for how to do it:&lt;br /&gt;
&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Define a utility function for making a square-wave pulse.&lt;/li&gt;
&lt;li&gt;Define wave fragments for a 0-bit (4 cycles of a 1200 Hz square wave) and 1-bit (8 cycles of a 2400 Hz square wave).&lt;/li&gt;
&lt;li&gt;Write a function that encodes a byte of data as an 11-bit wave fragment consisting of a start bit, 8 bits of data, and 2 stop bits.&lt;/li&gt;
&lt;li&gt;Write a function that takes an input text file, encodes every single byte using this scheme, and writes a big WAV file with some extra padding on the front and back.&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Here is a script &lt;a href="http://www.dabeaz.com/kcs_encode.py"&gt;kcs_encode.py&lt;/a&gt; that has one implementation.&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;##!/usr/bin/env python3
# kcs_encode.py
#
# Author : David Beazley (http://www.dabeaz.com)
# Copyright (C) 2010
#
# Requires Python 3.1.2 or newer

"""
Takes the contents of a text file and encodes it into a Kansas
City Standard WAV file, that when played will upload data via the
cassette tape input on various vintage home computers. See
http://en.wikipedia.org/wiki/Kansas_City_standard
"""

import wave

# A few global parameters related to the encoding

FRAMERATE = 9600       # Hz
ONES_FREQ = 2400       # Hz (per KCS)
ZERO_FREQ = 1200       # Hz (per KCS)
AMPLITUDE = 128        # Amplitude of generated square waves
CENTER    = 128        # Center point of generated waves

# Create a single square wave cycle of a given frequency 
def make_square_wave(freq,framerate):
    n = int(framerate/freq/2)
    return bytearray([CENTER-AMPLITUDE//2])*n + \
           bytearray([CENTER+AMPLITUDE//2])*n

# Create the wave patterns that encode 1s and 0s
one_pulse  = make_square_wave(ONES_FREQ,FRAMERATE)*8
zero_pulse = make_square_wave(ZERO_FREQ,FRAMERATE)*4

# Pause to insert after carriage returns (10 NULL bytes)
null_pulse = ((zero_pulse * 9) + (one_pulse * 2))*10

# Take a single byte value and turn it into a bytearray representing
# the associated waveform along with the required start and stop bits.
def kcs_encode_byte(byteval):
    bitmasks = [0x1,0x2,0x4,0x8,0x10,0x20,0x40,0x80]
    # The start bit (0)
    encoded = bytearray(zero_pulse)
    # 8 data bits
    for mask in bitmasks:
        encoded.extend(one_pulse if (byteval &amp;amp; mask) else zero_pulse)
    # Two stop bits (1)
    encoded.extend(one_pulse)
    encoded.extend(one_pulse)
    return encoded

# Write a WAV file with encoded data. leader and trailer specify the
# number of seconds of carrier signal to encode before and after the data
def kcs_write_wav(filename,data,leader,trailer):
    w = wave.open(filename,"wb")
    w.setnchannels(1)
    w.setsampwidth(1)
    w.setframerate(FRAMERATE)

    # Write the leader
    w.writeframes(one_pulse*(int(FRAMERATE/len(one_pulse))*leader))

    # Encode the actual data
    for byteval in data:
        w.writeframes(kcs_encode_byte(byteval))
        if byteval == 0x0d:
            # If CR, emit a short pause (10 NULL bytes)
            w.writeframes(null_pulse)
    
    # Write the trailer
    w.writeframes(one_pulse*(int(FRAMERATE/len(one_pulse))*trailer))
    w.close()

if __name__ == '__main__':
    import sys
    if len(sys.argv) != 3:
        print("Usage : %s infile outfile" % sys.argv[0],file=sys.stderr)
        raise SystemExit(1)

    in_filename = sys.argv[1]
    out_filename = sys.argv[2]
    data = open(in_filename,"U").read()
    data = data.replace('\n','\r\n')         # Fix line endings
    rawdata = bytearray(data.encode('latin-1'))
    kcs_write_wav(out_filename,rawdata,5,5)
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;You can study the implementation yourself for some of the finer details.  However, most of the heavy work is carried out using operations on Python's &lt;tt&gt;bytearray&lt;/tt&gt; object.  For padding the audio, a constant 1 bit is emitted (a constant 2400 Hz wave).   To handle old text encoding, newlines are replaced with a carriage return.  Moreover, to account for the slow speed of the Superboard II, a pause consisting of about 80 bits is inserted after each carriage return.&lt;/p&gt;&lt;p&gt;To use this script, you now just need an old BASIC program to upload.   Here's a really simple one (from the Superboard II manual):&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;10 PRINT "I WILL THINK OF A"
15 PRINT "NUMBER BETWEEN 1 AND 100"
20 PRINT "TRY TO GUESS WHAT IT IS"
25 N = 0
30 X = INT(RND(56)*99+1)
35 PRINT
40 PRINT "WHATS YOUR GUESS   ";
50 INPUT G
52 N = N + 1
55 PRINT
60 IF G = X THEN GOTO 110
70 IF G &gt; X THEN GOTO 90
80 PRINT "TOO SMALL, TRY AGAIN ";
85 GOTO 50
90 PRINT "TOO LARGE, TRY AGAIN ";
100 GOTO 50
110 PRINT "YOU GOT IT IN";N;" TRIES"
113 IF N &gt; 6 THEN GOTO 120
117 PRINT "VERY GOOD"
120 PRINT
130 PRINT
140 GOTO 10
150 END
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Let's say this program is in a file &lt;tt&gt;guess.bas&lt;/tt&gt;.  Here's how to encode it using our script.&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;bash $ &lt;b&gt;python3 kcs_encode.py guess.bas guess.wav&lt;/b&gt;
bash $ &lt;b&gt;ls -l guess.wav&lt;/b&gt;
352652
bash $
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Now, we have an audio file that's ready to go (note: it's rather impressive that a 476 byte input file has now expanded to a 350Kbyte audio file). You can listen to it &lt;a href="http://www.dabeaz.com/images/guess.wav"&gt;here&lt;/a&gt;. Note that data doesn't start until about 5 seconds have passed.&lt;/p&gt;&lt;p&gt;Now, the ultimate test.  Does this audio file even work?  To test it, we first hook up the audio input of the Superboard II to my Macbook.&lt;/p&gt;&lt;center&gt;&lt;img src="http://www.dabeaz.com/images/osi_mac.jpg"&gt;&lt;/center&gt;&lt;br /&gt;
&lt;p&gt;Next, we go over to the Superboard II and type 'LOAD'&lt;/p&gt;&lt;center&gt;&lt;img src="http://www.dabeaz.com/images/osi_load.jpg"&gt;&lt;/center&gt;&lt;br /&gt;
&lt;p&gt;Next, we start playing the WAV file on the Mac.  After a few seconds, you see data streaming in (at about 300 baud). Excellent!&lt;/p&gt;&lt;center&gt;&lt;img src="http://www.dabeaz.com/images/osi_loading.jpg"&gt;&lt;/center&gt;&lt;br /&gt;
&lt;p&gt;Finally, the ultimate test. Let's play the game:&lt;/p&gt;&lt;center&gt;&lt;img src="http://www.dabeaz.com/images/osi_running.jpg"&gt;&lt;/center&gt;&lt;br /&gt;
&lt;p&gt;Awesome!  Note for anyone under the age of 40: yes, this is the kind of stuff people did on these old machines--and we thought it was every bit as awesome as your shiny iPad.  Maybe even more awesome.  I digress.&lt;/p&gt;&lt;p&gt;(It occurs to me that fooling around on this machine might be the reason why I got an F in 7th grade math and had to attend summer school)&lt;/p&gt;&lt;p&gt;Just so you can get the full effect, here is a video of the upload in action.  It's really hard to believe that systems were so slow back then.  For big programs, it might take 5 minutes or more to load (even with the 8K limit):&lt;/p&gt;&lt;center&gt;&lt;br /&gt;
&lt;object width="425" height="344"&gt;&lt;param name="movie" value="http://www.youtube.com/v/VEcgIH096fQ&amp;hl=en&amp;fs=1"&gt;&lt;/param&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;/param&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/VEcgIH096fQ&amp;hl=en&amp;fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;br /&gt;
&lt;/center&gt;&lt;br /&gt;
&lt;p&gt;Well, that's about it for now.   The power of Python never ceases to amaze me--once again a problem that seems like it might be hard is solved with a short script using nothing more than a single built-in library module and some basic data manipulation. Next on the agenda: A Python script to decode WAV files back into text files.&lt;/p&gt;&lt;p&gt;By the way, if you take one of my &lt;a href="http://www.dabeaz.com/chicago/index.html"&gt;classes&lt;/a&gt;, you can play with the Superboard II yourself (wink ;-).&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-9149143544906598503?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/9149143544906598503/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=9149143544906598503' title='22 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/9149143544906598503'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/9149143544906598503'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2010/08/using-python-to-encode-cassette.html' title='Using Python to Encode Cassette Recordings for my Superboard II'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>22</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-2172876071062476182</id><published>2010-08-20T07:39:00.001-07:00</published><updated>2010-08-20T07:39:48.778-07:00</updated><title type='text'>Python Networks, Concurrency, and Distributed Systems</title><content type='html'>&lt;p&gt;If you've been paying any attention, you've probably noticed programmers talking a lot about multicore, concurrency, distributed computing, parallel programming, and more.   If you've ever wanted to know more about how all of this stuff works, then you should come to Chicago and attend my upcoming &lt;a href="http://www.dabeaz.com/chicago/network.html"&gt;Python Networks, Concurrency, and Distributed Systems&lt;/a&gt; course September 13-17.&lt;/p&gt;&lt;p&gt;Simply stated, this is probably the most intense and in-depth course on this topic you are going to find anywhere--the kind of course where you'll start with the basics and dig deep to see how everything else works under the covers.&lt;/p&gt;&lt;p&gt;If you attend, here's what to expect.&lt;/p&gt;&lt;p&gt;&lt;b&gt;Day 1 (Fundamentals):&lt;/b&gt;  The first day introduces some fundamental topics including socket programming, encoding/decoding, data handling (XML, JSON, etc.), and aspects of writing HTTP-based web services.&lt;/p&gt;&lt;p&gt;&lt;b&gt;Day 2 (Threads and Multiprocessing):&lt;/b&gt; An in-depth look at Python thread programming, multiprocessing, and concurrent programming techniques.  This includes knowing how to properly use synchronization primitives (locks, semaphores, etc.) and queues. Naturally, I'll probably say a few things about the &lt;a href="http://www.dabeaz.com/GIL"&gt;GIL&lt;/a&gt;. &lt;/p&gt;&lt;p&gt;&lt;b&gt;Day 3 (Serialization, Messaging, and Distributed Computing):&lt;/b&gt;  Details on data serialization, message passing, message queues, and distributed computation.   Major topics include the actor model, remote procedure call (RPC), REST, and distributed objects.&lt;/p&gt;&lt;p&gt;&lt;b&gt;Day 4 (Async, Events, and Tasklets):&lt;/b&gt;  An in-depth examination of asynchronous and event-driven I/O handling.   Part of the coverage will simply be about I/O handling in the operating system. We'll then build some event-driven applications and look at different event-driven programming models.  This includes the use of generators and coroutines to implement microthreads or tasklets.&lt;/p&gt;&lt;p&gt;&lt;b&gt;Day 5 (Coding):&lt;/b&gt;  The last half-day will be spent coding.   Possible topics include implementing your own map-reduce library, using third-party libraries (Twisted, Py0MQ, gevent, etc.), or simply working on your own projects.&lt;/p&gt;&lt;p&gt;You should know that my classes are small--no more than 6 people.  As a result, there is a significant amount of coding, discussion and interaction.  This is not a course where you will simply sit in the back and listen the whole time.  For instance, here's a scene from my recent "Advanced Python Mastery" course:&lt;/p&gt;&lt;p&gt;&lt;center&gt;&lt;img src="http://www.dabeaz.com/chicago/class.jpg"&gt;&lt;/center&gt;&lt;br /&gt;
&lt;p&gt;Finally, I should emphasize that you do not need to be an expert in the course topics to attend--in fact, that's the whole point.  All you need is a passion for Python and a desire to learn new things.   If you come, you'll have a great time.  &lt;/p&gt;&lt;p&gt;More information on the upcoming course can be found &lt;a href="http://www.dabeaz.com/chicago/network.html"&gt;here&lt;/a&gt;.  Hopefully I'll see you in a future class.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-2172876071062476182?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/2172876071062476182/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=2172876071062476182' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/2172876071062476182'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/2172876071062476182'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2010/08/python-networks-concurrency-and.html' title='Python Networks, Concurrency, and Distributed Systems'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-9047521973620765908</id><published>2010-08-13T08:04:00.000-07:00</published><updated>2010-08-16T15:00:15.344-07:00</updated><title type='text'>Five Dubious Reasons to Take My Advanced Python Course at the Last Minute</title><content type='html'>&lt;p&gt;For the past 10 weeks, my upcoming &lt;a href="http://www.eventbrite.com/event/619728625"&gt;Advanced Python Mastery&lt;/a&gt; course has been sold out.  However, a last-minute cancellation has opened up one slot.  Here are five, somewhat dubious, reasons why you should attend my August 17-19 course at the last minute:&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;&lt;ol&gt;&lt;li&gt;&lt;b&gt;You want to get fired (JetBlue version).&lt;/b&gt;  Scream "I'm not going to take it anymore!", grab your Python books, and make a mad dash to Chicago while leaving your coworkers to meet your project deadline.  Yes, I'm looking at you Java.&lt;/li&gt;
&lt;li&gt;&lt;b&gt;You want to get fired (alternate version).&lt;/b&gt;  If you apply everything that you will learn in this class to someone else's code, they will almost certainly want to fire you.&lt;/li&gt;
&lt;li&gt;&lt;b&gt;You want to keep your job.&lt;/b&gt;  If you apply everything that you will learn in this class to your own code, you will earn a "job security" merit badge.&lt;/li&gt;
&lt;li&gt;&lt;b&gt;You want to gain 10 pounds.&lt;/b&gt;  Did I mention that the venue for this course is completely surrounded by a wall of bakeries and coffee shops?&lt;/li&gt;
&lt;li&gt;&lt;b&gt;You want to create a new web framework&lt;/b&gt;.   Well, maybe you won't do that, but if you take this course you will explore almost every diabolical Python feature framework builders use to perform magic (decorators, metaclasses, descriptors, context managers, generators, coroutines, and more).  It will completely change the way you look at Python code.  Plus, it's just cool.&lt;/li&gt;
&lt;/ol&gt;&lt;p&gt;If this sounds interesting, registration information is available at &lt;a href="http://www.eventbrite.com/event/619728625"&gt;http://www.eventbrite.com/event/619728625&lt;/a&gt;. Hopefully I'll see you in Chicago next week!&lt;/p&gt;&lt;p&gt;&lt;b&gt;8/16 (Last Chance!)&lt;/b&gt;  Space is still available and as if you don't need another dubious reason to go, you'll get a chance to do peeks and pokes on the vintage OSI home computer my brother and I resurrected over the weekend. &lt;/p&gt;--Dave&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-9047521973620765908?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/9047521973620765908/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=9047521973620765908' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/9047521973620765908'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/9047521973620765908'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2010/08/five-dubious-reasons-to-take-my_13.html' title='Five Dubious Reasons to Take My Advanced Python Course at the Last Minute'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-6379636000702059811</id><published>2010-07-31T11:59:00.000-07:00</published><updated>2010-08-01T04:31:48.017-07:00</updated><title type='text'>Yieldable Threads (Part 1)</title><content type='html'>&lt;p&gt;&lt;b&gt;Disclaimer:&lt;/b&gt; This whole post is one big thought experiment.  It might, in fact, be a really dumb idea.  Don't say that you weren't warned! - Dave.&lt;/p&gt;&lt;p&gt;&lt;b&gt;Introduction&lt;/b&gt;&lt;/p&gt;&lt;p&gt;I'll admit that I'm an unabashed fan of Python generator functions--especially when applied to problems in data processing (e.g., setting up processing pipelines, cranking on big datasets, etc.).  Generators also have a rather curious use in the world of concurrency--especially in libraries that aim to provide an alternative to threading (i.e., tasklets, greenlets, coroutines, etc.).  Just in case you missed them, I've given past PyCon tutorials on both &lt;a href="http://www.dabeaz.com/generators"&gt;generators&lt;/a&gt; and &lt;a href="http://www.dabeaz.com/coroutines"&gt;coroutines&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;A common theme surrounding the use of generators and concurrency is that you can define functions that seem to operate as "tasks" without using any system threads (sometimes this approach is known as microthreading or green threading).  Typically this is done by playing clever tricks with I/O operations. For example, suppose you initially had the following function that served a client in a multithreaded server.&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;def serve_client(c):
    request = c.recv(MAXSIZE)        # Read from a socket
    response = processing(result)    # Process the request
    c.send(response)                 # Send on a socket
    c.close()
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;With the assistance of a coroutine or tasklet library, you might be able to get rid of threads and rewrite the function slightly, using &lt;tt&gt;yield&lt;/tt&gt; statements like this (keep in mind this is a high-level overview--the actual specifics might vary):&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;def serve_client(c):
    request = yield c.recv(MAXSIZE)        # Read from a socket
    response = processing(result)          # Process the request
    yield c.send(response)                 # Send on a socket
    c.close()
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;If you've never seen anything like this before, it will probably make your head spin (see my &lt;a href="http://www.dabeaz.com/coroutines"&gt;coroutines tutorial&lt;/a&gt; from PyCon'09).  However, the gist of the idea is that the &lt;tt&gt;yield&lt;/tt&gt; statements cause the function to suspend execution at points where I/O operations might block.  Underneath the covers, the I/O request operations (recv, send, etc.) are handled by a scheduler that uses nonblocking I/O and multiplexing (e.g., &lt;tt&gt;select()&lt;/tt&gt;, &lt;tt&gt;poll()&lt;/tt&gt;, etc.) to drive the execution of multiple generator functions at once, giving the illusion of concurrent execution.   To be sure, it's a neat trick and it works great--well, so long as the &lt;tt&gt;processing()&lt;/tt&gt; operation in the middle is well behaved.&lt;br /&gt;
&lt;p&gt;Sadly, it's not generally safe to assume that the processing step will play nice.  In fact, a major limitation of coroutine (and event-driven) approaches concerns the handling of processing steps that block or consume a large number of CPU cycles. This is because any kind of blocking causes the entire program and all "tasks" to grind to a halt until that operation completes.   This becomes a major concern if you are going to use other programming libraries as most code has not been written to operate in such a non-blocking manner.  In fact, libraries based on polling and non-blocking I/O typically take great pains to work around this limitation (for instance, consider the difficulty of performing a blocking third-party library database query in this environment).&lt;/p&gt;&lt;p&gt;&lt;b&gt;Threads : I'm Not Dead Yet&lt;/b&gt;&lt;/p&gt;&lt;p&gt;A simple solution that almost always eliminates the problem of blocking is to program with threads.  Yes, threads--everyone's favorite public enemy number one.  In fact, threads work great for blocking operations.  Not only that, the resulting threaded code tends to have readable and comprehensible control-flow (e.g., organized as a logical sequence of steps as opposed to being fragmented across dozens of asynchronous callbacks and event handlers).  Frankly, I suspect that most Python programmers would prefer to use threads if it weren't for the simple fact that their performance on CPU-bound processing sucks (damn you &lt;a href="http://www.dabeaz.com/GIL"&gt;GIL!&lt;/a&gt;). However, I digress--I've already said more than enough about that.&lt;/p&gt;&lt;p&gt;&lt;b&gt;A Different Premise (Thought Experiment)&lt;/b&gt;&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;Generator and event-based based alternatives to threads are usually based on a premise that thread programming should be avoided.   However, all of these thread alternatives are also strongly based on reimplementing the one part of thread programming that actually works reasonably well--handling of blocking I/O.&lt;/p&gt;&lt;p&gt;As a thought experiment, I got to wondering--why do thread alternatives fix what isn't broken about threads? If you &lt;em&gt;really&lt;/em&gt; wanted to fix threads, wouldn't you want to address the part of thread programming that actually &lt;em&gt;is&lt;/em&gt; broken? In particular, the poor execution of CPU-intensive work.&lt;/p&gt;&lt;p&gt;&lt;b&gt;Yielding CPU-intensive work&lt;/b&gt;&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;Generator-based alternatives to threads use the &lt;tt&gt;yield&lt;/tt&gt; statement to have I/O operations carried out elsewhere (by a scheduler sitting behind the scenes).   However, what if you fully embraced threads, but applied that same idea to CPU intensive processing instead of I/O?   For example, consider a threaded function that looked like this:&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;def serve_client(c):
    request = c.recv(MAXSIZE)                # Read from a socket
    response = yield processing, (result,)   # Request processing (somehow)
    c.send(response)                         # Send on a socket
    c.close()
&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;In this code, &lt;tt&gt;yield&lt;/tt&gt; is not used to perform non-blocking I/O.  Instead, it's used to "punt" on CPU-intensive processing.  For example, instead of directly calling the &lt;tt&gt;processing()&lt;/tt&gt; function above, the &lt;tt&gt;yield&lt;/tt&gt; statement merely spits it out to someone else.  In a sense, the thread is using &lt;tt&gt;yield&lt;/tt&gt; to say that it does &lt;b&gt;NOT&lt;/b&gt; want to do that work and that it wants to suspend until someone else finishes it.&lt;/p&gt;&lt;p&gt;Some careful study is probably required, but just to emphasize, the above generator freely performs blocking I/O operations (something threads are good at), but kicks out problematic CPU-intensive processing using yield.  It's almost the exact opposite of what you normally see with generator-based microthreading.&lt;/p&gt;&lt;p&gt;&lt;b&gt;A Yieldable Thread Object&lt;/b&gt;&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;To run such a function as a thread, you need to have a little bit of extra runtime support.  The following code defines a new thread object that knows how to utilize our new use of &lt;tt&gt;yield&lt;/tt&gt;:  &lt;br /&gt;
&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;# ythr.py
# Author : David Beazley
# 
# A yieldable thread object that offloads CPU intensive work 
# to a user-defined apply function

from threading import Thread
from types import GeneratorType

# Compatibility function (for Python 3)
def apply(func,args=(),kwargs={}):
    return func(*args,**kwargs)

class YieldableThread(Thread):
    def __init__(self,target,args=(),kwargs={},cpu_apply=None):
        Thread.__init__(self)
        self.__target = target
        self.__args = args
        self.__kwargs = kwargs
        self.__cpu_apply = cpu_apply if cpu_apply else apply

    # Run the thread and check for generators (if any)
    def run(self):
        # Call the specified target function
        result = self.__target(*self.__args,**self.__kwargs)
        # Check if the result is a generator.  If so, run it
        if isinstance(result, GeneratorType):
            genfunc   = result    # generator function to run
            genresult = None      # last result to send into the generator
            while True:
                try:
                    # Run to the next yield and get work to do
                    work = genfunc.send(genresult)
                    # Execute the work using the user-defined apply function
                    genresult = self.__cpu_apply(*work)
                except StopIteration:
                    break

&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;The key to this implementation is the bottom part of the &lt;tt&gt;run()&lt;/tt&gt; method that checks to see if the target function produced a generator.  If so, the run method manually steps through the generator (using its &lt;tt&gt;send()&lt;/tt&gt; method).   The yielded results are assumed to represent CPU-intensive functions that need to execute.  For each of these, the work is passed to a user-supplied apply function (the &lt;tt&gt;__cpu_apply&lt;/tt&gt; attribute). By default, this function is set to &lt;tt&gt;apply()&lt;/tt&gt; which makes the thread run the work as if &lt;tt&gt;yield&lt;/tt&gt; wasn't used at all.  However, as we'll see shortly, there are many different things that can be done by supplying a different implementation.&lt;br /&gt;
&lt;p&gt;&lt;b&gt;An Example&lt;/b&gt;&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;To explore this thread implementation, we first need a CPU-intensive function to work with.  Here's a trivial one just for the purpose of exploring the concept:&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;# A trivial CPU-bound function
def sumn(n):
    total = 0
    while n &gt; 0:
        total += n
        n -= 1
    return total
&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;This function just computes the sum of the first N integers in a really dumb way.  Here is an example:&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;&gt;&gt;&gt; &lt;b&gt;sumn(25000)&lt;/b&gt;
312512500
&gt;&gt;&gt; &lt;b&gt;timeit('sumn(25000)','from __main__ import sumn',number=1000)&lt;/b&gt;
4.500338077545166
&gt;&gt;&gt;
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;As you can see, summing the first 25000 integers 1000 times takes about 4.5 seconds (4.5 msec to do it just once).  Remember that number--we'll return to it shortly.&lt;/p&gt;&lt;p&gt;Next, we need to mix in some I/O.  Let's write a really simple multithreaded TCP server that turns the above function into a internet service.   This code is just a standard threaded server that uses none of our magic (yet).&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;# sumserver.py
# 
# A server that computes sum of n integers

from socket import *
from threading import Thread

# CPU-bound function
def sumn(n):
    total = 0
    while n &gt; 0:
        total += n
        n -= 1
    return total

# Function that handles clients
def serve_client(c):
    n = int(c.recv(16))
    result = sumn(n)
    c.send(str(result).encode('ascii'))
    c.close()

# Plain-old threaded server
def run_server(addr):
    s = socket(AF_INET, SOCK_STREAM)
    s.setsockopt(SOL_SOCKET, SO_REUSEADDR,1)
    s.bind(addr)
    s.listen(5)
    while True:
        c,a = s.accept()
        thr = Thread(target=serve_client,args=(c,))
        thr.daemon = True
        thr.start()
        
if __name__ == '__main__':
    run_server(("",10000))
&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;Finally, let's write a test client program that can be used to make a bunch of requests and time the performance.&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;# sumclient.py
from socket import *
def run_client(addr,repetitions,n):
    while repetitions &gt; 0:
        s = socket(AF_INET, SOCK_STREAM)
        s.connect(addr)
        s.send(str(n).encode('ascii'))
        resp = s.recv(128)
        s.close()
        repetitions -= 1

if __name__ == '__main__':
    import sys
    import time
    import threading

    ADDR = ("",10000)
    REPETITIONS = 1000
    N = 25000

    nclients = int(sys.argv[1])
    requests_per_client = REPETITIONS//nclients

    # Launch a set of client threads to make requests and time them
    thrs = [threading.Thread(target=run_client,args=(ADDR,requests_per_client,N)) for n in range(nclients)]
    start = time.time()
    for t in thrs:
        t.start()
    for t in thrs:
        t.join()
    end = time.time()
    print("%d total requests" % (nclients*requests_per_client))
    print(end-start)
&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;This client simply initiates 1000 requests with the server, using different numbers of threads. Let's run the server and try the client with different numbers of threads.&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;bash-3.2$ &lt;b&gt;python sumclient.py 1&lt;/b&gt;         # 1 client thread
1000 total requests
4.34612298012
bash-3.2$ &lt;b&gt;python sumclient.py 2&lt;/b&gt;         # 2 client threads
1000 total requests
7.81390690804
bash-3.2$ &lt;b&gt;python sumclient.py 4&lt;/b&gt;         # 4 client threads
1000 total requests
9.5317029953
bash-3.2$ &lt;b&gt;python sumclient.py 8&lt;/b&gt;         # 8 client threads
1000 total requests
10.2061738968
bash-3.2$ 
&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;Observe that with only 1 client thread, the performance of the server is comparable with the performance of &lt;tt&gt;timeit()&lt;/tt&gt;.  Making 1000 requests takes about 4.3 seconds (in fact, it seems to be a little faster).  However, if we start increasing the concurrency the performance degrades fast.   With 4 client threads, the server is already running twice as slow.  This is not a surprise--we already know that Python threads have problems with CPU bound processing.&lt;/p&gt;&lt;p&gt;&lt;b&gt;A Modified Example (Using yield)&lt;/b&gt;&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;Let's modify the server code to use our new &lt;tt&gt;YieldableThread&lt;/tt&gt; object.  Here is the code:&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;# ysumserver.py
# 
# A server that computes sum of n integers (using yieldable threads)

from socket import *
from ythr import YieldableThread

# CPU-bound function (unmodified)
def sumn(n):
    total = 0
    while n &gt; 0:
        total += n
        n -= 1
    return total

# Function that handles clients
def serve_client(c):
    n = int(c.recv(16))
    result = yield sumn, (n,)               # Notice use of yield
    c.send(str(result).encode('ascii'))
    c.close()

# Threaded server that uses yieldable threads. Note extra cpu_apply
# argument that allows a user-defined apply() function to be passed
def run_server(addr,cpu_apply=None):
    s = socket(AF_INET, SOCK_STREAM)
    s.setsockopt(SOL_SOCKET, SO_REUSEADDR,1)
    s.bind(addr)
    s.listen(5)
    while True:
        c,a = s.accept()
        thr = YieldableThread(target=serve_client,args=(c,),cpu_apply=cpu_apply)
        thr.daemon = True
        thr.start()
        
if __name__ == '__main__':
    run_server(("",10000))
&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;Observe that this version of the code is only slightly modified. &lt;/p&gt;&lt;p&gt;By default, yieldable threads should have performance comparable to normal threads.  Try the client again with this new server:&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;bash-3.2$ &lt;b&gt;python sumclient.py 1&lt;/b&gt;
1000 total requests
4.95635294914
bash-3.2$ &lt;b&gt;python sumclient.py 2&lt;/b&gt;
1000 total requests
7.82525205612
bash-3.2$ &lt;b&gt;python sumclient.py 4&lt;/b&gt;
1000 total requests
9.25957417488
bash-3.2$ &lt;b&gt;python sumclient.py 8&lt;/b&gt;
1000 total requests
9.95880198479
&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;Yep, the same lousy performance as before.  So, where is this going?&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;&lt;b&gt;Add Some Special Magic&lt;/b&gt;&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;Recall that yieldable threads allow the user to pass in their own custom &lt;tt&gt;apply()&lt;/tt&gt; function for performing CPU-bound processing.  That's where the magic enters the picture.&lt;/p&gt;&lt;p&gt;Let's write a new apply function and try running the server again.  Try this one:&lt;br /&gt;
&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;# ysumserver.py
...
# A locked version of apply that only allows one thread to run
from threading import Lock
_apply_lock = Lock()
def exclusive_apply(func,args=(),kwargs={}):
    with _apply_lock:
         return func(*args,**kwargs)

if __name__ == '__main__':
    run_server(("",10000),cpu_apply=exclusive_apply)
&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;Let's try our client with this new server:&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;bash-3.2$ &lt;b&gt;python sumclient.py 1&lt;/b&gt;
1000 total requests
4.55530810356
bash-3.2$ &lt;b&gt;python sumclient.py 2&lt;/b&gt;
1000 total requests
5.75427007675
bash-3.2$ &lt;b&gt;python sumclient.py 4&lt;/b&gt;
1000 total requests
5.75416207314
bash-3.2$ &lt;b&gt;python sumclient.py 8&lt;/b&gt;
1000 total requests
5.81962108612
bash-3.2$ 
&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;Wow! Look at the change for the threaded clients.  When running with 8 threads, this new server serves requests about 1.7x faster.  No code modifications were made to the server--only a different specification of the &lt;tt&gt;apply()&lt;/tt&gt; function.&lt;/p&gt;&lt;p&gt;How is this possible you ask?   Well, if you recall from my &lt;a href="http://www.dabeaz.com/GIL"&gt;GIL talk&lt;/a&gt;, CPU-bound threads tend to fight with each other on certain multicore machines.  By putting that lock in the apply function, threads aren't allowed to fight anymore (only one gets to run CPU-intensive work at once).  Again, keep in mind that the work in this example only takes about 4.5 milliseconds--we're getting a nice speedup even though none of the threads are running in the apply function for very long.&lt;/p&gt;&lt;p&gt;Here is another more interesting example.  Let's farm the CPU-intensive work out to a multiprocessing pool. Change the server slightly:&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;# ysumserver.py
...
if __name__ == '__main__':
    import multiprocessing
    pool = multiprocessing.Pool()
    run_server(("",10000),cpu_apply=pool.apply)
&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;Now, let's try our client again.&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;bash-3.2$ &lt;b&gt;python sumclient.py 1&lt;/b&gt;
1000 total requests
4.50634002686
bash-3.2$ &lt;b&gt;python sumclient.py 2&lt;/b&gt;
1000 total requests
2.29651284218
bash-3.2$ &lt;b&gt;python sumclient.py 4&lt;/b&gt;
1000 total requests
1.45105290413
bash-3.2$ &lt;b&gt;python sumclient.py 8&lt;/b&gt;
1000 total requests
1.59892106056
bash-3.2$                      
&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;Hey, look at that--the performance is actually getting better!   For example, the performance with 4 client threads is more than 3 times faster than with just one thread.  This is because the CPU-intensive work is now being handled on multiple cores through the use of the multiprocessing module.&lt;/p&gt;&lt;p&gt;&lt;b&gt;Wrapping up (for now)&lt;/b&gt;&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;Since this post is already getting long, I'm going to wrap it up.  However, let's conclude by revisiting a previous bit of code.  In our server, we defined a client handler function like this:&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;def serve_client(c):
    n = int(c.recv(16))
    result = yield sumn, (n,)          
    c.send(str(result).encode('ascii'))
    c.close()
&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;In this code, there are no dependencies on any libraries or special objects. It fact, all it does is spit out a bit of CPU-bound processing with the &lt;tt&gt;yield&lt;/tt&gt; statement.  Behind the scenes, the &lt;tt&gt;YieldableThread&lt;/tt&gt; object is free to take this work and do whatever it wants to with it. For example, run it in a special environment, pass it to the multiprocessing module, send it out to the cloud, etc.  I think that's kind of cool.&lt;/p&gt;&lt;p&gt;Of course, at this point, you might be asking yourself, "what can possibly go wrong?"  To answer that, you'll have to wait for the next installment.   However, as a preview, I'll just say that the answer is "a lot!"&lt;/p&gt;&lt;p&gt;&lt;b&gt;Postscript&lt;/b&gt;&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;All of my tests were performed using Python 2.7 running on a 4-core Mac Pro(2 x 2.66 GHz, Dual-Core Intel Xeon) running OS X version 10.6.4.&lt;/p&gt;&lt;p&gt;Although I've never seen generators used quite like this before, I don't want to steal anyone's thunder--if you are aware of prior work, please send me a link so I can post it here.&lt;/p&gt;&lt;p&gt;&lt;b&gt;Additional Postscript&lt;/b&gt;&lt;/p&gt;&lt;p&gt;If you like messing around with concurrency, distributed systems, and other neat things, then you would probably like the &lt;a href="http://www.dabeaz.com/chicago/index.html"&gt;Python Networks, Concurrency, and Distributed Systems&lt;/a&gt; course I'm running in Chicago.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-6379636000702059811?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/6379636000702059811/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=6379636000702059811' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/6379636000702059811'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/6379636000702059811'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2010/07/yieldable-threads-part-1.html' title='Yieldable Threads (Part 1)'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-3462548727484936961</id><published>2010-05-18T18:28:00.000-07:00</published><updated>2010-05-18T18:28:36.601-07:00</updated><title type='text'>Python Classes for Summer 2010</title><content type='html'>After teaching a variety of Python classes on the road in the US, Europe, and India, I'm back in Chicago to enjoy summer in the city. You should come to Chicago to enjoy the lakefront, summer festivals, and take one of my &lt;a href="http://www.dabeaz.com/chicago/index.html"&gt;Python classes&lt;/a&gt;.   Here's a brief preview:&lt;br /&gt;
&lt;br /&gt;
&lt;b&gt;Python Networks and Distributed Systems, June 21-23, 2010.&lt;/b&gt;&lt;br /&gt;
&lt;p&gt;This 3-day course is actually a greatly expanded version of my infamous Python Concurrency Workshop from 2009.  It includes a wide mix of topics from network programming, concurrent programming, and distributed computing. Over the past year, I've been refining and improving the material.  Ultimately, I might turn it into a new Python book, but if you want an early cutting-edge preview, you should come to the class.&lt;br /&gt;
&lt;/p&gt;&lt;b&gt;Jamming with Django, June 24, 2010.&lt;/b&gt;&lt;br /&gt;
&lt;p&gt;I'll freely admit that when it comes to Django, I don't know much beyond the fact that it's written in Python.   So, if you're like me and you'd like to learn more about it, you should come to this one-day course, taught by Chicago-area Django developers Chad Glendenin and Rodrigo Guzman.  I'm both hosting and taking the course.&lt;/p&gt;&lt;b&gt;Introduction to Python Programming, July 13-15. 2010&lt;/b&gt;&lt;br /&gt;
&lt;p&gt;This is the standard 3-day Python programming class that I've taught at numerous on-site locations around the world.  If you've programmed before, but never used Python, then this is the class that you want.   Even if you've been programming Python already, you'll learn a variety of new tricks.&lt;/p&gt;&lt;b&gt;Advanced Python Mastery, August 17-19, 2010.&lt;/b&gt;&lt;br /&gt;
&lt;p&gt;If you've been programming Python for awhile, but want to take your knowledge to a whole new level, this might be the course for you.   In short, this course goes far beyond what you find in a basic tutorial and focuses on some of Python's most advanced features.   This includes metaprogramming, generators, coroutines, C extensions, performance optimization, inner workings of the object system, and more.&lt;/p&gt;&lt;p&gt;If you want more information about the courses go &lt;a href="http://www.dabeaz.com/chicago/index.html"&gt;here&lt;/a&gt;. Hopefully I'll see in Chicago!&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-3462548727484936961?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/3462548727484936961/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=3462548727484936961' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/3462548727484936961'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/3462548727484936961'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2010/05/python-classes-for-summer-2010.html' title='Python Classes for Summer 2010'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-7659926691328912682</id><published>2010-04-21T17:38:00.001-07:00</published><updated>2010-04-21T17:39:41.922-07:00</updated><title type='text'>This blog has moved</title><content type='html'>
       This blog is now located at http://dabeaz.blogspot.com/.
       You will be automatically redirected in 30 seconds, or you may click &lt;a href='http://dabeaz.blogspot.com/'&gt;here&lt;/a&gt;.

       For feed subscribers, please update your feed subscriptions to
       http://dabeaz.blogspot.com/feeds/posts/default.
  &lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-7659926691328912682?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='related' href='http://dabeaz.blogspot.com/' title='This blog has moved'/><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/7659926691328912682/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=7659926691328912682' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/7659926691328912682'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/7659926691328912682'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2010/04/this-blog-has-moved.html' title='This blog has moved'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-7873195459855819841</id><published>2010-02-26T12:56:00.000-08:00</published><updated>2010-03-12T07:27:30.574-08:00</updated><title type='text'>Upcoming Python Training Classes</title><content type='html'>&lt;p&gt;Please forgive the brief commercial interruption.  I'd just like to plug a few of my upcoming Python training classes--yes, if you must know, this is how I pay the bills so that I can spend the rest of my time thinking about the &lt;a href="http://www.dabeaz.com/GIL"&gt;GIL&lt;/a&gt; and other diabolical Python-related topics.&lt;/p&gt;&lt;p&gt;&lt;b&gt;New! Python Mastery Bootcamp, April 12-16, 2010 (Atlanta)&lt;/b&gt;&lt;/p&gt;&lt;p&gt;First, I'm pleased to announce a brand-new Python course that I'm offering for the first time at &lt;a href="http://www.bignerdranch.com"&gt;Big Nerd Ranch&lt;/a&gt; in Atlanta.   The &lt;a href="http://bignerdranch.com/classes/python.shtml"&gt;Python Mastery Bootcamp&lt;/a&gt; might be the ultimate Python tutorial for programmers who already know the basics of Python, but who want to take their understanding of the language to a whole new level.  Over the past few years, I have given a number of well-reviewed PyCON tutorials on advanced topics such as &lt;a href="http://www.dabeaz.com/generators"&gt;Generator Tricks for Systems Programmers&lt;/a&gt;, &lt;a href="http://www.dabeaz.com/coroutines"&gt;A Curious Course on Coroutines and Concurrency&lt;/a&gt;, or most recently &lt;a href="http://www.dabeaz.com/python3io"&gt;Mastering Python 3 I/O&lt;/a&gt;. Well, the Mastery Bootcamp is sort of similar except that it lasts 5 days, it covers far more material (network programming, threads, multiprocessing, asynchronous I/O, functional programming, metaprogramming, distributed computing, C extensions, etc.), and it has more hands-on projects that allow the material to be explored in greater depth than at a conference. &lt;/p&gt;&lt;p&gt;The experience at Big Nerd Ranch is quite unique--for 5 days, you will be completely immersed in Python programming without the annoyance of outside distractions.  This makes it the perfect environment to interact with other class participants and to really focus on the course material.  There's really nothing quite like it in the training world--you won't be disappointed.&lt;/p&gt;&lt;p&gt;&lt;b&gt;March 12,2010 Update!&lt;/b&gt; The Mastery Bootcamp is confirmed to run and there are still a few slots available.  It's going to be great experience for anyone who wants to learn enough about Python to be dangerous.&lt;/p&gt;&lt;br /&gt;
&lt;p&gt;&lt;b&gt;Introduction to Python Programming, March 16-18, 2010 (Chicago)&lt;/b&gt;&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;If you're relatively new to Python and want to master the fundamentals, consider coming to my &lt;a href="http://www.dabeaz.com/chicago/index.html"&gt;Introduction to Python Programming&lt;/a&gt; class in Chicago.   This course is aimed at programmers, system administrators, scientists, and engineers who want to apply Python to everyday tasks such as analyzing data files, automating system tasks, scraping web pages, using databases, and more.  Through practical examples, you will learn all of the major features of Python including data handling, functions, modules, classes, generators, testing, and more.   This is a highly refined class that has been taught for numerous corporate and government clients over the past three years.  The class features a 300 page fully indexed course guide and more than 50 hands-on exercises.&lt;/p&gt;&lt;p&gt;My Chicago classes are also taught in a rather unique format.  Unlike a typical corporate training course, I conduct the course in a round-table format that is strictly limited to 6 attendees--a size that encourages interaction and allows course topics to be easily customized to your interests.   The course is located in Chicago's distinctive Andersonville neighborhood where just steps away, you will find dozens of unique restaurants, bakeries, coffee houses, pubs, and more.  You're definitely going to like it!&lt;/p&gt;&lt;p&gt;&lt;b&gt;March 12, 2010 update!&lt;/b&gt; The Chicago class is now sold out.  However, be on the lookout for its return in a few months.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-7873195459855819841?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/7873195459855819841/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=7873195459855819841' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/7873195459855819841'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/7873195459855819841'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2010/02/upcoming-python-training-classes.html' title='Upcoming Python Training Classes'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-8935811723349955313</id><published>2010-02-22T13:47:00.000-08:00</published><updated>2010-02-22T13:47:27.180-08:00</updated><title type='text'>Revisiting thread priorities and the new GIL</title><content type='html'>&lt;p&gt;Well, PyCon is over and it's time to get back to work.  First, I'd just like to thank everyone who came to my &lt;a href="http://www.dabeaz.com/python/UnderstandingGIL.pdf"&gt;GIL Talk&lt;/a&gt; and participated in all of the discussion that followed.    It was almost as if part of PyCon had turned into a mini operating systems conference!&lt;/p&gt;&lt;p&gt;This post is a followup to the GIL open space at PyCon where we looked at the new GIL and explored the possibility of introducing thread priorities.  For those of you not at PyCon, the open space was attended by about 30-40 people and included Guido, Antoine Pitrou, and a large number of systems hackers, some of which had previously worked on thread library implementations and operating system kernels.&lt;/p&gt;&lt;p&gt;First, a little background.   As might know, Antoine Pitrou implemented a new Python GIL that is currently only available in the Python 3.2 development branch (you can obtain it via subversion).   This new GIL is described in his &lt;a href="http://mail.python.org/pipermail/python-dev/2009-October/093321.html"&gt;original mailing list post&lt;/a&gt; as well as the &lt;a href="http://www.dabeaz.com/python/UnderstandingGIL.pdf"&gt;slides&lt;/a&gt; for my PyCon talk.  You should read those first if you haven't already.&lt;/p&gt;&lt;p&gt;Right before PyCON, I discovered an I/O performance problem with the new GIL that is related to CPU-bound threads stalling the progress of I/O bound threads which it turn leads to a severe performance degradation of I/O bandwidth and response time.  This is described in &lt;a href="http://bugs.python.org/issue7946"&gt;Issue 7946 : Convoy effect with I/O bound threads and New GIL&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;In the bug report, I submitted a very simple test case that illustrated the problem.  However, here is a more refined experiment that you can try.  The following program, &lt;tt&gt;iotest.py&lt;/tt&gt; contains both CPU-bound threads and an I/O server thread that echos UDP packets.  It is meant to study the case in which CPU-processing and I/O processing are overlapped. &lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;# iotest.py

import time
import threading
from socket import *
import itertools

def task_pidigits():
    """Pi calculation (Python)"""
    _map = map
    _count = itertools.count
    _islice = itertools.islice

    def calc_ndigits(n):
        # From http://shootout.alioth.debian.org/
        def gen_x():
            return _map(lambda k: (k, 4*k + 2, 0, 2*k + 1), _count(1))

        def compose(a, b):
            aq, ar, as_, at = a
            bq, br, bs, bt = b
            return (aq * bq,
                    aq * br + ar * bt,
                    as_ * bq + at * bs,
                    as_ * br + at * bt)

        def extract(z, j):
            q, r, s, t = z
            return (q*j + r) // (s*j + t)

        def pi_digits():
            z = (1, 0, 0, 1)
            x = gen_x()
            while 1:
                y = extract(z, 3)
                while y != extract(z, 4):
                    z = compose(z, next(x))
                    y = extract(z, 3)
                z = compose((10, -10*y, 0, 1), z)
                yield y

        return list(_islice(pi_digits(), n))

    return calc_ndigits, (50, )

def spin():
    task,args = task_pidigits()
    while True:
       r= task(*args)

def echo_server():
    s = socket(AF_INET, SOCK_DGRAM)
    s.setsockopt(SOL_SOCKET, SO_REUSEADDR,1)
    s.bind(("",16000))
    while True:
        msg, addr = s.recvfrom(16384)
        s.sendto(msg,addr)  

# Launch threads (adjust the number to see different results)
NUMTHREADS = 1
for n in range(NUMTHREADS):
    t = threading.Thread(target=spin)
    t.daemon = True
    t.start()

# Launch a background echo server
echo_server()
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Next, here is a client program &lt;tt&gt;ioclient.py&lt;/tt&gt; that simply measures the time it takes to echo 10MB of data to the server in the &lt;tt&gt;iotest.py&lt;/tt&gt; program.&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;# echoclient.py
from socket import *
import time

CHUNKSIZE = 8192
NUMMESSAGES = 1280     # Total of 10MB

# Dummy message
msg = b"x"*CHUNKSIZE

# Connect and send messages
s = socket(AF_INET,SOCK_DGRAM)
start = time.time()
for n in range(NUMMESSAGES):
    s.sendto(msg,("",16000))
    msg, addr = s.recvfrom(65536)
end = time.time()
print("%0.3f seconds (%0.3f bytes/sec)" % (end-start, (CHUNKSIZE*NUMMESSAGES)/(end-start)))
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;If you run &lt;tt&gt;iotest.py&lt;/tt&gt; on a dual-core Macbook with only 1 spin() thread.  You get the following result if you run &lt;tt&gt;ioclient.py&lt;/tt&gt;:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Python 3.2 (New GIL) : 9.166 seconds (1143998.140 bytes/sec)&lt;br /&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;P&gt;It works, but it's hardly impressive (just barely over 1MB/sec transfer rate between two processes?).  However, if you make the server have two spin() threads, the performance gets much worse:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Python 3.2 (New GIL) : 28.064 seconds (373642.858 bytes/sec)&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Now to further complicate matters, if you disable all but one of the CPU cores, you get this inexplicable result:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Python 3.2 (New GIL, 1 CPU) : 0.297 seconds (35326299.028 bytes/sec)&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Needless to say, there are many bizarre things going on here.   The most major effect is that on multiple cores, it is very easy for CPU-bound threads to reacquire the GIL whenever an I/O bound thread performs I/O.   This means that CPU-threads have a greater tendency to hog the GIL.&lt;/p&gt;&lt;p&gt;At PyCON, I did some experiments with thread priorities and a modified GIL that adjusted priorities in a manner similar to what you find with multilevel feedback queues in operating systems.  Namely:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;If a thread is forced to give up the GIL due to a timeout, it is penalized with lower priority.&lt;/li&gt;
&lt;li&gt;If a thread voluntarily gives up the GIL because it performed I/O, it is reward with higher priority.&lt;/li&gt;
&lt;li&gt;High priority threads always preempty low-priority threads.&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;The results of this approach were impressive.   If you run the same tests with priorities on 2 CPU cores, you get this result:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Python 3.2 (New GIL with priorities), 0.298 seconds (35156921.564 bytes/sec)&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;The prioritized GIL also gives good performance for Antoine's own &lt;tt&gt;ccbench.py&lt;/tt&gt; benchmark.&lt;/p&gt;&lt;table border=1 cellspacing=15&gt;&lt;tr&gt;
&lt;th&gt;New GIL&lt;/th&gt;&lt;th&gt;New GIL with priorities&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;br /&gt;
&lt;pre&gt;== CPython 3.2a0.0 (py3k:78250) ==
== i386 Darwin on 'i386' ==

--- Throughput ---

Pi calculation (Python)

threads=1: 873 iterations/s.
threads=2: 845 ( 96 %)
threads=3: 837 ( 95 %)
threads=4: 820 ( 93 %)

regular expression (C)

threads=1: 348 iterations/s.
threads=2: 339 ( 97 %)
threads=3: 328 ( 94 %)
threads=4: 317 ( 91 %)

bz2 compression (C)

threads=1: 367 iterations/s.
threads=2: 655 ( 178 %)
threads=3: 642 ( 174 %)
threads=4: 646 ( 175 %)

--- Latency ---

Background CPU task: Pi calculation (Python)

CPU threads=0: 0 ms. (std dev: 0 ms.)
CPU threads=1: 5 ms. (std dev: 0 ms.)
CPU threads=2: 2 ms. (std dev: 2 ms.)
CPU threads=3: 138 ms. (std dev: 100 ms.)
CPU threads=4: 132 ms. (std dev: 99 ms.)

Background CPU task: regular expression (C)

CPU threads=0: 0 ms. (std dev: 0 ms.)
CPU threads=1: 6 ms. (std dev: 1 ms.)
CPU threads=2: 6 ms. (std dev: 6 ms.)
CPU threads=3: 6 ms. (std dev: 4 ms.)
CPU threads=4: 10 ms. (std dev: 8 ms.)

Background CPU task: bz2 compression (C)

CPU threads=0: 0 ms. (std dev: 0 ms.)
CPU threads=1: 0 ms. (std dev: 1 ms.)
CPU threads=2: 0 ms. (std dev: 0 ms.)
CPU threads=3: 0 ms. (std dev: 0 ms.)
CPU threads=4: 0 ms. (std dev: 0 ms.)

&lt;/pre&gt;&lt;/td&gt;
&lt;td&gt;&lt;br /&gt;
&lt;pre&gt;== CPython 3.2a0.0 (py3k:78215M) ==
== i386 Darwin on 'i386' ==

--- Throughput ---

Pi calculation (Python)

threads=1: 885 iterations/s.
threads=2: 860 ( 97 %)
threads=3: 869 ( 98 %)
threads=4: 859 ( 97 %)

regular expression (C)

threads=1: 362 iterations/s.
threads=2: 358 ( 98 %)
threads=3: 349 ( 96 %)
threads=4: 354 ( 97 %)

bz2 compression (C)

threads=1: 373 iterations/s.
threads=2: 654 ( 175 %)
threads=3: 649 ( 173 %)
threads=4: 638 ( 170 %)

--- Latency ---

Background CPU task: Pi calculation (Python)

CPU threads=0: 0 ms. (std dev: 0 ms.)
CPU threads=1: 0 ms. (std dev: 0 ms.)
CPU threads=2: 0 ms. (std dev: 2 ms.)
CPU threads=3: 0 ms. (std dev: 1 ms.)
CPU threads=4: 0 ms. (std dev: 1 ms.)

Background CPU task: regular expression (C)

CPU threads=0: 0 ms. (std dev: 0 ms.)
CPU threads=1: 2 ms. (std dev: 1 ms.)
CPU threads=2: 3 ms. (std dev: 3 ms.)
CPU threads=3: 2 ms. (std dev: 1 ms.)
CPU threads=4: 2 ms. (std dev: 2 ms.)

Background CPU task: bz2 compression (C)

CPU threads=0: 0 ms. (std dev: 0 ms.)
CPU threads=1: 0 ms. (std dev: 1 ms.)
CPU threads=2: 0 ms. (std dev: 1 ms.)
CPU threads=3: 0 ms. (std dev: 1 ms.)
CPU threads=4: 0 ms. (std dev: 1 ms.)

&lt;/pre&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;&lt;p&gt;The overall outcome of the GIL open space was that having a priority mechanism was probably a good idea.  However, a lot of people wanted to study the problem in more detail and to think about different possible implementations.   I am posting the following tar file that has my own modifications to the GIL used for the above benchmarks:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://www.dabeaz.com/python/prioritygil.tar.gz"&gt;prioritygil.tar.gz&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;P&gt;Note: This tar file has all of the modified files in the Python 3.2 source (pystate.h, pystate.c, and ceval_gil.h) along with the io testing benchmark.    Be advised that this patch is only intended for further study by others---it's kind of hacked together and really only a proof of concept implementation of one possible priority scheme.   A real implementation would still need to address some issues not covered in my patch (e.g., starvation effects).&lt;/p&gt;&lt;p&gt;Due to other time commitments, I'm not going to be able to do much followup with this patch at this moment.  However, I do want to encourage others to at least consider the benefit of introducing thread priorities and to explore different possible implementations.  Initial results seem to indicate that this can fix the GIL for both CPU-bound threads and for&lt;br /&gt;
I/O performance.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-8935811723349955313?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/8935811723349955313/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=8935811723349955313' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/8935811723349955313'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/8935811723349955313'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2010/02/revisiting-thread-priorities-and-new.html' title='Revisiting thread priorities and the new GIL'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-4485812278500153019</id><published>2010-02-02T18:12:00.000-08:00</published><updated>2010-02-02T18:12:46.189-08:00</updated><title type='text'>A function that works as a context manager and a decorator</title><content type='html'>&lt;p&gt;As a followup to my last blog post on timings, I present the following function which works as both a decorator and a context manager.&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;# timethis.py
import time
from contextlib import contextmanager

def timethis(what):
    @contextmanager
    def benchmark():
        start = time.time()
        yield
        end = time.time()
        print("%s : %0.3f seconds" % (what, end-start))
    if hasattr(what,"__call__"):
        def timed(*args,**kwargs):
            with benchmark():
                return what(*args,**kwargs)
        return timed
    else:
        return benchmark()
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Here is a short demonstration of how it works:&lt;br /&gt;
&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;# Usage as a context manager
with timethis("iterate by lines (UTF-8)"):
     for line in open("biglog.txt",encoding='utf-8'):
          pass

# Usage as a decorator
@timethis
def iterate_by_lines_latin_1():
    for line in open("biglog.txt",encoding='latin-1'):
        pass

iterate_by_lines_latin_1()
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;If you run it, you'll get output like this:&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;bash % &lt;b&gt;python3 timethis.py&lt;/b&gt;
iterate by lines (UTF-8) : 3.762 seconds
&amp;lt;function iterate_by_lines_latin_1 at 0x100537958&gt; : 3.513 seconds
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Naturally, this bit of code would be a good thing to bring into your next code review just to make sure people are actually paying attention.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-4485812278500153019?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/4485812278500153019/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=4485812278500153019' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/4485812278500153019'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/4485812278500153019'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2010/02/function-that-works-as-context-manager.html' title='A function that works as a context manager and a decorator'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-8673452591802794299</id><published>2010-02-02T05:01:00.000-08:00</published><updated>2010-02-02T05:01:46.052-08:00</updated><title type='text'>A Context Manager for Timing Benchmarks</title><content type='html'>&lt;p&gt;I spend a lot of time studying different aspects of Python, different implementation techniques, and so forth.  As part of that, I often carry out little performance benchmarks.  For small things, I'll often use the &lt;a href="http://docs.python.org/library/timeit"&gt;timeit&lt;/a&gt; module.  For example:&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;&gt;&gt;&gt; &lt;b&gt;from timeit import timeit&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;timeit("math.sin(2)","import math")&lt;/b&gt;
0.29826998710632324
&gt;&gt;&gt; &lt;b&gt;timeit("sin(2)","from math import sin")&lt;/b&gt;
0.21983098983764648
&gt;&gt;&gt;
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;However, for larger blocks of code, I tend to just use the &lt;tt&gt;time&lt;/tt&gt; module directly like this:&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;import time
start = time.time()
...
... &lt;em&gt;some big calculation&lt;/em&gt;
...
end = time.time()
print("Whatever : %0.3f seconds" % (end-start))
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Having typed that particular code template more often than I care to admit, it occurred to me that I really ought to just make a context manager for doing it.  Something like this:&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;# benchmark.py
import time
class benchmark(object):
    def __init__(self,name):
        self.name = name
    def __enter__(self):
        self.start = time.time()
    def __exit__(self,ty,val,tb):
        end = time.time()
        print("%s : %0.3f seconds" % (self.name, end-self.start))
        return False
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Now, I can just use that context manager whenever I want to do that kind of timing benchmark.  For example:&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;# fileitertest.py
from benchmark import benchmark

with benchmark("iterate by lines (UTF-8)"):
     for line in open("biglog.txt",encoding='utf-8'):
          pass

with benchmark("iterate by lines (Latin-1)"):
     for line in open("biglog.txt",encoding='latin-1'):
         pass

with benchmark("iterate by lines (Binary)"):
     for line in open("biglog.txt","rb"):
         pass
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;If you run it, you might get output like this:&lt;br /&gt;
&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;bash % &lt;b&gt;python3 fileitertest.py&lt;/b&gt;
iterate by lines (UTF-8) : 3.903 seconds
iterate by lines (Latin-1) : 3.615 seconds
iterate by lines (Binary) : 1.886 seconds
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Nice. I like it already!&lt;br /&gt;
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-8673452591802794299?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/8673452591802794299/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=8673452591802794299' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/8673452591802794299'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/8673452591802794299'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2010/02/context-manager-for-timing-benchmarks.html' title='A Context Manager for Timing Benchmarks'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-2082133620075059646</id><published>2010-01-28T19:02:00.000-08:00</published><updated>2010-01-29T03:19:04.135-08:00</updated><title type='text'>A few useful bytearray tricks</title><content type='html'>&lt;p&gt;When I first saw the new Python 3 &lt;tt&gt;bytearray&lt;/tt&gt; object (also back-ported to Python 2.6), I wasn't exactly sure what to make of it.  On the surface, it seemed like a kind of mutable 8-bit string (a feature sometimes requested by users of Python 2).  For example:&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;&gt;&gt;&gt; &lt;b&gt;s = bytearray(b"Hello World")&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;s[:5] = b"Cruel"&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;s&lt;/b&gt;
bytearray(b'Cruel World')
&gt;&gt;&gt;
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;On the other hand, there are aspects of &lt;tt&gt;bytearray&lt;/tt&gt; objects that are completely unlike a string.  For example, if you iterate over a bytearray, you get integer byte values:&lt;br /&gt;
&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;&gt;&gt;&gt; &lt;b&gt;s = bytearray(b"Hello World")&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;for c in s: print(c)&lt;/b&gt;
...
72
101
108
108
111
32
87
111
114
108
100
&gt;&gt;&gt; 
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Similarly, indexing operations are tied to integers:&lt;br /&gt;
&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;&gt;&gt;&gt; &lt;b&gt;s[1]&lt;/b&gt;
101
&gt;&gt;&gt; &lt;b&gt;s[1] = 97&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;s[1] = b'a'&lt;/b&gt;
Traceback (most recent call last):
  File "&lt;stdin&gt;", line 1, in &lt;module&gt;
TypeError: an integer is required
&gt;&gt;&gt; 
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Finally, there's the fact &lt;tt&gt;bytearray&lt;/tt&gt; instances have most of the methods associated with strings as well as methods associated with lists.  For example:&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;&gt;&gt;&gt; &lt;b&gt;s.split()&lt;/b&gt;
[bytearray(b'Hello'), bytearray(b'World')]
&gt;&gt;&gt; &lt;b&gt;s.append(33)&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;s&lt;/b&gt;
bytearray(b'Hello World!')
&gt;&gt;&gt;
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Although documentation on bytearrays describes these features, it is a little light on meaningful use cases.  Needless to say, if you have too much spare time (sic) on your hands, this is the kind of thing that you start to think about.  So, I thought I'd share three practical uses of bytearrays.&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;&lt;b&gt;Example 1: Assembling a message from fragments&lt;/b&gt;&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;Suppose you're writing some network code that is receiving a large message on a socket connection.  If you know about sockets, you know that the &lt;tt&gt;recv()&lt;/tt&gt; operation doesn't wait for all of the data to arrive.  Instead, it merely returns what's currently available in the system buffers.  Therefore, to get all of the data, you might write code that looks like this:&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;# remaining = number of bytes being received (determined already)
msg = b""
while remaining &gt; 0:
    chunk = s.recv(remaining)    # Get available data
    msg += chunk                 # Add it to the message
    remaining -= len(chunk)  
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;The only problem with this code is that concatenation (+=) has horrible performance.  Therefore, a common performance optimization in Python 2 is to collect all of the chunks in a list and perform a join when you're done.  Like this:&lt;br /&gt;
&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;# remaining = number of bytes being received (determined already)
msgparts = []
while remaining &gt; 0:
    chunk = s.recv(remaining)    # Get available data
    msgparts.append(chunk)       # Add it to list of chunks
    remaining -= len(chunk)  
msg = b"".join(msgparts)          # Make the final message
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Now, here's a third solution using a &lt;tt&gt;bytearray&lt;/tt&gt;:&lt;br /&gt;
&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;# remaining = number of bytes being received (determined already)
msg = bytearray()
while remaining &gt; 0:
    chunk = s.recv(remaining)    # Get available data
    msg.extend(chunk)            # Add to message
    remaining -= len(chunk)  
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Notice how the bytearray version is really clean.  You don't collect parts in a list and you don't perform that cryptic join at the end. Nice.&lt;/p&gt;&lt;p&gt;Of course, the big question is whether or not it performs.  To test this out, I first made a list of small byte fragments like this:&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;chunks = [b"x"*16]*512
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;I then used the &lt;tt&gt;timeit&lt;/tt&gt; module to compare the following two code fragments:&lt;br /&gt;
&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;# Version 1
msgparts = []
for chunk in chunks:
    msgparts.append(chunk)
msg = b"".join(msgparts)

# Version 2
msg = bytearray()
for chunk in chunks:
    msg.extend(chunk)
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;When tested, version 1 of the code ran in 99.8s whereas version 2 ran in 116.6s (a version using += concatenation takes 230.3s by comparison).   So while performing a join operation is still faster, it's only faster by about 16%.   Personally, I think the cleaner programming of the bytearray version might make up for it.&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;&lt;b&gt;Example 2: Binary record packing&lt;/b&gt;&lt;/p&gt;&lt;p&gt;This example is an slight twist on the last example.  Support you had a large Python list of integer (x,y) coordinates.  Something like this:&lt;br /&gt;
&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;points = [(1,2),(3,4),(9,10),(23,14),(50,90),...]
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Now, suppose you need to write that data out as a binary encoded file consisting of a 32-bit integer length followed by each point packed into a pair of 32-bit integers.  One way to do it would be to use the &lt;tt&gt;struct&lt;/tt&gt; module like this:&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;import struct
f = open("points.bin","wb")
f.write(struct.pack("I",len(points)))
for x,y in points:
    f.write(struct.pack("II",x,y))
f.close()
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;The only problem with this code is that it performs a large number of small &lt;tt&gt;write()&lt;/tt&gt; operations.  An alternative approach is to pack everything into a bytearray and only perform one write at the end.  For example:&lt;br /&gt;
&lt;blockquote&gt;&lt;pre&gt;import struct
f = open("points.bin","wb")
msg = bytearray()
msg.extend(struct.pack("I",len(points))
for x,y in points:
    msg.extend(struct.pack("II",x,y))
f.write(msg)
f.close()
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Sure enough, the version that uses &lt;tt&gt;bytearray&lt;/tt&gt; runs much faster.  In a simple timing test involving a list of 100000 points, it runs in about half the time as the version that makes a lot of small writes.&lt;/p&gt;&lt;p&gt;&lt;b&gt;Example 3:&lt;/b&gt; Mathematical processing of byte values&lt;/p&gt;&lt;p&gt;The fact that bytearrays present themselves as arrays of integers makes it easier to perform certain kinds of calculations.   In a recent embedded systems project, I was using Python to communicate with a device over a serial port.  As part of the communications protocol, all messages had to be signed with a Longitudinal Redundancy Check (LRC) byte.  An LRC is computed by taking an XOR across all of the byte values.  &lt;br /&gt;
&lt;/p&gt;&lt;p&gt;Bytearrays make such calculations easy.  Here's one version:&lt;br /&gt;
&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;message = bytearray(...)     # Message already created
lrc = 0
for b in message:
    lrc ^= b
message.append(lrc)          # Add to the end of the message
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Here's a version that increases your job security:&lt;br /&gt;
&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;message.append(functools.reduce(lambda x,y:x^y,message))
&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;And here's the same calculation in Python 2 without bytearrays:&lt;br /&gt;
&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;message = "..."       # Message already created
lrc = 0
for b in message:
    lrc ^= ord(b)
message += chr(lrc)        # Add the LRC byte
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Personally, I like the bytearray version.  There's no need to use &lt;tt&gt;ord()&lt;/tt&gt; and you can just append the result at the end of the message instead of using concatenation.&lt;/p&gt;&lt;p&gt;Here's another cute example.  Suppose you wanted to run a bytearray through a simple XOR-cipher.  Here's a one-liner to do it:&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;&gt;&gt;&gt; &lt;b&gt;key = 37&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;message = bytearray(b"Hello World")&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;s = bytearray(x ^ key for x in message)&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;s&lt;/b&gt;
bytearray(b'm@IIJ\x05rJWIA')
&gt;&gt;&gt; &lt;b&gt;bytearray(x ^ key for x in s)&lt;/b&gt;
bytearray(b"Hello World")
&gt;&gt;&gt; 
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;&lt;b&gt;Final Comments&lt;/b&gt;&lt;/p&gt;&lt;p&gt;Although some programmers might focus on bytearrays as a kind of mutable string, I find their use as an efficient means for assembling messages from fragments to be much more interesting.  That's because this kind of problem comes up a lot in the context of interprocess communication, networking, distributed computing, and other related areas.  Thus, if you know about bytearrays, it might lead to code that has good performance and is easy to understand. &lt;/p&gt;&lt;p&gt;That's it for this installment.  In case you're wondering, this topic is also related to my upcoming PyCON'2010 tutorial "Mastering Python 3 I/O."&lt;br /&gt;
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-2082133620075059646?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/2082133620075059646/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=2082133620075059646' title='13 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/2082133620075059646'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/2082133620075059646'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2010/01/few-useful-bytearray-tricks.html' title='A few useful bytearray tricks'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>13</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-5519571958197782221</id><published>2010-01-26T20:36:00.000-08:00</published><updated>2010-01-28T07:12:58.615-08:00</updated><title type='text'>Reexamining Python 3 Text I/O</title><content type='html'>&lt;p&gt;&lt;b&gt;Note:&lt;/b&gt; Since I first posted this, I added a performance test using the Python 2.6.4 codecs module.  This addition is highlighted in &lt;font color="#ff0000"&gt;red&lt;/font&gt;.&lt;/p&gt;&lt;p&gt;When Python 3.0 was first released, I tried it out on a few things and walked away unimpressed.   By far, the big negative was the horrible I/O performance.  For instance, scripts to perform simple data analysis tasks like processing a web server log file were running more than 30 times slower than Python 2.  Even though there were many new features of Python 3 to be excited about, the I/O performance alone was enough to make me not want to use it---or recommend it to anyone else for that matter.&lt;/p&gt;&lt;p&gt;Some time has passed since then.  For example, Python-3.1.1 is out and many improvements have been made.  To force myself to better understand the new Python 3 I/O system, I've been working on a tutorial &lt;a href="http://us.pycon.org/2010/tutorials/beazley_python3/"&gt;Mastering Python 3 I/O&lt;/a&gt; for the upcoming PyCON'2010 conference in Atlanta. Overall, I have to say that I'm pretty impressed with what I've found--and not just in terms of improved performance.&lt;/p&gt;&lt;p&gt;Due to space constraints, I can't talk about everything in my tutorial here.  However, I thought I would share some thoughts about text-based I/O in Python 3.1 and discuss a few examples.  Just as a disclaimer, I show a few benchmarks, but my intent is not to do a full study of every possible aspect of text I/O handling.  I would strongly advise you to download Python 3.1.1 and perform your own tests to get a better feel for it.&lt;/p&gt;&lt;p&gt;Like many people, one of my main uses of Python is data processing and parsing.  For example, consider the contents of a typical Apache web server log:&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;75.54.118.139 - - [24/Feb/2008:00:15:42 -0600] "GET /favicon.ico HTTP/1.1" 404 133
75.54.118.139 - - [24/Feb/2008:00:15:49 -0600] "GET /software.html HTTP/1.1" 200 3163
75.54.118.139 - - [24/Feb/2008:00:16:10 -0600] "GET /ply/index.html HTTP/1.1" 200 8018
213.145.165.82 - - [24/Feb/2008:00:16:19 -0600] "GET /ply/ HTTP/1.1" 200 8018
...
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Let's look at a simple script that processes this file.  For example, suppose you wanted to produce a list of all URLs that have generated a 404 error.   Here's a really simple (albeit hacky) script that does that:&lt;br /&gt;
&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;error_404_urls = set()
for line in open("access-log"):
    fields = line.split()
    if fields[-2] == '404':
        error_404_urls.add(fields[-4])

for name in error_404_urls:
    print(name)
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;On my machine, I have a 325MB log file consisting of 3649000 lines--a perfect candidate for performing a few benchmarks.   Here are the numbers that you get running the above script with different Python versions.  UCS-2 refers to Python compiled with 16-bit Unicode characters.  UCS-4 refers to Python compiled with 32-bit Unicode characters (the &lt;tt&gt;--with-wide-unicode&lt;/tt&gt; configuration option).  Also, in the interest of full disclosure, these tests were performed with a warm disk cache on a 2 GHZ Intel Core 2 Duo Apple Macbook with 4GB of memory under OS-X 10.6.2 (Snow Leopard).&lt;/p&gt;&lt;blockquote&gt;&lt;table&gt;&lt;tr&gt;&lt;th align="center"&gt;Python Version&lt;/th&gt;&lt;th align="center"&gt;Time (seconds)&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td align="center"&gt;2.6.4&lt;/td&gt;&lt;td align="center"&gt;7.91s&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td align="center"&gt;3.0&lt;/td&gt;&lt;td align="center"&gt;125.42s&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td align="center"&gt;3.1.1 (UCS-2)&lt;/td&gt;&lt;td align="center"&gt;14.11s&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td align="center"&gt;3.1.1 (UCs-4)&lt;/td&gt;&lt;td align="center"&gt;17.32s&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;&lt;/blockquote&gt;&lt;p&gt;As you can see, Python 3.0 performance was an anomaly--the performance of Python 3.1.1 is substantially better.    To better understand the I/O component of this script, I ran a modified test with the following code&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;for line in open("access-log"):
    pass
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Here are the performance results for iterating over the file by lines:&lt;/p&gt;&lt;blockquote&gt;&lt;table&gt;&lt;tr&gt;&lt;th align="center"&gt;Python Version&lt;/th&gt;&lt;th align="center"&gt;Time (seconds)&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td align="center"&gt;2.6.4&lt;/td&gt;&lt;td align="center"&gt;1.50s&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td align="center"&gt;&lt;font color="#ff0000"&gt;2.6.4 (codecs, UTF-8)&lt;/font&gt;&lt;/td&gt;&lt;td align="center"&gt;&lt;font color="#ff0000"&gt;52.22s&lt;/font&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td align="center"&gt;3.0&lt;/td&gt;&lt;td align="center"&gt;105.87s&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td align="center"&gt;3.1.1 (UCS-2)&lt;/td&gt;&lt;td align="center"&gt;4.35s&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td align="center"&gt;3.1.1 (UCs-4)&lt;/td&gt;&lt;td align="center"&gt;6.11s&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;&lt;/blockquote&gt;&lt;p&gt;If you look at these numbers, you will see that the I/O performance of Python 3.1 has improved substantially. &lt;font color="#ff0000"&gt;It is also substantially faster than using the codecs module in Python 2.6.&lt;/font&gt;  However, you'll also observe that the performance is still quite a bit worse than the native Python 2.6 file object.  For example, in the table, iterating over lines is about 3x slower in Python 3.1.1 (UCS-2).  How can that be good?  That's 300% slower!&lt;/p&gt;&lt;p&gt;Let's talk about the numbers in more detail.  The decreased performance in Python 3 is almost solely due to the overhead of the underlying Unicode conversion applied to text input.  That conversion process involves two distinct steps:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Input data (bytes) has to be scanned and characters decoded according to some encoding (UTF-8 by default).&lt;/li&gt;
&lt;li&gt;The decoded character data has to be stored as an array of multibyte integers that represent the associated string result.&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;The overhead of decoding is a direct function of how complicated the underlying codec is.  Although UTF-8 is relatively simple, it's still more complex than an encoding such as Latin-1.  Let's see what happens if we try reading the file with "latin-1" encoding instead. Here's the modified test code:&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;for line in open("access-log",encoding='latin-1'):
    pass
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Here are the modified performance results that show an improvement:&lt;/p&gt;&lt;blockquote&gt;&lt;table&gt;&lt;tr&gt;&lt;th align="center"&gt;Python Version&lt;/th&gt;&lt;th align="center"&gt;Time (seconds)&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td align="center"&gt;3.1.1 (UCS-2)&lt;/td&gt;&lt;td align="center"&gt;3.64s (was 4.35s)&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td align="center"&gt;3.1.1 (UCs-4)&lt;/td&gt;&lt;td align="center"&gt;5.33s (was 6.11s)&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;&lt;/blockquote&gt;&lt;p&gt;Lesson learned : The encoding matters.  So, if you're working purely with ASCII text, specifying an encoding such as 'latin-1' will speed everything up.  Just so you know, if you specify 'ascii' encoding, you get no improvement over UTF-8.   This is because 'ascii' requires more work to decode than 'latin-1' (due to an extra check for bytes outside the range 0-127 in the decoding process).&lt;/p&gt;&lt;p&gt;At this point, you're still saying that it's slower.  Yes, even with a faster encoding, Python 3.1.1 is still about 2.5x slower than Python 2.6.4 on this simple I/O test.   Is there anything that can be done about that?&lt;/p&gt;&lt;p&gt;The short answer is "not really."  Since Python 3 strings are Unicode, the process of reading a simple 8-bit text file is always going to involve an extra process of converting and copying the byte-oriented data into the multibyte Unicode representation.  Just to give you an idea, let's drop into C programming and consider the following program:&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;#include &amp;lt;stdio.h&amp;gt;

int main() {
  FILE *f;
  char  bytes[256];

  f = fopen("access-log","r");
  while (fgets(bytes,256,f)) {  // Yes, hacky 
  }
}
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;This program does nothing more than iterate over lines of a file--think of it as the ultimate stripped down version of our Python-2.6.4 test.  If you run it, takes 1.13s to run on the same log file used for our earlier Python tests.&lt;/p&gt;&lt;p&gt;When you go to Python 3, there is always extra conversion.   It's like modifying the C program as follows:&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;#include &amp;lt;stdio.h&amp;gt;

int main() {
  FILE *f;
  char  bytes[256], *c;
  short  unicode[256], *u;

  f = fopen("biglog.txt","r");
  while (fgets(bytes,256,f)) {
    c = bytes;
    u = unicode;
    while (*c) {    /* Convert to Unicode */
      *(u++) = (short) *(c++);
    }
  }
}
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Sure enough, if you run this modified C program, it takes about 1.7 seconds--a nearly 50% performance hit just from that extra copying and conversion step.  Minimally, Python 3 has to do the same conversion.  However, it's also performing dynamic memory allocation, reference counting, and other low-level operations.  So, if you factor all of that in, the performance numbers start to make a little more sense. You also start to understand why it might be really hard to do much better.&lt;/p&gt;&lt;p&gt;Now, should you care about all of this?   Truthfully, most programs are probably not going to be affected by degraded text I/O performance as much as you think.  That's because most interesting programs do far more than just I/O.   Go back and consider the original script that I presented.  On Python-2.6.4, it took 7.91s to execute.   If I go back and tune the script to use the more efficient 'latin-1' encoding, it takes 13.8s with Python-3.1.1.  Yes, that's about 1.75x slower than before. However, the key point is that it's not 2.5x slower as our earlier I/O tests would suggest.   The performance impact will become less and less as the script performs more non-IO related work.&lt;/p&gt;&lt;p&gt;Finally, let's say that you still can't live with the performance degradation.  If you're just working with simple ASCII data files, you might solve this problem by turning to binary I/O instead.  For example, the following script variant uses binary I/O and bytes for most of its processing--only converting text to Unicode when absolutely necessary for printing.&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;error_404_urls = set()
for line in open("access-log","rb"):
    fields = line.split()
    if fields[-2] == b'404':
        error_404_urls.add(fields[-4])

for name in error_404_urls:
    print(name.decode('latin-1'))
&lt;/pre&gt;&lt;/blockquote&gt;&lt;br /&gt;
&lt;p&gt;If you run this final script, you find that it takes 8.22s in Python 3.1.1--which is only about 4% slower than the Python-2.6.4.  How about that!&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;The bottom line is that Python-3.1 is definitely worth a second look--especially if you tried the earlier Python 3.0 release and were disappointed with its performance.  Although text-based I/O is always going to be slower in Python 3 due to extra Unicode processing, it might not matter as much in practice.  Plus, binary I/O in Python 3 is still quite fast which means that you can turn to it as a last resort.&lt;/p&gt;&lt;p&gt;If you want to know more, attend my &lt;a href="http://us.pycon.org/2010/tutorials/beazley_python3/"&gt;Mastering Python 3 I/O&lt;/a&gt; at PyCON'2010 or sign up for the &lt;a href="http://www.dabeaz.com/chicago/index.html"&gt;Special Preview&lt;/a&gt; in Chicago.&lt;/p&gt;&lt;p&gt;&lt;b&gt;Final Notes:&lt;/b&gt;&lt;br /&gt;
&lt;/p&gt;&lt;ul&gt;&lt;li&gt;All versions of Python were compiled from source using the exact same configuration, compiler, and environment settings.&lt;/li&gt;
&lt;li&gt;Python timing tests were performed using the &lt;tt&gt;time&lt;/tt&gt; module and enclosing code with these statements:&lt;br /&gt;
&lt;pre&gt;import time
start = time.time()
... statements ...
end = time.time()
print(end-start)
&lt;/pre&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-5519571958197782221?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/5519571958197782221/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=5519571958197782221' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/5519571958197782221'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/5519571958197782221'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2010/01/reexamining-python-3-text-io.html' title='Reexamining Python 3 Text I/O'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-1344512631157130266</id><published>2010-01-21T10:00:00.000-08:00</published><updated>2010-01-21T10:00:47.803-08:00</updated><title type='text'>Slashdot, Pronouns, and the Python Essential Reference</title><content type='html'>Yesterday, I was ecstatic to see a &lt;a href="http://books.slashdot.org/story/10/01/20/1431242/Python-Essential-Reference-4th-Ed"&gt;positive review&lt;/a&gt; of my &lt;a href="http://www.amazon.com/python-Essential-Reference-David-Beazley/dp/0672329786/"&gt;Python Essential Reference&lt;/a&gt; book on Slashdot.  I've never had a book reviewed on Slashdot before.  However, I also know that with Slashdot, one never really knows what direction the subsequent discussion is going to take.  For instance, will someone jump in and say something like "in Soviet Russia, Python indents you" or will the conversation devolve into something about how Python programmers will never have a girlfriend?  That's not true by the way. I once had a girlfriend who went to hear me talk for 90 minutes about LALR(1) parser generators at a Chipy meeting despite the fact that she didn't know the first thing about programming.   That's surely a sign of true love or insanity if there ever was one.  Needless to say, I married her. However, I digress.&lt;br /&gt;
 &lt;br /&gt;
No, this time around, the Slashdot discussion decided it was going to focus on the use of pronouns--namely in response to a comment that included the sentence "... there is a lot of what a developer needs and very little of what she doesn't need."   Now, I am by no means any fan of political correctness, but I had to chuckle at the irony.  Of all of the things to discuss about the Python Essential Reference, pronouns would have to rank at about the bottom of the list.  This is because the entire book is virtually devoid of personal pronouns.  With the exception of the word "you" (e.g., "you type this..."), you won't find "he", "she", "him", "her", "we", or anything like that used anywhere in the text.   This was an intentional choice, but it wasn't related to any kind of political influence (in fact, editors of the Essential Reference have often tried to add pronouns like "he" and "she" to the text only to have me take them out again).  &lt;br /&gt;
&lt;br /&gt;
First published in 1999, the Essential Reference was actually my second major writing project--the first being my Ph.D. dissertation which had been completed the year before.  As you know, writing a dissertation is a pretty major affair.  Not only do you have to do original research and defend it, you also have to write a major document describing the results.  For a typical graduate student, the dissertation is the most technically demanding document you will ever write.  It might even be the first document that you will ever submit to a real-world copy editor--an editor who will very likely tear your precious document to shreds in front of your eyes.&lt;br /&gt;
&lt;br /&gt;
In my case, the final stage of my dissertation involved a somewhat prolonged battle with the dissertation editor at the University of Utah.   Upon submitting the document, she would immediately put it under the microscope to see if it met the required "technical specifications."  This meant measuring margins, line spacing, tables, figures, and other details with a ruler.  Any deviation whatsoever meant instant rejection of the entire document--please play again.  &lt;br /&gt;
&lt;br /&gt;
Assuming one could pass the basic technical requirements, the next stage involved a review to see if you were strictly adhering to the required writing "style guide."  When submitting a dissertation, you actually had to indicate a specific writing style guide.  For example, I said that I was writing the document according to the "Chicago Manual of Style."  What this meant in practical terms is that upon submitting the dissertation to the editor, she would read it and return it to you a few days later dripping in a sea of red ink.  Every sentence of the document that did not precisely adhere to that style guide would be torn apart.  I have to say that in my entire academic and professional career (grade school, high school, college, etc.), I have never had any paper reviewed like that. &lt;br /&gt;
&lt;br /&gt;
Just to give you an example of the agony, if I wrote something like "the data is plotted" (something that sounded perfectly reasonable to me as a programmer) the editor would reject it because "data" is a plural (of datum) and you can't use "is" with a plural (e.g., you would never say "the points is plotted.").   The other major source of agony was in the use of pronouns.  The editor would instantly punish you for any use of a personal pronoun.  So, a sentence like "we took the points and processed them with a script" would be rejected. &lt;br /&gt;
&lt;br /&gt;
Essentially the editor wanted the entire document to be written in what I would roughly describe as "academic passive voice."    It's a style of writing where you never identify who is actually carrying out various actions.  So, instead of saying "we took the points and processed them with a script" you had to write "the points were processed with a script."  As you can see, A major feature of this writing style is that it is very direct and precise.  Not only is the second sentence more compact, it doesn't muddle the discussion with unimportant details about who is actually carrying out the action.  Obviously, you also avoid the whole issue of "he" versus "she" with such a writing style.&lt;br /&gt;
&lt;br /&gt;
Anyways, work on the Python Essential Reference started just 6 months after finishing my dissertation.   Having fought all of those editor battles, I wrote it in the exact same style.  So far as I can remember, I don't think any pronoun other than "it" or "its" appeared in the text.   It must have blown the copy editor's mind.  What kind of deranged lunatic would write a 300-page impersonal document like that?  Especially since writing in the passive voice is something so actively discouraged.&lt;br /&gt;
&lt;br /&gt;
Over the last ten years, various copy-editors have worked on the Essential Reference, but much of that original academic writing style remains.  At some point, use of the word "you" was introduced in the book.  I was somewhat lukewarm about it at the time, but as an author you also learn to pick and choose your battles--and that wasn't one that seemed worth fighting (unlike the battle to convince my publisher that putting out a Python 2.6 book hot on the heels of Python 3.0 was going to make any sense).   &lt;br /&gt;
&lt;br /&gt;
So there you have it.   A review of a book virtually devoid of personal pronouns spawns a big discussion on the use of he/she on Slashdot.   Who would have thought?&lt;br /&gt;
&lt;br /&gt;
Naturally, I disavow any grammatical mistakes in this blog post---after all, I don't have a editor.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-1344512631157130266?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/1344512631157130266/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=1344512631157130266' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/1344512631157130266'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/1344512631157130266'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2010/01/slashdot-pronouns-and-python-essential.html' title='Slashdot, Pronouns, and the Python Essential Reference'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-4565997516889751136</id><published>2010-01-17T10:09:00.000-08:00</published><updated>2010-01-17T10:09:24.408-08:00</updated><title type='text'>Presentation on the new Python GIL</title><content type='html'>For anyone who missed it, I gave a presentation on the new Python GIL, implemented by Antoine Pitrou, at the January 14, 2010 meeting of Chipy.   The presentation slides can be found at &lt;a href="http://www.dabeaz.com/python/NewGIL.pdf"&gt;http://www.dabeaz.com/python/NewGIL.pdf&lt;/a&gt;.  I don't have any followup comments to put here at this time.  However, I think this is an exciting new development for Python 3.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-4565997516889751136?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/4565997516889751136/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=4565997516889751136' title='19 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/4565997516889751136'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/4565997516889751136'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2010/01/presentation-on-new-python-gil.html' title='Presentation on the new Python GIL'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>19</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-4523567443307976608</id><published>2010-01-05T06:18:00.000-08:00</published><updated>2010-01-06T05:09:58.381-08:00</updated><title type='text'>The Python GIL Visualized</title><content type='html'>&lt;p&gt;In preparation for my upcoming PyCON'2010 talk on "Understanding the Python GIL", I've been working on a variety of new material--including some graphical visualization of the GIL behavior described in my earlier &lt;a href="http://www.dabeaz.com/python/GIL.pdf"&gt;talk&lt;/a&gt;.   I'm still experimenting, but check it out.&lt;/p&gt;&lt;p&gt;In these graphs, Python interpreter ticks are shown along the X-axis.   The two bars indicate two different threads that are executing.  White regions indicate times at which a thread is completely idle.  Green regions indicate when a thread holds the GIL and is running.  Red regions indicate when a thread has been scheduled by the operating system only to awake and find that the GIL is not available (e.g., the infamous "GIL Battle").  For those who don't want to read, here is the legend again in pictures:&lt;/p&gt;&lt;p&gt;&lt;center&gt;&lt;br /&gt;
&lt;img src="http://www.dabeaz.com/images/GILLegend.png"&gt;&lt;/center&gt;&lt;/p&gt;&lt;p&gt;Okay, now let's look at some threads.   First, here is the behavior of running two CPU-bound threads on a single CPU system.  As you will observe, the threads nicely alternate with each other after long periods of computation.&lt;/p&gt;&lt;p&gt;&lt;center&gt;&lt;img src="http://www.dabeaz.com/images/GIL_1cpu.png"&gt;&lt;/center&gt;&lt;/p&gt;&lt;p&gt;Now, let's go fire up the code on your fancy new dual-core laptop.  Yow! Look at all of that GIL contention.  Again, all of those red regions indicate times where the operating system has scheduled a Python thread on one of the cores, but it can't run because the thread on the other core is holding it.&lt;/p&gt;&lt;p&gt;&lt;center&gt;&lt;img src="http://www.dabeaz.com/images/GIL_2cpu.png"&gt;&lt;/center&gt;&lt;/p&gt;&lt;p&gt;Here's an interesting case that involves an I/O bound thread competing with a CPU-bound thread.  In this example, the I/O thread merely echoes UDP packets.  Here is the code for that thread.&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;def thread_1(port):
    s = socket(AF_INET,SOCK_DGRAM)
    s.bind(("",port))
    while True:
        msg, addr = s.recvfrom(1024)
        s.sendto(msg,addr)
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;The other thread (thread 2) is just mindlessly spinning.  This graph shows what happens when you send a UDP message to thread 1.&lt;/p&gt;&lt;p&gt;&lt;center&gt;&lt;img src="http://www.dabeaz.com/images/GIL_io.png"&gt;&lt;/center&gt;&lt;/p&gt;&lt;p&gt;As you would expect, most of the time is spent running the CPU-bound thread.  However, when I/O is received, there is a flurry of activity that takes place in the I/O thread.  Let's zoom in on that region and see what's happening.&lt;/p&gt;&lt;p&gt;&lt;center&gt;&lt;img src="http://www.dabeaz.com/images/GIL_ioclose.png"&gt;&lt;/center&gt;&lt;/p&gt;&lt;p&gt;In this graph, you're seeing how difficult it is for the I/O bound to get the GIL in order to perform its small amount of processing.  For instance, approximately 17000 interpreter ticks pass between the arrival of the UDP message and successful return of the &lt;tt&gt;s.recvfrom()&lt;/tt&gt; call (and notice all of the GIL contention).  More that 34000 ticks pass between the execution of &lt;tt&gt;s.sendto()&lt;/tt&gt; and looping back to the next &lt;tt&gt;s.recvfrom()&lt;/tt&gt; call.  Needless to say, this is not the behavior you usually want for I/O bound processing. &lt;/p&gt;&lt;p&gt;Anyways, that is all for now.  Come to my PyCON talk to see much more.  Also check out Antoine Pitrou's work on a &lt;a href="http://mail.python.org/pipermail/python-dev/2009-October/093321.html"&gt;new GIL&lt;/a&gt;. &lt;/p&gt;&lt;p&gt;Note: It is not too late to sign up for my &lt;a href="http://www.dabeaz.com/chicago/concurrent.html"&gt;Concurrency Workshop&lt;/a&gt; next week (Jan 14-15). &lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-4523567443307976608?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/4523567443307976608/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=4523567443307976608' title='17 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/4523567443307976608'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/4523567443307976608'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2010/01/python-gil-visualized.html' title='The Python GIL Visualized'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>17</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-8547467207931590304</id><published>2009-12-14T03:50:00.000-08:00</published><updated>2009-12-14T03:50:00.426-08:00</updated><title type='text'>Python Concurrency Workshop (Reprise)</title><content type='html'>&lt;p&gt;Well, the winter months are now upon us--making it a perfect time to come to Chicago in the middle of January and have your brain exploded by the second edition of my &lt;a href="http://www.dabeaz.com/chicago/index.html"&gt;Python Concurrency Workshop&lt;/a&gt; (January 14-15, 2010).  Over the last few months, I've been working on numerous refinements to the previous workshop and adding some new material related to distributed computing (Actors, REST, distributed objects, etc.).  I think I'm even more excited by this version than the last.&lt;/p&gt;&lt;p&gt;So what is this concurrency workshop you ask?  Well, first all, you may have already encountered a small portion of it if you saw my presentation on the &lt;a href="http://www.dabeaz.com/python/GIL.pdf"&gt;Python GIL&lt;/a&gt;---that was only a small part of the workshop's thread programming section. The rest of the workshop aims to explore a variety of other topics at a similar technical depth.  For example, thread synchronization, thread debugging, message passing, data serialization, interprocess communication, multiprocessing, distributed computing, and advanced I/O handling.  In a nutshell, it's an opportunity to learn more about what makes Python tick and to go beyond what you normally find in the user manual.  The workshop is also a kind of proving ground for some of my future book projects and PyCON tutorials--I have made every effort to keep it cutting edge.&lt;/p&gt;&lt;p&gt;So, you might ask, who is the target audience of the workshop?   Although a lot of advanced material is covered, I think the workshop is best suited for intermediate Python programmers who want to learn more. For instance, the workshop utilizes numerous Python features such as context managers, decorators, generators, and coroutines.  If you've heard of such topics before, but aren't quite sure what they're all about, the workshop will fill in details.   Second, the workshop has a very strong focus on networking and distributed systems.  If you've been doing work in web services, cloud computing, parallel computing, or any related topic, the workshop aims to fill in a variety of essential technical details that will help you write more efficient code.   Finally, if you simply want to escape the office and hang out with other Python hackers, the workshop won't disappoint.&lt;/p&gt;&lt;p&gt;Finally, although there is a small chance the workshop will be held in the middle of a wind-whipped Chicago blizzard, other amenities will more than make up for it. Some of Chicago's finest bakeries and coffee shops surround the workshop venue--ensuring a proper balance of sugar and caffeine required for a workshop of this nature.  You won't be disappointed.&lt;/p&gt;&lt;p&gt;In any case, hopefully I'll see you at the workshop.  It's going to be great!&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-8547467207931590304?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/8547467207931590304/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=8547467207931590304' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/8547467207931590304'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/8547467207931590304'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2009/12/python-concurrency-workshop-reprise.html' title='Python Concurrency Workshop (Reprise)'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-4305260963941133323</id><published>2009-11-27T07:50:00.000-08:00</published><updated>2009-11-27T07:50:12.751-08:00</updated><title type='text'>Fun with block towers</title><content type='html'>&lt;p&gt;Lately, I've been having a lot of fun playing with wooden blocks--a great toy for toddlers and grown-ups alike.  &lt;/p&gt;&lt;center&gt;&lt;br /&gt;
&lt;img src="http://www.dabeaz.com/images/Blocks/blockbaby.JPG"&gt;&lt;/center&gt;&lt;br /&gt;
&lt;p&gt;There's a certain primal simplicity to blocks.  Sure, you can stack them up in simple towers or piles.  However, my inner geek makes me want to build more tricky structures.  For example, this diamond structure:&lt;p&gt;&lt;center&gt;&lt;br /&gt;
&lt;img src="http://www.dabeaz.com/images/Blocks/diamond.JPG"&gt;&lt;/center&gt;&lt;br /&gt;
&lt;p&gt;Or maybe a diamond with huge spire&lt;/p&gt;&lt;center&gt;&lt;br /&gt;
&lt;img src="http://www.dabeaz.com/images/Blocks/diamond2.JPG"&gt;&lt;/center&gt;&lt;br /&gt;
&lt;p&gt;Or flip the whole thing upside down if you're inclined:&lt;/p&gt;&lt;center&gt;&lt;br /&gt;
&lt;img src="http://www.dabeaz.com/images/Blocks/inverted.JPG"&gt;&lt;/center&gt;&lt;br /&gt;
&lt;p&gt;A more interesting challenge is to build an arch.&lt;/p&gt;&lt;center&gt;&lt;br /&gt;
&lt;img src="http://www.dabeaz.com/images/Blocks/arch.JPG"&gt;&lt;/center&gt;&lt;br /&gt;
&lt;p&gt;And if you can keep that stable, to find out how much you can stack on top of it&lt;/p&gt;&lt;center&gt;&lt;br /&gt;
&lt;img src="http://www.dabeaz.com/images/Blocks/archspire.JPG"&gt;&lt;/center&gt;&lt;br /&gt;
&lt;p&gt;Lately, I've been experimenting with expanding the number of dimensions.  For example, this interesting structure:&lt;/p&gt;&lt;center&gt;&lt;br /&gt;
&lt;img src="http://www.dabeaz.com/images/Blocks/3dsimple.JPG"&gt;&lt;/center&gt;&lt;br /&gt;
&lt;p&gt;Or this more complex extension of the idea&lt;/p&gt;&lt;center&gt;&lt;br /&gt;
&lt;img src="http://www.dabeaz.com/images/Blocks/3dcomplex.JPG"&gt;&lt;/center&gt;&lt;br /&gt;
&lt;p&gt;Somewhere in all of this, there's probably some kind of software development analogy.  Maybe it's the fact that even with simple components, you can make some pretty cool things.  Or maybe it's somehow related to the same inner urge that drives a programmer to build their entire application out of closures, generators, coroutines, actors, tasklets, or something similarly "simple."&lt;/p&gt;&lt;p&gt;Then again, maybe it's more of a warning.  After all, there are those pesky end-users who are going to put their dirty hands on everything when you're done (observe their look of terror).&lt;/p&gt;&lt;center&gt;&lt;br /&gt;
&lt;img src="http://www.dabeaz.com/images/Blocks/enduser2.JPG"&gt;&lt;/center&gt;&lt;br /&gt;
&lt;p&gt;... and well, we all know what happens next.&lt;/p&gt;&lt;p&gt;Anyways, that is all for now.  Hope everyone is enjoying the holiday!&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-4305260963941133323?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/4305260963941133323/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=4305260963941133323' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/4305260963941133323'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/4305260963941133323'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2009/11/fun-with-block-towers.html' title='Fun with block towers'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-84152637118440889</id><published>2009-11-20T14:50:00.000-08:00</published><updated>2009-11-20T15:04:52.179-08:00</updated><title type='text'>Python Thread Deadlock Avoidance</title><content type='html'>&lt;p&gt;One danger of writing programs based on threads is the potential for deadlock--a problem that almost invariably shows up if you happen to write thread code that tries to acquire more than one mutex lock at once.  For example:&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;a_lock = threading.Lock()
b_lock = threading.Lock()

def foo():
    with a_lock:
         ...
         with b_lock:
              # Do something
              ...

t1 = threading.Thread(target=foo)
t1.start()
&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;Code like that looks innocent enough until you realize that some other thread in the system also has a similar idea about locking--but acquires the locks in a slightly different order:&lt;br /&gt;
&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;def bar():
    with b_lock:
         ...
         with a_lock:
              # Do something (maybe)
              ...
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Sure, the code might be lucky enough work "most" of the time.  However, you will suffer a thousand sorrows if both threads try to acquire those locks at about the same time and you have to figure out why your program is mysteriously nonresponsive.  &lt;br /&gt;
&lt;/p&gt;&lt;p&gt;Computer scientists love to spend time thinking about such problems--especially if it means they can make up some diabolical problem about philosophers that they can put on an operating systems exam.  However, I'll spare you the details of that.&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;The problem of deadlock is not something that I would normally spend much time thinking about, but I recently saw some material talking about improved thread support in C++0x.  For example,  &lt;a href="http://www.devx.com/SpecialReports/Article/38883/1954"&gt;this article&lt;/a&gt; has some details.  In particular, it seems that C++0x offers a new locking operation &lt;tt&gt;std::lock()&lt;/tt&gt; that can acquire multiple mutex locks all at once while avoiding deadlock. For example:&lt;br /&gt;
&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;std::unique_lock&amp;lt;std::mutex&gt; lock_a(a.m,std::defer_lock);
std::unique_lock&amp;lt;std::mutex&gt; lock_b(b.m,std::defer_lock);
&lt;b&gt;std::lock(lock_a,lock_b);&lt;/b&gt;      // Lock both locks
...
... do something involving data protected by both locks
...
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;I don't actually know how C++0x implements its &lt;tt&gt;lock()&lt;/tt&gt; operation, but I do know that one way to avoid deadlock is to put some kind of ordering on all of the locks in a program.   If you then strictly enforce a policy that all locks have to be acquired in increasing order, you can avoid deadlock.  Just as an example, if you had two locks A and B, you could assign a unique number to each lock such as A=1 and B=2.  Then, in any part of the program that wanted to acquire both lock A and B, you just make a rule that A always has to be acquired first (because its number is lower).   In such a scheme, the thread &lt;tt&gt;bar()&lt;/tt&gt; shown earlier would simply be illegal.  That &lt;tt&gt;lock()&lt;/tt&gt; operation in C++ is almost certainly doing something similar to this--that is, it knows enough about the locks so that they can acquired without deadlock.&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;All of this got me thinking--I wonder how hard it would be to implement the &lt;tt&gt;lock()&lt;/tt&gt; operation in Python?   Not hard as it turns out.  First step is to change the name--given that &lt;tt&gt;acquire()&lt;/tt&gt; is the typical method used to acquire a lock, let's just call the operation &lt;tt&gt;acquire()&lt;/tt&gt; to make it more clear.  You can define &lt;tt&gt;acquire()&lt;/tt&gt; as a context-manager and simply order locks according to their &lt;tt&gt;id()&lt;/tt&gt; value like this:&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;class acquire(object):
    def __init__(self,*locks):
        self.locks = sorted(locks, key=lambda x: id(x))
    def __enter__(self):
        for lock in self.locks:
            lock.acquire()
    def __exit__(self,ty,val,tb):
        for lock in reversed(self.locks):
            lock.release()
        return False
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Okay, that was easy enough to do, but does it work?   Let's try it on the classic dining philosophers problem (look it up if you need a refresher):&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;import threading

# The philosopher thread
def philosopher(left, right):
    while True:
        with acquire(left,right):
             print threading.currentThread(), "eating"

# The chopsticks
NSTICKS = 5
chopsticks = [threading.Lock() 
              for n in xrange(NSTICKS)]

# Create all of the philosophers
phils = [threading.Thread(target=philosopher,
                          args=(chopsticks[n],chopsticks[(n+1) % NSTICKS]))
         for n in xrange(NSTICKS)]

# Run all of the philosophers
for p in phils:
    p.start()
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;If you try this code, you'll find that the philosophers run all day with no deadlock. Just as an experiment, you can try changing the &lt;tt&gt;philosopher()&lt;/tt&gt; implementation to one that acquires the locks separately:&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;def philosopher(left, right):
    while True:
        with left:
             with right:
                 print threading.currentThread(), "eating"
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Yep, almost instantaneously deadlock.  So, as you can see, our &lt;tt&gt;acquire()&lt;/tt&gt; operation seems to be working.&lt;/p&gt;&lt;p&gt;There's still one last aspect of this experiment that needs to be addressed.   One potential problem with our &lt;tt&gt;acquire()&lt;/tt&gt; operation is that it doesn't prevent a user from using it in a nested manner as before.  For example, someone might write code like this:&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;with acquire(a_lock,b_lock):
     ...
     with acquire(c_lock, d_lock):
          ...
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Catching such cases at the time of definition would be difficult (if not impossible).  However, we could make the &lt;tt&gt;acquire()&lt;/tt&gt; context manager keep a record of all previously acquired locks using a list placed in thread local storage.   Here's a new implementation--and just for kicks, I'm going to switch it over to a context manager defined by a generator (mainly because I can and generators are cool):&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;import threading
from contextlib import contextmanager

local = threading.local()
@contextmanager
def acquire(*locks):
    locks = sorted(locks, key=lambda x: id(x))   
    acquired = getattr(local,"acquired",[])
    # Check to make sure we're not violating the order of locks already acquired   
    if acquired:
        if max(id(lock) for lock in acquired) &gt;= id(locks[0]):
            raise RuntimeError("Lock Order Violation")
    acquired.extend(locks)
    local.acquired = acquired
    try:
        for lock in locks:
            lock.acquire()
        yield
    finally:
        for lock in reversed(locks):
            lock.release()
        del acquired[-len(locks):]
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;If you use this version, you'll find that the philosophers work just fine as before.  However, now consider this slightly modified version with the nested acquires:&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;# The philosopher thread                                                                                             
def philosopher(left, right):
    while True:
        with acquire(left):
            with acquire(right):
                print threading.currentThread(), "eating"
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Unlike the previous version that had nested &lt;tt&gt;with&lt;/tt&gt; statements and deadlocked, this one runs.  However, one of the philosophers crashes with a nasty traceback:&lt;/p&gt;&lt;blockquote&gt;&lt;pre&gt;Exception in thread Thread-5:
Traceback (most recent call last):
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/threading.py", line 522, in __bootstrap_inner
    self.run()
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/threading.py", line 477, in run
    self.__target(*self.__args, **self.__kwargs)
  File "hier4.py", line 53, in philosopher
    with acquire(right):
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/contextlib.py", line 16, in __enter__
    return self.gen.next()
  File "hier4.py", line 35, in acquire
    raise RuntimeError("Lock Order Violation")
RuntimeError: Lock Order Violation
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Very good.  That's exactly what we wanted.&lt;/p&gt;&lt;p&gt;So, what's the moral of this story.  First of all, I don't think you should use this as a license to go off and write a bunch of multithreaded code that relies on nested lock acquisitions.  Sure, the context manager might catch some potential problems, but it won't change the fact that you'll still want to blow your head off after debugging some other horrible problem that comes up with your overly clever and/or complicated design.&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;I think the main take-away is an appreciation for Python's context-manager feature.  There's so much more you can do with a context manager than simply closing a file or releasing an individual lock.&lt;/p&gt;&lt;p&gt;Disclaimer: I didn't do a hugely exhaustive internet search to see if anyone else had implemented anything similar to this in Python.  If you know of some links to related work, tell me.  I'll add them here.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-84152637118440889?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/84152637118440889/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=84152637118440889' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/84152637118440889'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/84152637118440889'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2009/11/python-thread-deadlock-avoidance_20.html' title='Python Thread Deadlock Avoidance'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-3077934970165352710</id><published>2009-10-31T09:19:00.000-07:00</published><updated>2009-10-31T09:19:25.297-07:00</updated><title type='text'>Ultimate Python Quickstart Guide</title><content type='html'>&lt;p&gt;As the father of a toddler and a newborn, I've been getting my fair share of practice putting together various sorts of baby accessories (strollers, bassinets, cribs, etc.).  It has inspired me to write this ultimate quick start guide to getting started with the Python programming language.  I hope that you find it to be as incredibly useful as I have.&lt;/p&gt;&lt;p&gt;&lt;b&gt;Congratulations!&lt;/b&gt;&lt;br /&gt;
&lt;p&gt;Congratulations on your wise decision to use Python!  Follow this quick and easy guide to get started.&lt;/p&gt;&lt;br /&gt;
&lt;center&gt;&lt;br /&gt;
(a) Get&lt;br&gt;&lt;br /&gt;
&lt;img src="http://www.dabeaz.com/images/DownloadPython.png"&gt;&lt;/br&gt;&lt;br /&gt;
&lt;p&gt;(b) Click &lt;br&gt;&lt;br /&gt;
&lt;img src="http://www.dabeaz.com/images/InstallPython.JPG"&gt;&lt;/br&gt;&lt;br /&gt;
&lt;p&gt;(c) Run &lt;br&gt;&lt;br /&gt;
&lt;img src="http://www.dabeaz.com/images/RunPython.JPG"&gt;&lt;/br&gt;&lt;br /&gt;
&lt;p&gt;(d) Code &lt;br&gt;&lt;br /&gt;
&lt;img src="http://www.dabeaz.com/images/CodePython.png"&gt;&lt;/br&gt;&lt;br /&gt;
&lt;p&gt;&lt;/center&gt;&lt;br /&gt;
&lt;p&gt;Enjoy your new Python interpreter!&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-3077934970165352710?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/3077934970165352710/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=3077934970165352710' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/3077934970165352710'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/3077934970165352710'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2009/10/ultimate-python-quickstart-guide.html' title='Ultimate Python Quickstart Guide'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-1146030503757846605</id><published>2009-09-13T17:43:00.000-07:00</published><updated>2009-09-13T17:43:39.500-07:00</updated><title type='text'>Python Thread Synchronization Primitives : Not Entirely What You Think</title><content type='html'>&lt;p&gt;If you have done any kind of programming with Python threads, you are probably familiar with the basic synchronization primitives provided by the &lt;a href="http://docs.python.org/library/threading"&gt;threading&lt;/a&gt; module.  Specifically, you get the following kinds of synchronization objects to work with:&lt;/p&gt;&lt;p&gt;&lt;ul&gt;&lt;li&gt;&lt;tt&gt;Lock&lt;/tt&gt;.   Mutual exclusion lock that's commonly used to protect shared data structures. &lt;/li&gt;
&lt;li&gt;&lt;tt&gt;RLock&lt;/tt&gt;. Reentrant mutual exclusion lock that is useful for code-based locking on functions or methods or to implement monitors. &lt;/li&gt;
&lt;li&gt;&lt;tt&gt;Event&lt;/tt&gt;.  An object that that allows one or more threads to wait for some "event" to occur.  Used to implement barriers or to signal the completion of some task. &lt;/li&gt;
&lt;li&gt;&lt;tt&gt;Condition&lt;/tt&gt;.  Condition variable.  Used to send signals between threads.  For example in producer-consumer problems, the producer will use a condition variable to send a signal to the consumer that data is available.&lt;/li&gt;
&lt;li&gt;&lt;tt&gt;Semaphore&lt;/tt&gt;. A high-level synchronization primitive based on an integer counter;  Acquiring the semaphore decreases the counter and releasing the semaphore increases the counter.  If the counter is 0 and a thread tries to acquire, it will block until a different thread releases the semaphore. &lt;/li&gt;
&lt;/ul&gt;&lt;/p&gt;&lt;p&gt;Knowing how and when to use the various synchronization primitives is often a non-trivial exercise.   However, the point of this post is not about that--so if you're here looking for a gentle tutorial, you're in the wrong place.&lt;/p&gt;&lt;p&gt;Instead, I'd like to look at the inner workings of Python's thread synchronization primitives. In part, this is motivated by a general interest in knowing how Python works on multicore machines. However, it's also related to something that I noticed when putting my GIL talk together.  So, we'll take a little tour under the covers, do a few experiments, and think about how this might fit into the "big picture."&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;&lt;b&gt;A Curious Fact&lt;/b&gt;&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;If you write threaded programs, you should know that Python uses real system-level threads to carry out its work.  That is, threads are implemented using pthreads or some other native threading mechanism provided by the operating system.  However, the same can not be said of Python's basic synchronization primitives such as &lt;tt&gt;Lock&lt;/tt&gt;, &lt;tt&gt;Condition&lt;/tt&gt;, &lt;tt&gt;Semaphore&lt;/tt&gt; and so forth.  That is, even though low-level libraries such as pthreads provide various kinds of basic locks and synchronization objects, the &lt;tt&gt;threading&lt;/tt&gt; library doesn't make direct use of them (so, when you're using something like a &lt;tt&gt;Lock&lt;/tt&gt; object in your program, you're not manipulating a pthreads mutex). &lt;/p&gt;&lt;p&gt;This fact may surprise experienced programmers.  Many of Python's core library modules provide a direct interface to low-level functionality written in C (e.g., think about the &lt;tt&gt;os&lt;/tt&gt; or &lt;tt&gt;socket&lt;/tt&gt; modules).  However, thread synchronization objects are an exception to that rule.&lt;/p&gt;&lt;p&gt;&lt;b&gt;Some History&lt;/b&gt;&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;Python has included support for threads for most of its history.  In fact, if Guido ever gets around to updating his &lt;a href="http://python-history.blogspot.com/"&gt;History of Python&lt;/a&gt; blog, he will eventually tell you that threads were first added to Python in 1992 after a contribution by one of his coworkers Sjoerd Mullender (disclaimer: I don't have a time machine, but I have seen the entire "History of Python" article that Guido is using as the basis for his history blog).  This early work is where you find the introduction of the global interpreter lock (GIL) as well as the low-level &lt;a href="http://docs.python.org/library/thread"&gt;&lt;tt&gt;thread&lt;/tt&gt;&lt;/a&gt; library module. &lt;/p&gt;&lt;p&gt;Part of the problem faced by early versions of Python was the fact that thread programming interfaces weren't always available or standardized across systems.  Thus, threads were only supported on certain machines such as SGI Irix and Sun Solaris.  The pthreads interface wasn't standardized until a little later (~1995).  The modern &lt;a href="http://docs.python.org/library/threading"&gt;&lt;tt&gt;threading&lt;/tt&gt;&lt;/a&gt; library that virtually all Python programmers now use first appeared in Python-1.5.1 (1998).&lt;/p&gt;&lt;p&gt;A consequence of this chaos was that Python's support for threads was intentionally designed to have a minimal set of basic requirements.  The &lt;tt&gt;thread&lt;/tt&gt; library module simply provided a function for launching a Python callable in its own execution thread.  A single function, &lt;tt&gt;allocate_lock()&lt;/tt&gt; could be used to allocate a "lock" object.  This object provided the usual &lt;tt&gt;acquire()&lt;/tt&gt; and &lt;tt&gt;release()&lt;/tt&gt; operations, but not much else.&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;If you dig into the C implementation of the interpreter, you'll find that all support for locking is reduced to just four C functions.&lt;/p&gt;&lt;p&gt;&lt;ul&gt;&lt;li&gt;&lt;tt&gt;PyThread_allocate_lock()&lt;/tt&gt;&lt;/li&gt;
&lt;li&gt;&lt;tt&gt;PyThread_free_lock()&lt;/tt&gt;&lt;/li&gt;
&lt;li&gt;&lt;tt&gt;PyThread_acquire_lock()&lt;/tt&gt;&lt;/li&gt;
&lt;li&gt;&lt;tt&gt;PyThread_release_lock()&lt;/tt&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/p&gt;&lt;p&gt;You can find these functions in a series of files such as &lt;tt&gt;thread_pthread.h&lt;/tt&gt;, &lt;tt&gt;thread_nt.h&lt;/tt&gt;, &lt;tt&gt;thread_solaris.h&lt;/tt&gt;, and so forth in the &lt;tt&gt;Python/&lt;/tt&gt; directory of the Python interpreter source.  Each file simply contains a platform specific implementation of a basic lock.  This lock then becomes the basis for all other synchronization primitives as we'll see in a minute.  It should also be noted that these functions are also used to implement the infamous global interpreter lock (GIL).&lt;/p&gt;&lt;p&gt;&lt;b&gt;What is a lock exactly?&lt;/b&gt;&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;If you have worked with thread locking in C, you might think that the above C functions are simply a wrapper around something like a pthreads mutex lock. However, this is not the case.  Instead, the lock is minimally implemented as a &lt;a href="http://en.wikipedia.org/wiki/Semaphore_(programming)"&gt;binary semaphore&lt;/a&gt;.  Here is a simplified example of the lock implementation that's used on many Unix systems:&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;#include &amp;lt;stdlib.h&gt;
#include &amp;lt;pthread.h&gt;
#include &amp;lt;string.h&gt;

typedef struct {
  char           locked;
  pthread_cond_t lock_released;
  pthread_mutex_t mut;
} lock_t;

lock_t *
allocate_lock(void) {
  lock_t *lock;
  lock = (lock_t *) malloc(sizeof(lock_t));
  memset((void *)lock, '\0', sizeof(lock_t));
  pthread_mutex_init(&amp;lock-&gt;mut,NULL);
  pthread_cond_init(&amp;lock-&gt;lock_released, NULL);
  return lock;
}

void 
free_lock(lock_t *lock) {
  pthread_mutex_destroy( &amp;lock-&gt;mut );
  pthread_cond_destroy( &amp;lock-&gt;lock_released );
  free((void *)lock);
}

int 
acquire_lock(lock_t *lock, int waitflag) {
  int success;
  pthread_mutex_lock( &amp;lock-&gt;mut );
  success = lock-&gt;locked == 0;

  if ( !success &amp;&amp; waitflag ) {
    while ( lock-&gt;locked ) {
      pthread_cond_wait(&amp;lock-&gt;lock_released,&amp;lock-&gt;mut);
    }
    success = 1;
  }
  if (success) lock-&gt;locked = 1;
  pthread_mutex_unlock( &amp;lock-&gt;mut );
  return success;
}

void 
release_lock(lock_t *lock) {
  pthread_mutex_lock( &amp;lock-&gt;mut );
  lock-&gt;locked = 0;
  pthread_mutex_unlock( &amp;lock-&gt;mut );
  pthread_cond_signal( &amp;lock-&gt;lock_released );
}
&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;Understanding this code requires some careful study.  However, the key part of it is that Python lock objects manually keep track of their internal state (locked or unlocked).  This is the &lt;tt&gt;locked&lt;/tt&gt; attribute of the lock structure.  The pthreads mutex lock is simply being used to synchronize access to the &lt;tt&gt;locked&lt;/tt&gt; attribute in the &lt;tt&gt;acquire()&lt;/tt&gt; and &lt;tt&gt;release()&lt;/tt&gt; operations (note: this mutex lock is not actually &lt;em&gt;the&lt;/em&gt; lock).  Finally, the condition variable is being used to perform a kind of thread signaling that's used to wake up any sleeping threads when the lock gets released.&lt;/p&gt;&lt;p&gt;&lt;b&gt;What about Native Semaphores?&lt;/b&gt;&lt;/p&gt;&lt;P&gt;As just mentioned, the Python lock is minimally implemented as a binary semaphore.  If you've done thread programming in C, you probably know that many systems optionally include a native semaphore object. On such systems, Python may be built in a way so that it simply uses the native semaphore object for the lock.  For example, this what Python uses for synchronization on Windows.&lt;/P&gt;&lt;p&gt;I don't intend to say any more about this here except to emphasize that using some kind of semaphore is actually a requirement for other parts of Python's threading to work correctly.  For instance, the high-level threading library won't work if the lock isn't implemented in this manner.&lt;/p&gt;&lt;p&gt;&lt;b&gt;Semaphores vs. Mutex Locks&lt;/b&gt;&lt;/p&gt;&lt;p&gt;The differences between a semaphore and mutex lock are subtle.  However, the most obvious one pertains to the issue of ownership.   When you use a mutex lock, there is almost always a strong sense of ownership.  Specifically, if a thread acquires a mutex, it is the only thread that is allowed to release it.  Semaphores don't have this restriction.  In fact, once a semaphore has been acquired, any thread can later release it. This allows for more varied forms of thread signaling and synchronization.  Here is one such experiment you can try in Python:&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;&gt;&gt;&gt; &lt;b&gt;import threading, time&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;done = threading.Lock()&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;def foo():&lt;/b&gt;
...      &lt;b&gt;print "I'm foo and I'm running"&lt;/b&gt;
...      &lt;b&gt;time.sleep(30)&lt;/b&gt;
...      &lt;b&gt;done.release()&lt;/b&gt;       # Signal completion by releasing the lock
...
&gt;&gt;&gt; &lt;b&gt;done.acquire()&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;threading.Thread(target=foo).start()&lt;/b&gt;
I'm foo and I'm running
&gt;&gt;&gt; &lt;b&gt;done.acquire(); print "Foo done"&lt;/b&gt;
Foo done                        (note: prints after 30 seconds)
&gt;&gt;&gt;
&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;In this example, a lock is being used to signal the completion of some task.  The main thread acquires the lock to clear it and then launches a thread to carry out some work.  Immediately after launching this thread, the main thread attempts to immediately acquire the lock again.  Since the lock was already in use, this operation blocks.  However, when the worker thread finishes, it releases the lock--notifying the main thread that it has finished.  It is critical to emphasize that the lock is being acquired and released by two different threads.  This is the essential property provided by using a semaphore.  If a traditional mutex lock were used, the program would deadlock or crash with an error.&lt;/p&gt;&lt;p&gt;Just as aside, I would not recommend writing Python code that uses &lt;tt&gt;Lock&lt;/tt&gt; objects in this way.  Most programmers are going to associate &lt;tt&gt;Lock&lt;/tt&gt; with a mutex-lock.  You definitely don't use mutex-locks in the manner shown. &lt;/p&gt;&lt;p&gt;Other differences between mutex locks and semaphores tend to be more subtle. There are a number of well-known problems concerning mutex locks that typically get addressed by thread libraries and the operating system.  For example, the system may implement policies to prevent thread starvation or provide some sense of fairness when many threads are competing for the same lock.   If threads have different scheduling priorities, the system may also try to work around problems related to priority inversion (a problem where a low-priority thread holds a lock needed by a high-priority thread).  Semaphores aren't necessarily treated in the same manner which means that a multithreaded program using semaphores may execute in a manner that is slightly different than one that uses mutex locks.  For now, however, let's skip though details.&lt;/p&gt;&lt;p&gt;&lt;b&gt;The threading Library&lt;/b&gt;&lt;/p&gt;&lt;p&gt;Now, that we've talked about the low-level locking mechanism used by the interpreter, let's talk about the synchronization primitives defined in the &lt;tt&gt;threading&lt;/tt&gt; library.  With the exception of &lt;tt&gt;Lock&lt;/tt&gt; objects, which are identical to the lock described in the above section, all of the other synchronization primitives are written entirely in Python.  For example, consider the &lt;tt&gt;RLock&lt;/tt&gt; implementation.  Here is a cleaned up version of how it is implemented:&lt;br /&gt;
&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;class RLock:
    def __init__(self):
        self._block = _allocate_lock()
        self._owner = None
        self._count = 0

    def acquire(self, blocking=1):
        me = current_thread()
        if self._owner is me:
            self._count = self._count + 1
            return 1
        rc = self._block.acquire(blocking)
        if rc:
            self._owner = me
            self._count = 1
        return rc

    def release(self):
        if self._owner is not current_thread():
            raise RuntimeError("cannot release un-aquired lock")
        self._count = count = self._count - 1
        if not count:
            self._owner = None
            self._block.release()
&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;The fact that an &lt;tt&gt;RLock&lt;/tt&gt; is implemented entirely as a Python layer over a regular lock object significantly impacts its performance.   For example:&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;&gt;&gt;&gt; &lt;b&gt;from timeit import timeit&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;timeit("lock.acquire();lock.release()","from threading import Lock; lock = Lock()")&lt;/b&gt;
0.50123405456542969
&gt;&gt;&gt; &lt;b&gt;timeit("lock.acquire();lock.release()","from threading import RLock; lock = RLock()")&lt;/b&gt;
5.2153160572052002
&gt;&gt;&gt;
&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;Here, you see that acquiring and releasing a &lt;tt&gt;RLock&lt;/tt&gt; object is about ten times slower than using a &lt;tt&gt;Lock&lt;/tt&gt;.   The performance impact is worse for more advanced synchronization primitives.  For example, here is the result of using a &lt;tt&gt;Semaphore&lt;/tt&gt; object (which is also implemented entirely in Python)&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;&gt;&gt;&gt; &lt;b&gt;timeit("lock.acquire();lock.release()","from threading import Semaphore; lock = Semaphore(1)")&lt;/b&gt;
6.5345189571380615
&gt;&gt;&gt; 
&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;&lt;tt&gt;Condition&lt;/tt&gt; and &lt;tt&gt;Event&lt;/tt&gt; objects are also implemented entirely in Python. However, their implementation is also rather interesting.   Keep in mind that the primary purpose of a &lt;tt&gt;Condition&lt;/tt&gt; object is to perform signaling between threads.  Here is a very common scenario that you see with producer-consumer problems such as in the implementation of a queue.&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;from threading import Lock, Condition
from collections import deque

items      = deque()
items_cv   = Condition()

def producer():
    while True:
         # produce some item
         items_cv.acquire()
         items.append(item)
         items_cv.notify()
         items_cv.release()

def consumer():
    while True:
         items_cv.acquire()
         while not items:
               items_cv.wait()
         item = items.popleft()
         items_cv.release()
         # Do something with item
&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;Of particular interest here are the &lt;tt&gt;wait()&lt;/tt&gt; and &lt;tt&gt;notify()&lt;/tt&gt; operations that perform the thread signaling.  This signaling is actually carried out using a &lt;tt&gt;Lock&lt;/tt&gt; object.  When you wait on a condition variable, a new &lt;tt&gt;Lock&lt;/tt&gt; object is created and acquired.  The lock is then acquired again to force the thread to block.  If you look at the implementation of &lt;tt&gt;Condition&lt;/tt&gt; you find code like this:&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;class Condition:
    ...
    def wait(self, timeout=None):
        ...
        waiter = _allocate_lock()
        waiter.acquire()
        self._waiters.append(waiter)
        ...
        waiter.acquire()       # Block
    ...
&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;The &lt;tt&gt;notify()&lt;/tt&gt; operation that wakes up a thread is carried out by simply releasing the waiter lock created above:&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;class Condition:
    ...
    def notify(self, n=1):
        waiters = self._waiters[:n]
        for waiter in waiters:
            waiter.release()
    ...
&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;Needless to say, a lot of processing is going on underneath the covers when you use something like a &lt;tt&gt;Condition&lt;/tt&gt; object in Python.   Every &lt;tt&gt;wait()&lt;/tt&gt; operation involves creating an entirely new lock object.  Signaling is carried out with &lt;tt&gt;acquire()&lt;/tt&gt; and &lt;tt&gt;release()&lt;/tt&gt; operations on that lock.  Moreover, there are additional locking operations carried out on the lock object associated with the condition variable itself.  Plus, consider that all of this high-level locking actually involves more locks and condition variables in C. &lt;/p&gt;&lt;p&gt;&lt;b&gt;Who Cares?&lt;/b&gt;&lt;/p&gt;&lt;p&gt;At this point, you might be asking yourself "who cares? This is all a bunch of low-level esoteric details."   However, I think that anyone who is serious about using threads in Python should take an interest in how the synchronization primitives are actually put together.&lt;/p&gt;&lt;p&gt;For one, a common rule of thumb with thread programming is to try and avoid the use of locks and synchronization primitives as much as possible.  This is certainly true in C, but even more so in Python.  The fact that almost all of the synchronization primitives are implemented in Python means that they are substantially slower than any comparable operations in a C/C++ threading library.  So, if you care about performance, using a lot of locks is something you'll definitely want to avoid.&lt;/p&gt;&lt;p&gt;The other reason to care about this concerns the &lt;tt&gt;Queue&lt;/tt&gt; module.  It is commonly advised that the Queue module be used as a means for exchanging data between threads because it already deals with all of the underlying synchronization. This is all well and good except for the fact that &lt;tt&gt;Queue&lt;/tt&gt; objects add even more layers to all of the synchronization primitives that we've talked about.   In particular, the locking performed by a queue is done using a combination of locks and condition variables from the &lt;tt&gt;threading&lt;/tt&gt; module. &lt;/p&gt;&lt;p&gt;This means that if you're using queues, you're not really avoiding all of the overhead of locking.  Instead, you're just moving it to a different location where it's out of view.&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;One might wonder just how much overhead gets added by all of this.  For instance, a &lt;tt&gt;Queue&lt;/tt&gt; object is really just a wrapper around a &lt;tt&gt;collections.deque&lt;/tt&gt; with the added locking.   You can try a few performance experiments. For instance, inserting items:&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;&gt;&gt;&gt; &lt;b&gt;timeit("q.append(1)","from collections import deque; q = deque()")&lt;/b&gt;
0.17505884170532227
&gt;&gt;&gt; &lt;b&gt;timeit("q.put(1)","from Queue import Queue; q = Queue()")&lt;/b&gt;
4.4164938926696777
&gt;&gt;&gt; 
&lt;/pre&gt;&lt;/blockquote&gt;&lt;/P&gt;&lt;p&gt;Here, you find that inserting into a &lt;tt&gt;Queue&lt;/tt&gt; is about 25 times slower than inserting into a &lt;tt&gt;deque&lt;/tt&gt;. You get similar figures for removing items.   Keep in mind that these simple benchmarks don't even cover the case of working with multiple threads where even more overhead would be added.&lt;br /&gt;
&lt;/p&gt;&lt;P&gt;&lt;b&gt;Some Final Thoughts&lt;/b&gt;&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;There surely seems to be an opportunity for some experimentation with better implementations of Python's thread synchronization primitives. For example, condition variables are a core component of Python's &lt;tt&gt;Semaphore&lt;/tt&gt;, &lt;tt&gt;Event&lt;/tt&gt;, and &lt;tt&gt;Queue&lt;/tt&gt; objects, yet Python makes no effort to use any kind of native implementation (e.g., pthreads condition variables).  Moreover, why is Python using custom implementations of synchronization objects already provided by the operating system and thread libraries (e.g., semaphores).  Given that much of Python's thread implementation was worked out more than ten years ago, it would be interesting to perform some experiments and revisit the threading implementation on modern systems--especially in light of the increased interested in concurrency, multiple CPU cores, and other matters.&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;Anyways, that's it for now.   I'd love to hear your comments.  Also, if you are aware of prior work related to optimizing the threading library, benchmarks, or anything else that might be related, I'd be interested in links so that I can post them here. &lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-1146030503757846605?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/1146030503757846605/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=1146030503757846605' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/1146030503757846605'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/1146030503757846605'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2009/09/python-thread-synchronization.html' title='Python Thread Synchronization Primitives : Not Entirely What You Think'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-8353460319128031213</id><published>2009-08-27T05:39:00.000-07:00</published><updated>2009-09-06T16:58:48.798-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='threads'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='gil'/><title type='text'>Inside the "Inside the Python GIL" Presentation</title><content type='html'>&lt;p&gt;On June 11, 2009 I gave a &lt;a href="http://www.dabeaz.com/python/GIL.pdf"&gt;presentation&lt;/a&gt; about the inner workings of the Python GIL at the &lt;a href="http://chipy.org/"&gt;Chicago Python user group&lt;/a&gt; meeting.   To be honest, I always expected the event to be a pretty low-key affair involving some local Python hackers and some beers.   However, the presentation went a little viral and I've received a number of requests to get the code modifications I made to investigate thread behavior--especially the traces that show thread switching and other details.&lt;/p&gt;&lt;p&gt;In this post, I'll briefly outline the code changes I made to generate the traces.  Before going any further, you should probably first view the original &lt;a href="http://blip.tv/file/2232410/"&gt;presentation&lt;/a&gt;. Also, as a disclaimer, none of these changes are easily packaged into a neat "patch" that one can simply download and install into any Python distribution.  So, to start, you should first go download a Python source distribution for the version of Python you want to experiment with.  For my talk, I was using Python 2.6.&lt;/p&gt;&lt;p&gt;First, let's talk about a major issue--any investigation of threads at a low-level (especially thread scheduling) tends to be a rather tricky affair involving some kind of computer science variant of the uncertainty principle.   That is, once you start trying to observe thread behavior, you run the risk of changing the very thing you're trying to observe.  The problem gets worse if you add a lot of extra complexity--especially if there are extra system calls or I/O.   So, a major underlying concern was to try and devise a technique for recording thread behavior in a minimally invasive manner (as an aside, I considered the idea of trying to use dtrace for this, but decided that it would take longer for me to learn dtrace than it would to simply make a few minor modifications to the interpreter).  &lt;/p&gt;&lt;p&gt;&lt;b&gt;Step 1: Defining time&lt;/b&gt;&lt;/p&gt;&lt;p&gt;Everything that happens inside the Python interpreter is focused around the concept of "ticks."  Each tick loosely corresponds to a single instruction in the virtual machine.  Locate the file &lt;tt&gt;Python/ceval.c&lt;/tt&gt; in the Python source code.   In this file, you will find a global variable &lt;tt&gt;_Py_Ticker&lt;/tt&gt; holding the tick counter.  Here's what the code looks like:&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre class="prettyprint"&gt;/* ceval.c */
...
int _Py_CheckInterval = 100;
volatile int _Py_Ticker = 0; /* so that we hit a "tick" first thing */
...&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;Add a new variable declaration &lt;tt&gt;_Py_Ticker_Counter&lt;/tt&gt; to this code so that it looks like this:&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre class="prettyprint"&gt;/* ceval.c */
...
int _Py_CheckInterval = 100;
volatile int _Py_Ticker = 0; /* so that we hit a "tick" first thing */
&lt;span style="color:red;"&gt;volatile int _Py_Ticker_Count = 0;&lt;/span&gt;
...&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;Later in the same file, you will find code that decrements the value of &lt;tt&gt;_Py_Ticker&lt;/tt&gt;.  Modify this code so that each time &lt;tt&gt;_Py_Ticker&lt;/tt&gt; reaches 0, the value of &lt;tt&gt;_Py_Ticker_Count&lt;/tt&gt; is incremented.  Here's what it looks like:&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;/* ceval.c */
...
  if (--_Py_Ticker &amp;lt; 0) {
   if (*next_instr == SETUP_FINALLY) {
    /* Make the last opcode before
       a try: finally: block uninterruptable. */
    goto fast_next_opcode;
   }
   _Py_Ticker = _Py_CheckInterval;
&lt;font color="red"&gt;   _Py_Ticker_Count++; &lt;/font&gt;
   tstate-&gt;tick_counter++;
...&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;The &lt;tt&gt;_Py_Ticker_Count&lt;/tt&gt; and &lt;tt&gt;_Py_Ticker&lt;/tt&gt; variables together define a kind of internal clock.   &lt;tt&gt;_Py_Ticker&lt;/tt&gt; is a countdown to the next time the interpreter might thread-switch.   The &lt;tt&gt;_Py_Ticker_Count&lt;/tt&gt; keeps track of how many times the interpreter has actually signaled the operating system to schedule waiting threads (if any).  In the traces that follow, these two values are used together to record the sequence of events that occur in terms of interpreter ticks.&lt;/p&gt;&lt;p&gt;&lt;b&gt;Step 2 : Recording Trace Data&lt;/b&gt;&lt;/p&gt;&lt;p&gt;Python defines a general purpose lock object that is used for both the GIL and locking primitives in the threading modules.  On Unix systems using pthreads, the implementation of the lock can be found in the file &lt;tt&gt;Python/thread_pthread.h&lt;/tt&gt;.   In that file, there are two functions that we are going to modify:  &lt;tt&gt;PyThread_acquire_lock()&lt;/tt&gt; and &lt;tt&gt;PyThread_release_lock()&lt;/tt&gt;.&lt;/p&gt;&lt;p&gt;Here's the general idea : The lock/unlock functions are instrumented to record a large in-memory trace of lock-related events.  These include lock entry (when a thread first tries to acquire a lock), busy (when the lock is busy), retry (a repeated failed attempt to acquire a lock), acquire (lock successfully acquired), and release (lock released).  In addition to events, the trace records current values of the &lt;tt&gt;_Py_Ticker&lt;/tt&gt; and &lt;tt&gt;_Py_Ticker_Count&lt;/tt&gt; variables as well as the pointer to the currently executing thread.&lt;/p&gt;&lt;p&gt;All trace data is stored entirely in memory as programs execute.   The size of the history can be controlled with a macro in the code.  To dump the trace, a function &lt;tt&gt;print_history()&lt;/tt&gt; is registered to execute on interpreter exit using the &lt;tt&gt;atexit()&lt;/tt&gt; call.    It is important to emphasize that no I/O occurs as programs are executing--traces are only dumped on program exit.&lt;/p&gt;&lt;p&gt;Here a copy of the modified code.  Be aware that &lt;tt&gt;thread_pthread.h&lt;/tt&gt; is a bit of a mess and that there are a few different implementations of locks.  This code is meant to go in the non-semaphore implemention of locks.    Further discussion appears afterwards&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;/* thread_pthread.h */
...
&lt;span style="color:red;"&gt;/* Thread lock monitoring modifications (beazley) */

#include &amp;lt;sys/resource.h&gt;
#include &amp;lt;sched.h&gt;

#define MAXHISTORY 5000000
static int           thread_history[MAXHISTORY];
static unsigned char tick_history[MAXHISTORY];
static int           tick_count_history[MAXHISTORY];
static unsigned char tick_acquire[MAXHISTORY];
static double        time_history[MAXHISTORY];
static unsigned int  history_count = 0;

#define EVENT_ENTRY   0
#define EVENT_BUSY    1
#define EVENT_RETRY   2
#define EVENT_ACQUIRE 3
#define EVENT_RELEASE 4

static char *_codes[] = {"ENTRY","BUSY","RETRY","ACQUIRE","RELEASE" };

static void print_history(void) {
 int i;
 FILE *f;

 f = fopen("tickhistory.txt","w");
 for (i = 0; i &amp;lt; history_count; i++) {
   fprintf(f,"%x %d %d %s %0.6f\n",thread_history[i],tick_history[i],tick_count_history[i],_codes[tick_acquire[i]],time_history[i]);
 }
 fclose(f);
}

/* External variables recorded in the history */
extern volatile int _Py_Ticker;
extern volatile int _Py_Ticker_Count;

&lt;/span&gt;
int
PyThread_acquire_lock(PyThread_type_lock lock, int waitflag)
{
 int success;
 pthread_lock *thelock = (pthread_lock *)lock;
 int status, error = 0;
 &lt;span style="color:red;"&gt;int start_thread = 0;

 if (history_count == 0) {
   atexit(print_history);
 }&lt;/span&gt;

 dprintf(("PyThread_acquire_lock(%p, %d) called\n", lock, waitflag));

 status = pthread_mutex_lock( &amp;amp;thelock-&gt;mut );

 &lt;span style="color:red;"&gt;/* Record information in the log */
 start_thread = (int) pthread_self(); 
 if (history_count &amp;lt; MAXHISTORY) {
   thread_history[history_count] = start_thread;
   tick_history[history_count] = _Py_Ticker;
   tick_count_history[history_count] = _Py_Ticker_Count;
   time_history[history_count] = 0.0;
   tick_acquire[history_count++] = EVENT_ENTRY;
 }&lt;/span&gt;

 CHECK_STATUS("pthread_mutex_lock[1]");
 success = thelock-&gt;locked == 0;

 if ( !success &amp;amp;&amp;amp; waitflag ) {

   &lt;span style="color:red;"&gt;int ntries = 0;&lt;/span&gt;
  /* continue trying until we get the lock */

  /* mut must be locked by me -- part of the condition
   * protocol */

  while ( thelock-&gt;locked ) {
    &lt;span style="color:red;"&gt;if (ntries == 0) {
      if (history_count &amp;lt; MAXHISTORY) {
        thread_history[history_count] = start_thread;
        tick_history[history_count] = _Py_Ticker;
        tick_count_history[history_count] = _Py_Ticker_Count;
        time_history[history_count] = 0.0;
        tick_acquire[history_count++] = EVENT_BUSY;
      }&lt;/span&gt;
    }

   status = pthread_cond_wait(&amp;amp;thelock-&gt;lock_released,
         &amp;amp;thelock-&gt;mut);
   CHECK_STATUS("pthread_cond_wait");
   if (thelock-&gt;locked) {
     &lt;span style="color:red;"&gt;if (history_count &amp;lt; MAXHISTORY) {
       thread_history[history_count] = start_thread;
       tick_history[history_count] = _Py_Ticker;
       tick_count_history[history_count] = _Py_Ticker_Count;
       time_history[history_count] = 0.0;
       tick_acquire[history_count++] = EVENT_RETRY;
       ntries += 1;
     }&lt;/span&gt;
   } else {
     &lt;span style="color:red;"&gt;if (history_count &amp;lt; MAXHISTORY) {
       thread_history[history_count] = start_thread;
       tick_history[history_count] = _Py_Ticker;
       tick_count_history[history_count] = _Py_Ticker_Count;
       {
         struct timeval t;
#ifdef GETTIMEOFDAY_NO_TZ
         if (gettimeofday(&amp;amp;t) == 0)
    time_history[history_count] = (double)t.tv_sec + t.tv_usec*0.000001;
#else /* !GETTIMEOFDAY_NO_TZ */
         if (gettimeofday(&amp;amp;t, (struct timezone *)NULL) == 0)
    time_history[history_count] = (double)t.tv_sec + t.tv_usec*0.000001;
#endif /* !GETTIMEOFDAY_NO_TZ */
       }
       tick_acquire[history_count++] = EVENT_ACQUIRE;
     }&lt;/span&gt;
   }

  }
  success = 1;
 } else {&lt;span style="color:red;"&gt;
   if (history_count &amp;lt; MAXHISTORY) {
     thread_history[history_count] = start_thread;
     tick_history[history_count] = _Py_Ticker;
     tick_count_history[history_count] = _Py_Ticker_Count;
     time_history[history_count] = 0.0;
     tick_acquire[history_count++] = EVENT_ACQUIRE;
   }&lt;/span&gt;
 }
 if (success) thelock-&gt;locked = 1;
 status = pthread_mutex_unlock( &amp;amp;thelock-&gt;mut );
 CHECK_STATUS("pthread_mutex_unlock[1]");

 if (error) success = 0;
 dprintf(("PyThread_acquire_lock(%p, %d) -&gt; %d\n", lock, waitflag, success));
 return success;
}

void
PyThread_release_lock(PyThread_type_lock lock)
{
 pthread_lock *thelock = (pthread_lock *)lock;
 int status, error = 0;

 dprintf(("PyThread_release_lock(%p) called\n", lock));

 status = pthread_mutex_lock( &amp;amp;thelock-&gt;mut );
 CHECK_STATUS("pthread_mutex_lock[3]");
 &lt;span style="color:red;"&gt;
 if (history_count &amp;lt; MAXHISTORY) {
   thread_history[history_count] = (int) pthread_self();
   tick_history[history_count] = _Py_Ticker;
   tick_count_history[history_count] = _Py_Ticker_Count;
   tick_acquire[history_count++] = EVENT_RELEASE;
 }&lt;/span&gt;

 thelock-&gt;locked = 0;

 status = pthread_mutex_unlock( &amp;amp;thelock-&gt;mut );
 CHECK_STATUS("pthread_mutex_unlock[3]");

 /* wake up someone (anyone, if any) waiting on the lock */
 status = pthread_cond_signal( &amp;amp;thelock-&gt;lock_released );
 CHECK_STATUS("pthread_cond_signal");
}&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;&lt;b&gt;Step 3 : Rebuilding and Running Python&lt;/b&gt;&lt;/p&gt;&lt;p&gt;Once you have made the above changes, rebuild the Python interpreter and run it on some sample code.  The code should run the same as before, but on program exit, you will get get a huge data file &lt;tt&gt;tickhistory.txt&lt;/tt&gt; dumped into the current working directory.    The contents of this file are going to look something like this:&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;a0811720 8 1299 RELEASE 0.000000
a0811720 15 1302 ENTRY 0.000000
a0811720 15 1302 ACQUIRE 0.000000
a0811720 10 1302 ENTRY 0.000000
a0811720 10 1302 ACQUIRE 0.000000
a0811720 10 1302 RELEASE 0.000000
a0811720 7 1302 ENTRY 0.000000
a0811720 7 1302 ACQUIRE 0.000000
b0081000 7 1302 ENTRY 0.000000
b0081000 7 1302 ACQUIRE 0.000000
b0081000 7 1302 RELEASE 0.000000
b0081000 7 1302 ENTRY 0.000000
b0081000 7 1302 ACQUIRE 0.000000
b0081000 7 1302 RELEASE 0.000000
b0081000 7 1302 ENTRY 0.000000
b0081000 7 1302 BUSY 0.000000
a0811720 1 1302 RELEASE 0.000000
a0811720 1 1302 ENTRY 0.000000
a0811720 1 1302 ACQUIRE 0.000000
a0811720 1 1302 ENTRY 0.000000
a0811720 1 1302 ACQUIRE 0.000000
a0811720 100 1303 RELEASE 0.000000
a0811720 100 1303 ENTRY 0.000000
a0811720 100 1303 ACQUIRE 0.000000
a0811720 92 1303 RELEASE 0.000000
a0811720 92 1303 ENTRY 0.000000
a0811720 92 1303 ACQUIRE 0.000000
a0811720 92 1303 ENTRY 0.000000
a0811720 92 1303 ACQUIRE 0.000000
...&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;Be forewarned--the size of this file can be substantial.  Running a threaded program for even 10-20 seconds might generate a trace file that contains 3-4 million events.  To do any kind of analysis on it, you'll probably want to do what everyone normally does and write a Python script. &lt;/p&gt;&lt;p&gt;&lt;b&gt;Discussion&lt;/b&gt;&lt;/p&gt;&lt;p&gt;Interpreting the contents of the trace file are left as an exercise for the reader. However, here are few tips.  First, the normal sequence of lock acquisition and release on the GIL with a CPU-bound thread looks something like this (notice that the &lt;tt&gt;_Py_Ticker&lt;/tt&gt; value in the 2nd column is always 100 and that the lock goes through a repeated ENTRY-&gt;ACQUIRE-&gt;RELEASE cycle): &lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;a000d000 100 3570 ENTRY 0.000000
a000d000 100 3570 ACQUIRE 0.000000
a000d000 100 3571 RELEASE 0.000000
a000d000 100 3571 ENTRY 0.000000
a000d000 100 3571 ACQUIRE 0.000000
a000d000 100 3572 RELEASE 0.000000
a000d000 100 3572 ENTRY 0.000000
a000d000 100 3572 ACQUIRE 0.000000
a000d000 100 3573 RELEASE 0.000000
...&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;If you're looking at thread contention, you're going to see a trace that has an event series of ENTRY-&gt;BUSY-&gt;RETRY-&gt;...-&gt;RETRY-&gt;ACQUIRE-&gt;RELEASE like this:&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;&lt;span style="color:red;"&gt;a000d000 48 4794 ENTRY 0.000000&lt;/span&gt;
&lt;span style="color:red;"&gt;a000d000 48 4794 BUSY 0.000000&lt;/span&gt;
7091800 32 4794 RELEASE 0.000000
7069a00 32 4794 ACQUIRE 1251397338.473370
7091800 32 4794 ENTRY 0.000000
7091800 32 4794 BUSY 0.000000
&lt;span style="color:red;"&gt;a000d000 32 4794 RETRY 0.000000&lt;/span&gt;
7069a00 100 4795 RELEASE 0.000000
7069a00 100 4795 ENTRY 0.000000
7069a00 100 4795 ACQUIRE 0.000000
&lt;span style="color:red;"&gt;a000d000 66 4795 RETRY 0.000000&lt;/span&gt;
7069a00 100 4796 RELEASE 0.000000
7069a00 100 4796 ENTRY 0.000000
7069a00 100 4796 ACQUIRE 0.000000
&lt;span style="color:red;"&gt;a000d000 95 4796 RETRY 0.000000&lt;/span&gt;
7069a00 100 4797 RELEASE 0.000000
7069a00 100 4797 ENTRY 0.000000
7069a00 100 4797 ACQUIRE 0.000000
...
&lt;span style="color:red;"&gt;a000d000 100 5083 ACQUIRE 1251397338.478188&lt;/span&gt;
...&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt; Here are some other notes concerning its analysis: &lt;ul&gt;&lt;li&gt;The first column is the hex memory address of a lock object.  If you run the program on a threaded program that is using many different locks, you will be tracing not only the GIL, but every lock in the program.   You might be able to use this to investigate lock contention.&lt;/li&gt;
&lt;li&gt;The GIL is not specifically identified in the trace file.  However, it will be one of the first locks used.   &lt;/li&gt;
&lt;li&gt;The last column of the trace file is a system timer that is only recorded when locks are acquired after repeated failed acquisition attempts.   At some point, I was using this to investigate some issues related to response times, but to be honest, I didn't spend much time exploring that angle.   It might be useful if you want to get an idea for how long each thread runs before giving up control.   Of course, you may just want to comment that code out.&lt;/li&gt;
&lt;/ul&gt;&lt;/p&gt;&lt;p&gt;&lt;b&gt;Other Comments&lt;/b&gt;&lt;/p&gt;&lt;p&gt;Since giving the presentation, I've received a few comments through email offering suggestions for a GIL fix.   I stand by my earlier assertion that there is no easy fix for the problem described in the presentation.   Here are some specific suggestions followed by my response: &lt;ul&gt;&lt;li&gt;"Perhaps the GIL could be fixed by adding some kind of scheduling queue."   If you were to add a scheduling queue to the GIL, you would effectively turn it into a kind of poorly implemented mutex lock.    Mutex locks are already implemented (by pthreads and the OS) using queues into order to avoid thread starvation.   More details can be found in an operating system textbook.  You might also look at the &lt;a href="http://en.wikipedia.org/wiki/Lamport%27s_bakery_algorithm"&gt;Bakery Algorithm&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;"Perhaps the GIL could be fixed by simply using a mutex lock."   As just mentioned, mutex locks are generally implemented using a queuing mechanism.  If you do this, runnable threads will always context switch every 100 interpreter ticks (you'll see the threads cycling in a round-robin manner).  This will definitely eliminate the multicore contention problem, but now your programs will perform a tremendous amount of context switching.  Also, you might lose the high scheduling priority of I/O bound threads.    Needless to say, there are some downsides that need to be considered (just for the record, I think the use of a condition variable in the current implementation is probably the best overall solution for running on a single CPU). &lt;/li&gt;
&lt;li&gt;"Could you fix the problem by telling the operating system to schedule all threads on the same core?"  Short answer: No.  C extensions to Python (and even significant parts of Python itself) often release the GIL by design so that they can run concurrently while carrying out work that doesn't directly involve the Python interpreter.   If you force everything to one core, you will most likely make these programs run worse, not better.&lt;/li&gt;
&lt;/ul&gt;&lt;/p&gt;&lt;p&gt;&lt;b&gt;Final Words&lt;/b&gt;&lt;/p&gt;&lt;p&gt;As mentioned in the presentation, deep exploration of the Python GIL is not a project I'm actively working on.  In fact, all of this was really just an exploration to find out how the GIL works and to see if I could track down pathological performance for a certain test case on my Mac.    Feel free to take this code and hack it in any way that you wish.  If it proves to be useful, just give me an acknowledgment when you give your PyCon presentation.   Have fun!&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-8353460319128031213?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/8353460319128031213/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=8353460319128031213' title='11 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/8353460319128031213'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/8353460319128031213'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2009/08/inside-inside-python-gil-presentation.html' title='Inside the &quot;Inside the Python GIL&quot; Presentation'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>11</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-7604178517001523239</id><published>2009-08-09T15:32:00.001-07:00</published><updated>2009-09-06T17:03:35.031-07:00</updated><title type='text'>Python Binary I/O Handling</title><content type='html'>&lt;p&gt;As a followup to my last post about the Essential Reference, I thought I'd talk about the one topic that I wish I had addressed in more detail in my book--and that's the subject of binary data and I/O handling.   Let me elaborate. &lt;/p&gt;&lt;p&gt;One of the things that interests me a lot right now is the subject of concurrent programming.  In the early 1990's, I spent a lot of time writing big physics simulation codes for Connection Machines and Crays.   All of those programs had massive parallelism (e.g., 1000s of processors) and were based largely on message-passing.    In fact, my first use of Python was to control a large massively parallel C program that used MPI.  Now, we're starting to see message passing concepts incorporated into the Python standard library.   For example, I think the inclusion of the multiprocessing library is probably one of the most significant additions to the Python core that has occurred in the past 10 years. &lt;/p&gt;&lt;p&gt;A major aspect of message passing concerns the problem of quickly getting data from point A to point B.   Obviously, you want to do it as fast as possible.  A high speed connection helps.  However, it also helps to eliminate as much processing overhead as possible.  Such overhead can come from many places--decoding data, copying memory buffers, and so forth. &lt;/p&gt;&lt;p&gt;Python makes it pretty easy to pass data around between processes.  For example, you can use the pickle module, json, XML-RPC, or some other similar mechanism.  However, all of these approaches involve a significant amount of overhead to encode and decode data.   You probably wouldn't want to use them for any kind of bulk data transfer (e.g., if you wanted to send a large array of floats between processes).  Nor would you really want to use this for some kind of high-performance networking on a big cluster.&lt;/p&gt;&lt;p&gt;However, lurking within the Python standard library is another way to deal with data in messaging and interprocess communication.   However, it's all spread out in a way that's not entirely obvious unless you're looking for it (and even then it's still pretty subtle).    Let's start with the &lt;a href="http://docs.python.org/library/ctypes.html"&gt;ctypes&lt;/a&gt; library.   I always assumed that ctypes was all about accessing C libraries from Python (an alternative approach to &lt;a href="http://www.swig.org/"&gt;Swig&lt;/a&gt;).  However, that's only part of the story.  For instance, using ctypes, you can define binary data structures:&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;from ctypes import *
class Point(Structure):
     _fields_ = [ ('x',c_double), ('y',c_double), ('z',c_double) ]&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;This defines an object representing a C data structure.  You can even create and manipulate such objects just like an ordinary Python class:&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;&gt;&gt;&gt; &lt;b&gt;p = Point(2,3.5,6)&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;p.x&lt;/b&gt;
2.0
&gt;&gt;&gt; &lt;b&gt;p.y&lt;/b&gt;
3.5
&gt;&gt;&gt; &lt;b&gt;p.z = 7&lt;/b&gt;
&gt;&gt;&gt;
&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;However, keep in mind that under the covers, this is manipulating a C structure represented in a contiguous block of memory. &lt;/p&gt;&lt;p&gt;Now this is where things start to get interesting.   I wonder how many Python programmers know that they can directly write a ctypes data structure onto a file opened in binary mode.   For example, you can take the point above and do this:&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;&gt;&gt;&gt; &lt;b&gt;f = open("foo","wb")&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;f.write(p)&lt;/b&gt;       
&gt;&gt;&gt; &lt;b&gt;f.close()&lt;/b&gt;&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;Not only that, you can read the file directly back into a ctypes structure if you use the poorly documented readinto() method of files.&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;&gt;&gt;&gt; &lt;b&gt;g = open("foo","rb")&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;q = Point()&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;g.readinto(q)&lt;/b&gt;
24
&gt;&gt;&gt; &lt;b&gt;q.x&lt;/b&gt;
2.0
&gt;&gt;&gt; &lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;The mechanism that makes all of this work is Python's so-called "buffer protocol."   Since C types structures are contiguous in memory, I/O operations can be performed directly with that memory without making copies or first converting such structures into strings as you might do with something like the struct module.   The buffer protocol simply exposes the underlying memory buffers for use in I/O.&lt;/p&gt;&lt;p&gt;Direct binary I/O like this is not limited to files.  If &lt;tt&gt;s&lt;/tt&gt; is a socket, you can perform similar operations like this:&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;p = Point(2,3,4)           #  Create a point
s.send(p)                  #  Send across a socket

q = Point()
s.recv_info(q)               # Receive directly into q
&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;If that wasn't enough to make your brain explode, similar functionality is provided by the multiprocessing library as well.  For example, Connection objects (as created by the multiprocessing.Pipe() function) have send_bytes() and recv_bytes_into() methods that also work directly with ctypes objects.   Here's an experiment to try.  Start two different Python interpreters and define the &lt;tt&gt;Point&lt;/tt&gt; structure above.    Now, try sending a point through a multiprocessing connection object: &lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;&gt;&gt;&gt; &lt;b&gt;p = Point(2,3,4)&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;from multiprocessing.connection import Listener&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;serv = Listener(("",25000),authkey="12345")&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;c = serv.accept()&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;c.send_bytes(p)&lt;/b&gt;
&gt;&gt;&gt;&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;In the other Python process, do this:&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;&lt;pre&gt;&gt;&gt;&gt; &lt;b&gt;q = Point()&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;from multiprocessing.connection import Client&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;c = Client(("",25000),authkey="12345")&lt;/b&gt;
&gt;&gt;&gt; &lt;b&gt;c.recv_bytes_into(q)&lt;/b&gt;
24
&gt;&gt;&gt; &lt;b&gt;q.x&lt;/b&gt;
2.0
&gt;&gt;&gt; &lt;b&gt;q.y&lt;/b&gt;
3.0
&gt;&gt;&gt;&lt;/pre&gt;&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;As you can see, the point defined in one process has been directly transferred to the other. &lt;/p&gt;&lt;p&gt;If you put all of the pieces of this together, you find that there is this whole binary handling layer lurking under the covers of Python.   If you combine it with something like ctypes, you'll find that you can directly pass binary data structures such as C structures and arrays around between different interpreters.  Moreover, if you combine this with C extensions, it seems to be possible pass data around without a lot of extra overhead.   Finally, if that wasn't enough, it turns out that some popular extensions such as numpy also play in this arena.  For instance, in certain cases you can perform similar direct I/O operations with numpy arrays (e.g., directly passing arrays through multiprocessing connections).&lt;/p&gt;&lt;p&gt;I think that this functionality is pretty interesting--and highly relevant to anyone who is thinking about parallel processing and messaging.   However, all of this is also somewhat unsettling.  For one, much of this functionality is all very poorly documented in the Python documentation (and in my book for that matter).     If you look at the documentation for methods such as the read_into() method files, it simply says "undocumented, don't use it."  The buffer interface, which makes much of this work, has always been rather obscure and poorly understood--although it got a redesign in Python 3.0 (see Travis Oliphant's &lt;a href="http://www.youtube.com/watch?v=10smLBD0kXg"&gt;talk&lt;/a&gt; from PyCon).    And if it wasn't complicated enough already, much of this functionality gets tied into the bytes/Unicode handling part of Python --a hairy subject on its own.  &lt;/p&gt;&lt;p&gt;To wrap up, I think much of what I've described here represents a part of Python that probably deserves more investigation (and at the very least, more documentation).  Unfortunately, I only started playing around with this recently--too late for inclusion in the Essential Reference (which was already typeset and out the door).    However, I'm thinking it might be a good topic for a  PyCon tutorial.   Stay tuned. &lt;/p&gt;&lt;p&gt;Note: If anyone has links to articles or presentations about this, let me know and I'll add them here.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-7604178517001523239?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/7604178517001523239/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=7604178517001523239' title='14 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/7604178517001523239'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/7604178517001523239'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2009/08/python-binary-io-handling.html' title='Python Binary I/O Handling'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>14</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-4272389616836173946</id><published>2009-08-09T08:26:00.000-07:00</published><updated>2009-09-06T17:02:07.789-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='essential reference'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Essential Misconceptions</title><content type='html'>&lt;p&gt;A few days ago, Mike Riley posted a great &lt;a href="http://dobbscodetalk.com/index.php?option=com_myblog&amp;amp;task=view&amp;amp;id=1696&amp;amp;Itemid=96"&gt;review&lt;/a&gt; of the new "Python Essential Reference, 4th Edition" on Dr. Dobb's CodeTalk.   In that review, he writes:&lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;"While the author could have taken the easy path of regurgitating the online documentation, he has instead reworked the explanation for each class and function call in the Python core library with commendable clarity, frequently accompanying these detailed examinations with extremely useful and meaningful code examples.  The book is also very well designed and organized, making it a snap to find information within a matter of seconds."&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;This is a reviewer who really gets what this book is about.  However, for every great review like this, I also encounter comments that simply dismiss the book out-of-hand saying it "offers nothing" over Python's online documentation.  With all due respect to Python's fine documentation, I beg to differ.   &lt;/p&gt;&lt;p&gt;First and foremost, I've always viewed the Python Essential Reference as a serious programming reference for myself (yes, I always have a copy next to my desk and I use it regularly).  Although, I will admit that Python certainly has a lot of online documentation, it's also missing a lot of essential details. For example, I can't count the number of times I've looked at the online documentation for something only to have to go out and do some kind of extended Google search to fill in a missing detail (or worse, having to load the source code for some module and look through it).&lt;/p&gt;&lt;p&gt;Let's look at an example.   Suppose you're writing some networking code with the socket module and you want to use the recv(bufsize [, flags]) method of a socket.  If you head off to the &lt;a href="http://docs.python.org/library/socket.html"&gt;online documentation&lt;/a&gt; you will certainly find some information. &lt;/p&gt;&lt;p&gt;&lt;blockquote&gt;"Receive data from the socket. The return value is a string representing the data received. The maximum amount of data to be received at once is specified by bufsize. See the Unix manual page recv(2) for the meaning of the optional argument flags; it defaults to zero."&lt;/blockquote&gt;&lt;/p&gt;&lt;p&gt;Yes, this is all very useful.   Especially that part about having to refer to a Unix man page.  I'm sure the Windows programmers find that especially useful.    If you turn to the Essential Reference p. 483, you'll not only find a description, but you will also get a complete table showing you exactly what can be given for flags along with a brief description of each option.   This approach is found throughout the book--with few exceptions are readers simply referred to other documentation.  As another example, I would challenge anyone to effectively use something like the setsockopt() or getsockopt() methods of a socket using nothing by Python's online docs.&lt;/p&gt;&lt;p&gt;The other thing that I've tried to do in the book is answer all sorts of questions about tricky interactions between different parts of Python.  Take, for example, this question:  Can a separate execution thread safely close a generator/coroutine function by invoking the generator's close() method?   Sure, that's not the kind of question that comes up every day, but if you know a thing or two about generators and coroutines, you'll know that they are often used in the context of concurrent programming, just like threads.  Not only that, threads and generators might be used together (for example, using threads to carry out blocking operations).  Thus, it is reasonable to assume that programmers working with both threads and generators in the same program might start to wonder about their possible interaction.  I know I did.&lt;/p&gt;&lt;p&gt;If you try to find an answer to this question using the online documentation, you will be searching for some time and probably come up with nothing.  Although there is plenty of discussion about generators, the yield statement, and other matters, you really don't find much about generators and threads mixed together.  Even PEP 342, the official specification that introduced the generator close() method says nothing on this matter.&lt;/p&gt;&lt;p&gt;Now, let's look at the Essential Reference.  First, if you turn to the index and look up "Threads", you will find about a half-page of subentries.  In fact, there is even an entry labeled "Threads: close() method of generators, p. 104."    If you turn to p. 104, you will find a sentence "if a program is currently iterating on a generator, you should not call close() asynchronously on that generator from a separate thread of execution or from a signal handler."   &lt;/p&gt;&lt;p&gt;This is certainly not the only example, but there are a wide variety of similar questions that I try to address.  For example, can you use a decorator with a recursive function? (p. 113).  Or what is the interaction between the __slots__ feature of a class and inheritance? (p. 133).  Or, does the name mangling of private attributes (e.g., __foo) in a class introduce a runtime performance penalty? (p. 128).  All of these questions fall into a general category of issues related to the "side-effects" of using various Python features.   Although you can find some of this in the online docs, it is often scattered and incomplete.  I've tried to fix that.&lt;/p&gt;&lt;p&gt;Finally, I've really tried to make the Essential Reference a kind of programming "cookbook" of sorts.   Although its primary goal is to be a reference, I have also incorporated a wide variety of practical examples from the Python training courses that I run.   For instance,  if you know about the &lt;a href="http://www.dabeaz.com/generators"&gt;Generators&lt;/a&gt; or &lt;a href="http://www.dabeaz.com/coroutines"&gt;Coroutines&lt;/a&gt; tutorials I presented at PyCON, you'll find similar information.  I also include examples that explore tricky interactions and customization features of certain library modules.    For example, how do I customize an XML-RPC server to only accept connections from known IP addresses? (p. 494).   Or how do I use the ssl module to implement a secure server?  (p. 489).    Many of these examples are related to things that I've had to figure out once before, but can never quite remember on a day-to-day basis.  By putting them in the book, it helps me remember how to do a variety of tricky things.&lt;/p&gt;&lt;p&gt;So, that's about it.   I hope people find the book to be useful.  If so, tell your friends.  If not, feel free to use it for propping up some uneven furniture.    Just don't say that it's the same as the online docs.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-4272389616836173946?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/4272389616836173946/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=4272389616836173946' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/4272389616836173946'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/4272389616836173946'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2009/08/essential-misconceptions.html' title='Essential Misconceptions'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-36456651.post-4937612723006367015</id><published>2009-08-09T08:01:00.000-07:00</published><updated>2009-08-09T08:21:59.908-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='introduction'/><title type='text'>First post</title><content type='html'>Well, this is my blog.  Welcome!   I have to admit that I've never been much of a blogger in the past--preferring to focus my energy on writing books and giving conference presentations.  However, I'll probably use this space to post occasional technical articles about projects that I'm working on as well as followups to my presentations.  Enjoy!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/36456651-4937612723006367015?l=dabeaz.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dabeaz.blogspot.com/feeds/4937612723006367015/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=36456651&amp;postID=4937612723006367015' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/4937612723006367015'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/36456651/posts/default/4937612723006367015'/><link rel='alternate' type='text/html' href='http://dabeaz.blogspot.com/2009/08/first-post.html' title='First post'/><author><name>Dave Beazley</name><uri>http://www.blogger.com/profile/02802905126181462140</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry></feed>
