September 25th, 2007 | Published in Google Code
The recent release of the Python client library (version 1.0.8) contains some changes that I'm particularly proud of. I recently refactored all of the data model classes to improve efficiency and make the code cleaner. You might notice an improvement in XML parsing speed, but the real benefit is to those writing new code for the library (like Takashi Matsuo, for example). With some of the recent launches of new Google data APIs, there are a lot of new classes to write - and there are more on the way.
Refactoring can often be painful and boring, but in this case it was actually fun. I felt like I was moving to a new and better design and the library has sufficient tests to let me know that I haven't broken something. Along the way I learned a few things, and I thought I'd share:
- Take advantage of metadata: I used a dictionary to map XML tags to class members and types so I could use a set of generic conversion methods to convert any XML into any of the data model classes. Not only does this mean less code, but it runs faster than the way I was doing things before.
- Unit tests are vital: When refactoring and rewriting major portions of an application it is extremely easy to introduce bugs and break things, so having unit tests that catch these things is very helpful. Thankfully, I had to add very few new test cases since we had written tests for the initial code.
- Plan ahead carefully: For about two weeks before this change I'd scribble some designs on a whiteboard or notepad at random free moments. I wrote a couple of test programs to check my proof of concept and measure efficiency improvements. In the end this meant that my final conversion was pretty painless.
None of these are rocket science, but I think these tips have made this project fun. I don't know if everyone else out there uses similar techniques to stay sane while coding, so I'm interested to hear what kinds of best practices you, the reader, recommend. Why not post it in the Google data Python contributors group?
P.S. For an example on how the code has changed, see the wiki page entitled Data Model Refactoring.