Wednesday, December 22, 2010

Writing clean, testable, high quality code in Python

Introduction

Writing software is among the most complicated endeavors a human can undertake. Brian Kernigan, co-author of the AWK programming language and "K and R C", sumed up the true nature of software development in the book, Software Tools, when he stated, "Controlling complexity is the essence of software development." The harsh reality of real world software development is that software is often created with intentional, or unintentional, complexity and a disregard for maintainability, testability, and quality. The end result of this unfortunate reality is software that can become increasingly difficult and expensive to maintain and that fails sporadically and even spectacularly.

The first step in the process of writing high quality code is to re-examine the entire thought process of how an individual or team develops software. Often in failed, or troubled, software development projects, the software was developed in a reactionary stream of consciousness where the focus of the software development was on getting a problem solved in any manner possible. In a successful software project, the developer is thinking not only about how to solve the problem at hand, but additionally about the process involved in solving the problem.

A succesful software developer will devise a way to run the tests in an easily automated fashion, so they can continuously prove the software works. They are aware of the dangers of needless complexity. They are humble in their approach, seek critical review, and expect refactoring at every step of the way. They continuously think about how they can ensure their software is testable, readable, and maintainable. Although Python the language, and Python the community, are heavily influenced by desire to write clean, maintainable code that works, it is still quite easy to do the exact opposite. In this article, we will tackle this problem head on and explore how to write clean, testable, high quality code in Python.
A clean code hypothetical problem

The best way to demonstrate this style of development is to solve a hypothetical problem. Let's suppose you are a back-end web developer at a company that allows users to generate reviews, and you need to come up with a way to show and highlight small snippets of those reviews. One way to approach the problem would be to write a large function that takes a snippet of text, and query parameters, and returns back a character limited snippet with the query parameters highlighted. All of the logic needed to solve the problem would be included in the one "mega" function, and you would simply need to keep rerunning your script, until you got the result you wanted. The format would probably look like the code example below and would often be developed with a combination of print statements, or logging statements, and an interactive shell.

Messy code

def my_mega_function(snippet, query)
   """This takes a snippet of text, and a query parameter and returns """
   #Logic goes here, and often runs on for several hundred lines
   #There are often deeply nested conditional statements and loops
   #Function could reach several hundred, if not thousands of lines
 
   return result

With a dynamic language like Python, Perl, or Ruby, it is easy to develop software by simply banging away at the problem, often interactively, until you get what seems to be the correct result and calling it a day. Unfortunately, this approach, while tempting, often leads to a false sense of accomplishment that is fraught with danger. Much of the danger lies in not designing a solution to be testable, and part lies in not properly controlling the complexity of the software written.

Read more: IBM