Introduction

This article is a bunch of good python programming practices accumulated in working.

Table of Contents

Style

The Zen of Python

python -c 'import this'
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

Readability Counts

WhiteSpace

  • 4 spaces per indentation level.
  • No hard tabs.
  • Never mix tabs and spaces.
  • One blank line between functions and classes.
  • Add a space after “,” in dicts, lists, tuples, & argument lists, and after “:” in dicts, but not before.
  • Put spaces around assignments & comparisons (except in argument lists).
  • No spaces just inside parentheses or just before argument lists.
  • No spaces just inside docstrings.

Naming

  • joined_lower for functions, methods, attributes
  • joined_lower or ALL_CAPS for constants
  • PascalCase for classes
  • camelCase only to conform to pre-existing conventions
  • Attributes: interface, _internal, __private
    • But try to avoid the __private form.

      >>> class T(object):
      ...     def __init__(self):
      ...             self.__xprop = 1
      ...
      >>> t = T()
      >>> t
      <__main__.T object at 0x7ff7e00c1fd0>
      >>> t.__dict__
      {'_T__xprop': 1}
      
  • type_, class_ for name conflicts with keywords

Long Lines & Continuations

  • Keep lines below 80 chars in length.
    • Use implied line continuation inside parentheses/brackets/braces:

      def __init__(self, task_id, user_id, version,
          quality, quatity):
          output = user_id + ':'
            + 'task_id'
      
    • Use backslashes as a last resort:

      self.files = \
          [os.path.join(vtk, fname) for fname in os.listdir('VTK')]
      

      Backslashes are fragile; they must end the line they’re on. If you add a space after the backslash, it won’t work any more.

Long Strings

  • Adjacent literal strings are concatenated by the parser:
  • The string prefixed with an “r” is a “raw” string. Backslashes are not evaluated as escapes in raw strings. They’re useful for regular expressions and Windows filesystem paths.
  • The parentheses allow implicit line continuation.

    text = ('Long strings can be made up '
        'of several shorter strings.')
    

Compund Statements

Good:

if r.method != 'GET':
    Logger.warn('unsupported request method')
parse(request)
process(request)
send(response)

Bad:

if r.method != 'GET': Logger.warn('unsupported request method')
parse(request); process(request); send(response)

Assert

Used to define user constraints, not input.

assert x == 1, 'not equal'
# is equalivent to
if __debug__ and not x == 1:
	raise AssertionError('not equal')

__debug__ is True by default.

Function

A method should do one thing and only one thing However many lines of code it takes to do that one thing is how many lines it should have. If that “one thing” can be broken into smaller things, each of those should have a method.

10 LoC per function is a good practice.

Exception Handling

Exceptions allow error handling to be organized cleanly in a central or high-level place within the program structure.

Python Exceptions

  • Catch What You Can Handle
  • Abstract exception when raise again
    • FatalError
    • LogicalError
    • Warnings
  • Exception should contains useful information
    • exec context
    • error cause

As of Python 3, exceptions must subclass BaseException.

try:
    query(stmt)
except: # catch *all* exceptions
    rollback()
    raise
else:
    commit()
try:
    do_sth()
except:
    Logger.error('stack trace:\n{}'.format(traceback.format_exc()))
finally:
    # regardless of success or error
    post_process()
BaseException
 +-- SystemExit
 +-- KeyboardInterrupt
 +-- GeneratorExit
 +-- Exception
      +-- StopIteration
      +-- StandardError
      |    +-- BufferError
      |    +-- ArithmeticError
      |    |    +-- FloatingPointError
      |    |    +-- OverflowError
      |    |    +-- ZeroDivisionError
      |    +-- AssertionError
      |    +-- AttributeError
      |    +-- EnvironmentError
      |    |    +-- IOError
      |    |    +-- OSError
      |    |         +-- WindowsError (Windows)
      |    |         +-- VMSError (VMS)
      |    +-- EOFError
      |    +-- ImportError
      |    +-- LookupError
      |    |    +-- IndexError
      |    |    +-- KeyError
      |    +-- MemoryError
      |    +-- NameError
      |    |    +-- UnboundLocalError
      |    +-- ReferenceError
      |    +-- RuntimeError
      |    |    +-- NotImplementedError
      |    +-- SyntaxError
      |    |    +-- IndentationError
      |    |         +-- TabError
      |    +-- SystemError
      |    +-- TypeError
      |    +-- ValueError
      |         +-- UnicodeError
      |              +-- UnicodeDecodeError
      |              +-- UnicodeEncodeError
      |              +-- UnicodeTranslateError
      +-- Warning
           +-- DeprecationWarning
           +-- PendingDeprecationWarning
           +-- RuntimeWarning
           +-- SyntaxWarning
           +-- UserWarning
           +-- FutureWarning
           +-- ImportWarning
           +-- UnicodeWarning
           +-- BytesWarning

Docstrings & Comments

Docstrings = How to use code

Comments = Why (rationale) & how code works

Docstrings explain how to use code, and are for the users of your code. Uses of docstrings:

  • Explain the purpose of the function even if it seems obvious to you, because it might not be obvious to someone else later on.
  • Describe the parameters expected, the return values, and any exceptions raised.
  • If the method is tightly coupled with a single caller, make some mention of the caller (though be careful as the caller might change later).

Comments explain why, and are for the maintainers of your code. Examples include notes to yourself, like:

Docstrings are useful in interactive use (help()) and for auto-documentation systems.

False comments & docstrings are worse than none at all. So keep them up to date! When you make changes, make sure the comments & docstrings are consistent with the code, and don’t contradict it.

PEP 257: Docstring Convertions

# !!! BUG: ...
# !!! FIX: This is a hack
# ??? Why is this here?

def process_request(request):
    '''process the user request and send back response in json format
    raise InvalidRequestException on parsing error
    '''
    ...

Sphinx-apidoc

def create_task(task_id):
    '''create task with id = task_id

    :param task_id: task id
    :type task_id: int or long
    :rtype: Task object

    ::

        first line code
        second line code

Importing

Don’t use wildcard import.

from module import *
  1. standard library imports
  2. related third party imports
  3. local application/library specific imports

Module

Module Structure

"""module docstring"""

__all__ = ['classname', 'funcname']

# imports
# constants
# exception classes
# interface functions
# classes
# internal functions & classes

def main(...):
	...

if __name__ == '__main__':
	main(...)

Scripting

Script Structure

#!/usr/bin/env python
#-*- coding: utf-8 -*-

"""module docstring
description
"""

# the same as module afterwards

Shebang

#!/usr/bin/env python
#!/usr/bin/python
#!/usr/bin/python2
#!/usr/bin/python3
#!/usr/local/bin/python3
#!/home/oxnz/python-3.4/bin/python3.4

Encoding

BOM
Byte Order Mark
#-*- coding: utf-8 -*-

Packages

package/
    __init__.py
    module1.py
    subpackage/
        __init__.py
        module2.py

Unit Testing

python -m unittest discover
import unittest

def SimulatorTest(unittest.TestCase):
    def test_simulate(self):
        task_id = 124
        sim = Simulator(task_id)
        self.assertEqual(task_id, sim.task_id, 'inconsistent task_id')

def HelperTest(unittest.TestCase):
    def test_exec_cmd(self):
        with self.assertRaises(OSError):
            Helper.exec_cmd(['non_exists_cmd'])

if __name__ == '__main__':
    unittest.main()

Practicality Beats Purity

There are always exceptions. From PEP 8:

But most importantly: know when to be inconsistent – sometimes the style guide just doesn’t apply. When in doubt, use your best judgment. Look at other examples and decide what looks best. And don’t hesitate to ask!

Two good reasons to break a particular rule:

  1. When applying the rule would make the code less readable, even for someone who is used to reading code that follows the rules.
  2. To be consistent with surrounding code that also breaks it (maybe for historic reasons) – although this is also an opportunity to clean up someone else’s mess (in true XP style).

… but practicality shouldn’t beat purity to a pulp!

Idioms

Swap Values

a, b = b, a

Ternary Operator

# b = a > 2 ? 2 : 1
b = 2 if a > 2 else 1

Building Strings from Substrings

colors = ['red', 'orange', 'yellow', 'green', 'blue', 'indigo', 'violet']
rainbow = 'magic'.join(colors)

Use copy for deepcopy

Use Counter for counting

from collections import Counter
Counter('success')

Lazy evaluation

yield

enum

class Seasons:
    Spring = 0
    Summer = 1
    Autumn = 2
    Winter = 3

class Seasons:
    Sping, Summer, Autumn, Winter = range(4)

def enum(*posarg, **keysarg):
    return type('Enum', (object,), dict(zip(posarg, xrange(len(posarg))), **keysarg))

Seasons = enum('Spring', 'Summer', 'Autumn', 'Winter' = 1)

Seasons = namedtuple('Seasons', 'Spring Summer Autumn Winter')._make(range(4))

Type checking

prefer isinstance than type

if not isinstance(task_id, (int, long)):
    raise BadRequest('invalid task id, integer or long expected')

eval is evil

is and ==

  • is: object identity
  • ==: euqal

Use in where possible

  • in is generally faster
  • This pattern also works for items in arbitrary containers (such as lists, tuples, and sets).
  • in is also an operator
for key in d:
    print key

if key in d:
    print 'in'

But, .keys() is necessary when mutating the dictionary:

for key in d.keys():
    d[str(key)] = d[key]

Dictionary get Method

dict.get(key, default) removes the need for the test

Dictionary setdefault Method

defaultdict

Building & Splitting Dictionaries

>>> zip((1, 2), ('a', 'b'))
[(1, 'a'), (2, 'b')]
>>> dict(zip((1, 2), ('a', 'b')))
{1: 'a', 2: 'b'}

Testing for Truth Values

a = 2
b = 3
1 <= a < b <= 10 # True

Condition Test

if cluster is not None:

Index & Item

enumerate

arr = [1, 2, 3, 4, 5]
for idx, elem in enumerate(arr, 0):
    print idx, elem

Default Parameter Values

String Formatting

Prefer format than %

% String Formatting

format

List Comprehensions

product

import itertools

suit = itertools.product(['A', 'B'], (1, 2, 3))

Generator Expressions

it = iter(range(100))
chunk_iter = (list(itertools.islice(it, 3)) for _ in itertools.repeat(0))
[(_[0],[i for i in _[1]]) for _ in itertools.groupby(range(10), lambda x: x/3)]
it = itertools.count()
[(_[0],[i for i in _[1]]) for _ in itertools.groupby(range(10), lambda x: _it.next()/4)]

Sorting

Soring with DSW

DSU = Decorate-Sort-Undecorate

Sorting With Keys

Generators

Reading Lines From Text/Data Files

with open('input') as f:
    for line in f:
        do_sth_with(line)

Decorator

@property
def code(self):
    return self._code

@staticmethod
def create_task(args):
    return Task(args)

Though classmethod and staticmethod are quite similar, there’s a slight difference in usage for both entities: classmethod must have a reference to a class object as the first parameter, whereas staticmethod can have no parameters at all.

Define new decorators

def jsonify_api(func):
    '''decorator used to jsonify web api'''
    @functools.wraps(func)
    def decorator(*args, **kwargs):
        try:
            return func(*args, **kwargs)
        except NotFound as e:
            return jsonify({'code': 1, 'error': e.message}), 404
        except Exception as e:
            return jsonify({'code': 1, 'error': e.message}), 500
    return decorator

def synchronized(func):
    '''decorator used to synchronized on lock'''
    func.__lock__ = threading.Lock()
    @functools.wraps(func)
    def decorator(*args, **kwargs):
        with func.__lock__:
            return func(*args, **kwargs)
    return decorator

def synchronized(lock):
    def decorator(func):
        def sync_func(*args, **kwargs):
            with lock:
                return func(*args, **kwargs)
        return sync_func
    return decorator

with statement

class TransactionalSession(object):
    def __init__(self):
        self._transactional_session = scoped_session()

    @property
    def transactional_session(self):
        return self._transactional_session

    def __enter__(self):
        return self.transactional_session

    def __exit__(self, excpt_type, excpt_value, excpt_traceback):
        if excpt_type is None:
            self.session.commit()
            return True
        else:
            self.session.rollback()
            return False

# or use contextmanager
from contextlib import contextmanager
@contextmanager
def transactional_session():
    session = scoped_session()
    try:
        yield session
        session.commit()
    except:
        session.rollback()
        raise
# usage
with TransactionalSession() as session:
    session.add(task)
with transactional_session() as session:
    session.add(task)

EAFP is preferable to LBYL

  • “It’s Easier to Ask for Forgiveness than Permission.”
  • “Look Before You Leap”
try:            vs     if ...:
except:

Simple is Better Than Complex

Don’t reinvent the wheel

Tips

  • Style Guide for Python Code
  • don’t commit commented out code
  • don’t repeat yourself
  • don’t return negative number to shell
  • no magic numbers
  • no hard coded constants

PEP = Python Enhancement Proposal

Debug

Logging

logging

loggins is thread-safe only, not process-safe.

import logging

logging.basicConfig(
    level = logging.DEBUG,
    filename = 'log.txt',
    filemode = 'w',
    format = '%(asctime)s %(filename)s[line:%(lineno)d] %(levelname)s %(message)s',
)

# and log to console at the same time
console = logging.StreamHandler()
console.setLevel(logging.ERROR)
formatter = logging.Formatter('%(name)-12s: %(levelname)-8s %(message)s')
console.setFormatter(formatter)
logging.getLogger(__name__).addHandler(console)
logger = logging.getLogger(__file__)
console = logging.StreamHandler()
console.setLevel(logging.DEBUG)
logging.addHandler(console)

syslog

import syslog

class Logger(object):
    '''log options:
    LOG_PID, LOG_CONS, LOG_NDELAY, LOG_NOWAIT and LOG_PERROR
    if defined in <syslog.h>.
    '''
    syslog.openlog(logoption = syslog.LOG_PID, facility = syslog.LOG_LOCAL0)

    @staticmethod
    def log(priority, message):
        '''priority levels:
        LOG_EMERG, LOG_ALERT, LOG_CRIT, LOG_ERR, LOG_WARNING,
        LOG_NOTICE, LOG_INFO, LOG_DEBUG.
        '''
        syslog.syslog(priority, message)

    @staticmethod
    def debug(msg):
        Logger.log(LOG_DEBUG, msg)

    @staticmethod
    def info(msg):
        Logger.log(LOG_INFO, msg)

Disassemble

import dis
dis.dis(func)

Concurrency

multiprocessing

import contextlib
import multiprocessing as mp

nproc = 48
with contextlib.closing(mp.Pool(nproc)) as pool:
    rows = sum(pool.map(match, tables), ())

thread

thread exception handling ?

threading

def worker():
    while True:
        item = Q.get()
        do_work(item)
        Q.task_done()

Q = Queue()
for i in range(nworker):
    t = Thread(target=worker)
    t.daemon = True
    t.start()

for item in source():
    Q.put(item)

Q.join()

Communication

  • Lock
  • RLock
  • Condition
  • Semaphore
  • BoundedSemaphore
  • Event
  • Queue
mutex = threading.Lock()

def serialized_method(self):
    with mutex:
        do_sth()

thread vs threading

signal

  • A handler for a particular signal, once set, remains installed until it is explicitly reset (except SIGCHLD, which follows the underlying impl)
  • There is no way to ‘block’ signals temporarily from critical sections
  • Python signal handlers are called asynchronously, but they can only occur between the ‘atomic’ instructions of the Python interpreter. This means that signals arriving during long calculations implemented purely in C may be delayed for an arbitray amount of time
  • Because the C signal handler always returns, it makes little sense to catch syncrhonous errors like SIGPIPE or SIGSEGV
  • Python installs a small number of signal handlers by default:
    • SIGPIPE is ignored
    • SIGINT is translated into a KeyboardInterrupt exception
  • Some care must be taken if both signals and threads are used in the same program. The fundamental thing to remmeber in using signals and threads simultaneously is:
    • always perform signal() operations in the main thread of execution.
    • Any thread can perform an alarm(), getsignal(), pause(), settimer() or gettimer()
    • only the main thread can set a new signal handler, and the main thread will be the only one to receive signals (this is enforced by the Python signal module, even if the underlying thread implementation supports sending signals to individual threads). This means that signals can't be used as a means of inter-thread communication. Use locks instead.

Performance

Benchmarking

from timeit import Timer
Timer('tmp = x; x = y; y = tmp', 'x = 2; y = 3').timeit()

Profile

time

time python script.py
real	0m0.027s
user	0m0.013s
sys	0m0.012s

cProfile

python -m cProfile -s cumtime script.py
python -m cProfile script.py
	 3 function calls in 0.000 seconds

Ordered by: standard name

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1    0.000    0.000    0.000    0.000 api.py:27(<module>)
     1    0.000    0.000    0.000    0.000 api.py:27(test)
     1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Output to file

python -m cProfile -o perf.stat script.py

analysis

import pstats
p = pstats.Stat('perf.stat')
p.sort_stats('cumulative').print_stats(20)

Design

Object-oriented

Request <-----> Controller
                    ^
                    | Entity
                    v
                 Service
                    ^
                    | Entity
                    v
                   DAO <----> Database
  • Entity
  • Service
  • Controller

Package vs Module

Modules are simple Python files with the .py extension, which implement a set of functions.

A module is a file containing Python definitions and statements. The filename is the module name.

Packages are a way of structuring Python’s module namespace by using “dotted module names”.

Packages are namespaces which contain multiple packages and modules themselves. The are simply directories contain a file called __init__.py. The file indicate that the directory is a Python package.

Design-Patterns

design patterns impelemented in serveral programming languages

Singleton

import Sun
Sun.rise()
Sun.set()
  • all variables binds to module
  • module inited once
  • import is thread-safe

Mixin

Mix-in programming is a style of software development where units of functionality are created in a class and then mixed in with other classes. This might sound like simple inheritance at first, but a mix-in differs from a traditional class in one or more of the following ways. Often a mix-in is not the “primary” superclass of any given class, does not care what class it is used with, is used with many classes scattered throughout the class hierarchy and is introduced dynamically at runtime.

There are several reasons to use mix-ins:

  • they extend existing classes in new areas without having to edit, maintain or merge with their source code;
  • they keep project components (such as domain frameworks and interface frameworks) separate;
  • they ease the creation of new classes by providing a grab bag of functionalities that can be combined as needed;
  • and they overcome a limitation of subclassing, whereby a new subclass has no effect if objects of the original class are still being created in other parts of the software.

Python provides an ideal language for mix-in development because it supports multiple inheritance, supports full-dynamic binding and allows dynamic changes to classes.

One thing to keep in mind is the order of searching with regard to multiple inheritance. The search order goes from left to right through the base classes, and for any given base class, goes deep into its ancestor classes.

When you create mix-ins, keep in mind the potential for method names to clash. By creating distinct mix-ins with well-named methods you can generally avoid any surprises. Lastly, Python supports dynamic changes to the class hierarchy.

class Person(object):
    pass

class Writer:
    def write(self):
        print 'Hello world!'

Person.__bases__ += (Writer,)
Person().write()
def mixIn(origClass, mixInClass):
    if mixInClass not in origClass.__bases__:
        origClass.__bases__ += mixInClass

def minxIn(origClass, mixInClass, append=True):
    if mixInClass not in origClass.__bases__:
        if append:
            origClass.__bases__ += (mixInClass,)
        else:
            origClass.__bases__ += (mixInClass,) + origClass.__bases__

import types
def mixIn(origClass, mixInClass, makeAncestor=False):
    '''An even more sophisticated version of this function could return
    (perhaps optionally) a list of methods that clash between the two,
    or raise an exception accompanied by such a list, if the overlap exists.
    '''
    if makeAncestor:
        if mixInClass not in origClass.__bases__:
            origClass.__bases__ = (mixInClass,) + origClass.__bases__
    else:
        # recursively traverse the mix-in ancestor classes in order to
        # support inheritance
        baseClasses = list(mixInClass.__bases__)
        baseClasses.reverse()
        for baseClass in baseClasses:
            mixIn(origClass, baseClass)
        # install the mix-in methods into the class
        for name in dir(mixInClass):
            if not name.startswith('__'): # skip private members
                member = getattr(mixInClass, name)
                if type(member) is types.MethodType:
                    member = member.im_func
                setattr(origClass, name, member)

One warning regarding dynamic mix-ins: they can change the behavior of existing objects (because they change the classes of those objects). This could lead to unpredictable results, as most classes are not designed with that type of change in mind. The safe way to use dynamic mix-ins is to install them when the application first starts, before any objects are created.

Mix-ins are great for improving modularity and enhancing existing classes without having to get intimate with their source code. This in turn supports other design paradigms, like separation of domain and interface, dynamic configuration and plug-ins. Python’s inherent support for multiple inheritance, dynamic binding and dynamic changes to classes enables a very powerful technique. As you continue to write Python code, consider ways in which mix-ins can enhance your software.

pub/sub

python-message

state

python-state

Modules

builtin

id

CPython implementation detail: This is the address of the object in memory.

sorting

  • list.sort()
  • sorted()
sorted('The quick fox jumped over the lazy dog'.split(), key=str.lower)
sorted(tasks, key=lambda task: task.priority)
sorted(tasks, key=itemgetter(1, 2))
sorted(tasks, key=attrgetter('vm_type', 'vm_image'), reverse=True)
sorted(iterable, key=functools.cmp_to_key(locale.strcoll)) # local-aware sort order

functools

singledispatch

@functools.singledispatch
def f(arg):
    print('f: {}'.format(arg))
@f.register(int)
def _(arg):
    print('int: {}'.format(arg))
@f.register(list)
def _(arg):
    print('list: {}'.format(arg))

partialmethod

class Task(object):
    def __init__(self):
        self._state = 'active'
    @property
    def state(self):
        return self._active
    def set_state(self, state):
        self._state = state
    set_active = partialmethod(set_state, 'active')
    set_inactive = partialmethod(set_state, 'inactive')

Frameworks

Numerical

  • scipy and numpy
  • pandas
  • SymPy
  • matplotlib
  • Traits
  • Chaco
  • TVTK
  • VPython
  • OpenCV

Http

HTTP Reqeusts

Requests HTTP library for Python

MySQL

import MySQLdb
import contextlib
import pandas as pd

fields = ('email', 'create_time', 'update_time')
with contextlib.closing(MySQLdb.connect(host='10.20.30.40', port=1234, user='root', pass='root', db='test')) as conn:
    with conn as cursor:
        cursor.execute('SELECT {} from test'.format(fields)
    rows = cursor.fetchall()
    df = pd.DataFrame(rows, columns=fields)
    pd.set_option('display.expand_frame_repr', False)
    print df[df.email != None]
import traceback
import contextlib
import mysql.connector

config = {
        'database': {
            'host': '127.0.0.1',
            'port': 3306,
            'user': 'test',
            'password': '',
            'database': 'test',
            'raise_on_warnings': True,
            }
        }

def insert():
    query = 'INSERT INTO test (name, email) VALUES (%s, %s)'
    with contextlib.closing(mysql.connector.connect(**config['database'])) as cnx:
        with contextlib.closing(cnx.cursor(dictionary=True)) as cursor:
            cursor.execute(query, ('y', 'y@z.com'))
        cnx.commit()

def select():
    try:
        with contextlib.closing(mysql.connector.connect(**config['database'])) as cnx:
            with contextlib.closing(cnx.cursor()) as cursor:
                query = 'SELECT name, email FROM test'
                cursor.execute(query)
                rows = cursor.fetchall()
                for (name, email) in rows:
                    print('{}: {}'.format(name, email))
    except mysql.connector.Error as e:
        if e.errno == mysql.connector.errorcode.ER_ACCESS_DENIED_ERROR:
            print(e)
        traceback.print_exc()
    except Exception as e:
        traceback.print_exc()

def main():
    insert()
    select()

if __name__ == '__main__':
    main()

Kafka

Consumer

from kafka import KafkaConsumer

topic = 'requests'
brokers = '10.20.30.40:9092,11.22.33.44:9092'
consumer = KafkaConsumer(topic, group_id='cg', bootstrap_servers=brokers, auto_offset_reset='earliest')
for msg_raw in consumer:
    print 'timestamp: {} partition: {} offset: {}'.format(time.strftime('%F %T', time.localtime(int(msg_raw.timestamp/1000.0))), msg_raw.partition, msg_raw.offset)
    msg = Message()
    if msg.ParseFromString(msg_raw.value): proc(msg)
# batch mode
batch = consumer.pool(10000, 10000)
count = sum(map(len, batch.values()))
if count != 0: proc(batch)

Web

Flask

app = Flask(__name__)
app.add_url_rule(rule = '/api/<ver>/task/<task_id>', methods = ['POST'],
    endpoint = 'add_task', view_func = self.add_task)
# or
@app.route('/api/v<int:ver>/task/<int:task_id>', methods = ['POST'])
@jsonify_api
def add_task(self, ver, task_id):
    pass

@app.before_request
def pre_process():
    setattr(request, 'timestamp', time.time())

@app.after_request
def post_process(response)
    elapsed = time.time() - request.timestamp
    log.info('elapsed: {}'.format(elapsed)
    return response

if __name__ == '__main__':
    app.run(host='0.0.0.0')
    app.run(host='0.0.0.0', threaded=True, processes=10)

flask.Flask.run accepts additional keyword arguments (**options) that it forwards to werkzeug.serving.run_simple - two of those arguments are threaded (which can set to True to enable threading) and processes (which can set to a number greater than one to have werkzeug spawn more than one process to handle requests).

Django

Django Basics

bottle.py

Database

SQLAlchemy

Cautions
Pass Entity across sessions(threads)
  • pass committed object id
IOC
Inversion Of Control

session.flush() communicates a series of operations to the database (insert, update, delete). The database maintains them as pending operations in a transaction. The changes aren’t persisted permanently to disk, or visible to other transactions until the database receives a COMMIT for the current transaction (which is what session.commit() does).

SQLAlchemy: What’s the difference between flush() and commit()?

tl;dr;

  1. As a general rule, keep the lifecycle of the session separate and external from functions and objects that access and/or manipulate database data. This will greatly help with achieving a predictable and consistent transactional scope.
  2. Make sure you have a clear notion of where transactions begin and end, and keep transactions short, meaning, they end at the series of a sequence of operations, instead of being held open indefinitely.

Configuration

ConfigParser

INI

config = ConfigParser.SafeConfigParser()
config.read("test.ini")
sections = config.sections()
options = config.options()
items = config.items()

JSON

import json

with open('config.json') as f:
	config = json.load(f)

YAML

import yaml

print yaml.load('''
name: Will
''')

Internals

MRO

MRO
Method Resolution Order
  • Classic Class
    • from left to right, depth-first
  • Modern Class
    • more complicated (C3 MRO)

Project Anatomy

moxile/                      # Project Hosting
    .svn/                    # Version Control
    moxile/                  # Quality Code
        moxile.py
    tests/                   # Unit Testing
        test_moxile.py
    doc/                     # Documentation
        index.rst
        html/
            index.html
    README.txt
    LICENSE.txt              # Licensing
    setup.py                 # Packaging
    MANIFEST.in

Documentation

sudo pip install -U sphinx
cd moxile
sphinx-quickstart
sphinx-apidoc .. --force -o .
# modify conf.py
sys.path.insert(0, os.path.abspath('/path/to/source/code'))
make html

Deployment

setup.py

File Hierarchy

setup.py
src/
    mypkg/
        __init__.py
        module.py
        data/
            tables.dat

Script

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from distutils.core import setup

setup(name = 'platformz',
    version = '0.1.0',
    description = 'an operational platform',
    author = 'oxnz',
    author_email = 'yunxinyi@gmail.com',
    url = 'https://oxnz.github.io',
    packages = ['simulation', 'agent', 'utilities'],
    scripts = ['scripts/simulate', 'scripts/vm-agent'],
    data_files = [('/etc/init.d', ['init-script']),
        ('docs', ['man.1']),
    ]
)
python setup.py --help-commands

Standard commands:

  • build build everything needed to install
  • build_py “build” pure Python modules (copy to build directory)
  • build_ext build C/C++ extensions (compile/link to build directory)
  • build_clib build C/C++ libraries used by Python extensions
  • build_scripts “build” scripts (copy and fixup #! line)
  • clean clean up temporary files from ‘build’ command
  • install install everything from build directory
  • install_lib install all Python modules (extensions and pure Python)
  • install_headers install C/C++ header files
  • install_scripts install scripts (Python or otherwise)
  • install_data install data files
  • sdist create a source distribution (tarball, zip file, etc.)
  • register register the distribution with the Python package index
  • bdist create a built (binary) distribution
  • bdist_dumb create a “dumb” built distribution
  • bdist_rpm create an RPM distribution
  • bdist_wininst create an executable installer for MS Windows
  • upload upload binary package to PyPI
  • check perform some checks on the package

PIP

Wheel

python -m pip download --dest=/path/to/dest elasticsearch
pip --user install django

pythonrc

#!/usr/bin/env python
#-*- coding: utf-8 -*-
#
# Copyright (c) 2013-2015 Z
# All rights reserved.
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
#

try:
    import readline
    import rlcompleter
    import os
    import atexit
except ImportError as e:
    print(e)
else:
    class TabCompleter(rlcompleter.Completer):
        """Completer that support tab indenting"""
        def complete(self, text, state):
            if not text:
                return ('\t', None)[state]
            else:
                return rlcompleter.Completer.complete(self, text, state)
    readline.set_completer(TabCompleter().complete)
    if 'libedit' in readline.__doc__:
        """Complete filename (tab key)
        http://minix1.woodhull.com/manpages/man3/editline.3.html"""
        readline.parse_and_bind('bind ^I rl_complete')
    else:
        readline.parse_and_bind('tab: complete')
    histfile = os.path.expanduser('~/.pyhistory')
    def savehist(histfile=histfile):
        import readline
        readline.write_history_file(histfile)
    atexit.register(savehist)
    if os.path.exists(histfile):
        readline.read_history_file(histfile)
    del readline, os, atexit, histfile, savehist

import

sys.path

import sys.path
sys.path.insert(0, 'path/to/prepend')
sys.path.append('/path/to/append')

site

import site
site.addsitedir('/path/to/another/sitedir') # will append to sys.path

multi-version

import pkg_resources
pkg_resources.require('protobuf=3.1.0')
import google.protobuf

References