BME205, Fall 2011, Section 01: Intro to Python

Andrew Uzilov

(0) Disclaimers and Background

We will probably run out of time; can continue outside of class (TA office hours?).

Why Python?
- Easy for beginners, but many advanced features.
- Consistent syntax (more or less).
- Good for prototyping (fast to write).
- Good for debugging (e.g. stack traces).
- Powerful standard library:
- unit tests (module "unitest")
- profiler (module "profile")
- regular expressions (module "re")
- iterators (module "itertools")
- tests of docstring-embedded examples (module "doctest")
- and many more!
- Widely adopted, including by bioinformatics community.
- Thriving development community for language itself and associated tools.

But remember: there is no "one true/best language"; you will have to
learn others.

Several Python implementations: CPython, Jython, IronPython, etc.

(1) Resources for Python programming

- "index" is most useful
- "Library Reference" is next in usefulness
- These are the most relevant pages for BME205 Assignment #1:
Tutorial docs: Understanding strings (and more generally, sequences):
- Built-in functions:

- Python in a Nutshell book (useful for advanced users only)

- Editors/IDEs
- WingIDE has a free version: - Eclipse with PyDev: - I recommend emacs (Aquamacs on Mac OS X) - Use python-mode.el mode (NOT python.el): - Auto-completion package available (looks in current buffer to append to current dictionary on-the-fly): or - recommended by John St. John: YaSnippet (;
can be integrated with AutoComplete

- epydoc for documentation (it's like javadoc).

- Mac OS X only: when installing Python, I recommend using MacPorts instead of
downloading the DMG from .

(2) Starting up Python

- Python versions: use 2.7.x, do NOT use 3.x yet!
- Python installation on SOE machines - see Evan the TA for this.
- Command-line interpreter
- useful for playing around with one line of code at a time
- using it as a calculator
- help() function, takes strings or references:
help ('print')
help (print) # same, as long as name is bound to a reference
help ('modules')
x = 205
help (x)
s = 'some arbitrary string'
help (s.rstrip) # help on object methods
- Whitespace matters! But only in indentation.
- History (readline support), reverse search.
- Ctrl+D to exit.

(3) Python types and data model

- Everything is a reference; the object to which we refer has the type.
- Binding: associating a (possibly named) reference to an object.
Sometimes called "variable assignment", but in Python that's a misnomer.
- Most important types:
int, float, bool, string, list, tuple, dict, None
- Categories of types
mutable vs immutable
sequence vs non-sequence
- Type conversion.
- Useful reference
- standard type hierarchy: - how to use standard types: (4) Writing your first Python program ( - Shebang/hash-bang. - chmod +x to make program executable. - Two ways to run a Python program in a shell - Use shebang to locate interpreter (must have done chmod +x): ./ - Ignore shebang, specify explicitly which Python interpreter to use (also ignores chmod +x setting): python /usr/local/bin/python2.7 (5) Other Python demo programs (demo*.py). See program code to learn. (6) Modules that I find most useful in bioinformatics work - Standard Python library modules argparse - parsing command-line arguments array - memory-efficient arrays itertools - iteration and combinatorics random - generate random numbers, shuffle lists re - regular expressions sys - system-level stuff If it is not obvious to you why they are useful, ask me. The awesomeness of some modules (e.g. "itertools") may not be obvious. - 3rd party modules - pysam for SAM/BAM files (sequencing reads): - ruffus for pipelines (replaces makefiles): - rpy2 for Python/R integration: (7) Etc. - Functional programming with Python: Has fun tidbits such as "how to multiply together all items in a list".