INTRO TO PYTHON LECTURE OUTLINE
BME205, Fall 2011
Andrew Uzilov
Contact: auzilov@ucsc.edu
(0) Disclaimers and Background
We will probably run out of time; can continue outside of class (TA office hours?).
Why Python?
- Easy for beginners, but many advanced features.
- Consistent syntax (more or less).
- Good for prototyping (fast to write).
- Good for debugging (e.g. stack traces).
- Powerful standard library:
- unit tests (module "unitest")
- profiler (module "profile")
- regular expressions (module "re")
- iterators (module "itertools")
- tests of docstring-embedded examples (module "doctest")
- and many more!
- Widely adopted, including by bioinformatics community.
- Thriving development community for language itself and associated tools.
But remember: there is no "one true/best language"; you will have to
learn others.
Several Python implementations: CPython, Jython, IronPython, etc.
(1) Resources for Python programming
- docs.python.org
- "index" is most useful
- "Library Reference" is next in usefulness
- These are the most relevant pages for BME205 Assignment #1:
Tutorial docs:
http://docs.python.org/tutorial/interpreter.html
http://docs.python.org/tutorial/introduction.html
http://docs.python.org/tutorial/controlflow.html
http://docs.python.org/tutorial/inputoutput.html
Understanding strings (and more generally, sequences):
http://docs.python.org/library/stdtypes.html#string-methods
http://docs.python.org/library/stdtypes.html#string-formatting-operations
http://docs.python.org/library/stdtypes.html#sequence-types-str-unicode-list-tuple-buffer-xrange
- Built-in functions: http://docs.python.org/library/functions.html
- Python in a Nutshell book (useful for advanced users only)
- Editors/IDEs
- WingIDE has a free version: http://wingware.com/downloads/wingide-101/4.0.4-1/binaries
- Eclipse with PyDev: http://pydev.org
- I recommend emacs (Aquamacs on Mac OS X)
- Use python-mode.el mode (NOT python.el): https://launchpad.net/python-mode
- Auto-completion package available (looks in current buffer to append
to current dictionary on-the-fly):
http://cx4a.org/software/auto-complete/
or
http://www.emacswiki.org/emacs/AutoComplete
- recommended by John St. John: YaSnippet (http://code.google.com/p/yasnippet/);
can be integrated with AutoComplete
- epydoc for documentation (it's like javadoc).
- Mac OS X only: when installing Python, I recommend using MacPorts instead of
downloading the DMG from python.org .
(2) Starting up Python
- Python versions: use 2.7.x, do NOT use 3.x yet!
- Python installation on SOE machines - see Evan the TA for this.
- Command-line interpreter
- useful for playing around with one line of code at a time
- using it as a calculator
- help() function, takes strings or references:
help ('print')
help (print) # same, as long as name is bound to a reference
help ('modules')
x = 205
help (x)
s = 'some arbitrary string'
help (s.rstrip) # help on object methods
- Whitespace matters! But only in indentation.
- History (readline support), reverse search.
- Ctrl+D to exit.
(3) Python types and data model
- Everything is a reference; the object to which we refer has the type.
- Binding: associating a (possibly named) reference to an object.
Sometimes called "variable assignment", but in Python that's a misnomer.
- Most important types:
int, float, bool, string, list, tuple, dict, None
- Categories of types
mutable vs immutable
sequence vs non-sequence
- Type conversion.
- Useful reference
- standard type hierarchy: http://docs.python.org/reference/datamodel.html#the-standard-type-hierarchy
- how to use standard types: http://docs.python.org/library/stdtypes.html
(4) Writing your first Python program (demo0.py)
- Shebang/hash-bang.
- chmod +x to make program executable.
- Two ways to run a Python program in a shell
- Use shebang to locate interpreter (must have done chmod +x):
./progName.py
- Ignore shebang, specify explicitly which Python interpreter to
use (also ignores chmod +x setting):
python progName.py
/usr/local/bin/python2.7 progName.py
(5) Other Python demo programs (demo*.py).
See program code to learn.
(6) Modules that I find most useful in bioinformatics work
- Standard Python library modules
argparse - parsing command-line arguments
array - memory-efficient arrays
itertools - iteration and combinatorics
random - generate random numbers, shuffle lists
re - regular expressions
sys - system-level stuff
If it is not obvious to you why they are useful, ask me.
The awesomeness of some modules (e.g. "itertools") may not be obvious.
- 3rd party modules
- pysam for SAM/BAM files (sequencing reads): http://code.google.com/p/pysam/
- ruffus for pipelines (replaces makefiles): http://ruffus.org.uk/
- rpy2 for Python/R integration: http://rpy.sourceforge.net/rpy2.html
(7) Etc.
- Functional programming with Python: http://docs.python.org/howto/functional.html
Has fun tidbits such as "how to multiply together all items in a list".
Attachments
- sample_data.bed, application/octet-stream, 1,125 bytes
- demoProgs.tar, application/x-tar, 16,896 bytes
- sillyFile.txt, text/plain, 27 bytes
- forLoops.py, text/x-python-script, 1,661 bytes
- generatorsYield.py, text/x-python-script, 2,349 bytes