Skip to content

Code & documentation standards

Scott Olesen edited this page Aug 23, 2014 · 3 revisions

Reading the in-code documentation

To look at docstrings in a particular script script.py, you can run pydoc script. Docstrings appear as red when the source is viewed on github.

Coding standards

  • Scripts and their functions have unit tests in the test folder. It is best that every function be tested; it's acceptable that just the few highest-level functions be tested. It's essential that there be some test, since this ensures all users of the pipeline that it hasn't been broken by any new changes.
    • Unit testing uses python's standard unittest.
  • Indentation is 4 spaces. Tabs are not cool.
  • Scripts that read input from the command line use argparse.

Documentation standards

  • Every function has a docstring.
    • Learn about docstrings.
    • Short, simple functions with one input and output can have one-line docstrings.
    • Functions that are more than about 5 lines, that have more than input, or that have an output whose type is not obvious should have a multiline docstring that specifies the purpose of the function, the expected types of inputs and outputs, the default values of any parameters, and the meanings of each parameter and output.

Here's an example of a short docstring:

def output_filenames(input_filename, k):
    '''destination filenames foo.fastq.0, etc.'''

An a long docstring:

def best_barcode_match(known_barcodes, barcode):
    '''
    Find the best match between a known barcode a list of known barcodes

    Parameters
    known_barcodes : sequence (or iterator) of sequences
        list of known barcodes
    barcode : string
        the barcode read to be matched against the known barcodes

    Returns
    min_mismatches : int
        number of mismatches in the best alignment
    best_known_barcode : string
        known barcode that aligned best
    '''