pep-0248.txt

PEP: 248
Title: Python Database API Specification v1.0
Version: $Revision$
Last-Modified: $Date$
Author: mal@lemburg.com (Marc-André Lemburg)
Discussions-To: db-sig@python.org
Status: Final
Type: Informational
Content-Type: text/x-rst
Created:
Post-History:
Superseded-By: 249


Introduction
============

This API has been defined to encourage similarity between the
Python modules that are used to access databases.  By doing this,
we hope to achieve a consistency leading to more easily understood
modules, code that is generally more portable across databases,
and a broader reach of database connectivity from Python.

This interface specification consists of several items:

* Module Interface
* Connection Objects
* Cursor Objects
* DBI Helper Objects

Comments and questions about this specification may be directed to
the SIG on Tabular Databases in Python
(http://www.python.org/sigs/db-sig).

This specification document was last updated on: April 9, 1996.
It will be known as Version 1.0 of this specification.


Module Interface
================

The database interface modules should typically be named with
something terminated by ``db``.  Existing examples are: ``oracledb``,
``informixdb``, and ``pg95db``.  These modules should export several
names:

``modulename(connection_string)``
   Constructor for creating a connection to the database.
   Returns a Connection Object.

``error``
   Exception raised for errors from the database module.


Connection Objects
==================

Connection Objects should respond to the following methods:

``close()``
   Close the connection now (rather than whenever ``__del__`` is
   called).  The connection will be unusable from this point
   forward; an exception will be raised if any operation is
   attempted with the connection.

``commit()``
   Commit any pending transaction to the database.

``rollback()``
   Roll the database back to the start of any pending
   transaction.

``cursor()``
   Return a new Cursor Object.  An exception may be thrown if
   the database does not support a cursor concept.

``callproc([params])``
   (Note: this method is not well-defined yet.)  Call a
   stored database procedure with the given (optional)
   parameters.  Returns the result of the stored procedure.

(all Cursor Object attributes and methods)
   For databases that do not have cursors and for simple
   applications that do not require the complexity of a
   cursor, a Connection Object should respond to each of the
   attributes and methods of the Cursor Object.  Databases
   that have cursor can implement this by using an implicit,
   internal cursor.


Cursor Objects
==============

These objects represent a database cursor, which is used to manage
the context of a fetch operation.

Cursor Objects should respond to the following methods and
attributes:

``arraysize``
   This read/write attribute specifies the number of rows to
   fetch at a time with ``fetchmany()``.  This value is also used
   when inserting multiple rows at a time (passing a
   tuple/list of tuples/lists as the params value to
   ``execute()``).  This attribute will default to a single row.

   Note that the arraysize is optional and is merely provided
   for higher performance database interactions.
   Implementations should observe it with respect to the
   ``fetchmany()`` method, but are free to interact with the
   database a single row at a time.

``description``
   This read-only attribute is a tuple of 7-tuples.  Each
   7-tuple contains information describing each result
   column: (name, type_code, display_size, internal_size,
   precision, scale, null_ok). This attribute will be ``None``
   for operations that do not return rows or if the cursor
   has not had an operation invoked via the ``execute()`` method
   yet.

   The 'type_code' is one of the 'dbi' values specified in
   the section below.

   Note: this is a bit in flux. Generally, the first two
   items of the 7-tuple will always be present; the others
   may be database specific.

``close()``
   Close the cursor now (rather than whenever ``__del__`` is
   called).  The cursor will be unusable from this point
   forward; an exception will be raised if any operation is
   attempted with the cursor.

``execute(operation [,params])``
   Execute (prepare) a database operation (query or command).
   Parameters may be provided (as a sequence
   (e.g. tuple/list)) and will be bound to variables in the
   operation.  Variables are specified in a database-specific
   notation that is based on the index in the parameter tuple
   (position-based rather than name-based).

   The parameters may also be specified as a sequence of
   sequences (e.g. a list of tuples) to insert multiple rows
   in a single operation.

   A reference to the operation will be retained by the
   cursor.  If the same operation object is passed in again,
   then the cursor can optimize its behavior.  This is most
   effective for algorithms where the same operation is used,
   but different parameters are bound to it (many times).

   For maximum efficiency when reusing an operation, it is
   best to use the ``setinputsizes()`` method to specify the
   parameter types and sizes ahead of time.  It is legal for
   a parameter to not match the predefined information; the
   implementation should compensate, possibly with a loss of
   efficiency.

   Using SQL terminology, these are the possible result
   values from the ``execute()`` method:

   - If the statement is DDL (e.g. ``CREATE TABLE``), then 1 is
     returned.

   - If the statement is DML (e.g. ``UPDATE`` or ``INSERT``), then the
     number of rows affected is returned (0 or a positive
     integer).

   - If the statement is DQL (e.g. ``SELECT``), ``None`` is returned,
     indicating that the statement is not really complete until
     you use one of the  'fetch' methods.

``fetchone()``
   Fetch the next row of a query result, returning a single
   tuple.

``fetchmany([size])``
   Fetch the next set of rows of a query result, returning as
   a list of tuples. An empty list is returned when no more
   rows are available. The number of rows to fetch is
   specified by the parameter.  If it is ``None``, then the
   cursor's arraysize determines the number of rows to be
   fetched.

   Note there are performance considerations involved with
   the size parameter.  For optimal performance, it is
   usually best to use the arraysize attribute.  If the size
   parameter is used, then it is best for it to retain the
   same value from one ``fetchmany()`` call to the next.

``fetchall()``
   Fetch all rows of a query result, returning as a list of
   tuples.  Note that the cursor's arraysize attribute can
   affect the performance of this operation.

``setinputsizes(sizes)``
   (Note: this method is not well-defined yet.)  This can be
   used before a call to ``execute()`` to predefine memory
   areas for the operation's parameters.  sizes is specified
   as a tuple -- one item for each input parameter.  The item
   should be a Type object that corresponds to the input that
   will be used, or it should be an integer specifying the
   maximum length of a string parameter.  If the item is
   ``None``, then no predefined memory area will be reserved
   for that column (this is useful to avoid predefined areas
   for large inputs).

   This method would be used before the ``execute()`` method is
   invoked.

   Note that this method is optional and is merely provided
   for higher performance database interaction.
   Implementations are free to do nothing and users are free
   to not use it.

``setoutputsize(size [,col])``
   (Note: this method is not well-defined yet.)

   Set a column buffer size for fetches of large columns
   (e.g. LONG).  The column is specified as an index into the
   result tuple.  Using a column of ``None`` will set the default
   size for all large columns in the cursor.

   This method would be used before the ``execute()`` method is
   invoked.

   Note that this method is optional and is merely provided
   for higher performance database interaction.
   Implementations are free to do nothing and users are free
   to not use it.


DBI Helper Objects
==================

Many databases need to have the input in a particular format for
binding to an operation's input parameters.  For example, if an
input is destined for a ``DATE`` column, then it must be bound to the
database in a particular string format.  Similar problems exist
for "Row ID" columns or large binary items (e.g. blobs or ``RAW``
columns).  This presents problems for Python since the parameters
to the ``execute()`` method are untyped.  When the database module
sees a Python string object, it doesn't know if it should be bound
as a simple CHAR column, as a raw binary item, or as a ``DATE``.

To overcome this problem, the 'dbi' module was created.  This
module specifies some basic database interface types for working
with databases.  There are two classes: 'dbiDate' and 'dbiRaw'.
These are simple container classes that wrap up a value.  When
passed to the database modules, the module can then detect that
the input parameter is intended as a ``DATE`` or a ``RAW``.  For symmetry,
the database modules will return ``DATE`` and ``RAW`` columns as instances
of these classes.

A Cursor Object's 'description' attribute returns information
about each of the result columns of a query.  The 'type_code' is
defined to be one of five types exported by this module: ``STRING``,
``RAW``, ``NUMBER``, ``DATE``, or ``ROWID``.

The module exports the following names:

``dbiDate(value)``
   This function constructs a 'dbiDate' instance that holds a
   date value.  The value should be specified as an integer
   number of seconds since the "epoch" (e.g. ``time.time()``).

``dbiRaw(value)``
   This function constructs a 'dbiRaw' instance that holds a
   raw (binary) value.  The value should be specified as a
   Python string.

``STRING``
   This object is used to describe columns in a database that
   are string-based (e.g. CHAR).

``RAW``
   This object is used to describe (large) binary columns in
   a database (e.g. LONG RAW, blobs).

``NUMBER``
   This object is used to describe numeric columns in a
   database.

``DATE``
   This object is used to describe date columns in a
   database.

``ROWID``
   This object is used to describe the "Row ID" column in a
   database.


Acknowledgements
================

Many thanks go to Andrew Kuchling who converted the Python
Database API Specification 1.0 from the original HTML format into
the PEP format.


Copyright
=========

This document has been placed in the Public Domain.


..
  Local Variables:
  mode: indented-text
  indent-tabs-mode: nil
  End: