General style

You may also refer to common guidance like the Google Python Style Guide.

See also

The Services section contains Python-related content for PostgreSQL and RabbitMQ.

Organization-wide spell-checking

Naming

There are only two hard things in computer science: cache invalidation, naming things and off-by-one errors.

  • Use lower_snake_case for everything except constants (UPPER_SNAKE_CASE) and classes (UpperCamelCase).

  • Use the same terminology as other projects. At minimum, don’t use the same term for a different concept.

  • Use terminology from Enterprise Integration Patterns.

  • Don’t use “cute” names.

Comments

  • Use sentence case, correct punctuation, and correct capitalization. Do not omit articles.

  • Do not add TODO comments. Instead, create GitHub issues. TODO’s are less visible to the team.

You can use comments to:

  • Describe why code is as it is.

  • Link to relevant documentation or original sources.

  • Add structure to long functions or consecutive lines of class attributes, for example.

When not to use comments:

  • Instead of describing what code does in a comment, describe it in a docstring (or not at all).

  • Instead of describing a magic value in a comment, assign it to a constant.

  • Where possible, use descriptive function names and variable names, to reduce the need for comments. However, in interpreted languages, don’t extract single-caller methods or assign single-use variables just to transform a comment into code.

Maintainers can find TODO comments with this command:

grep -R -i --exclude-dir .git --exclude-dir .sass-cache --exclude-dir .tox --exclude-dir __pycache__ --exclude-dir _build --exclude-dir _static --exclude-dir build --exclude-dir dist --exclude-dir htmlcov --exclude-dir node_modules --exclude-dir sass --exclude-dir LC_MESSAGES --exclude app.js --exclude conf.py '\btodo\b' .

Maintainers can find other comments with this regular expression:

(^#(?!!/)|  #)(?!:\s|\. | (isort|noqa|type): | https:)

across these files (ignoring boilerplate and test files):

-conf.py,-settings.py,-migrations/*,-spiders/*,-tests/*,*.py

Whitespace

Use empty lines to group code into logical units (paragraphs). For example, tests follow an Arrange, Act, Assert pattern. An empty line should separate each step.

Type annotations

Type hints are especially useful in packages for documentation using Sphinx and linting using Mypy. Use of type hints is optional.

Note

Since Mypy has many open issues for relatively common scenarios, using Mypy to validate your type hints is optional.

Reference: typing – Support for type hints

Exceptions

  • Do not raise built-in exceptions. Define specific exceptions in an exceptions.py module. For example:

    class ProjectNameError(Exception):
        """Base class for exceptions from within this package/application"""
    
    
    class SpecificNameError(ProjectNameError):
        """Raised if this clear condition is true"""
    
  • Do not use a bare except: or a generic except Exception:. Use specific error classes to avoid handling exceptions incorrectly.

  • Do not catch an exception and raise a new exception, unless the new exception has a special meaning (e.g. CommandError in Django).

  • If an unexpected error occurs within a long-running worker, allow the worker to die. For example, if a worker is failing due to a broken connection, it should not survive to uselessly attempt to reuse that broken connection.

Warnings

  • Do not add or override any methods in a Warning subclass. In particular, do not add required positional arguments to the __init__ method.

    Why?

    The warnings.catch_warnings(record=True) context manager catches instances of warnings.WarningMessage, not instances of the original warning classes. To reissue a warning, you need to do, like in Apache Airflow:

    warnings.warn_explicit(w.message, w.category, w.filename, w.lineno, source=w.source)
    

    The warnings.warn_explicit() function calls category(message). If the _init__ method is overridden with additional required arguments, a TypeError is raised, like MyWarning.__init__() missing 2 required positional arguments.

    Because the additional required arguments are unavailable, you can’t do:

    warnings.warn(category(w.message, var1, var2))  # var1 and var2 are indeterminable
    
  • Call warnings.warn(message, category=MyWarning), not warnings.warn(MyWarning(message)), to avoid the temptation to add required positional arguments to the __init__ method.

  • warnings.catch_warnings(record=True) catches all warnings. To reissue warnings you aren’t interested in:

    with warnings.catch_warnings(record=True) as wlist:
        warnings.simplefilter("always", category=MyWarning)
    
        ...
    
    for w in wlist:
        if issubclass(w.category, MyWarning):
            ...
        else:
            warnings.warn_explicit(w.message, w.category, w.filename, w.lineno, source=w.source)
    
  • Subclass from the UserWarning class, not the Warning class.

Formatted strings

Tip

Don’t use regular expressions or string methods to parse and construct filenames and URLs.

Use the pathlib (or os.path) module to parse or construct filenames, for cross-platform support.

Use the urllib.parse module to parse and construct URLs, notably: urlsplit (not urlparse), parse_qs, urljoin and urlencode. To replace part of a URL parsed with the urlsplit function, use its _replace method. See examples.

See also

How to construct SQL statements

Format strings (f-strings), introduced in Python 3.6 via PEP 498, are preferred for interpolation of variables:

message = f"hello {name}"

For interpolation of expressions, the str.format() method is preferred if it is easier to read and write. For example:

message = "Is '{name}' correct?".format(name=person["name"])

or:

message = "Is '{person[name]}' correct?".format(person=person)

is easier to write than:

message = f"""Is '{person["name"]}' correct?"""  # AVOID

There are two cases in which f-strings and str.format() are not preferred:

Logging

“Formatting of message arguments is deferred until it cannot be avoided.” If you write:

logger.debug("hello {}".format("world"))  # WRONG

then str.format() is called whether or not the message is logged. Instead, please write:

logger.debug("hello %s", "world")
Internationalization (i18n)

String extraction in most projects is done by the xgettext command, which doesn’t support f-strings. To have a single syntax for translated strings, use named placeholders and the % operator, as recommended by Django. For example:

_('Today is %(month)s %(day)s.') % {'month': m, 'day': d}

Remember to put the % operator outside, not inside, the _() call:

_('Today is %(month)s %(day)s.' % {'month': m, 'day': d})  # WRONG

Note

To learn how to use or migrate between % and format(), see pyformat.info.

Maintenance

Maintainers can find improper formatting with these regular expressions. Test directories and Sphinx conf.py files can be ignored, if needed.

  • Unnamed placeholders, except for log messages, strftime(), psycopg2.extras.execute_values() and common false positives (e.g. % in SECRET_KEY default value):

    (?<!info)(?<!debug|error)(?<!getenv)(?<!warning)(?<!critical|strftime)(?<!exception)(?<!execute_values)\((\n( *['"#].*)?)* *['"].*?%[^( ]
    
  • Named placeholders, except for translation strings and SQL statements:

    (?<!\b[t_])(?<!one|all)(?<!pluck)(?<!gettext|execute|sql\.SQL)\((\n( *['"#].*)?)* *['"].*?%\(
    
  • Named placeholders, with incorrect position of % operator (trailing space):

    %\(.+(?<!\) )%
    
  • Log messages using f-strings or str.format() (case-sensitive), ignoring the extra keyword argument, ArgumentParser.error and Directive.error:

    ^( *)(?:\S.*)?\b(?<!self\.)(?<!subparser\.)_?(?:debug|info|warning|error|critical|exception)\((?:\n(\1 .+)?)*.*?(?<!extra=){
    
  • Translation strings using f-strings or str.format():

    ^( *)(?:\S.*)?(?:\b__?|gettext|lazy)\((?:\n(\1 .+)?)*.*?(?<!% ){
    
  • Remaining occurrences of str.format():

    [^\w\]]\.format\(
    

To correct any remaining occurrences of str.format(), use these patterns and replacements:

Pattern

Replacement

("[^"]*?{)(}[^"]*")\.format\(([\w.]+)\)

f$1$3$2

('[^']*?{)(}[^']*')\.format\(([\w.]+)\)

f$1$3$2

("[^"]*?{)(}[^"]*?{)(}[^"]*")\.format\(([\w.]+), ([\w.]+)\)

f$1$4$2$5$3

('[^']*?{)(}[^']*?{)(}[^']*')\.format\(([\w.]+), ([\w.]+)\)

f$1$4$2$5$3

("[^"]*?{)(}[^"]*?{)(}[^']*?{)(}[^"]*?")\.format\(([\w.]+), ([\w.]+), ([\w.]+)\)

f$1$5$2$6$3$7$4

('[^']*?{)(}[^']*?{)(}[^']*?{)(}[^']*?')\.format\(([\w.]+), ([\w.]+), ([\w.]+)\)

f$1$5$2$6$3$7$4

Long strings

For cases in which whitespace has no effect, like SQL statements, use multi-line strings:

cursor.execute("""
    SELECT *
    FROM table
    WHERE id > 1000
""")

For cases in which whitespace changes the output, like log messages, use consecutive strings:

logger.info(
    "A line with up to 119 characters. Use consecutive strings, one on each line, without `+` operators or join "
    "methods. Do not start a string with a space. Instead, append it to the previous string. If the message has "
    "multiple sentences, do not break the line at punctuation."
)

However, in some cases, it might be easier to edit in the form:

from textwrap import dedent

content = dedent("""\
# Heading

A long paragraph.

- Item 1
- Item 2
- Item 3
""")

Maintainers can find improper use of multi-line strings with this regular expression:

(?<!all|raw)(?<!dedent)(?<!execute)\((\n( *)(#.*)?)*"""

Data structures

Reference

Data Structures

  • To test whether a value equals one of many literals, use a set (not a tuple or list), because a set is fastest. For example:

    if status in {"cancelled", "unsuccessful"}:
        pass
    
  • To iterate over manually composed values, use a tuple (not a list or dict), because a tuple is simplest, because it is immutable. For example:

    for subject, index, column in (
        ("Buyer", 2, "buyer_id"),
        ("ProcuringEntity", 3, "procuring_entity_id"),
        ("Tenderer", 4, "tenderer_id"),
    ):
        pass
    

Default values

Use dict.setdefault instead of a simple if-statement. A simple if-statement has no elif or else branches, and a single statement in the if branch.

data.setdefault('key', 1)
if 'key' not in data:  # AVOID
    data['key'] = 1

Maintainers can find simple if-statements with this regular expression:

^( *)if (.+) not in (.+):(?: *#.*)?\n(?: *#.*\n)* +\3\[\2\] = .+\n(?!(?: *#.*\n)*\1(else\b|elif\b|    \S))

Input/Output

import sys

print('message', file=sys.stderr)
sys.stderr.write('message\n')  # WRONG

See also

File formats

Functional style

itertools, filter() and map() can be harder to read, less familiar, and longer. On PyPy, they can also be slower.

Instead of using filter() and map() with a lambda expression, you can use a list comprehension in most cases. For example:

output = list(filter(lambda x: x < 10, xs))  # AVOID
output = [x for x in xs if x < 10]
output = list(map(lambda x: f'a strong with {x}', xs))  # AVOID
output = [f'a string with {x}' for x in xs]

That said, it is fine to do:

output = map(str, xs)

Object-oriented style

Don’t force polymorphism and inheritance, especially if it sacrifices performance, maintainability or readability.

Python provides encapsulation via modules. As such, functions are preferred to classes where appropriate.

The primary feature for easy maintenance is locality: Locality is that characteristic of source code that enables a programmer to understand that source by looking at only a small portion of it.

Richard Gabriel

Maintainers can find class hierarchies, excluding those imposed by dependencies (Click, Docutils, Django, Django REST Framework, and standard libraries), with this regular expression:

\bclass \S+\((?!(AdminConfig|AppConfig|Directive|Exception|SimpleTestCase|TestCase|TransactionTestCase|json\.JSONEncoder|yaml.SafeDumper)\b|(admin|ast|click|forms|migrations|models|nodes|serializers|template|views|viewsets)\.|\S+(Command|Error|Warning)\b)

Simple statements

Reference

Simple statements

  • Never use relative import.

Standard library

  • Use @dataclass for simple classes only. Using @dataclass with inheritance, mixins, class variables, etc. tends to increase complexity.

Scripts

If a repository requires a command-line tool for management tasks, create an executable script named manage.py in the root of the repository. (This matches Django.)

If you are having trouble with the Python path, try running the script with python -m script_module, which will add the current directory to sys.path.

Examples: extension_registry, deploy