Python Code Distribution (PyPI)

23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

This Tuto Techno is largely inspired by Loïc Gouarin Distribuer son Application Python

Before starting, clone the repository:

git clone git@gitlab.inria.fr:tutos-technos/python-pip-code-distribution.git

or

git clone https://gitlab.inria.fr/tutos-technos/python-pip-code-distribution.git

This presentation is also available in the wiki:
https://gitlab.inria.fr/tutos-technos/python-pip-code-distribution/-/wikis/home
It contains more details and exercises.

[TOC]

23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

Introduction

  • The Python language is now widely used in research. Many libraries (applications) are developed in our laboratories and research teams using this language.

  • But are these developments given back to the community ? How much time and energy does it cost to make work available ? What are the benefits ?

  • In this Tuto Techno, we'll explain what a Python package is, how to organize your code to be ready for distributing.

23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

Virtual environment

A python virtual environment is a small, easily reproducible and erasable space where you work in isolation.

23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

Tools

  • venv : available since Python3.3 and is installed on your system at the same time as Python. It is a subset of virtualenv.
  • virtualenv: more flexibility than venv
  • pipenv: combines pip and virtualenv into one command
  • mamba: conda package manager but also manages virtual environment
  • ...
23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

Create and activate an environment

  • pipenv

    pipenv install
    pipenv shell
    
  • virtualenv

    virtualenv my_env
    source my_env/bin/activate
    
23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

Declare your dependencies

These tools can install dependencies thanks to a requirements.txt file. This file can be shared by developer and can have the same development environment.

requirements.txt example:

numpy>1.20
matplotlib==3.7.1
jupyterlab>4.0,<4.1; platform_system == "Linux"
requests [security] >= 2.8.1, == 2.8.* ; python_version < "2.7"
23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

Install dependencies

  • pip

    pip install -r requirements.txt
    
  • pipenv

    pipenv install -r requirements.txt
    
23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

Deactivate the active environment

deactivate
23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

Structure of a Python project

A Python library (=> made to be imported) consists of:

  • Python files with the extension .py made for import, called modules,

  • directories containing Python files, called packages.

A Python project (=> made to be distributed and installed) can have different layout:

  • single module
  • flat
  • src
23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

Single module

examples/calculator_mod/
├── calculator_mod.py
├── pyproject.toml
├── README.md
└── LICENSE.txt
23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

Flat layout

examples/calculator_flat_layout/
├── calculator
│   ├──__init__.py
│   └── operator
│       ├── __init__.py
│       ├── add.py
│       └── sub.py
├── pyproject.toml
├── README.md
└── LICENSE.txt
23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

Src layout

examples/calculator_src_layout/
├── src
│   └── calculator
│       ├──__init__.py
│       └── operator
│           ├── __init__.py
│           ├── add.py
│           └── sub.py
├── pyproject.toml
├── README.md
└── LICENSE.txt

=> requires to (editable) install

pip install -e .
pip install .
23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

Packaging and distributing

Most important files:

  • pyproject.toml (configuration file)
  • README (.rst or .md)
  • LICENSE
23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

Configuration

minimum required:

[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"

[project]
name = "calculator"
dynamic = ["version"]
requires-python = ">=3.8"

[tool.setuptools.dynamic]
version = { attr = "calculator.version.__version__" }

Others backends: Hatchling, PDM, Flit, Whey, Scikit-build, etc.

23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

You can add the dependencies of your project:

[project]
...
dependencies = [
  "numpy",
]

You can add a command line interface to transform your library to an application:

[project.scripts]
calculator-script = "calculator.command_line:main"
23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

To distribute and to be more visible, you need to fill in a little more information before distributing. For example:

  • description
  • authors/ maintainers
  • readme
  • license
  • keywords
  • classifiers
  • ...
23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

Package your project

Use build package to do the packaging. It builds the package in an isolated environment, generating a source-distribution and wheel in the directory dist/.

python -m build
python -m build -s # only source (sdist)
python -m build -w # only wheel (bdist)
23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER
  • sdist is an archive .tar.gz that contains your source files.
    pip use it to build the wheel when it doesn't find a corresponding wheel.
  • bdist creates a wheel format containing files and metadata that only need to be moved to the correct location on the target system, to be installed.
    • pure python => not platform dependant (hello-0.1-py3-none-any.whl) and contains your python files.
    • not pure python => platform dependant (calculator-0.1.0-cp310-cp310-linux_x86_64.whl). It contains python files and compiled files (.so, .dll, ...)
23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

Publish your package

The twine tool lets you put the distributions you've created in the dist directory on PyPI. Two sites are available:

23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

Documentation

For writing Python application documentation, sphinx is widely used now.
It can be configured with the file conf.py. It can be extended with extensions like autodoc to read automatically the documentation from the docstrings.

To start your documentation run

sphinx-quickstart docs

and answer questions. It will generate the basic files needed by sphinx.

23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

Build your documentation

To build the documentation, you can run

sphinx-build -M html sourcedir outputdir

or

make html # if you ran sphinx-quickstart before

You can see your documentation by opening the index.html file located inside your build folder.

23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

Add contents

In Sphinx source files, you can use most features of standard reStructuredText (.rst).
The index.rst file will be your landing page.
You can then add a new page by creating a new .rst file.
For example api.rst:

.. function:: foo(x)
              foo(y, z)
   :module: some.module.name

   Return a line of text input from the user.
23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

Now you can link your page in the toctree of index.rst

.. toctree::
   :maxdepth: 2
   :caption: Contents:

   api
23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

Theme

The default theme is Alabaster. You can change it by setting the variable html_theme in the conf.py file

html_theme = 'pydata_sphinx_theme'

Other themes:

  • sphinx_rtd_theme
  • pydata_sphinx_theme
  • furo

See more themes here: https://sphinx-themes.org/

23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

Extensions

To add an extension you need to add your extension inside the variable
extensions in the conf.py file.
For example:

extensions = [..., "sphinx.ext.autodoc", ...]
23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

List of nice extensions

  • sphinx.ext.autodoc
  • sphinx.ext.viewcode
  • sphinx.ext.napoleon
  • sphinx-copybutton
  • nbsphinx
  • nbsphinx-link
  • myst-parser
23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

Continuous Deployment

23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

Package distribution

GitHub

  • create an environment (ex pypi) to deploy.
  • create a workflow that triggers on release
    • use github actions to build and publish (pypi-publish) the package
  • on PyPI add your workflow as a trusted publisher
  • do a release from github
23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

GitLab

There is neither github action nor trusted publisher so we need to use
twine with a api token to publish.

  • on PyPI create an api token
  • create an environment (ex pypi) to deploy.
  • save it to your environment variable or project variable.
  • create a pipeline that triggers on tag
    • create jobs that build and publish with your token (saved in a CI/CD variable)
  • do a release or push a git tag
23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

Documentation distribution

  • readthedoc

    • requirements.txt
    • .readthedocs.yml
  • pages

    • GitHub with github actions
    • GitLab with the pages job
23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER

Pratical session

Go to https://gitlab.inria.fr/tutos-technos/python-pip-code-distribution/-/wikis/home.
Don't hesitate to look at some project examples located in the examples folder

Do the exercises of the wiki:

  • env, structure (30 min)
  • packaging and distributing (~1h)
  • documentation (~30 min)
  • CD (~30min)

You can always ask questions !!!

23/02/2024 --- Distributing Python Program --- Nathaniel SEYLER