Contributing to pygmtools: Developer Documentation

First, thank you for contributing to pygmtools!

How to contribute

The preferred workflow for contributing to pygmtools is to fork the main repository on GitHub, clone, and develop on a branch. Steps:

  1. Fork the project repository by clicking on the ‘Fork’ button near the top right of the page. This creates a copy of the code under your GitHub user account. For more details on how to fork a repository see this guide.

  2. Clone your fork of the repo from your GitHub account to your local disk:

    $ git clone git@github.com:YourUserName/pygmtools.git
    $ cd pygmtools
    
  3. Create a feature branch to hold your development changes:

    $ git checkout -b my-feature
    

    Always use a feature branch. It is good practice to never work on the master branch!

  4. Develop the feature on your feature branch. Add changed files using git add and then git commit files:

    $ git add modified_files
    $ git commit
    

    to record your changes in Git, then push the changes to your GitHub account with:

    $ git push -u origin my-feature
    
  5. Follow these instructions to create a pull request from your fork. This will email the committers and an automatic check will run.

(If any of the above seems like magic to you, please look up the Git documentation on the web, or ask a friend or another contributor for help.)

Pull Request Checklist

We recommended that your contribution complies with the following rules before you submit a pull request:

  • Follow the PEP8 Guidelines.

  • If your pull request addresses an issue, please use the pull request title to describe the issue and mention the issue number in the pull request description. This will make sure a link back to the original issue is created.

  • All public methods should have informative docstrings with sample usage presented as doctests when appropriate.

  • When adding additional functionality, provide at least one example script under the function’s API. Have a look at other functions’ examples for reference. You are also encouraged to add new examples to the examples/ folder to demonstrate why the new functionality is useful in practice and, if possible, compare it to other methods available in pygmtools.

    • If you have modified any examples, please build the documentation before you commit, and please also commit any changes in docs/auto_examples/.

  • Documentation and high-coverage tests are necessary for enhancements to be accepted. Bug-fixes or new features should be provided with non-regression tests. These tests verify the correct behavior of the fix or feature. In this manner, further modifications on the code base are granted to be consistent with the desired behavior. For the Bug-fixes case, at the time of the PR, these tests should fail for the code base in master and pass for the PR code.

  • At least one paragraph of narrative documentation with links to references in the literature and the example.

You can also check for common programming errors with the following tools:

  • No pyflakes warnings, check with:

    $ pip install pyflakes
    $ pyflakes path/to/module.py
    
  • No PEP8 warnings, check with:

    $ pip install pep8
    $ pep8 path/to/module.py
    
  • AutoPEP8 can help you fix some of the easy redundant errors:

    $ pip install autopep8
    $ autopep8 path/to/pep8.py
    

Filing bugs

We use Github issues to track all bugs and feature requests; feel free to open an issue if you have found a bug or wish to see a feature implemented.

It is recommended to check that your issue complies with the following rules before submitting:

  • Verify that your issue is not being currently addressed by other issues or pull requests.

  • Please ensure all code snippets and error messages are formatted in appropriate code blocks. See Creating and highlighting code blocks.

  • Please include your operating system type and version number, as well as your Python, pygmtools, numpy, and scipy versions. Please also provide the name of your running backend, and the GPU/CUDA versions if you are using GPU. This information can be found by running the following environment report (pygmtools>=0.2.9):

    $ python3 -c 'import pygmtools; pygmtools.env_report()'
    

    If you are using GPU, make sure to install pynvml before running the above script: pip install pynvml.

  • Please be specific about what estimators and/or functions are involved and the shape of the data, as appropriate; please include a reproducible code snippet or link to a gist. If an exception is raised, please provide the traceback.

Documentation

We are glad to accept any sort of documentation: function docstrings, reStructuredText documents, tutorials, examples, etc. reStructuredText documents live in the source code repository under the doc/ directory.

You can edit the documentation using any text editor and then generate the HTML output by typing make html from the docs/ directory. The resulting HTML files are in docs/_build/ and are viewable in any web browser. The example files in examples/ are also built. If you want to skip building the examples, please use the command make html-noplot.

For building the documentation, you will need the packages listed in docs/requirements.txt. Please use python==3.8 to keep it consistent with the read-the-doc builder online. If you have modified any examples, please build the documentation before you commit, and please also commit any changes in docs/auto_examples/.

When you are writing documentation, it is important to keep a good compromise between mathematical and algorithmic details, and give intuition to the reader on what the algorithm does. It is best to always start with a small paragraph with a hand-waving explanation of what the method does to the data.

Notes for Developers

Here we show some ideas to help developers better understand our designs and contribute to pygmtools.

Multiple Numerical Backends

pygmtools supports multiple backends including numpy, pytorch, jittor, paddle, aiming to cater to a broader audience of researchers and practitioners. Each backend might have its strengths and performance characteristics, making it beneficial to provide multiple options for users.

Unified API

Despite supporting multiple backends, pygmtools could present a unified API for its functionalities. This makes it easier for users to switch between different backends, enhancing its ease of use and maintainability.

Documentation-Driven Development

We put a strong emphasis on comprehensive and user-friendly documentation. pygmtools should include usage examples, tutorials, and clear explanations for each component, making it easier for users and contributors to understand and utilize the toolkit.

Performance Benchmarks

We compare the algorithms and numerical backends under various conditions, aiming to provide users with empirical data to guide their selection of the most suitable solvers/algorithms for their specific use cases.

Open-Source Collaboration

Emphasizing open-source principles and collaborative development is likely a key idea. We encourage open discussions, peer reviews, and community engagement to improve the quality and scope of the toolkit.

This Contribution guide is strongly inpired by the one of the scikit-learn team.