Skip to content

Cookiecutter template for python data projects that will setup environment in a clean way

Notifications You must be signed in to change notification settings

rcammisola/python-data-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Cookiecutter for Python Data Projects

This is a template cookiecutter project for bootstrapping your work on python data projects. It contains :

  • a directory structure for sorting your notebooks, data, models, figures, tasks and source code to reuse in notebooks
  • a conda environment file with the basic python libraries and some extras :
    • numpy / pandas / scikit-learn / seaborn / statsmodels / plotly / jupyterlab classic Data Science stack
    • lightgbm for prediction
    • missingno for missing data analysis
    • invoke as a replacement to Makefile for managing project tasks
    • nbdime for diffing and merging notebooks
    • path.py for browsing files in Python
    • kaggle-api a CLI for interacting with Kaggle API (Optional)
    • pytest and coverage (with badges)

The template post-hook will:

  • install itself as a package
  • add an ipykernel so that the environment is properly referenced by Jupyter

Prerequisites

  • Cookiecutter >= 1.4.0: This can be installed with pip by or conda depending on how you manage your Python packages:
$ pip install cookiecutter

Generate a new project

In a folder where you want your project generated : cookiecutter git@github.com:rcammisola/python-data-project.git

You can also clone the project in <path/to/template>, and from the folder where you want to generate your project, launch cookiecutter <path/to/template>

It will ask for the following values :

full_name
email
project_name
project_short_description
python_version
version
for_kaggle

Complete the values for your project and voilà ! Then follow the README inside your new project for further installation.

Contributing guide

All contributions, bug reports, bug fixes, documentation improvements, enhancements and ideas are welcome.

Credits

This project is heavily influenced by drivendata's Data Science cookiecutter and cookiecutter Kaggle template project.

Other links that helped shape this cookiecutter :

TODO:

  • Git init
  • Add nb-clean to dev requiremnts
  • Add bumpversion

About

Cookiecutter template for python data projects that will setup environment in a clean way

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published