Skip to content

station-10/ds-uberhacks

Repository files navigation

ds-uberhacks

Repo to compliment my 10 Data Science Uberhacks to Turbocharge your workflow presentation.

It has my slides, this README with a bunch of links in and a basic Makefile.

The Uberhacks

1. Github Awesomeness

Awesome curated content. Great for researching and finding more uberhacks.

Some good awesome lists:

2. D-Tale

Eyeball data easily. Use it instead of MS Excel.

3. YData Profiling

EDA as a Service.

4. TheFuzz

Fuzzy String Matching.

5. UK Open Data

So. Much. Data.

6. Yellowbrick

Easy AI Visualisation.

7. Shap

AI Explainability

8. Fairlearn

Non-discriminatory AI.

9. Metaflow

Easy Pipelines.

10. Make

CLI maker-easier.

Usage

  1. Check the Makefile in this repo - it contains some basic recipes for creating a conda environment and running a main.py file, but you can add stuff like Docker, Cloud tasks etc. to it... Anything and everything involving the CLI.
  2. The Makefile is linked to the .env file. If you specify a variable in the .env, Make will read it and use it.
  3. Using Make is simple: make <command>. You can type make help or just make for a list of commands.
  4. For the create-environment command, Make will install everything in requirements-conda.txt and then everything in requirements-pip.txt
  5. I've recently stopped trying to conda install anything. The general consensus is that it's broken due to the length of time it takes solving environments and such. As such, all requirements are in the requirements-pip.txt file. I'll still use conda as my package manager because it's interoperable with cloud platforms and MLFlow, but yeah conda install sucks now. I hope they find a way to fix it 😫.
  6. Feel free to use this Makefile and setup as a base. I can't claim credit for it. I stole the Makefile from Yuxiang Gong's tutorial which is also probably a good place to start for learning more about it.
  7. Remember... Make is pre-installed on linux, available via XCode on mac and via choco on Windows!
  8. Last thing I promise! nbstripout is awesome and will remove your notebook output cells from git so you don't commit sensitive data. EVERYONE should be using it! In hindsight, it should have made the top 10. Maybe next time...

More Uberhacky Things...

These didn't make the list for one reason or another, but are still worth checking out!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published