Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc examples #47

Closed
wants to merge 45 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
978e69a
✏️ Fix get_collinear_point docstring #18
EssamWisam Sep 2, 2023
e52acf0
✏️ Improve SMOTE-N docstring #23
EssamWisam Sep 2, 2023
bb01745
✏️ Fix output tagging in examples #24
EssamWisam Sep 2, 2023
51c561f
✏️ Fix table_wrapper docstring typo #39
EssamWisam Sep 2, 2023
8b8025a
✏️ Rename example variable for clarity #26
EssamWisam Sep 2, 2023
ef4b6e1
✅ Better testing of floats equality #28
EssamWisam Sep 2, 2023
00b2c38
✏️ Fix matrixify argument docstring #38
EssamWisam Sep 2, 2023
7de6bec
🚨 Do away with abstract types in structs #33 #22
EssamWisam Sep 2, 2023
a300e2b
🚨Omit function keyword from docstrings #31
EssamWisam Sep 2, 2023
9335cbb
📝 README retouch
EssamWisam Sep 2, 2023
bb15304
🔥 Delete CompatHelper.yml
EssamWisam Sep 3, 2023
ca5a90e
tweak Credits in readme
ablaom Sep 4, 2023
c0adcba
tweak Credits in readme
ablaom Sep 4, 2023
52251fb
Merge branch 'readme-credits' of https://github.com/JuliaAI/Imbalance…
ablaom Sep 4, 2023
fedf801
Merge pull request #44 from JuliaAI/readme-credits
EssamWisam Sep 4, 2023
6139872
💄 Modularize the interfaces
EssamWisam Sep 4, 2023
eb15a0b
Merge pull request #45 from JuliaAI/Modularization
EssamWisam Sep 4, 2023
13f3005
🚑 Less excessive typing #30, #40
EssamWisam Sep 4, 2023
468d210
Even less excessive typing #31
EssamWisam Sep 4, 2023
f5025ff
🚑 Fix URLs
EssamWisam Sep 4, 2023
59e70f9
⬆️ Update Documenter.yml
EssamWisam Sep 4, 2023
5ebcda7
👽️ Dummy commit
EssamWisam Sep 4, 2023
56e21b2
Merge branch 'dev' of https://github.com/JuliaAI/Imbalance.jl into dev
EssamWisam Sep 4, 2023
5eb8f72
➕ Simplify the table wrapper #41
EssamWisam Sep 4, 2023
4e483e0
📝 Update README.md
EssamWisam Sep 4, 2023
26ea218
Merge branch 'dev' into Doc-Examples
EssamWisam Sep 4, 2023
72c31ae
⚡️ Improve KNN computations
EssamWisam Sep 4, 2023
bad5c96
💫 major code refactoring
EssamWisam Sep 4, 2023
97eb6c0
✅ Fix SMOTENC test
EssamWisam Sep 4, 2023
9c0ff78
✅ Small ROSE fix
EssamWisam Sep 4, 2023
e308511
✨ Add tree hyperparameter to SMOTENC
EssamWisam Sep 4, 2023
270dd99
✨ Add missing "begin"
EssamWisam Sep 4, 2023
0cedb6e
🚑 Fix badge
EssamWisam Sep 5, 2023
ed78613
Update Documenter.yml
EssamWisam Sep 5, 2023
3e027c3
Dummy commit
EssamWisam Sep 5, 2023
7ea45dc
Update Documenter.yml
EssamWisam Sep 5, 2023
2318230
➕ Add some deps
EssamWisam Sep 5, 2023
7d90689
Merge branch 'Doc-Examples' of https://github.com/JuliaAI/Imbalance.j…
EssamWisam Sep 5, 2023
ad067bd
Create codecov.yml
EssamWisam Sep 5, 2023
64935ff
⬆️ Update docs deployment branch
EssamWisam Sep 5, 2023
9ae563a
Merge pull request #46 from JuliaAI/CodeCov-setup
EssamWisam Sep 5, 2023
c65fa51
🚑 Fix small issues with docs
EssamWisam Sep 5, 2023
de88c04
Merge branch 'dev' into Doc-Examples
EssamWisam Sep 5, 2023
5f6a109
🚑 Fix Colab img
EssamWisam Sep 5, 2023
860e549
Merge branch 'Doc-Examples' of https://github.com/JuliaAI/Imbalance.j…
EssamWisam Sep 5, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 0 additions & 45 deletions .github/workflows/CompatHelper.yml

This file was deleted.

4 changes: 2 additions & 2 deletions .github/workflows/Documenter.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ name: Documenter
on:
push:
branches:
- main
- Doc-Examples
tags: '*'
pull_request:
jobs:
Expand All @@ -19,4 +19,4 @@ jobs:
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} # For authentication with GitHub Actions token
DOCUMENTER_KEY: ${{ secrets.DOCUMENTER_KEY }} # For authentication with SSH deploy key
run: julia --project=docs/ docs/make.jl
run: julia --project=docs/ docs/make.jl
4 changes: 4 additions & 0 deletions .github/workflows/codecov.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
- name: Upload coverage reports to Codecov
uses: codecov/codecov-action@v3
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
12 changes: 4 additions & 8 deletions Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -6,20 +6,13 @@ version = "0.1.0"
[deps]
CategoricalArrays = "324d7699-5711-5eae-9e2f-1d82baa6b597"
CategoricalDistributions = "af321ab8-2d2e-40a6-b165-3d674595d28e"
Conda = "8f4d0f93-b110-5947-807f-2305c1781a2d"
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
Distances = "b4f34e82-e78d-54a5-968a-f98e89d6e8f7"
DocumenterTools = "35a29f4d-8980-5a13-9543-d66fff28ecb8"
JuliaFormatter = "98e50ef6-434e-11e9-1051-2b60c6c9e899"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
MLJModelInterface = "e80e1ace-859a-464e-9ed9-23947d8ae3ea"
MLJTestInterface = "72560011-54dd-4dc2-94f3-c5de45b75ecd"
Memoization = "6fafb56a-5788-4b4e-91ca-c0cea6611c73"
NearestNeighbors = "b8a86587-4115-5ab1-83bc-aa920d37bbce"
OneRule = "90484964-6d6a-4979-af09-8657dbed84ff"
OrderedCollections = "bac558e1-5e72-5ebc-8fee-abe8a469f55d"
Parameters = "d96e819e-fc66-5662-9728-84c9c7592b0a"
Pkg = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"
ProgressMeter = "92933f4c-e287-5a05-a399-4b506db050ca"
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
ScientificTypes = "321657f4-b219-11e9-178b-2701a2544e81"
Expand All @@ -46,6 +39,9 @@ PyCall = "438e738f-606a-5dbb-bf0a-cddfbfd45ab0"
StableRNGs = "860ef19b-820b-49d6-a774-d7a799459cd3"
TableTransforms = "0d432bfd-3ee1-4ac1-886a-39f05cc69a3e"
Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
Pkg = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"
Conda = "8f4d0f93-b110-5947-807f-2305c1781a2d"


[targets]
test = ["Test", "DataFrames", "MLJBase", "TableTransforms", "StableRNGs", "PyCall"]
test = ["Test", "DataFrames", "MLJBase", "TableTransforms", "StableRNGs", "PyCall", "Pkg", "Conda"]
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@

A Julia package with resampling methods to correct for class imbalance in a wide variety of classification settings.

[![](https://img.shields.io/badge/docs-dev-blue.svg)](https://essamwisam.github.io/Imbalance.jl/dev/)
[![Tests](https://github.com/EssamWisam/Imbalance.jl/actions/workflows/Runtests.yml/badge.svg)](https://github.com/EssamWisam/Imbalance.jl/actions/workflows/Runtests.yml)
[![](https://img.shields.io/badge/docs-dev-blue.svg)](https://juliaai.github.io/Imbalance.jl/dev/)
[![Tests](https://github.com/JuliaAI/Imbalance.jl/actions/workflows/Runtests.yml/badge.svg)](https://github.com/JuliaAI/Imbalance.jl/actions/workflows/Runtests.yml)

## ⏬ Installation
```julia
Expand All @@ -24,8 +24,8 @@ using Imbalance

# Set dataset properties then generate imbalanced data
probs = [0.5, 0.2, 0.3] # probability of each class
num_rows, num_cont_feats = 100, 5
X, y = generate_imbalanced_data(num_rows, num_cont_feats; probs, rng=42)
num_rows, num_continuous_feats = 100, 5
X, y = generate_imbalanced_data(num_rows, num_continuous_feats; probs, rng=42)

# Apply SMOTE to oversample the classes
Xover, yover = smote(X, y; k=5, ratios=Dict(0=>1.0, 1=> 0.9, 2=>0.8), rng=42)
Expand Down Expand Up @@ -55,7 +55,7 @@ All implemented oversampling methods are considered static transforms and hence,
This interface operates on single tables; it assumes that `y` is one of the columns of the given table. Thus, it follows a similar pattern to the `MLJ` interface except that the index of `y` is a required argument while instantiating the model and the data to be transformed via `apply` is only one table `Xy`.
```julia
using Imbalance
using TableTransforms
using Imbalance.TableTransforms

# Generate imbalanced data
num_rows = 200
Expand All @@ -65,7 +65,7 @@ Xy, _ = generate_imbalanced_data(num_rows, num_features;
probs=[0.5, 0.2, 0.3], insert_y=y_ind, rng=42)

# Initiate SMOTE model
oversampler = SMOTE_t(y_ind; k=5, ratios=Dict(0=>1.0, 1=> 0.9, 2=>0.8), rng=42)
oversampler = SMOTE(y_ind; k=5, ratios=Dict(0=>1.0, 1=> 0.9, 2=>0.8), rng=42)
Xyover = Xy |> oversampler # can chain with other table transforms
Xyover, cache = TableTransforms.apply(oversampler, Xy) # equivalently
```
Expand Down Expand Up @@ -110,4 +110,4 @@ One obvious possible remedy is to weight the smaller sums so that a learning alg
To our knowledge, there are no existing maintained Julia packages that implement oversampling algorithms for multi-class classification problems or that handle both nominal and continuous features. This has served as a primary motivation for the creation of this package.

## 👥 Credits
This package was created by [Essam Wisam](https://github.com/EssamWisam) under supervision and expert guidance from his mentor [Dr. Anthony Blaom](https://github.com/ablaom). The binary `SMOTE` implementation by [Dr. Rik Huijzer](https://github.com/rikhuijzer) in `Resample.jl` has also been ultimately helpful while starting this project.
This package was created by [Essam Wisam](https://github.com/JuliaAI) as a Google Summer of Code project, under the mentorship of [Anthony Blaom](https://ablaom.github.io). Additionally, [Rik Huijzer](https://github.com/rikhuijzer) and his binary `SMOTE` implementation in `Resample.jl` have also been helpful.
4 changes: 4 additions & 0 deletions docs/Project.toml
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
[deps]
CSV = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b"
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
DocumenterTools = "35a29f4d-8980-5a13-9543-d66fff28ecb8"
MLJ = "add582a8-e3ab-11e8-2d5e-e98b27df1bc7"
ScientificTypes = "321657f4-b219-11e9-178b-2701a2544e81"
107 changes: 20 additions & 87 deletions docs/examples.jl
Original file line number Diff line number Diff line change
@@ -1,43 +1,46 @@
"""
This file automatically generates the grid in examples.md from a given Julia dictionary.
"""

data = [
Dict(
"title" => "Effect of Ratios Hyperparameter",
"description" => "In this tutorial we use an SVM and SMOTE and the Iris data to study
how the decision regions change with the amount of oversampling",
"image" => "/examples/assets/iris smote.jpeg",
"link" => "/examples/effect_of_ratios",
"colab_link" => "https://colab.research.google.com/github/JuliaAI/Imbalance.jl/blob/main/examples/effect_of_ratios.ipynb"
"image" => "./assets/iris smote.jpeg",
"link" => "./effect_of_ratios",
"colab_link" => "https://colab.research.google.com/github/JuliaAI/Imbalance.jl/blob/Doc-Examples/examples/effect_of_ratios.ipynb"
),
Dict(
"title" => "From Random Oversampling to ROSE",
"description" => "In this tutorial we study the `s` parameter in rose and the effect
of increasing it.",
"image" => "/examples/assets/iris rose.jpeg",
"link" => "/examples/effect_of_s",
"colab_link" => "https://colab.research.google.com/github/JuliaAI/Imbalance.jl/blob/main/examples/effect_of_s.ipynb"
"image" => "./assets/iris rose.jpeg",
"link" => "./effect_of_s",
"colab_link" => "https://colab.research.google.com/github/JuliaAI/Imbalance.jl/blob/Doc-Examples/examples/effect_of_s.ipynb"
),
Dict(
"title" => "SMOTE on Customer Churn Data",
"description" => "In this tutorial we apply SMOTE and random forest to predict customer churn based
on continuous attributes.",
"image" => "/examples/assets/churn smote.jpeg",
"link" => "/examples/smote_churn_dataset",
"colab_link" => "https://colab.research.google.com/github/JuliaAI/Imbalance.jl/blob/main/examples/smote_churn_dataset.ipynb"
"image" => "./assets/churn smote.jpeg",
"link" => "./smote_churn_dataset",
"colab_link" => "https://colab.research.google.com/github/JuliaAI/Imbalance.jl/blob/Doc-Examples/examples/smote_churn_dataset.ipynb"
),
Dict(
"title" => "SMOTEN on Mushroom Data",
"description" => "In this tutorial we use a purely categorical dataset to predict mushroom odour.",
"image" => "/examples/assets/mushy.jpeg",
"link" => "/examples/smoten_mushroom",
"colab_link" => "https://colab.research.google.com/github/JuliaAI/Imbalance.jl/blob/main/examples/smoten_mushroom.ipynb"
"image" => "./assets/mushy.jpeg",
"link" => "./smoten_mushroom",
"colab_link" => "https://colab.research.google.com/github/JuliaAI/Imbalance.jl/blob/Doc-Examples/examples/smoten_mushroom.ipynb"
),
Dict(
"title" => "SMOTENC on Customer Churn Data",
"description" => "In this tutorial we extend the SMOTE tutorial to include both categorical and continuous
data for churn prediction",
"image" => "/examples/assets/churn smoten.jpeg",
"link" => "/examples/smotenc_churn_dataset",
"colab_link" => "https://colab.research.google.com/github/JuliaAI/Imbalance.jl/blob/main/examples/smotenc_churn_dataset.ipynb"
"image" => "./assets/churn smoten.jpeg",
"link" => "./smotenc_churn_dataset",
"colab_link" => "https://colab.research.google.com/github/JuliaAI/Imbalance.jl/blob/Doc-Examples/examples/smotenc_churn_dataset.ipynb"
)
]

Expand All @@ -51,10 +54,10 @@ for item in data
colab_link = item["colab_link"]
grid_item = """
<div class="grid-item">
<a href="$colab_link"><img id="colab" src="/examples/assets/colab.png"/></a>
<a href="$colab_link"><img id="colab" src="./assets/colab.png"/></a>
<a href="$link">
<img src="$img_src" alt="Image">
<div class="title">$title
<div class="item-title">$title
<p>$description</p>
</div>
</a>
Expand All @@ -66,81 +69,11 @@ end
template = """
```@raw html

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<style>

.grid {
display: grid;
grid-template-columns: repeat(auto-fit, minmax(250px, 1fr));
gap: 20px;
max-width: 1200px;
padding: 20px;
}

.grid-item {
position: relative;
overflow: hidden;
border-radius: 15px;
transition: transform 0.3s ease;
cursor: pointer;
}

.grid-item img {
max-width: 100%;
height: auto;
display: block;
}

.grid-item .title {
position: absolute;
bottom: 0;
left: 0;
width: 100%;
background-color: rgba(0, 0, 0, 0.9);
color: #fff;
padding: 8px 15px;
font-size: 16px;
}

.title p {
font-weight: normal !important;
display: none;
}



.grid-item:hover p {
display: block;
}

.grid-item:hover #colab {
display: block;
}

#colab {
border-radius: 25%;
position: absolute;
top: 3px;
right: 3px;
width: 11%;
display: none;
}
</style>

</head>
<body>

<div class="grid">
$grid_items
</div>

<script>
</body>
</html>

```"""

Expand Down
11 changes: 5 additions & 6 deletions docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,10 @@ include("examples.jl")


makedocs(
sitename = "Imbalance.jl",
authors = "Essam Wisam, mentored by Dr. Anthony Blaom",
repo="https://github.com/JuliaAI/TableTransforms.jl/",

sitename = "Imbalance.jl",
authors = "Essam Wisam, mentored by Dr. Anthony Blaom",
repo="https://github.com/JuliaAI/Imbalance.jl/",
format = Documenter.HTML(;
assets=[
"assets/favicon.ico",
Expand All @@ -44,6 +45,4 @@ makedocs(
# Documenter can also automatically deploy documentation to gh-pages.
# See "Hosting Documentation" and deploydocs() in the Documenter manual
# for more information.
deploydocs(repo = "github.com/EssamWisam/Imbalance.jl.git")


deploydocs(repo = "github.com/JuliaAI/Imbalance.jl.git", devbranch="Doc-Examples")
3 changes: 2 additions & 1 deletion docs/src/about.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
# Credits
This package was created by [Essam Wisam](https://github.com/EssamWisam) under supervision and expert guidance from his mentor [Dr. Anthony Blaom](https://github.com/ablaom). The binary `SMOTE` implementation by [Dr. Rik Huijzer](https://github.com/rikhuijzer) in `Resample.jl` has also been ultimately helpful while starting this project.
This package was created by [Essam Wisam](https://github.com/JuliaAI) as a Google Summer of Code project, under the mentorship of [Anthony Blaom](https://ablaom.github.io). Additionally, [Rik Huijzer](https://github.com/rikhuijzer) and his binary `SMOTE` implementation in `Resample.jl` have also been helpful.

Loading
Loading