Skip to content

Commit

Permalink
Merge pull request #40 from mathurinm/point2skglm
Browse files Browse the repository at this point in the history
DOC update readme & doc landing page to point towards skglm
  • Loading branch information
yngvem authored Jun 6, 2024
2 parents aa1c77a + 07fc392 commit b24d6a4
Show file tree
Hide file tree
Showing 5 changed files with 301 additions and 242 deletions.
137 changes: 6 additions & 131 deletions README.rst
Original file line number Diff line number Diff line change
@@ -1,136 +1,11 @@
===========
Group Lasso
===========
⚠️⚠️ **Disclaimer** ⚠️⚠️: This package is no longer maintained.

.. image:: https://pepy.tech/badge/group-lasso
:target: https://pepy.tech/project/group-lasso
:alt: PyPI Downloads
If you are looking for efficient and scikit-learn-like models with group structure such **Group Lasso** and **Group Logistic Regression**, have a look at `skglm <https://github.com/scikit-learn-contrib/skglm>`_

.. image:: https://github.com/yngvem/group-lasso/workflows/Unit%20tests/badge.svg
:target: https://github.com/yngvem/group-lasso

..
.. image:: https://coveralls.io/repos/github/yngvem/group-lasso/badge.svg
:target: https://coveralls.io/github/yngvem/group-lasso
``skglm`` provides efficient and scikit-learn-compatible models with group structure such as `Group Lasso <https://contrib.scikit-learn.org/skglm/generated/skglm.GroupLasso.html#skglm.GroupLasso>`_ and Group Logistic Regression.
It extends the features of ``scikit-learn`` for Generalized Linear Models by implementing a wealth of missing models.
Check out the `documentation <https://contrib.scikit-learn.org/skglm/api.html>`_ for the full list of supported models, and the `Gallery of examples <https://contrib.scikit-learn.org/skglm/auto_examples/index.html>`_ to see its speed and efficiency when tackling large scale problems.

.. image:: https://readthedocs.org/projects/group-lasso/badge/?version=latest
:target: https://group-lasso.readthedocs.io/en/latest/?badge=latest

.. image:: https://img.shields.io/pypi/l/group-lasso.svg
:target: https://github.com/yngvem/group-lasso/blob/master/LICENSE

.. image:: https://img.shields.io/badge/code%20style-black-000000.svg
:target: https://github.com/python/black

.. image:: https://www.codefactor.io/repository/github/yngvem/group-lasso/badge
:target: https://www.codefactor.io/repository/github/yngvem/group-lasso
:alt: CodeFactor

The group lasso [1]_ regulariser is a well known method to achieve structured
sparsity in machine learning and statistics. The idea is to create
non-overlapping groups of covariates, and recover regression weights in which
only a sparse set of these covariate groups have non-zero components.

There are several reasons for why this might be a good idea. Say for example
that we have a set of sensors and each of these sensors generate five
measurements. We don't want to maintain an unneccesary number of sensors.
If we try normal LASSO regression, then we will get sparse components.
However, these sparse components might not correspond to a sparse set of
sensors, since they each generate five measurements. If we instead use group
LASSO with measurements grouped by which sensor they were measured by, then
we will get a sparse set of sensors.

An extension of the group lasso regulariser is the sparse group lasso
regulariser [2]_, which imposes both group-wise sparsity and coefficient-wise
sparsity. This is done by combining the group lasso penalty with the
traditional lasso penalty. In this library, I have implemented an efficient
sparse group lasso solver being fully scikit-learn API compliant.

------------------
About this project
------------------
This project is developed by Yngve Mardal Moe and released under an MIT
lisence.

------------------
Installation guide
------------------
Group-lasso requires Python 3.5+, numpy and scikit-learn.
To install group-lasso via ``pip``, simply run the command::

pip install group-lasso

Alternatively, you can manually pull this repository and run the
``setup.py`` file::

git clone https://github.com/yngvem/group-lasso.git
cd group-lasso
python setup.py

-------------
Documentation
-------------

You can read the full documentation on
`readthedocs <https://group-lasso.readthedocs.io/en/latest/maths.html>`_.

--------
Examples
--------

There are several examples that show usage of the library
`here <https://group-lasso.readthedocs.io/en/latest/auto_examples/index.html>`_.

------------
Further work
------------

1. Fully test with sparse arrays and make examples
2. Make it easier to work with categorical data
3. Poisson regression

----------------------
Implementation details
----------------------
The problem is solved using the FISTA optimiser [3]_ with a gradient-based
adaptive restarting scheme [4]_. No line search is currently implemented, but
I hope to look at that later.

Although fast, the FISTA optimiser does not achieve as low loss values as the
significantly slower second order interior point methods. This might, at
first glance, seem like a problem. However, it does recover the sparsity
patterns of the data, which can be used to train a new model with the given
subset of the features.

Also, even though the FISTA optimiser is not meant for stochastic
optimisation, it has to my experience not suffered a large fall in
performance when the mini batch was large enough. I have therefore
implemented mini-batch optimisation using FISTA, and thus been able to fit
models based on data with ~500 columns and 10 000 000 rows on my moderately
priced laptop.

Finally, we note that since FISTA uses Nesterov acceleration, is not a
descent algorithm. We can therefore not expect the loss to decrease
monotonically.

----------
References
----------

.. [1] Yuan, M. and Lin, Y. (2006), Model selection and estimation in
regression with grouped variables. Journal of the Royal Statistical
Society: Series B (Statistical Methodology), 68: 49-67.
doi:10.1111/j.1467-9868.2005.00532.x
.. [2] Simon, N., Friedman, J., Hastie, T., & Tibshirani, R. (2013).
A sparse-group lasso. Journal of Computational and Graphical
Statistics, 22(2), 231-245.
.. [3] Beck, A. and Teboulle, M. (2009), A Fast Iterative
Shrinkage-Thresholding Algorithm for Linear Inverse Problems.
SIAM Journal on Imaging Sciences 2009 2:1, 183-202.
doi:10.1137/080716542
.. [4] O’Donoghue, B. & Candès, E. (2015), Adaptive Restart for
Accelerated Gradient Schemes. Found Comput Math 15: 715.
doi:10.1007/s10208-013-9150-
If you are looking for the ``grouplasso`` documentation, view the `old version of the README <./old_README.rst>`_.
19 changes: 19 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,29 @@ SPHINXBUILD = sphinx-build
SOURCEDIR = .
BUILDDIR = _build

GITHUB_PAGES_BRANCH = gh-pages
OUTPUTDIR = _build/html
STABLE_DOC_DIR = stable

ALLSPHINXOPTS = -d $(BUILDDIR)/doctrees $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) .


# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

html-noplot:
$(SPHINXBUILD) -D plot_gallery=0 -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html
@echo
@echo "Build finished. The HTML pages are in $(BUILDDIR)/html."

.PHONY: html
html:
$(SPHINXBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html
@echo
@echo "Build finished. The HTML pages are in $(BUILDDIR)/html."


.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
Expand Down
127 changes: 16 additions & 111 deletions docs/index.rst
Original file line number Diff line number Diff line change
@@ -1,122 +1,27 @@
Efficient Group Lasso in Python
===============================

This library provides efficient computation of sparse group lasso regularise
linear and logistic regression.
.. Caution::

What is group lasso?
--------------------
This package is no longer maintained and it is advised to use `skglm <https://github.com/scikit-learn-contrib/skglm>`_ instead.

It is often the case that we have a dataset where the covariates form natural
groups. These groups can represent biological function in gene expression
data or maybe sensor location in climate data. We then wish to find a sparse
subset of these covariate groups that describe the relationship in the data.
Let us look at an example to crystalise the usefulness of this further.
``skglm`` provides efficient and scikit-learn-compatible models with group structure such as `Group Lasso <https://contrib.scikit-learn.org/skglm/generated/skglm.GroupLasso.html#skglm.GroupLasso>`_ and Group Logistic Regression.
It extends the features of ``scikit-learn`` for Generalized Linear Models by implementing a wealth of missing models.
Check out the `documentation <https://contrib.scikit-learn.org/skglm/api.html>`_ for the full list of supported models, and the `Gallery of examples <https://contrib.scikit-learn.org/skglm/auto_examples/index.html>`_ to see its speed and efficiency when tackling large scale problems.

Say that we work as data scientists for a large Norwegian food supplier and
wish to make a prediction model for the amount of that will be sold based on
weather data. We have weather data from cities in Norway and need to know how
the fruit should be distributed across different warehouses. From each city,
we have information about temperature, precipitation, wind strength, wind
direction and how cloudy it is. Multiplying the number of cities with the
number of covariates per city, we get 1500 different covariates in total.
It is unlikely that we need all these covariates in our model, so we seek a
sparse set of these to do our predictions with.

Let us now assume that the weather data API that we use charge money by
the number of cities we query, but the amount of information we get per
city. We therefore wish to create a regression model that predicts fruit
demand based on a sparse set of city observations. One way to achieve such
sparsity is through the framework of group lasso regularisation [1]_.

What is sparse group lasso
--------------------------
Follow :ref:`this link <old_doc>` to access the documentation of the unmaintained package.

The sparse group lasso regulariser [2]_ is an extension of the group lasso
regulariser that also promotes parameter-wise sparsity. It is the combination
of the group lasso penalty and the normal lasso penalty. If we consider the
example above, then the sparse group lasso penalty will yield a sparse set
of groups and also a sparse set of covariates in each selected group. An
example of where this is useful is if each city query has a set price that
increases based on the number of measurements we want from each city.

A quick mathematical interlude
------------------------------

Let us now briefly describe the mathematical problem solved in group lasso
regularised machine learning problems. Originally, group lasso algorithm [1]_
was defined as regularised linear regression with the following loss function

.. math::
\text{arg} \frac{1}{n} \min_{\mathbf{\beta}_g \in \mathbb{R^{d_g}}}
|| \sum_{g \in \mathcal{G}} \left[\mathbf{X}_g\mathbf{\beta}_g\right] - \mathbf{y} ||_2^2
+ \lambda_1 ||\mathbf{\beta}||_1
+ \lambda_2 \sum_{g \in \mathcal{G}} \sqrt{d_g}||\mathbf{\beta}_g||_2,
where :math:`\mathbf{X}_g \in \mathbb{R}^{n \times d_g}` is the data matrix
corresponding to the covariates in group :math:`g`, :math:`\mathbf{\beta}_g`
is the regression coefficients corresponding to group :math:`g`,
:math:`\mathbf{y} \in \mathbf{R}^n` is the regression target, :math:`n` is the
number of measurements, :math:`d_g` is the dimensionality of group :math:`g`,
:math:`\lambda_1` is the parameter-wise regularisation penalty,
:math:`\lambda_2` is the group-wise regularisation penalty and
:math:`\mathcal{G}` is the set of all groups.

Notice, in the equation above, that the 2-norm is *not* squared. A consequence
of this is that the regulariser has a "kink" at zero, uninformative covariate
groups to have zero-valued regression coefficients. Later, it has been popular
to use this methodology to regularise other machine learning algorithms, such
as logistic regression. The "only" thing neccesary to do this is to exchange
the squared norm term, :math:`|| \sum_{g \in \mathcal{G}} \left[\mathbf{X}_g\mathbf{\beta}_g\right] - \mathbf{y} ||_2^2`,
with a general loss term, :math:`L(\mathbf{\beta}; \mathbf{X}, \mathbf{y})`,
where :math:`\mathbf{\beta}` and :math:`\mathbf{X}` is the concatenation
of all group coefficients and group data matrices, respectively.


API design
----------

The ``group-lasso`` python library is modelled after the ``scikit-learn`` API
and should be fully compliant with the ``scikit-learn`` ecosystem.
Consequently, the ``group-lasso`` library depends on ``numpy``, ``scipy``
and ``scikit-learn``.

Currently, the only supported algorithm is group-lasso regularised linear
and multiple regression, which is available in the ``group_lasso.GroupLasso``
class. However, I am working on an experimental class with group lasso
regularised logistic regression, which is available in the
``group_lasso.LogisticGroupLasso`` class. Currently, this class only supports
binary classification problems through a sigmoidal transformation, but
I am working on a multiple classification algorithm with the softmax
transformation.

All classes in this library is implemented as both ``scikit-learn``
transformers and their regressors or classifiers (dependent on their
use case). The reason for this is that to use lasso based models for
variable selection, the regularisation coefficient should be quite high,
resulting in sub-par performance on the actual task of interest. Therefore,
it is common to first use a lasso-like algorithm to select the relevant
features before using another another algorithm (say ridge regression)
for the task at hand. Therefore, the ``transform`` method of
``group_lasso.GroupLasso`` to remove the columns of the input dataset
corresponding to zero-valued coefficients.

.. it mandatory to keep the toctree here although it doesn't show up in the page, otherwise it doesn't show up in sidebar
.. toctree::
:maxdepth: 2
:caption: Contents:

installation
auto_examples/index
maths
api_reference


References
----------
.. [1] Yuan M, Lin Y. Model selection and estimation in regression with
grouped variables. Journal of the Royal Statistical Society: Series B
(Statistical Methodology). 2006 Feb;68(1):49-67.
.. [2] Simon, N., Friedman, J., Hastie, T., & Tibshirani, R. (2013).
A sparse-group lasso. Journal of Computational and Graphical
Statistics, 22(2), 231-245.
:maxdepth: 2
:hidden:
:includehidden:
:caption: Contents:

installation
auto_examples/index
maths
api_reference
Loading

0 comments on commit b24d6a4

Please sign in to comment.