Skip to content

Commit

Permalink
Add all reference maps (#17)
Browse files Browse the repository at this point in the history
* Add logo

* Add blog link to README

* Add reference map for base

* Add all reference maps

* Use observed values for non-observsed value match for group_data instead of NAs, which might change the dtype.

* 0.2.1
  • Loading branch information
pwwang authored Jun 23, 2021
1 parent ba7cf58 commit 35156bb
Show file tree
Hide file tree
Showing 19 changed files with 763 additions and 198 deletions.
1 change: 1 addition & 0 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ jobs:
cp ../README.md index.md
cp ../example.png example.png
cp ../example2.png example2.png
cp ../logo.png logo.png
cd ..
mkdocs build
if : success()
Expand Down
12 changes: 8 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,15 @@

Port of [dplyr][2] and other related R packages in python, using [pipda][3].

Unlike other similar packages in python that just mimic the piping sign, `datar` follows the API designs from the original packages as much as possible. So that minimal effort is needed for those who are familar with those R packages to transition to python.

<!-- badges -->
[![Pypi][6]][7] [![Github][8]][9] ![Building][10] [![Docs and API][11]][5] [![Codacy][12]][13] [![Codacy coverage][14]][13]

[Documentation][5] | [Reference Maps][15] | [Notebook Examples][16] | [API][17]
[Documentation][5] | [Reference Maps][15] | [Notebook Examples][16] | [API][17] | [Blog][18]

<img width="30%" style="margin: 10px 10px 10px 30px" align="right" src="logo.png">

Unlike other similar packages in python that just mimic the piping sign, `datar` follows the API designs from the original packages as much as possible. So that minimal effort is needed for those who are familar with those R packages to transition to python.


## Installtion

Expand Down Expand Up @@ -69,7 +72,7 @@ df >> mutate(z=if_else(f.x>1, 1, 0)) >> filter(f.z==1)

```python
# works with plotnine
# works with plotnine
# example grabbed from https://github.com/has2k1/plydata
import numpy
from datar.base import sin, pi
from plotnine import ggplot, aes, geom_line, theme_classic
Expand Down Expand Up @@ -115,3 +118,4 @@ iris >> pull(f.Sepal_Length) >> dist_plot()
[15]: https://pwwang.github.io/datar/reference-maps/ALL/
[16]: https://pwwang.github.io/datar/notebooks/across/
[17]: https://pwwang.github.io/datar/api/datar/
[18]: https://pwwang.github.io/datar-blog
10 changes: 6 additions & 4 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,6 @@ datar

Port of `dplyr <https://dplyr.tidyverse.org/index.html>`_ and other related R packages in python, using `pipda <https://github.com/pwwang/pipda>`_.

Unlike other similar packages in python that just mimic the piping sign, ``datar`` follows the API designs from the original packages as much as possible. So that minimal effort is needed for those who are familar with those R packages to transition to python.

:raw-html-m2r:`<!-- badges -->`
`
.. image:: https://img.shields.io/pypi/v/datar?style=flat-square
Expand Down Expand Up @@ -36,7 +34,11 @@ Unlike other similar packages in python that just mimic the piping sign, ``datar
:alt: Codacy coverage
<https://app.codacy.com/gh/pwwang/datar>`_

`Documentation <https://pwwang.github.io/datar/>`_ | `Reference Maps <https://pwwang.github.io/datar/reference-maps/ALL/>`_ | `Notebook Examples <https://pwwang.github.io/datar/notebooks/across/>`_ | `API <https://pwwang.github.io/datar/api/datar/>`_
`Documentation <https://pwwang.github.io/datar/>`_ | `Reference Maps <https://pwwang.github.io/datar/reference-maps/ALL/>`_ | `Notebook Examples <https://pwwang.github.io/datar/notebooks/across/>`_ | `API <https://pwwang.github.io/datar/api/datar/>`_ | `Blog <https://pwwang.github.io/datar-blog>`_

:raw-html-m2r:`<img width="30%" style="margin: 10px 10px 10px 30px" align="right" src="logo.png">`

Unlike other similar packages in python that just mimic the piping sign, ``datar`` follows the API designs from the original packages as much as possible. So that minimal effort is needed for those who are familar with those R packages to transition to python.

Installtion
-----------
Expand Down Expand Up @@ -101,7 +103,7 @@ Example usage
.. code-block:: python
# works with plotnine
# works with plotnine
# example grabbed from https://github.com/has2k1/plydata
import numpy
from datar.base import sin, pi
from plotnine import ggplot, aes, geom_line, theme_classic
Expand Down
2 changes: 1 addition & 1 deletion datar/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@
from .core import frame_format_patch as _
from .core.defaults import f

__version__ = '0.2.0'
__version__ = '0.2.1'
48 changes: 31 additions & 17 deletions datar/core/grouped.py
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ def _compute_single_var_groups(self):

def _compute_multiple_var_groups(self):
"""Compute groups for multiple vars"""
from ..base import NA
from ..base import unique

dtypes = {}
groupings = []
Expand All @@ -120,21 +120,35 @@ def _compute_multiple_var_groups(self):
# pandas not including unused categories for multiple variables
# even with observed=False
# #
# This is a simplied version to include those unobserved values
# Find a better way to implement the dplyr way?
if not self.attrs['_group_drop']:
unobserved = [
self[gvar].values.categories.difference(self[gvar])
if is_categorical(dtype)
else []
for gvar, dtype in dtypes.items()
]
maxlen = max((len(unobs) for unobs in unobserved))
if maxlen > 0:
unobserved = [
[NA] if len(unobs) == 0 else unobs
for unobs in unobserved
]
# unobserved = [
# self[gvar].values.categories.difference(self[gvar])
# if is_categorical(dtype)
# else []
# for gvar, dtype in dtypes.items()
# ]
# maxlen = max((len(unobs) for unobs in unobserved))
# if maxlen > 0:
# unobserved = [
# ## Simply adding NAs would change dtype
# [NA] if len(unobs) == 0 else unobs
# for unobs in unobserved
# ]
# for row in product(*unobserved):
# groups[row] = []
unobserved = []
insert_unobs = False
for gvar, dtype in dtypes.items():
if is_categorical(dtype):
unobs = self[gvar].values.categories.difference(self[gvar])
if len(unobs) > 0:
unobserved.append(unobs)
insert_unobs = True
else:
unobserved.append(unique(self[gvar]))
else:
unobserved.append(unique(self[gvar]))
if insert_unobs:
for row in product(*unobserved):
groups[row] = []

Expand Down Expand Up @@ -285,8 +299,8 @@ def _groups_to_group_data(
out[gvar] = na_if_safe(out[gvar], dtype=dtype)
else:
try:
same_dtype = out[gvar].dtype != dtype
except TypeError:
same_dtype = out[gvar].dtype == dtype
except TypeError: # pragma: no cover
# Cannot interpret 'CategoricalDtype(categories=[1, 2],
# ordered=False)' as a data type
same_dtype = False
Expand Down
4 changes: 4 additions & 0 deletions docs/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
## 0.2.1
- Use observed values for non-observsed value match for group_data instead of NAs, which might change the dtype.
- Fix tibble recycling values too early

## 0.2.0
Added:
- Add `base.which`, `base.bessel`, `base.special`, `base.trig_hb` and `base.string` modules
Expand Down
Loading

0 comments on commit 35156bb

Please sign in to comment.