Refactor tbl data #9

machow · 2023-10-11T18:59:12Z

This PR works toward #7 by..

Copying in https://github.com/machow/databackend, to allow instance checks without importing pandas or polars
Removing the TblData class, in favor of generic functions (e.g. _tbl_data.n_rows)
Adding tests for the generic functions against pandas and polars

I noticed pandas DataFrames are created in some places (e.g. Spanners). It might be useful for us to keep the pandas DataFrame as frame used within some of these classes (at least for now?), just to make life simpler while getting all the different features out.

Happy to make any changes, discuss anything!

machow · 2023-10-12T16:46:16Z

I hope it's okay I lumped in a doc example fix, and added a test of fmt_number!

machow · 2023-10-12T16:48:58Z

@rich-iannone I think this is ready--happy to walk through while pairing tomorrow! I think it might be easier to merge this in (happy to add any tweaks, clarify anything), then implement / add tests for some of things in examples-qmd. I added tests of the dataframe generic functions, and we should know whether we need to tweak once we test some of the functions that call them

machow · 2023-10-12T16:49:59Z

.github/workflows/ci-tests.yaml

@@ -29,7 +29,7 @@ jobs:
          if [ -f requirements-dev.txt ]; then pip install -r requirements-dev.txt; fi
      - name: Install
        run: |
-          make install


I wasn't sure what to add to requirements-dev.txt, so switched to installing the package with dev dependencies. Happy to tweak!

machow · 2023-10-12T16:51:09Z

tests/test_formats.py

+    df = pd.DataFrame({"x": [1.234, 2.345], "y": [3.456, 4.567]})
+    gt = GT(df).fmt_number(columns="x", decimals=2)
+
+    # TODO: is 2.35 below the intended result?


I wasn't sure if gt deliberately rounds, so that 2.345 rounded is 2.35

I think if it doesn't through the formatter we should deliberately pre-round before formatting (then we could expose the rounding method too).

rich-iannone · 2023-10-12T16:51:43Z

This looks great! Just going to review, approve, and you can merge in whenever!

rich-iannone

LGTM!

machow · 2023-10-12T16:52:47Z

gt/_body.py

+                # TODO: I think that this is very inefficient with polars, so
+                # we could either accumulate results and set them per column, or
+                # could always use a pandas DataFrame inside Body?
+                _set_cell(self.body, row, col, result)


Setting individual cells is inefficient in polars (I think it sets a whole column copy), and it looks like they're not planning on changing that (since they're more analytic / column oriented).

Shouldn't be an issue for prototyping, but wanted to flag

machow · 2023-10-12T16:54:02Z

Ahhh yeeeeaaahhhhhh!!!

machow added 9 commits October 11, 2023 13:43

refactor: replace TblData methods with generic functions

fbb8f6b

refactor: remove TblData class

2b023e4

refactor: add missing databackend file

7956800

refactor: support instance checks on DataFrameLike

f8f9280

dev: add polars to setup.cfg

3b05835

ci: install dev dependencies, no packaging beforehand

7f99732

dev: add jupyter to dev dependencies

6380524

Merge branch 'main' into refactor-tbl-data

bf0fb2e

tests: add basic fmt_number test

2da4a55

github-actions bot deployed to pr-9 October 12, 2023 16:38 View deployment

machow marked this pull request as ready for review October 12, 2023 16:46

machow commented Oct 12, 2023

View reviewed changes

rich-iannone self-requested a review October 12, 2023 16:52

rich-iannone approved these changes Oct 12, 2023

View reviewed changes

machow commented Oct 12, 2023

View reviewed changes

machow merged commit 4c8677d into main Oct 12, 2023
4 checks passed

rich-iannone deleted the refactor-tbl-data branch October 12, 2023 16:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor tbl data #9

Refactor tbl data #9

machow commented Oct 11, 2023

machow commented Oct 12, 2023

machow commented Oct 12, 2023

machow Oct 12, 2023

machow Oct 12, 2023

rich-iannone Oct 12, 2023

rich-iannone commented Oct 12, 2023

rich-iannone left a comment

machow Oct 12, 2023

machow commented Oct 12, 2023

Refactor tbl data #9

Refactor tbl data #9

Conversation

machow commented Oct 11, 2023

machow commented Oct 12, 2023

machow commented Oct 12, 2023

machow Oct 12, 2023

Choose a reason for hiding this comment

machow Oct 12, 2023

Choose a reason for hiding this comment

rich-iannone Oct 12, 2023

Choose a reason for hiding this comment

rich-iannone commented Oct 12, 2023

rich-iannone left a comment

Choose a reason for hiding this comment

machow Oct 12, 2023

Choose a reason for hiding this comment

machow commented Oct 12, 2023