Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-434] [Feature] PostGres partial index #4970

Closed
1 task done
kmedara-wr opened this issue Mar 28, 2022 · 10 comments
Closed
1 task done

[CT-434] [Feature] PostGres partial index #4970

kmedara-wr opened this issue Mar 28, 2022 · 10 comments
Labels
enhancement New feature or request good_first_issue Straightforward + self-contained changes, good for new contributors! postgres stale Issues that have gone stale Team:Adapters Issues designated for the adapter area of the code

Comments

@kmedara-wr
Copy link

kmedara-wr commented Mar 28, 2022

Is there an existing feature request for this?

  • I have searched the existing issues

Describe the Feature

I understand since 0.20.0 you there is native support for indexes. but I could not find anything for partial indexes..something like

{{config(
    indexes=[
        {'columns': ['status'], 'type': 'partial', 'condition': 'status = 'Active'},
      ]
)
}}

or

{{config(
    indexes=[
        {'columns': ['status'], 'type': 'partial', 'operator': '=', 'value': 'Active'},
      ]
)
}}

Describe alternatives you've considered

No response

Who will this benefit?

No response

Are you interested in contributing this feature?

No response

Anything else?

https://docs.getdbt.com/reference/resource-configs/postgres-configs

@kmedara-wr kmedara-wr added enhancement New feature or request triage labels Mar 28, 2022
@github-actions github-actions bot changed the title [Feature] PostGres partial index [CT-434] [Feature] PostGres partial index Mar 28, 2022
@jtcohen6 jtcohen6 added postgres Team:Adapters Issues designated for the adapter area of the code labels Mar 28, 2022
@jtcohen6 jtcohen6 added good_first_issue Straightforward + self-contained changes, good for new contributors! and removed triage labels Apr 27, 2022
@jtcohen6
Copy link
Contributor

@kmedara-wr Sorry for the delay getting back to you!

I'm not a Postgres expert, so I was initially a bit skeptical—why not just a new table (filtered projection) that can have a simpler "full" index?—but the docs set me straight:

Partial indexes are a specialized feature, but there are several situations in which they are useful.

One major reason for using a partial index is to avoid indexing common values. Since a query searching for a common value (one that accounts for more than a few percent of all the table rows) will not use the index anyway, there is no point in keeping those rows in the index at all. This reduces the size of the index, which will speed up those queries that do use the index.

In terms of the syntax you're proposing: The condition example feels fine to me. For consistency with other dbt features, we might call this where or filter. (I lean where personally.) The only good reason I can imagine for the more-structured representation is that dbt could verify the existence and data types of the columns being filtered on. That's a worthwhile undertaking for metrics, but it feels like overkill here.

Implementation details

Add a new attribute (whichever it's called — condition, where, or filter) to the PostgresIndexConfig dataclass, with type Optional[str] = None

@dataclass
class PostgresIndexConfig(dbtClassMixin):
columns: List[str]
unique: bool = False
type: Optional[str] = None

In the docs, it looks like there's no partial "type," in the sense of passing through the keyword using (index docs). It would be possible to add validation ensuring that type = 'partial' if the new config is set, or the two attributes (type + where) could be unrelated to each other.

Then, update the postgres__get_create_index_sql macro such that, if the new attribute is set, dbt templates out a where statement accordingly:

{% macro postgres__get_create_index_sql(relation, index_dict) -%}
{%- set index_config = adapter.parse_index(index_dict) -%}
{%- set comma_separated_columns = ", ".join(index_config.columns) -%}
{%- set index_name = index_config.render(relation) -%}
create {% if index_config.unique -%}
unique
{%- endif %} index if not exists
"{{ index_name }}"
on {{ relation }} {% if index_config.type -%}
using {{ index_config.type }}
{%- endif %}
({{ comma_separated_columns }});
{%- endmacro %}

Finally, add a new test case to 065_postgres_index_tests.

I think all of those steps are fairly straightforward and self-contained, so I'm going to mark this a good first issue. Is it something you might be interested in contributing? :)

@kmedara-wr
Copy link
Author

@jtcohen6 yea! sounds like a fun knowledge gainer. thank you for the details

@alswang18
Copy link
Contributor

@kmedara-wr are you tackling this? I'd love to give it a shot

@kmedara-wr
Copy link
Author

kmedara-wr commented Apr 28, 2022

@kmedara-wr are you tackling this? I'd love to give it a shot

@alswang18 Yes this is something I'm working on, if I hit any blockers I'll certainly reach out though

@github-actions
Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

@github-actions github-actions bot added the stale Issues that have gone stale label Oct 26, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Nov 2, 2022

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest; add a comment to notify the maintainers.

@github-actions github-actions bot closed this as completed Nov 2, 2022
@kinghuang
Copy link

I'm quite interested in being able to declare PostgreSQL partial indexes in dbt. @kmedara-wr, did you get a chance to implement a change?

@williamkaper
Copy link

I'm quite interested in being able to declare PostgreSQL partial indexes in dbt. @kmedara-wr, did you get a chance to implement a change?

I'm interested in this change as well @kmedara-wr @alswang18

@razze76
Copy link

razze76 commented Oct 28, 2024

Any new on this? This would be very good to have

@christhekeele
Copy link

Also interested in this, it's essential for picking out common cases that represent a small fraction of records a large table (that you do not want to turn into its own intermediate table).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good_first_issue Straightforward + self-contained changes, good for new contributors! postgres stale Issues that have gone stale Team:Adapters Issues designated for the adapter area of the code
Projects
None yet
Development

No branches or pull requests

7 participants