Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat] Handle tags case insensitive #334

Open
NoRulesJustFeels opened this issue Oct 16, 2023 · 12 comments
Open

[feat] Handle tags case insensitive #334

NoRulesJustFeels opened this issue Oct 16, 2023 · 12 comments

Comments

@NoRulesJustFeels
Copy link
Contributor

Describe the problem

Artists can specify tags in any case, for example: Fear4Tez or fear4tez
However, clicking on the tag will result in different results which might not be the intention of the artist. For example, these give different results, but probably shouldn't: https://teia.art/tags/Fear4Tez vs https://teia.art/tags/fear4tez

Describe the solution you'd like

Handle tags case-insensitive

Alternatives considered

No response

Additional context

No response

@melMass
Copy link
Member

melMass commented Oct 16, 2023

I'll look into it when reviewing jag's PR tomorrow, I'm just not sure if its related to NGINX too Cc @Zir0h potentially, afaik React Router itself is but that's probably tweakable with a prop

@Zir0h
Copy link
Contributor

Zir0h commented Oct 16, 2023

Hi! I had a look yesterday, it's related to hasura/postgres. The use of _eq in the graphql query makes it case sensitive. With _ilike it would become case insensitive.

@Zir0h
Copy link
Contributor

Zir0h commented Oct 21, 2023

Making the graphql query case insensitive with _ilike does make it A LOT slower. 2seconds vs 200ms

@NoRulesJustFeels
Copy link
Contributor Author

I wonder if anything can be done on the underlying database to improve the performance.

@Zir0h
Copy link
Contributor

Zir0h commented Oct 21, 2023

I wonder if anything can be done on the underlying database to improve the performance.

                          Table "public.tags"
   Column    |         Type          | Collation | Nullable | Default 
-------------+-----------------------+-----------+----------+---------
 fa2_address | character varying(36) |           | not null | 
 token_id    | text                  |           | not null | 
 tag         | text                  |           | not null | 

tag being of type "text" is probably not helping at all 😅

@Zir0h
Copy link
Contributor

Zir0h commented Oct 21, 2023

People who are using #Fear4Tez or #fear4tez are completely missing from all tag feeds too 😅

teztok=# SELECT count(*) FROM tags WHERE to_tsvector(tag) @@ to_tsquery('Fear4Tez'); SELECT count(*) FROM tags WHERE tag ILIKE 'Fear4Tez';
 count 
-------
   917
(1 row)

 count 
-------
   754
(1 row)

@Zir0h
Copy link
Contributor

Zir0h commented Oct 21, 2023

CREATE INDEX idx_gin_tags ON tags USING GIN ((tag) gin_trgm_ops);
makes the _ilike fast again at around 750ms, but that still excludes the #-prefixed tags

"tag": "tezquakeaid"
Response Time 808 ms
Response Size 474236 bytes

"tag": "TezQuakeAid"
Response Time 396 ms
Response Size 474236 bytes

"tag": "#tezquakeaid"
Response Time 384 ms
Response Size 84308 bytes

"tag": "#TezQuakeAid"
Response Time 368 ms
Response Size 84308 bytes

It might be easiest to just drop the # character before we launch the query 😆

@Zir0h
Copy link
Contributor

Zir0h commented Dec 2, 2023

https://github.com/toluaina/pgsync looks pretty cool if we'd put an elasticsearch instance for the searching / tags etc

@Zir0h
Copy link
Contributor

Zir0h commented Dec 8, 2023

https://github.com/teia-community/teia-ui/blob/main/src/components/tags/index.module.scss#L26 is probably also confusing people, tags get lowercased on the frontend but the link can be mixed case.

@melMass
Copy link
Member

melMass commented Dec 8, 2023

Are there ways to optimize ilike?
IMO it would be the saner path -> case incentive everywhere (I'd vote for keeping it lower on the front end if we end up doing that, which now that I think about it might have been the reason we did in the first place)

@Zir0h
Copy link
Contributor

Zir0h commented Dec 8, 2023

Are there ways to optimize ilike? IMO it would be the saner path -> case incentive everywhere (I'd vote for keeping it lower on the front end if we end up doing that, which now that I think about it might have been the reason we did in the first place)

there's the CREATE INDEX idx_gin_tags ON tags USING GIN ((tag) gin_trgm_ops); index which +- restores the previous performance at the cost of a little bit bigger database

and then there's the elasticsearch option, hasura seems to integrate with that pretty well, but I haven't tested the pgsync thingy

that option would add a lot of overhead though, elasticsearch + sync containers (but it could be useful for site search too)

@Zir0h
Copy link
Contributor

Zir0h commented Dec 10, 2023

we can also use citext as type, but I get an error on the UI:

"message": "variable 'tag' is declared as 'String!', but used where 'citext' is expected"

https://www.postgresql.org/docs/current/citext.html#CITEXT-HOW-TO-USE-IT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants