Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

one2all/new2all output counts for common k-mers between multiple db-samples #12

Open
mihkelvaher opened this issue Jun 28, 2020 · 3 comments
Assignees

Comments

@mihkelvaher
Copy link

mihkelvaher commented Jun 28, 2020

Hi!

The one2all/new2all give this information about the intersection sizes with a new sample:
s1: 100/150
s2: 200/300
s3: 50/1000
...

Is there any way to get more detailed information showing common k-mers? For example, given these counts, I have no idea if the 50 k-mers seen in s3 are also present in s1 or s2.
The preferred output would be something like this:
s1: 50/50
s2: 200/300
s3: 0/900
s1 AND s3: 50/100
...

This can be achieved by creating all of the intersections beforehand, but looking at the kmer-db database structure, I was hoping to skip that step.

Regards,
Mihkel

@agudys agudys self-assigned this Jun 30, 2020
@agudys
Copy link
Member

agudys commented Jun 30, 2020

Dear Mikhel,

We can think of adding the functionality you mentioned to kmer-db. However, the number of all possible intersections grows exponentially with a number of queries. Wouldn't it be better to give user the possibility to explicitly state what intersections he is interested in?

Regards,
Adam

@mihkelvaher
Copy link
Author

Hi!

The number of intersections does indeed grow fast.
Could the given intersections be limited by the number of k-mers shared by the references? For example, if s1, s2 and s3 share less than 1000 k-mers, the intersection would not be shown.
Also, showing intersections where something was actually found while searching, reduces the output size significantly.

@blahah
Copy link

blahah commented Nov 21, 2020

Just run all2all as well :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants