Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change GetUnique() return type from char to init8_t #1269

Closed
wants to merge 2 commits into from

Conversation

martin-g
Copy link
Contributor

@martin-g martin-g commented Apr 4, 2024

char is unsigned on aarch64

Related to: #1062 (comment)

With this change I was able to fully build SPAdes on Linux ARM64:

mgrigorov in 🌐 euler-arm-22 in spades on ξ‚  next 
❯ file build/bin/*
build/bin/spades-bwa:                  ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=562164721589d0b609cd56ac36c3186706248059, for GNU/Linux 3.7.0, with debug_info, not stripped
build/bin/spades-convert-bin-to-fasta: ELF 64-bit LSB executable, ARM aarch64, version 1 (GNU/Linux), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=74eda85d3b74b60020a4cd6657de655f3396fcde, for GNU/Linux 3.7.0, with debug_info, not stripped
build/bin/spades-core:                 ELF 64-bit LSB executable, ARM aarch64, version 1 (GNU/Linux), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=b5e24e4001f549108cab10b9745debf2da913be2, for GNU/Linux 3.7.0, with debug_info, not stripped
build/bin/spades-corrector-core:       ELF 64-bit LSB executable, ARM aarch64, version 1 (GNU/Linux), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=ead8c8c5f9a451a02bbb76b6841dc55f38b51fb5, for GNU/Linux 3.7.0, with debug_info, not stripped
build/bin/spades-gbuilder:             ELF 64-bit LSB executable, ARM aarch64, version 1 (GNU/Linux), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=b76e897b78e923d5cdda9c95463810bb42c40dfc, for GNU/Linux 3.7.0, with debug_info, not stripped
build/bin/spades-gfa-split:            ELF 64-bit LSB executable, ARM aarch64, version 1 (GNU/Linux), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=cae67bbe81c35d566da04826af6a04a85546d9fb, for GNU/Linux 3.7.0, with debug_info, not stripped
build/bin/spades-gmapper:              ELF 64-bit LSB executable, ARM aarch64, version 1 (GNU/Linux), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=5b8a3c6077b43c38e41d06c6d8aadc2ecbde9fcc, for GNU/Linux 3.7.0, with debug_info, not stripped
build/bin/spades-gsimplifier:          ELF 64-bit LSB executable, ARM aarch64, version 1 (GNU/Linux), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=7f0b67d6449d6fa4253d429e520358995c4e4f60, for GNU/Linux 3.7.0, with debug_info, not stripped
build/bin/spades-hammer:               ELF 64-bit LSB executable, ARM aarch64, version 1 (GNU/Linux), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=4368c59963ee313387a09cd00033531cbcc5e8e1, for GNU/Linux 3.7.0, with debug_info, not stripped
build/bin/spades-ionhammer:            ELF 64-bit LSB executable, ARM aarch64, version 1 (GNU/Linux), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=432aaf9b887ff8521ee073510fe47ffc379fff1b, for GNU/Linux 3.7.0, with debug_info, not stripped
build/bin/spades-kmer-estimating:      ELF 64-bit LSB executable, ARM aarch64, version 1 (GNU/Linux), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=ef045fa9bee7c8f061d2e133e584bebfaca731f3, for GNU/Linux 3.7.0, with debug_info, not stripped
build/bin/spades-kmercount:            ELF 64-bit LSB executable, ARM aarch64, version 1 (GNU/Linux), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=2f94a2b137de314fe0a12ccf4fbe47ee52afb5a0, for GNU/Linux 3.7.0, with debug_info, not stripped
build/bin/spades-read-filter:          ELF 64-bit LSB executable, ARM aarch64, version 1 (GNU/Linux), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=a7ebfa2a63cbebff0d29f490d570b5581e72b515, for GNU/Linux 3.7.0, with debug_info, not stripped

mgrigorov in 🌐 euler-arm-22 in spades on ξ‚  next 
❯ ./build/bin/spades-bwa 

Program: bwa (alignment via Burrows-Wheeler transformation)
Version: 0.7.16a-r1185-dirty
Contact: Heng Li <lh3@sanger.ac.uk>

Usage:   bwa <command> [options]

Command: index         index sequences in the FASTA format
         mem           BWA-MEM algorithm
         fastmap       identify super-maximal exact matches
         pemerge       merge overlapping paired ends (EXPERIMENTAL)
         aln           gapped/ungapped alignment
         samse         generate alignment (single ended)
         sampe         generate alignment (paired ended)
         bwasw         BWA-SW for long queries

         shm           manage indices in shared memory
         fa2pac        convert FASTA to PAC format
         pac2bwt       generate BWT from PAC
         bwtupdate     update .bwt to the new format
         bwt2sa        generate SA from BWT and Occ

Note: To use BWA, you need to first index the genome with `bwa index'.
      There are three alignment algorithms in BWA: `mem', `bwasw', and
      `aln/samse/sampe'. If you are not sure which to use, try `bwa mem'
      first. Please `man ./bwa.1' for the manual.

If it does not cause regression for the supported platforms I'd be thankful if it is merged!

`char` is unsigned on aarch64

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
@@ -63,8 +63,8 @@ class InOutMask {
return unique[mask];
}

static constexpr char GetUnique(uint8_t mask) {
constexpr char next[] =
static constexpr int8_t GetUnique(uint8_t mask) {
Copy link
Member

@asl asl Apr 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change! However, I think we need to go the other way. Here is the full story:

  • The return value is assumed to be 0 - 3 (representing A, C, G, T correspondingly) for input of 0, 1, 2, 4 (converting single-bit mask to a nucleotide)
  • -1's represent the impossible values, but still they should be as initializers.

So, the return type should really be char as it is used later on in GetUniqueOutgoing / GetUniqueIncoming.

I would probably use 0xFF instead of -1. Does this work for you on aarch64-linux? To make things explicit we can even do something like:

                {  UINT8_C(0xFF),  0,  1, UINT8_C(0xFF),  2, UINT8_C(0xFF), UINT8_C(0xFF), UINT8_C(0xFF),
                  3, UINT8_C(0xFF), UINT8_C(0xFF), UINT8_C(0xFF), UINT8_C(0xFF), UINT8_C(0xFF), UINT8_C(0xFF), UINT8_C(0xFF) };

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! That works too!
Let me update the PR!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated it but I have the feeling that more changes are needed at the call sites of GetUniqueOutgoing / GetUniqueIncoming.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Callsites should be fine. The contract is that one should never call GetUniqueOutgoing / GetUniqueIncoming if they could return 0xFF.

`char` type is unsigned on aarch64, so negative values cannot be used
@@ -63,10 +63,10 @@ class InOutMask {
return unique[mask];
}

static constexpr char GetUnique(uint8_t mask) {
static constexpr char GetUnique(uint8_t mask) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static constexpr char GetUnique(uint8_t mask) {
static constexpr char GetUnique(uint8_t mask) {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(extra whitespace here)

@asl
Copy link
Member

asl commented Apr 5, 2024

Well, it seems it fails build non systems with unsigned char type. Let me think a bit about better solution.

@martin-g
Copy link
Contributor Author

martin-g commented Apr 5, 2024 via email

@asl
Copy link
Member

asl commented Apr 5, 2024

How about using 127 instead of 255 ? AFAIU any other value than the meaningful ones (1, 2, 3 and 4) should be OK

Yes, but let me think a bit more. Maybe we'd solve the problem the other way :)

@asl
Copy link
Member

asl commented Apr 8, 2024

I pushed a fix in d28746b

Will you please check on aarch64/linux?

@martin-g
Copy link
Contributor Author

martin-g commented Apr 8, 2024

I pushed a fix in d28746b

Will you please check on aarch64/linux?

Confirmed that it works fine on Linux ARM64! Thank you!
Closing this PR!

Do you have any ETA when next will be released ?

@martin-g martin-g closed this Apr 8, 2024
@asl
Copy link
Member

asl commented Apr 8, 2024

Do you have any ETA when next will be released ?

Well... as soon as we will finish the documentation revamp. And finalize the release CI

@martin-g martin-g deleted the aarch64-compatibility branch April 8, 2024 11:27
@asl
Copy link
Member

asl commented Jun 5, 2024

Just in case, SPAdes 4.0 was released: https://github.com/ablab/spades/releases/tag/v4.0.0

@martin-g
Copy link
Contributor Author

martin-g commented Jun 5, 2024

Thanks!
I've updated the Bioconda recipe - bioconda/bioconda-recipes#46731

@asl
Copy link
Member

asl commented Jun 5, 2024

Thanks! You may want to enable NCBI SRA input there. Also, it would make sense to update URL links as cab.spbu.ru is not working anymore

@martin-g
Copy link
Contributor Author

martin-g commented Jun 5, 2024

You may want to enable NCBI SRA input there.

Sure! What exactly do you suggest to do ?

@asl
Copy link
Member

asl commented Jun 5, 2024

You may want to enable NCBI SRA input there.

Sure! What exactly do you suggest to do ?

https://ablab.github.io/spades/installation.html#enabling-ncbi-sra-input-file-support

@martin-g
Copy link
Contributor Author

martin-g commented Jun 5, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants