Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataPrep error and empty output files #222

Open
cent0134 opened this issue Jul 19, 2024 · 2 comments
Open

DataPrep error and empty output files #222

cent0134 opened this issue Jul 19, 2024 · 2 comments

Comments

@cent0134
Copy link

Hello
I kept receiving error messages and empty output files (data_idex data.json、data.log、data.readcount eventalign.index)。
After I finish running xpore dataprep --eventalign reads-ref.eventalign.txt --gtf_or_gff NL4-3.gtf --transcript_fasta NL4-3.fa --out_dir dataprep --genome_,an error warning appeared,This is the complete error repot;

xpore dataprep --eventalign reads-ref.eventalign.txt --gtf_or_gff NL4-3.gtf --transcript_fasta NL4-3.fa --out_dir dataprep --genome
/home/dell/miniconda3/envs/xpore/lib/python3.12/site-packages/xpore/scripts/dataprep.py:72: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
chunk_split['line_length'] = np.array(lines)

These are my ref fasta, gtf ,and eventalign file。

gtf;AF003887 GenBank transcript 790 2292 . + . transcript_id "gag_t01"; gene_id "gag"
AF003887 GenBank exon 790 2292 . + . transcript_id "gag_t01"; gene_id "gag";
AF003887 GenBank CDS 790 2292 . + 0 transcript_id "gag_t01"; gene_id "gag";
AF003887 GenBank transcript 2358 5096 . + . transcript_id "pol_t01"; gene_id "pol"
AF003887 GenBank exon 2358 5096 . + . transcript_id "pol_t01"; gene_id "pol";
AF003887 GenBank CDS 2358 5096 . + 0 transcript_id "pol_t01"; gene_id "pol";
AF003887 GenBank transcript 5041 5619 . + . transcript_id "vif_t01"; gene_id "vif"
AF003887 GenBank exon 5041 5619 . + . transcript_id "vif_t01"; gene_id "vif";
AF003887 GenBank CDS 5041 5619 . + 0 transcript_id "vif_t01"; gene_id "vif";
AF003887 GenBank transcript 5559 5849 . + . transcript_id "vpr_t01"; gene_id "vpr"
AF003887 GenBank exon 5559 5849 . + . transcript_id "vpr_t01"; gene_id "vpr";
AF003887 GenBank CDS 5559 5849 . + 0 transcript_id "vpr_t01"; gene_id "vpr";
AF003887 GenBank transcript 5830 8489 . + . transcript_id "tat_t01"; gene_id "tat"
AF003887 GenBank exon 5830 6044 . + . transcript_id "tat_t01"; gene_id "tat";
AF003887 GenBank exon 8399 8489 . + . transcript_id "tat_t01"; gene_id "tat";
AF003887 GenBank CDS 5830 6044 . + 0 transcript_id "tat_t01"; gene_id "tat";
AF003887 GenBank CDS 8399 8489 . + 1 transcript_id "tat_t01"; gene_id "tat";
AF003887 GenBank transcript 5969 8673 . + . transcript_id "rev_t01"; gene_id "rev"
AF003887 GenBank exon 5969 6044 . + . transcript_id "rev_t01"; gene_id "rev";
AF003887 GenBank exon 8399 8673 . + . transcript_id "rev_t01"; gene_id "rev";
AF003887 GenBank CDS 5969 6044 . + 0 transcript_id "rev_t01"; gene_id "rev";
AF003887 GenBank CDS 8399 8673 . + 2 transcript_id "rev_t01"; gene_id "rev";
AF003887 GenBank transcript 6224 8815 . + . transcript_id "env_t01"; gene_id "env"
AF003887 GenBank exon 6224 8815 . + . transcript_id "env_t01"; gene_id "env";
AF003887 GenBank CDS 6224 8815 . + 0 transcript_id "env_t01"; gene_id "env";
AF003887 GenBank transcript 8817 9437 . + . transcript_id "nef_t01"; gene_id "nef"
AF003887 GenBank exon 8817 9437 . + . transcript_id "nef_t01"; gene_id "nef";
AF003887 GenBank CDS 8817 9437 . + 0 transcript_id "nef_t01"; gene_id "nef";

ref fasta;

AF003887
TGGAAGGGCTAATTTGGTCCCAAAAAAGACAAGAGATCCTTGATCTGTGG
ATCTACCACACACAAGGCTACTTCCCTGATTGGCAGAACTACACACCAGG
GCCAGGGATCAGATATCCACTGACCTTTGGATGGTGCTTCAAGTTAGTAC
CAGTTGAACCAGAGCAAGTAGAAGAGGCCAAATAAGGAGAGAAGAACAGC
TTGTTACACCCTATGAGCCAGCATGGGATGGAGGACCCGGAGGGAGAAGT
ATTAGTGTGGAAGTTTGACAGCCTCCTAGCATTTCGTCACATGGCCCGAG
AGCTGCATCCGGAGTACTACAAAGACTGCTGACATCGAGCTTTCTACAAG
GGACTTTCCGCTGGGGACTTTCCAGGGAGGTGTGGCCTGGGCGGGACTGG
GGAGTGGCGAGCCCTCAGATGCTACATATAAGCAGCTGCTTTTTGCCTGT
ACTGGGTCTCTCTGGTTAGACCAGATCTGAGCCTGGGAGCTCTCTGGCTA
ACTAGGGAACCCACTGCTTAAGCCTCAATAAAGCTTGCCTTGAGTGCTCA
AAGTAGTGTGTGCCCGTCTGTTGTGTGACTCTGGTAACTAGAGATCCCTC
AGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGTGGCGCCCGAACAGG
GACTTGAAAGCGAAAGTAAAGCCAGAGGAGATCTCTCGACGCAGGACTCG
GCTTGCTGAAGCGCGCACGGCAAGAGGCGAGGGGCGGCGACTGGTGAGTA
CGCCAAAAATTTTGACTAGCGGAGGCTAGAAGGAGAGAGATGGGTGCGAG
AGCGTCAGTATTAAGCGGGGGAGAATTAGATAGATGGGAAAAAATTCGGC
TAAGGCCAGGGGGAAAGAAAAAATATAAATTAAAACATATAGTATGGGCA
AGCAGGGAGCTAGAACGATTCGCAGTTAATCCTGGCCTGTTAGAAACATC
AGAAGGCTGTAGACAAATACTGGGACAGCTACAACCAGCCCTCCAGACAG
GATCAGAAGAACTTAGATCATTACATAATACAGTAGCAGTCCTCTATTGT
GTGCATCAAAGGATAGAGGTAAAAGACACCAAGGAAGCTTTAGAGAAAAT
AGAGGAAGAGCAAAACAAAAGTAAGAAAACAGCACAGCAAGCAGCAGCTG
ACACAGGAAACAGCAACAAGGTCAGCCAAAATTACCCTATAGTGCAGAAC
ATCCAGGGGCAAATGGTACATCAGGCCATATCACCTAGAACTTTAAATGC
ATGGGTAAAAGTAGTAGAAGAGAAGGCTTTTAGCCCAGAAGTAATACCCA
TGTTTTCAGCATTATCAGAAGGAGCCACCCCACAAGATTTAAACACCATG
CTAAACACAGTGGGGGGACATCAAGCAGCCATGCAAATGTTAAAAGAGAC
CATCAATGAGGAAGCTGCAGAATGGGATAGATTGCATCCAGTGCAGGCAG
GGCCTGTCGCACCAGGCCAGATGAGAGAACCAAGGGGAAGTGACATAGCA
GGAACTACTAGTACCCTTCAGGAACAAATAGGATGGATGACAAATAATCC
ACCTATCCCAGTAGGAGAAATCTATAAAAGATGGATAATCCTGGGATTAA
ATAAAATAGTAAGGATGTATAGCCCTGCCAGCATTCTGGACATAAGACAA
GGACCAAAGGAACCCTTTAGAGATTATGTAGACCGGTTCTATAAAACTCT
AAGAGCCGAGCAAGCTTCACAGGAGGTAAAAAATTGGATGACAGAAACCT
TGTTGGTCCAAAATTCGAACCCAGATTGTAAGACTATTTTAAAAGCATTG
GGACCAGCAGCTACACTAGAAGAAATGATGACAGCATGTCAGGGAGTGGG
AGGACCCAGCCATAAAGCAAGAGTTTTGGCTGAAGCAATGAGCCAAGCAA
CAAATGCAGCTACCATAATGATGCAGAGAGGCAATTTTAGGAATCAAAGA
AAGATTGTTAAATGTTTTAATTGTGGCAAAGAAGGGCACATAGCCAGGAA
TTGCAGGGCTCCTAGGAAAAAGGGCTGTTGGAAATGTGGAAAGGAAGGAC
ACCAAATGAAAGATTGTACTGAGAGACAGGCTAATTTTTTAGGGAAGATC
TGGCCTTCCTACAAGGGAAGGCCAGGGAATTTTCTTCAGAGCAGACCAGA
GCCAACAGCCCCACCAGAAGAGAGCTTCAGGTTTGGGGAGGAGACAACAA
CTCCCTCTCAGAAGCAGGAGCCGATAGACAAGGAACTGTATCCTTTAGCT
TCCCTCAGATCACTCTTTGGCAACGACCCCTCGTCAAAGTAAAGATAGGG
GGGCAACTAAAGGAAGCTCTATTAGATACAGGAGCAGATGATACAGTATT
AGAAGAAATGAATTTGCCAGGAAGATGGAAACCAAAAATGATAGGGGGAA
TTGGAGGTTTTATCAAAGTAAGACAGTATGATCAGGTACTTGTAGAAATC
TGTGGACATAAAGCTATAGGTACAGTATTAGTAGGACCTACACCCGTCAA
CATAATTGGAAGAAATCTGTTGACTCAGATTGGTTGCACTTTAAATTTTC
CCATTAGTCCTATTGAAACTGTACCAGTAAAATTAAAGCCAGGAATGGAT
GGCCCAAAAGTTAAACAATGGCCATTGACAGAAGAAAAAATAAAGGCATT
AGTAGAAATTTGTACAGAAATGGAAAAAGAAGGGAAAATTTCAAAAATTG
GGCCTGAAAATCCATACAATACTCCAGTATTTGCCATAAAGAAAAAAGAC
AGTACTAAATGGAGAAAATTAGTAGATTTCAGAGAACTTAATAAGAGAAC
TCAAGACTTCTGGGAAGTTCAATTAGGAATACCACATCCCGCAGGGTTAA
AAAAGAAAAAATCAGTAACAGTATTGGATGTGGGTGATGCATATTTTTCA
GTTCCCTTAGATAAAGACTTCAGAAAGTATACTGCATTTACCATACCTAG
TACAAACAATGAGACACCAGGGATTAGGTATCAGTACAATGTGCTTCCAC
AGGGATGGAAAGGATCACCAGCAATATTCCAAAGTAGCATGACAAGAATC
TTAGAGCCTTTTAGAAAACAAAATCCAGACATAGTTATCTATCAATACAT
GGATGATTTGTATGTAGGATCTGACTTAGAGATAGGGCAGCATAGAACAA
AAATAGAGGAACTGAGGCAACATCTGTTGAGATGGGGATTTACCACACCA
GACAAAAAACATCAGAAAGAACCTCCATTCCTTTGGATGGGTTATGAACT
CCATCCTGATAAATGGACAGTACAGCCTATAGTGCTGCCAGAAAAAGACA
GCTGGACTGTCAATGACATACAGAAGTTAGTGGGAAAATTAAATTGGGCA
AGTCAGATCTACGCAGGGATTAAAGTAAAGCAATTATGTAAACTCCTCAG
GGGAGCCAAAGCACTAACAGAAGTAATACCACTAACAGAAGAAGCAGAGC
TAGAACTGGCAGAAAACAGGGAGATTTTAAAAGAACCAGTACATGGAGTG
TATTATGACCCATCAAAAGACTTAATAGCAGAAATACAGAAGCAGGGGCA
GGGCCAATGGACATATCAAATTTATCAAGAGCCATATAAAAATCTGAAAA
CAGGAAAATATGCAAGAATGAGGGGTGCCCACACTAATGATGTTAAACAA
TTAACAGAGGCAGTGCAAAAAATAGCCACAGAAAGCATAGTAATATGGAG
AAAGATTCCTAAATTTAAACTACCCATACAAAGAGAAACATGGGAAACAT
GGTGGATAGAGTATTGGCAAGCAACCTGGATTCCTGAGTGGGAGTATGTC
AATACCCCTCCCTTAGTAAAATTATGGTACCAGTTAGAGAAAGAACCCAT
AGTAGGAGCAGAAACTTTCTATGTAGATGGGGCAGCTAACAGAGAAACTA
AATTAGGAAAAGCAGGATATGTTACTGACAGAGGAAGACAAAAAGTTGTC
TCCCTAAGTGACACAACAAATCAGAAAACTGAGTTACAAGCAATTCATCT
AGCTTTGCAGGATTCGGGCTTAGAAGTAAACATAGTAACAGACTCACAAT
ATGCATTAGGAATCATTCAAGCACAACCAGATAAAAGTGAATCAGAGTTA
GTCAGTCAAATAATAGAGCAGTTAATAAAAAAGGAAAAGGTCTACCTGGC
ATGGGTACCAGCACACAAAGGAATTGGAGGAAATGAACAAGTAGATAAAT
TAGTCAGTGCTGGAATCAGGAAAGTACTATTTTTAGATGGAATAGATAAG
GCCCAAGAAGAACATGAGAAATATCACAGTAATTGGAGAGCAATGGCTGG
TGATTTTAACCTGCCACCTGTAGTAGCAAAAGAAATAGTAGCCTGCTGTG
ATAAATGTCAGCTAAAAGGAGAAGCCATGCATGGACAAGTAGACTGTAGT
CCAGGAATATGGCAACTAGATTGTACACATTTAGAAGGAAAAGTTATCCT
GGTAGCAGTTCATGTAGCCAGTGGATATATAGAAGCAGAAGTCATTCCAG
CAGAGACAGGGCAGGAAACAGCATACTTTCTCTTAAAATTAGCAGGAAGA
TGGCCAGTAAAAACAATACATACAGATAATGGCAGCAATTTCACCAGTAC
TACAGTTAAGGCTGCTTGTTGGTGGGCAGGGATCAAGCAGGAATTTGGCA
TTCCCTACAATCCCCAAAGTCAAGGAGTAGTAGAATCTATGAATAAAGAA
TTAAAGAAAATTATAGGACAGGTAAGAGATCAGGCTGAACATCTTAAGAC
AGCAGTCCAAATGGCAGTATTCATCCACAATTTTAAGAGAAAAGGGGGGA
TTGGGGGGTACAGTGCAGGGGAAAGAATAGTAGACATAATAGCAACAGAC
ATACAAACTAAAGAATTACAAAAACACATTACAAAGATTCAAAATTTTCG
GGTTTATTACAGGGACAGCAGAGATCCACTTTGGAAAGGACCAGCAAAGC
TTCTCTGGAAAGGTGAAGGGGCAGTAGTAATACAAGATAATAGTGACATA
AAAGTAGTGCCAAGAAGAAAAGCAAAGATCATTAGGGATTATGGAAAACA
GATGGCAGGTGAAGATTGTGTGGCAAGTAGACAGGATGAGGATTAGAACA
TGGAAAAGTTTAGTAAAGCACCATATGTATGTTTCAGGGAAAGCCAAGAA
ATGGTTTTATAGACATCACTATGAAAGCACTCATCCTAGAATAAGTTCAG
AAGTGCACATCCCACTAGGAGATGCTAATTTGGTAATAACAACATATTGG
GGTCTGCATTCAGGAGAAAGAGACTGGCATTTGGGCCAGGGAGTCTCCAT
AGAATGGAGGAAAAAGAGATATAGCACACAAGTAGACCCTGGCCTAGCAG
ACCAACTAATTCATCTGTATTATTTTGATTGTTTTTCAGAATCTGCTATA
AGAAATGCCATATTAGGACATAGAGTTAGTCCTAGTTGTGAATATCAAGC
AGGACATAACAAGGTAGGATCTCTACAATACTTGGCACTAGTTGCATTAG
TAGCACCAAAAAAGATAAAGCCACCTTTGCCTAGTGTTACGAAACTGACA
GAGGATAGATGGAACAAGCCCCAGAAGACCAAGGGCCACAGAGGGAGCCA
TACAATGAATGGACACTAGAGCTTTTAGAGGAGCTTAAGAGTGAAGCTGT
TAGACATTTTCCTAGGTTATGGCTCCATAGCTTAGGACAACATATCTATG
AAACTTATGGGGATACTTGGGCAGGAGTGGAAGCCATAATAAGAATTCTG
CAACAACTGCTGTTTATTCATTTCAGAATTGGGTGTCGACATAGCAGAAT
AGGCATAATTCGACAGAGGAGAGCAAGAAATGGAGCCAGTAGATCCTAGC
CTAGAGCCCTGGAAGCATCCAGGAAGTCAGCCTAAAACTGCTTGTACCAA
TTGCTATTGTAAATATTGTTGCCTTCATTGCCAAGTTTGTTTCACAAGAA
AAGGCTTAGGCATCTCCTATGGCAGGAAGAAGCGGAGACAGCGACGAAGA
GCTCCTCAAGACAGTCAGACTCATCAAGTTTCTCTATCAAAGCAGTAAGT
AGTACATGTAATGCAACCTTTAGAAATATTAGCAATAGTAGCATTAGTAG
TAGCAGCAATAATAGCAATAGTTGTGTGGACCATAGTATTTATAGAATAT
AGGAAAATATTAAGACAAAGAAAGATAGACAGGTTAATTGATAGAATAAG
AGAAAGAGCAGAAGACAGTGGCAATGAAAGCGAAGGGGACCAGGAAGAAT
TATCAGCACTTGTGGAGATGGGGCATCACGCTCCTTGGAATGTTGATGAT
CTGTAGTGTTGCAGAACAATGGTGGGTCACAGTCTATTATGGGGTACCTG
TGTGGAAAGAAGCAACCACCACTCTATTTTGTGCATCAGATGCTAAGGCA
TATGATACAGAGGTGCATAATGTTTGGGCCACACATGCCTGTGTACCCAC
AGACCCCAACCCACAAGAAGTAGTATTGAGAAATGTGACAGAAAATTTTA
ACATGTGGAAAAATAACATGGTAGAACAGATGCATGAGGATATAATCAGT
TTATGGGATCAAAGTCTAAAGCCATGTGTAAAATTAACCCCACTCTGTGT
TACTCTAAATTGCACTGATAACTTAAAAAATGCTACTGTAAATAATGCTA
ATAATACCAATAATAGTAGCTGGGAAAAGATGGAGAAAGGAGAAATAAAA
AACTGCTCTTTCAATATCACCACTAGCATAAGAGATAAGGTGCAGAAAGA
ATATGCACTTTTTTATAAACTTGATGTAGTACCAATAGATAATGCTAATA
ATAGTAATGCTACTAACTATACCAGCTATAGATTGATAAGTTGTAACACC
TCAGTCATTACACAGGCCTGTCCAAAGGTATCCTTTGAGCCAATTCCTAT
ACATTATTGTGCCCCGGCTGGTTTTGCGATTCTAAAGTGTAATGATAAGA
AGTTCAATGGAACAGGACCATGTACAAATGTCAGTACGGTACAATGTACA
CATGGAATTAGGCCAGTAGTATCAACTCAACTGCTGTTAAATGGCAGTCT
AGCAGAAGAGGAGGTAGTAATTAGATCTGAAAATTTCACGAACAATGCTA
AAACCATAATAGTACAGCTGAATGAAACTGTAGTAATTAATTGTACAAGA
CCCAACAACAATACAAGGAAAAGTATACCTATAGGACCAGGGAGAGCATT
TTATACAACAGGAGACATAATAGGAGATATAAGACAAGCTCATTGTAACG
TTAGTAGAGCAAAATGGAATAACACTTTAGTAAAGATAGTTGAAAAATTA
AAAGAACAATTTGGGCATAATAAAACAATAGTCTTCAATCACTCCTCAGG
AGGGGACCTAGAAATTGTAACACACAGTTTTATTTGTGGAGGGGAATTCT
TCTACTGTAATACATCACAATTGTTTACTTGGAATAGTACTTGGAATAAT
ACTAGAGAGTCAGATAACAATACAGAAGAGATCATACTCCCATGCAGAAT
AAAACAAATTATAAACATGTGGCAGAAAGTAGGAAAGGCAATGTATGCCC
CTCCCATCAGAGGACAAATTAGATGTTTATCAAATATTACAGGGCTGCTA
TTAACAAGAGATGGTGGTGATACCCCGAACGGGACCGAGGTCTTCAGACC
TGGAGGAGGAGATATGAGGGACAATTGGAGAAGTGAATTATACAAATATA
AAGTAGTAAAAATTGAACCATTAGGAGTAGCACCCACCAAGGCAAAGAGA
AGAGTGGTGCAGAGAGAAAAAAGAGCAGTGGGAACAATAGGAGCTATGTT
CCTTGGGTTCTTGGGAGCAGCGGGAAGCACTATGGGCGCAGCGTCAATGA
CGCTGACGGTACAGGCCAGACTATTATTGTCTGGTATAGTGCAACAGCAG
AACAATTTGCTGAGGGCTATTGAGGCGCAACAGCATCTGTTGCAACTCAC
AGTCTGGGGCATCAAGCAGCTCCAGGCAAGAGTCCTGGCTGTGGAAAGAT
ACCTAAAGGATCAACAGCTCCTAGGGATTTGGGGTTGCTCTGGAAAACTC
ATTTGCACCACTGCTGTGCCTTGGAATGCTAGTTGGAGTAATAAATCTCT
GAATGAAATTTGGAATAACATGACCTGGATGGAATGGGAAAGAGAAATTA
ACAATTACACAGACCTAATATACACCCTAATTGAAGAATCGCAGAACCAG
CAAGAAAAGAATGAACAAGAATTATTGGAATTAGATAAATGGGCAAGTTT
GTGGAATTGGTTTGACATAACAAATTGGCTGTGGTATATAAAAATATTCA
TAATGATAGTAGGAGGCTTGGTAGGCTTAAGAATAGTCTTTACTGTACTT
TCTATAGTAAATAGAGTTAGGAAGGGATACTCACCATTATCGTTTCAGAC
CCGCCTCCCAACCCCGAGGGGACCCGACAGGCCCGAAGGAATCGCAGAAG
AAGGTGGAGAGCGAGACAGAGACAGATCCGAGCGCTTAGTGGATGGATTC
TTAGCAATTATCTGGGTCGACCTGCGGAGCCTGTGCCTCTTCAGCTACCA
CCGATTGAGAGACTTACTCTTGATTGTAGCGAGGATTGTGGAACTTCTGG
GACGCAGGGGGTGGGAAGTCCTCAAATATTGGTGGAATCTCCTGCAGTAT
TGGAGTCAGGAACTAAAGAATAGTGCTGTTAGCTTGCTCAATGCCACAGC
TATAGCAGTAGCTGAGGGGACAGATAGGGTTATAGAAGTATTACAAAGAG
CTTGTAGAGCTATCCTCCACATACCTACAAGAATAAGACAGGGCTTAGAA
AGGGCTTTGCTATAAGATGGGTGGCAAGTGGTCCAAAAGTAGTATAGTTG
GATGGCCTACTGTAAGGGAAAGAATGAGACGAGCTGAGCCAGCAGCAGAT
GGGGTGGGAGCAGTATCTCGAGACCTGGAAAAACATGGAGCAATCACAAG
TAGCAATACAGCAGCTACTAATGCTGATTGTGCCTGGCTAGAAGCACAAG
AGGAGGAAGAGGTGGGTTTTCCAGTCACACCTCAGGTACCTTTAAGACCA
ATGACTTACAAGGCAGCTGTAGATCTTAGCCACTTTTTAAAAGAAAAGGG
GGGACTGGAAGGGCTAATTCACTCCCAAAGAAGACAAGATATCCTTGATC
TGTGGATCTACCACACACAAGGCTACTTCCCTGATTGGCAGAACTACACA
CCAGGGCCAGGGGTCAGATATCCACTGACCTTTGGATGGTGCTACAAGCT
AGTACCAGTTGAGCCAGATAAGGTAGAAGAGGCCAATAAAGGAGAGAACA
CCAGCTTGTTACACCCTGTGAGCCTGCATGGAATGGATGACCCTGAGAGA
GAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCATTTCATCACGTGGC
CCGAGAGCTGCATCCGGAGTACTTCAAGAACTGCTGACATCGAGCTTGCT
ACAAGGGACTTTCCGCTGGGGACTTTCCAGGGAGGCGTGGCCTGGGCGGG
ACTGGGGAGTGGCGAGCCCTCAGATGCTGCATATAAGCAGCTGCTTTTTG
CCTGTACTGGGTCTCTCTGGTTAGACCAGATCTGAGCCTGGGAGCTCTCT

eventalign file;
contig position reference_kmer read_index strand event_index event_level_mean event_stdv event_length model_kmer model_mean model_stdv standardized_level
AF003887 1 GGAAG 0 t 1389 100.39 2.906 0.00564 GGAAG 115.76 5.56 -2.37
AF003887 1 GGAAG 0 t 1390 120.60 7.913 0.02191 GGAAG 115.76 5.56 0.75
AF003887 2 GAAGG 0 t 1391 110.99 6.816 0.00266 GAAGG 105.26 4.06 1.21
AF003887 3 AAGGG 0 t 1392 116.69 7.518 0.02324 AAGGG 113.12 7.84 0.39
AF003887 4 AGGGC 0 t 1393 111.36 5.661 0.01195 AGGGC 116.40 4.05 -1.07
AF003887 5 GGGCT 0 t 1394 105.70 3.247 0.00432 GGGCT 113.28 5.31 -1.23
AF003887 6 GGCTA 0 t 1395 113.84 4.058 0.00830 GGCTA 110.69 3.55 0.76
AF003887 7 GCTAA 0 t 1396 88.25 2.053 0.00498 GCTAA 84.40 2.63 1.26
AF003887 7 GCTAA 0 t 1397 84.09 1.143 0.00299 GCTAA 84.40 2.63 -0.10
AF003887 8 CTAAT 0 t 1398 95.87 4.933 0.00730 CTAAT 96.70 3.04 -0.23
AF003887 8 CTAAT 0 t 1399 104.79 1.832 0.00365 CTAAT 96.70 3.04 2.28

Can you please help me solve this problem? Thank you very much。

@yuukiiwa
Copy link
Collaborator

Hi @cent0134,

The warning is normal. You can try running xpore without the --genome, --gtf_or_gff, and --transcript_fasta flags:

xpore dataprep --eventalign reads-ref.eventalign.txt --out_dir dataprep

Thanks!

Best wishes,
Yuk Kei

@cent0134
Copy link
Author

HI, @yuukiiwa ,Thank you for taking the time to answer my question, but the key issue is that the file I output is empty. Or do you mean that the problem of outputting empty files can also be solved by removing these flags? Looking forward to your reply again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants