Skip to content

A Github action to annotate problematic parts sequences.

License

Notifications You must be signed in to change notification settings

igemsoftware2021/dna-annotate

Repository files navigation



dna-annotate Friendzymes Cookbook

A Github action to annotate problematic sequences from given Genbank files.

dna-annotate is a Github Action that receives a path for an input directory, a regex pattern that should be used to filter genbank files or another interetsing file name pattern, and a directory where the output will be written. This action will use all this information to annotate problematic parts of a given sequence.

Currently, dna-annotate attempts to find and annotate:

  • Repetitions greater than 10 base pairs in length
  • Hairpins
  • Homopolymers
  • Most common restriction binding sites

If you have some feature that you think will make this action better, please feel free to create an issue.

All options

List of Options

Every argument is required.

Option Description Default
input-dir Directory where all the input genbank files will be read input
input-pattern Regex to filter files in the input directory .*\.\(gb|gbk\)
output-dir Directory where all the output genbank files will be written output

Detailed Options

input-dir

This parameter is the path of the directory for your genbank files to read and annotate. You can use this parameter to setup different pipelines for different folders, so your project can be divided in folders with different processes. By default the action will use input as the input directory.

Default: input

input-pattern

This parameter is a regex pattern using re2 syntax to filter files from within input-dir. So even inside a given input directory, you can select a specific file or group of files for the current job. By default the action will match files with genbank extensions (.gb or .gbk).

Example: Match only BBF10k-prefixed files, freegene 10k gene project parts.

Default: .*\.\(gb\|gbk\)

output-dir

This parameter is the path of the directory for outputting annotated sequences as genbank files. By default the action will use output as the output directory.

Default: output

Usage

Basic:

- name: dna-annotate
  uses: Open-Science-Global/dna-annotate@v0.6.1

See action.yml for a comprehensive list of all the options.

See Friendzymes Cookbook for further examples and sample data.