De Novo Pipeline

De novo pipeline building requires bioinformatics expertise. Please contact the technical support team for assistance.

The creation of a brand new pipeline is more challenging than copying an existing pipeline. Fill in the following details on the pipeline creation window (Fig. 1) to create a new pipeline. Mandatory fields are indicated with asterisks (*).

At least one step is required for a functional pipeline.

Click the icon to add a new step (or tool) to the pipeline(Fig. 1). There are eight fields in each step:

HINT: Only positive integers are allowed

HINT: The name should be unique.

HINT: Only positive integers are allowed

HINT: The first step can’t be a merge step

HINT: Input sources from multiple steps are allowed. (Example: BAM and BAI files created in different steps required for Variant calling).

HINT: Currently, Data Store is allowed for the first step only

  1.   Command Builder

Commands are preconfigured by the platform admin. Users can only edit the commands.

This is a generic command building process. You are NOT making the actual file selections required for the analysis. The platform does it automatically based on your definitions.

Fig. 1. The Command Builder dialog box

 

The first tab of the Command Builder describes the generic details(summary) about a command.

Default pattern: #command #options #arguments #input #output

The pattern should ALWAYS start with #command and can’t be edited.

Allowed character: Parameter words, #, space, and >

“>” is allowed preceding the #output ONLY

The second tab of the Command Builder (Fig. 2) describes the Options parameter. Details of the Options tab are described below: 

Fig. 2. The options tab

 

Single-word parameters should be defined as options (Examples: --ignore, --1, PE, SE). All the options are listed in a table format. New row(s) can be added using the ‘+’ sign at the bottom of the table. Six fields are available under each option.

CAUTION - Verify usage of each option before using

Field Type

Value

Annotation

  • Variant annotation files (Mills1000G_INDELS, DBSNP, 1000G_HC, 1000G_OMNI, and HAPMAP), GATK
  • Pathway or GO
  • VEP Cache and VEP Cache Version
  • GTF
  • ABR

Constant

  • Any constant value (alphanumerics) (Examples: -o, --i, and --single)

Metadata

  • Experimental Design
  • Targets
  • Genelist
  • Amplicon ranges

Reference

Define references to select

  • References: Genome/Transcriptome
  • Indexed references: BWA, Bowtie2, etc

Threshold

Define threshold values to use

  • qvalue 
  • pvalue

Variable

Native variables of the platform 

  • JobID
  • Organism
  • Ploidy
  • Sample Name
  • Reference Version
  • Sequencing Platform

Table. Available Field Types and their corresponding Values.

CAUTION - Please refer to the Arguments section for defining the parameters with key-value pairing

Input and output files are defined under INPUTS and OUTPUTS tabs, respectively. Eight fields are available under each of these parameters (Fig. 3).

CAUTION - Allowed delimiters are =, -, :, and  ; 

CAUTION - The file extensions should be precise; even the FASTQ and FQ are treated distinctly.

 

Input file names

Regular expression

Example 1

castor1_R1.fastq

R1

Example 2

castor1_R1_trimmed.fastq

R1_trimmed

Example 3

abcd_1.fastq

_1

Example 4

abcd_1_R.fastq

_1_R

(Example: ${sampleName}_trim.fastq for trimmomatic step). This helps track the files across the entire pipeline execution. 

Fig. 3A. The Command Builder Inputs view.

 

Fig. 3B. The Command Builder Outputs view

 

Parameters defined as a key-value pair should be defined as arguments (Fig. 4). Arguments can be used for any parameters supported by the tools and other required files (reference files, gtf or annotation files, target or hotspot files). They are defined by the following eight features:

CAUTION - Please refer to the Options section for defining the singleton parameters

Arguments are grouped into categories to support diverse tools and commands. In arguments, two fields (Type and Value) work together to define an argument.

CAUTION - Allowed delimiters are =, -, :, %, and

Fig. 4. The Command Builder arguments view.

 

Click on the bottom right corner to save the changes to the command. This is the completion of the first step in the pipeline. Continue adding all the steps until the pipeline is complete. Steps can be dragged and dropped at any position with the icon. Step number, predecessor, and input source get automatically readjusted for all the steps. Click to save the pipeline.

 


Revision #5
Created 27 January 2022 04:55:24 by Kshama
Updated 3 March 2022 03:51:29 by Kshama