From the TaKaRa manual page:
Cogent NGS Analysis Pipeline (CogentAP) is bioinformatic software for analyzing RNA-seq NGS data generated using the following systems or kits:The program takes input data from sequencing and outputs an HTML report, with results typical to single-cell analysis, plus other files, such as a gene matrix, to continue further analysis. R data object with pre-computed results based on recommended parameters are also output. Either the standard output files or the R data object can serve as input for Cogent NGS Discovery Software (CogentDS), another bioinformatic software package provided by Takara Bio. CogentAP software is written in Python and can be run either via a GUI or command-line interface.
- ICELL8 cx Single-Cell System or the ICELL8 Single-Cell System on the single-cell full-length transcriptome (SMART-Seq ICELL8 workflow)
- ICELL8 cx Single-Cell System or the ICELL8 Single-Cell System on the single-cell differential expression (3′ DE or 5′ DE) workflows (ICELL8 3′ DE or ICELL8 TCR)
- SMARTer Stranded Total RNA-Seq Kit v3 - Pico Input Mammalian
$COGENTAP_TEST_DATA/fdb/cogentap and linked into the
install directory at the expected pathAllocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive --cpus-per-task=8 --mem=45g --gres=lscratch:20
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job
[user@cn3144]$ cd /lscratch/$SLURM_JOB_ID
[user@cn3144]$ module load cogentap
[user@cn3144]$ export CONDA_PKGS_DIRS=/data/$USER/.conda/pkgs
[user@cn3144]$ export CONDA_ENVS_PATH=/data/$USER/.conda/envs
[user@cn3144]$ export CONDA_ROOT=/data/$USER/.conda
[user@cn3144]$ cogent dna --help
usage: cogent dna [-h] {demux,analyze,postprocess,add_genome} ...
This submenu contains tools for processing DNA-Seq data.
positional arguments:
{demux,analyze,postprocess,add_genome}
demux De-multiplex barcoded reads from sequence data stored in FASTQ files.
analyze Perform Sequencing QC and CNV Analysis by fastq input data.
postprocess Additional analysis options that may be run run after "dna analyze" is completed.
add_genome Download and index a genome with preferred STAR (for RNA) or Bowtie2 (for DNA) parameters.
options:
-h, --help show this help message and exit
[user@cn3144]$ cp -r ${COGENTAP_TEST_DATA} .
[user@cn3144]$ cogent dna demux \
-i test/test_FL_R1.fastq.gz \
-p test/test_FL_R2.fastq.gz \
--barcodes_file test/99999_CogentAP_test_selected_WellList.TXT \
-t shasta_wga \
-o out \
-n $SLURM_CPUS_PER_TASK
###
### cogent 1.0
###
###
### cogent ≥1.5.0 - see the official manual for more differences
###
[user@cn3144]$ cogent dna analyze \
-i out/demultiplexed_fastqs/ -R 10000 -B 500kb -r 76bp \
-g hg38 -b test/99999_CogentAP_test_selected_WellList.TXT \
-o out/analysis -G $COGENTAP_GENOME_DATA \
-t shasta_wga
[user@cn3144]$ tree out
out
├ [user 4.0K] analysis
│ ├ [user 5.2K] analysis_analyzer.log
│ ├ [user 2.1M] analysis_genematrix.csv
│ ├ [user 1.0K] analysis_stats.csv
│ ├ [user 4.0K] cogent_ds
│ │ ├ [user 1.8M] CogentDS.analysis.rda
│ │ ├ [user 214K] CogentDS.boxplot.png
│ │ ├ [user 3.4K] CogentDS.cogent_ds.log
│ │ ├ [user 70] CogentDS.cor_stats.csv
│ │ ├ [user 164K] CogentDS.heatmap.png
│ │ ├ [user 1.7M] CogentDS.report.html
│ │ └ [user 157K] CogentDS.UMAP.png
│ ├ [user 4.0K] extras
│ │ ├ [user 2.1M] analysis_incl_introns_genematrix.csv
│ │ ├ [user 1.0K] analysis_incl_introns_stats.csv
│ │ └ [user 3.9M] gene_info_incl_introns.csv
│ ├ [user 3.8M] gene_info.csv
│ └ [user 4.0K] work
│ ├ [user 14M] analysis.Aligned.out.bam
│ ├ [user 2.0K] analysis.Log.final.out
│ ├ [user 488K] analysis.SJ.out.tab
│ └ [user 39] mito.gtf
├ [user 257] out_counts_all.csv
├ [user 20M] out_demuxed_R1.fastq
├ [user 20M] out_demuxed_R2.fastq
└ [user 1.3K] out_demuxer.log
[user@cn3144]$ mv out /data/$USER/
[user@cn3144]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf]$
Note that for analyze the --threads and --cores are multiplicative.
So threads * cores should equal the number of allocated CPUs. In our examples we use the default cores setting
of 1 which means threads = $SLURM_CPUS_PER_TASK. If you increase cores to 2 threads should be reduced to half
the allocated CPUs.
Create a batch input file (e.g. cogentap.sh) similar to the following:
#!/bin/bash
module load cogentap/
cd /lscratch/$SLURM_JOB_ID || exit 1
module load cogentap
cp -r ${COGENTAP_TEST_DATA:-none} .
export CONDA_PKGS_DIRS=/data/$USER/.conda/pkgs
export CONDA_ENVS_PATH=/data/$USER/.conda/envs
export CONDA_ROOT=/data/$USER/.conda
cogent dna demux \
-f test/test_FL_R1.fastq.gz \
-p test/test_FL_R2.fastq.gz \
--barcodes_file test/99999_CogentAP_test_selected_WellList.TXT \
-t shasta_wga \
-o out \
-n $SLURM_CPUS_PER_TASK
cogent dna analyze \
-i out/demultiplexed_fastqs/ \
-R 10000 \
-g hg38 \
-B 500kb \
-r 76bp \
-b test/99999_CogentAP_test_selected_WellList.TXT \
-G $COGENTAP_GENOME_DATA \
-o out/analysis \
-t shasta_wga
Submit this job using the Slurm sbatch command.
sbatch --cpus-per-task=8 --mem=30g cogentap.sh