Skip to main content
Fig. 3 | BMC Bioinformatics

Fig. 3

From: HPC-T-Assembly: a pipeline for de novo transcriptome assembly of large multi-specie datasets

Fig. 3

The HPC-T-Assembly pipeline. The pipeline involves three sequential stages and four parallel ones. Sequentially, the process begins with quality control and pre-assembly filtering (trimming) of the raw reads (FASTQ files). This is followed by the de novo assembly stage, where trimmed reads are gathered to both reconstruct full transcripts from nucleotide sequences and generate a FASTA file of the transcriptome. The final two sequential steps apply redundancy reduction and assembly thinning, which minimize redundant or low-informative sequences, optimizing the data for precise downstream analysis (clustering). Parallel stages are then applied to the final FASTA file (unigenes) and include: assembly quality assessment, alignment of the raw data to unigenes, transcript quantification, and Open Reading Frames prediction

Back to article page