SeqAn3
Quality

Contains the various quality score types. More...

Collaboration diagram for Quality:

Classes

class  seqan3::phred42
 Quality type for traditional Sanger and modern Illumina Phred scores (typical range). More...
 
class  seqan3::phred63
 Quality type for traditional Sanger and modern Illumina Phred scores (full range). More...
 
class  seqan3::phred68legacy
 Quality type for Solexa and deprecated Illumina formats. More...
 
class  seqan3::qualified< sequence_alphabet_t, quality_alphabet_t >
 Joins an arbitrary alphabet with a quality alphabet. More...
 
class  seqan3::quality_base< derived_type, size >
 A CRTP-base that refines seqan3::alphabet_base and is used by the quality alphabets. More...
 
interface  seqan3::quality_concept
 A concept that indicates whether an alphabet represents quality scores.In addition to the requirements for seqan3::alphabet_concept, the quality_concept introduces a requirement for conversion functions from and to a Phred score. More...
 
struct  seqan3::underlying_phred< alphabet_with_member_type >
 The internal phred type. More...
 

Helpers for seqan3::quality_concept

These functions and metafunctions expose member variables and types so that the type can model the seqan3::quality_concept.

template<typename alphabet_type >
using seqan3::underlying_phred_t = typename underlying_phred< alphabet_type >::type
 The internal phred type. More...
 
template<typename alphabet_type >
constexpr alphabet_type & seqan3::assign_phred (alphabet_type &chr, char const in)
 The public setter function of a phred score. More...
 
template<typename alphabet_type >
constexpr underlying_phred_t< alphabet_type > seqan3::to_phred (alphabet_type const &chr)
 The public getter function for the phred representation of a score. More...
 
template<typename alphabet_type >
constexpr alphabet_type seqan3::assign_phred (alphabet_type &&chr, char const in)
 

Detailed Description

Contains the various quality score types.

Introduction

Quality score sequences are usually output together with the DNA (or RNA) sequence by sequencing machines like the Illumina Genome Analyzer. The quality score of a nucleotide is also known as Phred score and is an integer score being inversely proportional to the propability $p$ that a base call is incorrect. Which roughly means that the higher a Phred score is, the higher is the probabality that the corresponding nucleotide is correct for that position. There exists two common variants of its computation:

Encoding Schemes
Format Quality Type Phred Score Range Rank Range ASCII Range Assert
Sanger, Illumina 1.8+ short seqan3::phred42 [0 .. 41] [0 .. 41] ['!' .. 'J'] Phred score in [0 .. 61]
Sanger, Illumina 1.8+ long seqan3::phred63 [0 .. 62] [0 .. 62] ['!' .. '_'] Phred score in [0 .. 62]
Solexa, Illumina [1.0; 1.8[ seqan3::phred68legacy [-5 .. 62] [0 .. 67] [';' .. '~'] Phred score in [-5 .. 62]

The most distributed format is the Sanger or Illumina 1.8+ format. Despite typical Phred scores for Illumina machines range from 0 to maximal 41, it is possible that processed reads reach higher scores. If you don't intend to handle Phred scores larger than 41, we recommend to use seqan3::phred42 due to its more space efficient implementation. For other formats, like Solexa and Illumina 1.0 to 1.7 the type seqan3::phred68legacy is provided. To cover also the Solexa format, the Phred score is stored as a signed integer starting at -5. An overview of all the score formats and their encodings can be found here: https://en.wikipedia.org/wiki/FASTQ_format#Encoding.

Concept

The quality submodule defines the seqan3::quality_concept which encompasses all the alphabets, defined in the submodule, and refines the seqan3::alphabet_concept by providing Phred score assignment and conversion operations.

Assignment and Conversion

Quality alphabets can be converted to their char and rank representation via to_char and to_rank respectively (like all other alphabets). Additionally they can be converted to their Phred representation via to_phred.

Likewise, assignment happens via assign_char, assign_rank and assign_phred. Phred values outside the representable range, but inside the legal range, are converted to the closest Phred score, e.g. assigning 60 to a seqan3::phred42 will result in a Phred score of 41. Assigning Phred values outside the legal range results in undefined behaviour.

All quality alphabets are explicitly convertible to each other via their Phred representation. Values not present in one alphabet are mapped to the closest value in the target alphabet (e.g. a seqan3::phred63 letter with value 60 will convert to a seqan3::phred42 letter of score 41).

Typedef Documentation

◆ underlying_phred_t

template<typename alphabet_type >
using seqan3::underlying_phred_t = typedef typename underlying_phred<alphabet_type>::type

The internal phred type.

Template Parameters
alphabet_typeThe type of alphabet. Must model the seqan3::quality_concept.

The underlying_phred type requires the quality_concept.

Function Documentation

◆ assign_phred()

template<typename alphabet_type >
constexpr alphabet_type& seqan3::assign_phred ( alphabet_type &  chr,
char const  in 
)

The public setter function of a phred score.

Template Parameters
alphabet_typeThe type of alphabet. Must model the seqan3::quality_concept.
Parameters
[in]chrThe quality value to assign a score.
[in]inThe character to representing the phred score.

The underlying_phred type requires the quality_concept.

◆ to_phred()

template<typename alphabet_type >
constexpr underlying_phred_t<alphabet_type> seqan3::to_phred ( alphabet_type const &  chr)

The public getter function for the phred representation of a score.

Template Parameters
alphabet_typeThe type of alphabet. Must model the seqan3::quality_concept.
Parameters
[in]chrThe quality value to convert into the phred score.

The underlying_phred type requires the quality_concept.