Class
MarkovModelGives a suitable representation of a Marcov Chain.
Gives a suitable representation of a Marcov Chain.
Defined in | <seqan/statistics.h> |
---|---|
Signature |
template <typename TAlphabet[, typename TFloat[, typename TSpec]]>
class MarkovModel;
|
Template Parameters
TAlphabet |
The type of the underlying alphabet. |
---|---|
TFloat |
The type for storing counts, default is double. |
TSpec |
Tag for specialization. |
Member Function Overview
-
void MarkovModel::build(stringSet);
Compute the transition matrix from a training set. -
TFloat MarkovModel::emittedProbability(s);, TFloat MarkovModel::emittedProbability(ss);
Computes the probability that a string or a set of strings is emitted by the MarkovModel. -
MarkovModel::MarkovModel(order);
Constructor -
void MarkovModel::read(file);
Load an instance of MarkovModel from a file. -
void MarkovModel(transition[, stationaryDistribution]);
Set transition matrix. -
void MarkovModel::write(file);
Stores an instance of a markovModel in a file.
Member Variable Overview
-
unsigned MarkovModel::order
The order of the MarkovModel. -
TVector MarkovModel::stationaryDistribution
The vector of characgter distribution (String of TFloat). -
TMatrix MarkovModel::transition
The transition matirx.
Detailed Description
Examples
Build a MarkovModel from Background
#include <iostream>
#include <fstream>
#include <seqan/index.h>
#include <seqan/statistics.h>
#include <seqan/seq_io.h>
using namespace seqan;
int main()
{
// Build path to background FASTA file.
CharString bgPath = SEQAN_PATH_TO_ROOT();
append(bgPath, "/demos/statistics/background.fa");
// Read the background from a file into X.
StringSet<DnaString> X;
SeqFileIn seqFile;
if (!open(seqFile, toCString(bgPath)))
{
std::cerr << "ERROR: Could not open " << bgPath << "\n";
return 1;
}
StringSet<CharString> ids; // will be ignored
readRecords(ids, X, seqFile);
// Create MarkovModel of order 3 from the background.
MarkovModel<Dna> mm(3);
buildMarkovModel(mm, X);
// Build set of words that we want to compute the zscore of.
StringSet<DnaString> W;
appendValue(W, "CCCAAAGC");
appendValue(W, "CCCAAAGTAAATT");
// Compute and print zscore.
std::cout << "zscore: " << zscore(W, X, mm, AhoCorasick()) << "\n";
// //TODO his path has to be set explicitely when calling the demo
// FILE *fd = fopen("projects/library/demos/zscore_human_mm.3","r");
// read(fd, mm);
// fclose(fd);
//std::cout << zscore(W, X, mm, WuManber()) << std::endl;
return 0;
}
The following example shows how to build a MarkovModel over a Dna alphabet from a set of background sequence. After build the model, we compute the zscore.
zscore: 11.8323
Load a MarkovModel from File
We can also load the MarkovModel from a file (previously saved using write). Since we do not have the background word set here but only the model, we compute the variance of a word using the function calculateVariance from the alignment_free module.
#include <iostream>
#include <fstream>
#include <seqan/index.h>
#include <seqan/alignment_free.h>
#include <seqan/statistics.h>
#include <seqan/seq_io.h>
using namespace seqan;
int main()
{
// Build path to serialized MarkovModel.
CharString mmPath = SEQAN_PATH_TO_ROOT();
append(mmPath, "/demos/statistics/zscore_example_mm.3");
// Open the file.
FILE * mmFile = fopen(toCString(mmPath), "rb");
if (!mmFile)
{
std::cerr << "ERROR: Could not open " << mmPath << "\n";
return 1;
}
// Create MarkovModel of order 3 and load it from the file.
MarkovModel<Dna> mm(3);
read(mmFile, mm);
fclose(mmFile); // close file again
// Build set of words that we want to compute the zscore of.
DnaString word = "CCCAAAGC";
// Compute variance.
double variance = 0;
int n = 10000; // assumed text length
calculateVariance(variance, word, mm, n);
std::cout << "variance: " << variance << "\n";
return 0;
}
variance: 0.267919
Member Functions Detail
void MarkovModel::build(stringSet);
Compute the transition matrix from a training set.
Parameters
stringSet
|
The StringSet to build the model for. |
---|
The character statitionary distribution and the auxiliary information that give raise to an instance of a Markov Model are also computed.
TFloat MarkovModel::emittedProbability(s);
TFloat MarkovModel::emittedProbability(ss);
Computes the probability that a string or a set of strings is emitted by the MarkovModel.
Parameters
s
|
The String to compute the emission probability for. |
---|---|
ss
|
The StringSet to compute the emission probability for. |
Returns
TFloat |
The emission probability, TFloat is the TFloat from the MarkovModel. |
---|
MarkovModel::MarkovModel(order);
Constructor
Parameters
order
|
The order of the model (unsigned). |
---|
void MarkovModel::read(file);
Load an instance of MarkovModel from a file.
Parameters
file
|
The file to read the model from (type FILE *). |
---|
void MarkovModel(transition[, stationaryDistribution]);
Set transition matrix.
Parameters
transition
|
The transition matrix. |
---|---|
stationaryDistribution
|
The vector of character distributions. |
Given e transition matrix, sets it as transition matrix of the MarkovModel and computes (if it is not available) the vector of character distributions and the auxiliary information.
void MarkovModel::write(file);
Stores an instance of a markovModel in a file.
Parameters
file
|
The file to write the model to (type FILE *). |
---|
Member Variables Detail
unsigned MarkovModel::order
The order of the MarkovModel.
TVector MarkovModel::stationaryDistribution
The vector of characgter distribution (String of TFloat).
TMatrix MarkovModel::transition
The transition matirx.