OpenMS
|
Helper class for calculations on decoy proteins. More...
#include <OpenMS/DATASTRUCTURES/FASTAContainer.h>
Classes | |
struct | DecoyStatistics |
struct for intermediate results needed for calculations on decoy proteins More... | |
struct | Result |
Static Public Member Functions | |
template<typename T > | |
static Result | findDecoyString (FASTAContainer< T > &proteins) |
Heuristic to determine the decoy string given a set of protein names. More... | |
template<typename T > | |
static DecoyStatistics | countDecoys (FASTAContainer< T > &proteins) |
Function to count the occurrences of decoy strings in a given set of protein names. More... | |
Static Public Attributes | |
static const std::vector< std::string > | affixes = { "decoy", "dec", "reverse", "rev", "reversed", "__id_decoy", "xxx", "shuffled", "shuffle", "pseudo", "random" } |
static const std::string | regexstr_prefix = std::string("^(") + ListUtils::concatenate<std::string>(affixes, "_*|") + "_*)" |
static const std::string | regexstr_suffix = std::string("(_") + ListUtils::concatenate<std::string>(affixes, "*|_") + ")$" |
Private Types | |
using | DecoyStringToAffixCount = std::unordered_map< std::string, std::pair< Size, Size > > |
using | CaseInsensitiveToCaseSensitiveDecoy = std::unordered_map< std::string, std::string > |
Helper class for calculations on decoy proteins.
|
private |
|
private |
|
inlinestatic |
Function to count the occurrences of decoy strings in a given set of protein names.
For tested decoy strings see DecoyHelper::affixes. Returns all data needed for interpretation (see DecoyHelper::DecoyStatistics).
References DecoyHelper::DecoyStatistics::all_prefix_occur, DecoyHelper::DecoyStatistics::all_proteins_count, DecoyHelper::DecoyStatistics::all_suffix_occur, DecoyHelper::DecoyStatistics::decoy_case_sensitive, DecoyHelper::DecoyStatistics::decoy_count, OpenMS::StringUtils::prefix(), DecoyHelper::regexstr_prefix, DecoyHelper::regexstr_suffix, OpenMS::StringUtils::suffix(), and String::toLower().
Referenced by DecoyHelper::findDecoyString().
|
inlinestatic |
Heuristic to determine the decoy string given a set of protein names.
For tested decoy strings see DecoyHelper::affixes. Both prefix and suffix is tested and if one of the candidates above is found in at least 40% of all proteins, it is returned as the winner (see DecoyHelper::Result).
References DecoyHelper::DecoyStatistics::all_prefix_occur, DecoyHelper::DecoyStatistics::all_proteins_count, DecoyHelper::DecoyStatistics::all_suffix_occur, DecoyHelper::countDecoys(), DecoyHelper::DecoyStatistics::decoy_case_sensitive, DecoyHelper::DecoyStatistics::decoy_count, OPENMS_LOG_DEBUG, OPENMS_LOG_ERROR, and OPENMS_LOG_WARN.
|
inlinestatic |
|
inlinestatic |
Referenced by DecoyHelper::countDecoys().
|
inlinestatic |
Referenced by DecoyHelper::countDecoys().