OpenMS
|
Algorithm class that implements simple protein inference by aggregation of peptide scores. It has multiple parameter options like the aggregation method, when to distinguish peptidoforms, and if you want to use shared peptides ("use_shared_peptides"). First, the best PSM per spectrum is used, then only the best PSM per peptidoform is aggregated. Peptidoforms can optionally be distinguished via the treat_X_separate parameters: More...
#include <OpenMS/ANALYSIS/ID/BasicProteinInferenceAlgorithm.h>
Public Types | |
enum class | AggregationMethod { PROD , SUM , BEST } |
The aggregation method. More... | |
typedef std::unordered_map< std::string, std::map< Int, PeptideHit * > > | SequenceToChargeToPSM |
Public Types inherited from ProgressLogger | |
enum | LogType { CMD , GUI , NONE } |
Possible log types. More... | |
Public Member Functions | |
BasicProteinInferenceAlgorithm () | |
Default constructor. More... | |
void | run (std::vector< PeptideIdentification > &pep_ids, std::vector< ProteinIdentification > &prot_ids) const |
void | run (std::vector< PeptideIdentification > &pep_ids, ProteinIdentification &prot_id) const |
void | run (ConsensusMap &cmap, ProteinIdentification &prot_id, bool include_unassigned) const |
Public Member Functions inherited from DefaultParamHandler | |
DefaultParamHandler (const String &name) | |
Constructor with name that is displayed in error messages. More... | |
DefaultParamHandler (const DefaultParamHandler &rhs) | |
Copy constructor. More... | |
virtual | ~DefaultParamHandler () |
Destructor. More... | |
DefaultParamHandler & | operator= (const DefaultParamHandler &rhs) |
Assignment operator. More... | |
virtual bool | operator== (const DefaultParamHandler &rhs) const |
Equality operator. More... | |
void | setParameters (const Param ¶m) |
Sets the parameters. More... | |
const Param & | getParameters () const |
Non-mutable access to the parameters. More... | |
const Param & | getDefaults () const |
Non-mutable access to the default parameters. More... | |
const String & | getName () const |
Non-mutable access to the name. More... | |
void | setName (const String &name) |
Mutable access to the name. More... | |
const std::vector< String > & | getSubsections () const |
Non-mutable access to the registered subsections. More... | |
Public Member Functions inherited from ProgressLogger | |
ProgressLogger () | |
Constructor. More... | |
virtual | ~ProgressLogger () |
Destructor. More... | |
ProgressLogger (const ProgressLogger &other) | |
Copy constructor. More... | |
ProgressLogger & | operator= (const ProgressLogger &other) |
Assignment Operator. More... | |
void | setLogType (LogType type) const |
Sets the progress log that should be used. The default type is NONE! More... | |
LogType | getLogType () const |
Returns the type of progress log being used. More... | |
void | setLogger (ProgressLoggerImpl *logger) |
Sets the logger to be used for progress logging. More... | |
void | startProgress (SignedSize begin, SignedSize end, const String &label) const |
Initializes the progress display. More... | |
void | setProgress (SignedSize value) const |
Sets the current progress. More... | |
void | endProgress (UInt64 bytes_processed=0) const |
void | nextProgress () const |
increment progress by 1 (according to range begin-end) More... | |
Private Types | |
typedef double(* | fptr) (double, double) |
get lambda function to aggregate scores More... | |
Private Member Functions | |
void | processRun_ (std::unordered_map< std::string, std::pair< ProteinHit *, Size >> &acc_to_protein_hitP_and_count, SequenceToChargeToPSM &best_pep, ProteinIdentification &prot_run, std::vector< PeptideIdentification > &pep_ids) const |
Performs simple aggregation-based inference on one protein run. More... | |
void | aggregatePeptideScores_ (SequenceToChargeToPSM &best_pep, std::vector< PeptideIdentification > &pep_ids, const String &overall_score_type, bool higher_better, const std::string &run_id) const |
fills and updates the map of best peptide scores best_pep (by sequence or modified sequence, depending on algorithm settings) More... | |
void | updateProteinScores_ (std::unordered_map< std::string, std::pair< ProteinHit *, Size >> &acc_to_protein_hitP_and_count, const SequenceToChargeToPSM &best_pep, bool pep_scores, bool higher_better) const |
aggregates and updates protein scores based on aggregation settings and aggregated peptide level results in prefilled best_pep More... | |
AggregationMethod | aggFromString_ (const std::string &method_string) const |
get the AggregationMethod enum from a method_string More... | |
void | checkCompat_ (const String &score_type, const AggregationMethod &aggregation_method) const |
double | getInitScoreForAggMethod_ (const AggregationMethod &aggregation_method, bool higher_better) const |
get the initial score value based on the chosen aggregation_method , higher_better is needed for "best" score More... | |
fptr | aggFunFromEnum_ (const BasicProteinInferenceAlgorithm::AggregationMethod &agg_method, bool higher_better) const |
Additional Inherited Members | |
Static Public Member Functions inherited from DefaultParamHandler | |
static void | writeParametersToMetaValues (const Param &write_this, MetaInfoInterface &write_here, const String &key_prefix="") |
Writes all parameters to meta values. More... | |
Protected Member Functions inherited from DefaultParamHandler | |
virtual void | updateMembers_ () |
This method is used to update extra member variables at the end of the setParameters() method. More... | |
void | defaultsToParam_ () |
Updates the parameters after the defaults have been set in the constructor. More... | |
Protected Attributes inherited from DefaultParamHandler | |
Param | param_ |
Container for current parameters. More... | |
Param | defaults_ |
Container for default parameters. This member should be filled in the constructor of derived classes! More... | |
std::vector< String > | subsections_ |
Container for registered subsections. This member should be filled in the constructor of derived classes! More... | |
String | error_name_ |
Name that is displayed in error messages during the parameter checking. More... | |
bool | check_defaults_ |
If this member is set to false no checking if parameters in done;. More... | |
bool | warn_empty_defaults_ |
If this member is set to false no warning is emitted when defaults are empty;. More... | |
Protected Attributes inherited from ProgressLogger | |
LogType | type_ |
time_t | last_invoke_ |
ProgressLoggerImpl * | current_logger_ |
Static Protected Attributes inherited from ProgressLogger | |
static int | recursion_depth_ |
Algorithm class that implements simple protein inference by aggregation of peptide scores. It has multiple parameter options like the aggregation method, when to distinguish peptidoforms, and if you want to use shared peptides ("use_shared_peptides"). First, the best PSM per spectrum is used, then only the best PSM per peptidoform is aggregated. Peptidoforms can optionally be distinguished via the treat_X_separate parameters:
|
private |
get lambda function to aggregate scores
typedef std::unordered_map<std::string, std::map<Int, PeptideHit*> > SequenceToChargeToPSM |
|
strong |
Default constructor.
|
private |
get the AggregationMethod enum from a method_string
|
private |
|
private |
fills and updates the map of best peptide scores best_pep
(by sequence or modified sequence, depending on algorithm settings)
best_pep | (mod.) sequence to charge to pointer of best PSM (PeptideHit*) |
pep_ids | the spectra with PSMs |
overall_score_type | the pre-determined type name to raise an error if mixed types occur |
higher_better | if for this score type higher is better |
run_id | only process peptides associated with this run_id (e.g. proteinID run getIdentifier()) |
|
private |
check if a score_type
is compatible to the chosen aggregation_method
I.e. only probabilities can be used for multiplication
|
private |
get the initial score value based on the chosen aggregation_method
, higher_better
is needed for "best" score
|
private |
Performs simple aggregation-based inference on one protein run.
acc_to_protein_hitP_and_count | Maps Accessions to a pair of ProteinHit pointers and number of peptidoforms encountered |
best_pep | Maps (un)modified peptide sequence to a map from charge (0 when unconsidered) to the best PeptideHit pointer |
prot_run | The current run to process |
pep_ids | Peptides for the current run to process |
void run | ( | ConsensusMap & | cmap, |
ProteinIdentification & | prot_id, | ||
bool | include_unassigned | ||
) | const |
Performs the actual inference based on best psm per peptide in cmap
for proteins from prot_id
. Ideally prot_id
is the union of proteins in all runs of cmap
. Sorts and filters psms in pep_ids
. Annotates results in prot_id
. Associations (via getIdentifier) for peptides to protein runs ARE IGNORED and all pep_ids used.
void run | ( | std::vector< PeptideIdentification > & | pep_ids, |
ProteinIdentification & | prot_id | ||
) | const |
Performs the actual inference based on best psm per peptide in pep_ids
per run in prot_id
. Sorts and filters psms in pep_ids
. Annotates results in prot_id
. Associations (via getIdentifier) for peptides to protein runs need to be correct.
void run | ( | std::vector< PeptideIdentification > & | pep_ids, |
std::vector< ProteinIdentification > & | prot_ids | ||
) | const |
Performs the actual inference based on best psm per peptide in pep_ids
per run in prot_ids
. Sorts and filters psms in pep_ids
. Annotates results in prot_ids
. Associations (via getIdentifier) for peptides to protein runs need to be correct.
|
private |
aggregates and updates protein scores based on aggregation settings and aggregated peptide level results in prefilled best_pep
acc_to_protein_hitP_and_count | the results to fill |
best_pep | best psm per peptide to read the score |
pep_scores | if the score is a posterior error probability -> Auto-converts to posterior probability |
higher_better | if for the score higher is better. Assume score is unconverted. |