OpenMS
|
A map alignment algorithm based on peptide identifications from MS2 spectra. More...
#include <OpenMS/ANALYSIS/MAPMATCHING/MapAlignmentAlgorithmTreeGuided.h>
Public Member Functions | |
MapAlignmentAlgorithmTreeGuided () | |
Default constructor. More... | |
~MapAlignmentAlgorithmTreeGuided () override | |
Destructor. More... | |
void | treeGuidedAlignment (const std::vector< BinaryTreeNode > &tree, std::vector< FeatureMap > &feature_maps_transformed, std::vector< std::vector< double >> &maps_ranges, FeatureMap &map_transformed, std::vector< Size > &trafo_order) |
Align feature maps tree guided using align() of OpenMS::MapAlignmentAlgorithmIdentification and use TreeNode with larger 10/90 percentile range as reference. More... | |
void | align (std::vector< FeatureMap > &data, std::vector< TransformationDescription > &transformations) |
Align feature maps tree guided using align() of OpenMS::MapAlignmentAlgorithmIdentification and use TreeNode with larger 10/90 percentile range as reference. More... | |
void | computeTrafosByOriginalRT (std::vector< FeatureMap > &feature_maps, FeatureMap &map_transformed, std::vector< TransformationDescription > &transformations, const std::vector< Size > &trafo_order) |
Extract original RT ("original_RT" MetaInfo) and transformed RT for each feature to compute RT transformations. More... | |
Public Member Functions inherited from DefaultParamHandler | |
DefaultParamHandler (const String &name) | |
Constructor with name that is displayed in error messages. More... | |
DefaultParamHandler (const DefaultParamHandler &rhs) | |
Copy constructor. More... | |
virtual | ~DefaultParamHandler () |
Destructor. More... | |
DefaultParamHandler & | operator= (const DefaultParamHandler &rhs) |
Assignment operator. More... | |
virtual bool | operator== (const DefaultParamHandler &rhs) const |
Equality operator. More... | |
void | setParameters (const Param ¶m) |
Sets the parameters. More... | |
const Param & | getParameters () const |
Non-mutable access to the parameters. More... | |
const Param & | getDefaults () const |
Non-mutable access to the default parameters. More... | |
const String & | getName () const |
Non-mutable access to the name. More... | |
void | setName (const String &name) |
Mutable access to the name. More... | |
const std::vector< String > & | getSubsections () const |
Non-mutable access to the registered subsections. More... | |
Public Member Functions inherited from ProgressLogger | |
ProgressLogger () | |
Constructor. More... | |
virtual | ~ProgressLogger () |
Destructor. More... | |
ProgressLogger (const ProgressLogger &other) | |
Copy constructor. More... | |
ProgressLogger & | operator= (const ProgressLogger &other) |
Assignment Operator. More... | |
void | setLogType (LogType type) const |
Sets the progress log that should be used. The default type is NONE! More... | |
LogType | getLogType () const |
Returns the type of progress log being used. More... | |
void | setLogger (ProgressLoggerImpl *logger) |
Sets the logger to be used for progress logging. More... | |
void | startProgress (SignedSize begin, SignedSize end, const String &label) const |
Initializes the progress display. More... | |
void | setProgress (SignedSize value) const |
Sets the current progress. More... | |
void | endProgress (UInt64 bytes_processed=0) const |
void | nextProgress () const |
increment progress by 1 (according to range begin-end) More... | |
Static Public Member Functions | |
static void | buildTree (std::vector< FeatureMap > &feature_maps, std::vector< BinaryTreeNode > &tree, std::vector< std::vector< double >> &maps_ranges) |
Extract RTs given for individual features of each map, calculate distances for each pair of maps and cluster hierarchical using average linkage. More... | |
static void | computeTransformedFeatureMaps (std::vector< FeatureMap > &feature_maps, const std::vector< TransformationDescription > &transformations) |
Apply transformations on input maps. More... | |
Static Public Member Functions inherited from DefaultParamHandler | |
static void | writeParametersToMetaValues (const Param &write_this, MetaInfoInterface &write_here, const String &key_prefix="") |
Writes all parameters to meta values. More... | |
Protected Types | |
typedef std::map< String, DoubleList > | SeqAndRTList |
Type to store feature retention times given for individual peptide sequence. More... | |
Protected Member Functions | |
void | updateMembers_ () override |
This method is used to update extra member variables at the end of the setParameters() method. More... | |
Protected Member Functions inherited from DefaultParamHandler | |
void | defaultsToParam_ () |
Updates the parameters after the defaults have been set in the constructor. More... | |
Static Protected Member Functions | |
static void | addPeptideSequences_ (const std::vector< PeptideIdentification > &peptides, SeqAndRTList &peptide_rts, std::vector< double > &map_range, double feature_rt) |
For given peptide identifications extract sequences and store with associated feature RT. More... | |
static void | extractSeqAndRt_ (const std::vector< FeatureMap > &feature_maps, std::vector< SeqAndRTList > &maps_seq_and_rt, std::vector< std::vector< double >> &maps_ranges) |
For each input map, extract peptide identifications (sequences) of existing features with associated feature RT. More... | |
Protected Attributes | |
String | model_type_ |
Type of transformation model. More... | |
Param | model_param_ |
Default params of transformation models linear, b_spline, lowess and interpolated. More... | |
MapAlignmentAlgorithmIdentification | align_algorithm_ |
Instantiation of alignment algorithm. More... | |
Protected Attributes inherited from DefaultParamHandler | |
Param | param_ |
Container for current parameters. More... | |
Param | defaults_ |
Container for default parameters. This member should be filled in the constructor of derived classes! More... | |
std::vector< String > | subsections_ |
Container for registered subsections. This member should be filled in the constructor of derived classes! More... | |
String | error_name_ |
Name that is displayed in error messages during the parameter checking. More... | |
bool | check_defaults_ |
If this member is set to false no checking if parameters in done;. More... | |
bool | warn_empty_defaults_ |
If this member is set to false no warning is emitted when defaults are empty;. More... | |
Protected Attributes inherited from ProgressLogger | |
LogType | type_ |
time_t | last_invoke_ |
ProgressLoggerImpl * | current_logger_ |
Private Member Functions | |
MapAlignmentAlgorithmTreeGuided (const MapAlignmentAlgorithmTreeGuided &) | |
Copy constructor intentionally not implemented -> private. More... | |
MapAlignmentAlgorithmTreeGuided & | operator= (const MapAlignmentAlgorithmTreeGuided &) |
Assignment operator intentionally not implemented -> private. More... | |
Additional Inherited Members | |
Public Types inherited from ProgressLogger | |
enum | LogType { CMD , GUI , NONE } |
Possible log types. More... | |
Static Protected Attributes inherited from ProgressLogger | |
static int | recursion_depth_ |
A map alignment algorithm based on peptide identifications from MS2 spectra.
ID groups with the same sequence in different maps represent points of correspondence in RT between the maps. They are used to evaluate the distances between the maps for hierarchical clustering and form the basis for the alignment. Only the best PSM per spectrum is considered as the correct identification.
For each pair of maps, the similarity is determined based on the intersection of the contained identifications using Pearson correlation. For small intersections, the Pearson value is reduced by multiplying the ratio of the intersection size to the union size: \(\texttt{PearsonValue(map1}\cap \texttt{map2)}*\Bigl(\frac{\texttt{N(map1 }\cap\texttt{ map2})}{\texttt{N(map1 }\cup\texttt{ map2})}\Bigr)\) Using hierarchical clustering together with average linkage a binary tree is produced. Following the tree, the maps are aligned, resulting in a transformed feature map that contains both the original and the transformed retention times. As long as there are at least two clusters, the alignment is done as follows: Of every pair of clusters, the one with the larger 10/90 percentile retention time range is selected as reference for the align() method of OpenMS::MapAlignmentAlgorithmIdentification. align() aligns the median retention time of each ID group in the second cluster to the reference retention time of this group. Cubic spline smoothing is used to convert this mapping to a smooth function. Retention times in the second cluster are transformed to the reference scale by applying this function. Additionally, the original retention times are stored in the meta information of each feature. The reference is combined with the transformed cluster.
The resulting map is used to extract transformation descriptions for each input map. For each map cubic spline smoothing is used to convert the mapping to a smooth function. Retention times of each map are transformed by applying the smoothed function.
Parameters of this class are:Name | Type | Default | Restrictions | Description |
---|---|---|---|---|
model_type | string | b_spline | linear, b_spline, lowess, interpolated | Options to control the modeling of retention time transformations from data |
model:type | string | b_spline | linear, b_spline, lowess, interpolated | Type of model |
model:linear:symmetric_regression | string | false | true, false | Perform linear regression on 'y - x' vs. 'y + x', instead of on 'y' vs. 'x'. |
model:linear:x_weight | string | x | 1/x, 1/x2, ln(x), x | Weight x values |
model:linear:y_weight | string | y | 1/y, 1/y2, ln(y), y | Weight y values |
model:linear:x_datum_min | float | 1.0e-15 | Minimum x value | |
model:linear:x_datum_max | float | 1.0e15 | Maximum x value | |
model:linear:y_datum_min | float | 1.0e-15 | Minimum y value | |
model:linear:y_datum_max | float | 1.0e15 | Maximum y value | |
model:b_spline:wavelength | float | 0.0 | min: 0.0 | Determines the amount of smoothing by setting the number of nodes for the B-spline. The number is chosen so that the spline approximates a low-pass filter with this cutoff wavelength. The wavelength is given in the same units as the data; a higher value means more smoothing. '0' sets the number of nodes to twice the number of input points. |
model:b_spline:num_nodes | int | 5 | min: 0 | Number of nodes for B-spline fitting. Overrides 'wavelength' if set (to two or greater). A lower value means more smoothing. |
model:b_spline:extrapolate | string | linear | linear, b_spline, constant, global_linear | Method to use for extrapolation beyond the original data range. 'linear': Linear extrapolation using the slope of the B-spline at the corresponding endpoint. 'b_spline': Use the B-spline (as for interpolation). 'constant': Use the constant value of the B-spline at the corresponding endpoint. 'global_linear': Use a linear fit through the data (which will most probably introduce discontinuities at the ends of the data range). |
model:b_spline:boundary_condition | int | 2 | min: 0 max: 2 | Boundary condition at B-spline endpoints: 0 (value zero), 1 (first derivative zero) or 2 (second derivative zero) |
model:lowess:span | float | 0.666666666666667 | min: 0.0 max: 1.0 | Fraction of datapoints (f) to use for each local regression (determines the amount of smoothing). Choosing this parameter in the range .2 to .8 usually results in a good fit. |
model:lowess:num_iterations | int | 3 | min: 0 | Number of robustifying iterations for lowess fitting. |
model:lowess:delta | float | -1.0 | Nonnegative parameter which may be used to save computations (recommended value is 0.01 of the range of the input, e.g. for data ranging from 1000 seconds to 2000 seconds, it could be set to 10). Setting a negative value will automatically do this. | |
model:lowess:interpolation_type | string | cspline | linear, cspline, akima | Method to use for interpolation between datapoints computed by lowess. 'linear': Linear interpolation. 'cspline': Use the cubic spline for interpolation. 'akima': Use an akima spline for interpolation |
model:lowess:extrapolation_type | string | four-point-linear | two-point-linear, four-point-linear, global-linear | Method to use for extrapolation outside the data range. 'two-point-linear': Uses a line through the first and last point to extrapolate. 'four-point-linear': Uses a line through the first and second point to extrapolate in front and and a line through the last and second-to-last point in the end. 'global-linear': Uses a linear regression to fit a line through all data points and use it for interpolation. |
model:interpolated:interpolation_type | string | cspline | linear, cspline, akima | Type of interpolation to apply. |
model:interpolated:extrapolation_type | string | two-point-linear | two-point-linear, four-point-linear, global-linear | Type of extrapolation to apply: two-point-linear: use the first and last data point to build a single linear model, four-point-linear: build two linear models on both ends using the first two / last two points, global-linear: use all points to build a single linear model. Note that global-linear may not be continuous at the border. |
align_algorithm:score_type | string | Name of the score type to use for ranking and filtering (.oms input only). If left empty, a score type is picked automatically. | ||
align_algorithm:score_cutoff | string | false | true, false | Use only IDs above a score cut-off (parameter 'min_score') for alignment? |
align_algorithm:min_score | float | 0.05 | If 'score_cutoff' is 'true': Minimum score for an ID to be considered. Unless you have very few runs or identifications, increase this value to focus on more informative peptides. |
|
align_algorithm:min_run_occur | int | 2 | min: 2 | Minimum number of runs (incl. reference, if any) in which a peptide must occur to be used for the alignment. Unless you have very few runs or identifications, increase this value to focus on more informative peptides. |
align_algorithm:max_rt_shift | float | 0.5 | min: 0.0 | Maximum realistic RT difference for a peptide (median per run vs. reference). Peptides with higher shifts (outliers) are not used to compute the alignment. If 0, no limit (disable filter); if > 1, the final value in seconds; if <= 1, taken as a fraction of the range of the reference RT scale. |
align_algorithm:use_unassigned_peptides | string | true | true, false | Should unassigned peptide identifications be used when computing an alignment of feature or consensus maps? If 'false', only peptide IDs assigned to features will be used. |
align_algorithm:use_feature_rt | string | true | true, false | When aligning feature or consensus maps, don't use the retention time of a peptide identification directly; instead, use the retention time of the centroid of the feature (apex of the elution profile) that the peptide was matched to. If different identifications are matched to one feature, only the peptide closest to the centroid in RT is used. Precludes 'use_unassigned_peptides'. |
align_algorithm:use_adducts | string | true | true, false | If IDs contain adducts, treat differently adducted variants of the same molecule as different. |
|
protected |
Type to store feature retention times given for individual peptide sequence.
Default constructor.
|
override |
Destructor.
|
private |
Copy constructor intentionally not implemented -> private.
|
staticprotected |
For given peptide identifications extract sequences and store with associated feature RT.
peptides | Vector of peptide identifications to extract sequences. |
peptide_rts | Map to store a list of feature RTs for each peptide sequence as key. |
map_range | Vector in which all feature RTs are stored for given peptide identifications. |
feature_rt | RT value of the feature to which the peptide identifications to be analysed belong. |
void align | ( | std::vector< FeatureMap > & | data, |
std::vector< TransformationDescription > & | transformations | ||
) |
Align feature maps tree guided using align() of OpenMS::MapAlignmentAlgorithmIdentification and use TreeNode with larger 10/90 percentile range as reference.
|
static |
Extract RTs given for individual features of each map, calculate distances for each pair of maps and cluster hierarchical using average linkage.
feature_maps | Vector of input maps (FeatureMap) whose distance is to be calculated. |
tree | Vector of BinaryTreeNodes that will be computed |
maps_ranges | Vector to store all sorted RTs of extracted identifications for each map in feature_maps ; needed to determine the 10/90 percentiles |
void computeTrafosByOriginalRT | ( | std::vector< FeatureMap > & | feature_maps, |
FeatureMap & | map_transformed, | ||
std::vector< TransformationDescription > & | transformations, | ||
const std::vector< Size > & | trafo_order | ||
) |
Extract original RT ("original_RT" MetaInfo) and transformed RT for each feature to compute RT transformations.
feature_maps | Vector of input maps for size information. |
map_transformed | FeatureMap that contains all features of combined maps with original and transformed RTs in order of alignment. |
transformations | Vector to store transformation descriptions for each map. (output) |
trafo_order | Vector that contains the indices of aligned maps in order of alignment. |
|
static |
Apply transformations on input maps.
feature_maps | Vector of maps to be transformed (output) |
transformations | Vector that contains TransformationDescriptions that are applied to input maps |
|
staticprotected |
For each input map, extract peptide identifications (sequences) of existing features with associated feature RT.
feature_maps | Vector of original maps containing peptide identifications. |
maps_seq_and_rt | Vector of maps to store feature RTs given for individual peptide sequences for each feature map. |
maps_ranges | Vector to store all feature RTs of extracted identifications for each map; needed to determine the 10/90 percentiles. |
|
private |
Assignment operator intentionally not implemented -> private.
void treeGuidedAlignment | ( | const std::vector< BinaryTreeNode > & | tree, |
std::vector< FeatureMap > & | feature_maps_transformed, | ||
std::vector< std::vector< double >> & | maps_ranges, | ||
FeatureMap & | map_transformed, | ||
std::vector< Size > & | trafo_order | ||
) |
Align feature maps tree guided using align() of OpenMS::MapAlignmentAlgorithmIdentification and use TreeNode with larger 10/90 percentile range as reference.
tree | Vector of BinaryTreeNodes that contains order for alignment. |
feature_maps_transformed | Vector with input maps for transformation process. Because the transformed maps are stored within this vector it's not const. |
maps_ranges | Vector that contains all sorted RTs of extracted identifications for each map; needed to determine the 10/90 percentiles. |
map_transformed | FeatureMap to store all features of combined maps with original and transformed RTs in order of alignment. |
trafo_order | Vector to store indices of maps in order of alignment. |
|
overrideprotectedvirtual |
This method is used to update extra member variables at the end of the setParameters() method.
Also call it at the end of the derived classes' copy constructor and assignment operator.
The default implementation is empty.
Reimplemented from DefaultParamHandler.
|
protected |
Instantiation of alignment algorithm.
|
protected |
Default params of transformation models linear, b_spline, lowess and interpolated.
|
protected |
Type of transformation model.