FLASHDeconv algorithm: ultrafast mass deconvolution algorithm for top down mass spectrometry dataset From MSSpectrum, this class outputs DeconvolvedSpectrum. Deconvolution takes three steps: i) decharging and select candidate masses - speed up via binning ii) collecting isotopes from the candidate masses and deisotope - peak groups are defined here iii) scoring and filter out low scoring masses (i.e., peak groups)
More...
#include <OpenMS/ANALYSIS/TOPDOWN/FLASHDeconvAlgorithm.h>
|
static int | getNominalMass (double mass) |
| convert double to nominal mass More...
|
|
static float | getCosine (const std::vector< float > &a, int a_start, int a_end, const IsotopeDistribution &b, int b_size, int offset, int min_iso_size) |
|
static float | getIsotopeCosineAndDetermineIsotopeIndex (double mono_mass, const std::vector< float > &per_isotope_intensities, int &offset, const PrecalculatedAveragine &avg, int iso_int_shift=0, int window_width=-1, int allowed_iso_error_for_second_best_cos=0, PeakGroup::TargetDummyType target_dummy_type=PeakGroup::TargetDummyType::target) |
| Examine intensity distribution over isotope indices. Also determines the most plausible isotope index or, monoisotopic mono_mass. More...
|
|
static void | addMZsToExcludsionList (const DeconvolvedSpectrum &dspec, std::unordered_set< double > &excluded_mzs) |
|
Static Public Member Functions inherited from DefaultParamHandler |
static void | writeParametersToMetaValues (const Param &write_this, MetaInfoInterface &write_here, const String &key_prefix="") |
| Writes all parameters to meta values. More...
|
|
|
void | updateLogMzPeaks_ () |
| generate log mz peaks from the input spectrum More...
|
|
void | updateMzBins_ (Size bin_number, std::vector< float > &mz_bin_intensities) |
| generate mz bins and intensity per mz bin from log mz peaks More...
|
|
double | getMassFromMassBin_ (Size mass_bin, double bin_mul_factor) const |
| get mass value for input mass bin More...
|
|
double | getMzFromMzBin_ (Size mass_bin, double bin_mul_factor) const |
| get mz value for input mz bin More...
|
|
void | generatePeakGroupsFromSpectrum_ () |
| Generate peak groups from the input spectrum. More...
|
|
Matrix< int > | updateMassBins_ (const std::vector< float > &mz_intensities) |
| Update mass_bins_. It select candidate mass bins using the universal pattern, eliminate possible harmonic masses. This function does not perform deisotoping. More...
|
|
Matrix< int > | filterMassBins_ (const std::vector< float > &mass_intensities) |
| Subfunction of updateMassBins_. More...
|
|
void | updateCandidateMassBins_ (std::vector< float > &mass_intensities, const std::vector< float > &mz_intensities) |
| Subfunction of updateMassBins_. It select candidate masses and update mass_bins_ using the universal pattern, eliminate possible harmonic masses. More...
|
|
void | getCandidatePeakGroups_ (const Matrix< int > &per_mass_abs_charge_ranges) |
| For selected masses in mass_bins_, select the peaks from the original spectrum. Also isotopic peaks are clustered in this function. More...
|
|
void | setFilters_ () |
| Make the universal pattern. More...
|
|
void | scoreAndFilterPeakGroups_ () |
| function for peak group scoring and filtering More...
|
|
void | removeChargeErrorPeakGroups_ (DeconvolvedSpectrum &dspec) const |
| filter out charge error masses More...
|
|
void | removeOverlappingPeakGroups_ (DeconvolvedSpectrum &dspec, double tol) |
| filter out overlapping masses More...
|
|
bool | registerPrecursor_ (const std::vector< DeconvolvedSpectrum > &survey_scans, const std::map< int, std::vector< std::vector< float >>> &precursor_map_for_real_time_acquisition) |
| register the precursor peak as well as the precursor peak group (or mass) if possible for MSn (n>1) spectrum. Given a precursor peak (found in the original MS n-1 Spectrum) the masses containing the precursor peak are searched. If multiple masses are detected, the one with the best Qscore is selected. For the selected mass, its corresponding peak group (along with precursor peak) is registered. If no such mass exists, only the precursor peak is registered. More...
|
|
|
static double | getBinValue_ (Size bin, double min_value, double bin_mul_factor) |
| static function that converts bin to value More...
|
|
static Size | getBinNumber_ (double value, double min_value, double bin_mul_factor) |
| static function that converts value to bin More...
|
|
FLASHDeconv algorithm: ultrafast mass deconvolution algorithm for top down mass spectrometry dataset From MSSpectrum, this class outputs DeconvolvedSpectrum. Deconvolution takes three steps: i) decharging and select candidate masses - speed up via binning ii) collecting isotopes from the candidate masses and deisotope - peak groups are defined here iii) scoring and filter out low scoring masses (i.e., peak groups)
◆ LogMzPeak
◆ PrecalculatedAveragine
◆ FLASHDeconvAlgorithm() [1/3]
◆ FLASHDeconvAlgorithm() [2/3]
◆ FLASHDeconvAlgorithm() [3/3]
◆ ~FLASHDeconvAlgorithm()
◆ addMZsToExcludsionList()
static void addMZsToExcludsionList |
( |
const DeconvolvedSpectrum & |
dspec, |
|
|
std::unordered_set< double > & |
excluded_mzs |
|
) |
| |
|
static |
add m/zs in input DeconvolvedSpectrum into exclusion list. The exclusion list is used to generate noise dummy masses.
- Parameters
-
◆ calculateAveragine()
void calculateAveragine |
( |
bool |
use_RNA_averagine | ) |
|
precalculate averagine (for predefined mass bins) to speed up averagine generation
- Parameters
-
use_RNA_averagine | if set, averagine for RNA (nucleotides) is calculated |
◆ filterMassBins_()
Matrix<int> filterMassBins_ |
( |
const std::vector< float > & |
mass_intensities | ) |
|
|
private |
Subfunction of updateMassBins_.
- Parameters
-
mass_intensities | per mass bin intensity |
- Returns
- a matrix containing charge ranges for all found masses
◆ generatePeakGroupsFromSpectrum_()
void generatePeakGroupsFromSpectrum_ |
( |
| ) |
|
|
private |
Generate peak groups from the input spectrum.
◆ getAveragine()
get calculated averagine. Call after calculateAveragine is called.
◆ getBinNumber_()
static Size getBinNumber_ |
( |
double |
value, |
|
|
double |
min_value, |
|
|
double |
bin_mul_factor |
|
) |
| |
|
staticprivate |
static function that converts value to bin
- Parameters
-
value | value |
min_value | minimum value (corresponding to bin number = 0) |
bin_mul_factor | bin multiplication factor: bin_number = (bin_value * bin_mul_factors_ - min_value) |
- Returns
- bin corresponding to value
◆ getBinValue_()
static double getBinValue_ |
( |
Size |
bin, |
|
|
double |
min_value, |
|
|
double |
bin_mul_factor |
|
) |
| |
|
staticprivate |
static function that converts bin to value
- Parameters
-
bin | bin number |
min_value | minimum value (corresponding to bin number = 0) |
bin_mul_factor | bin multiplication factor: bin_value = (min_value + bin_number/ bin_mul_factors_) |
- Returns
- value corresponding to bin
◆ getCandidatePeakGroups_()
void getCandidatePeakGroups_ |
( |
const Matrix< int > & |
per_mass_abs_charge_ranges | ) |
|
|
private |
For selected masses in mass_bins_, select the peaks from the original spectrum. Also isotopic peaks are clustered in this function.
- Parameters
-
per_mass_abs_charge_ranges | charge range per mass |
◆ getCosine()
static float getCosine |
( |
const std::vector< float > & |
a, |
|
|
int |
a_start, |
|
|
int |
a_end, |
|
|
const IsotopeDistribution & |
b, |
|
|
int |
b_size, |
|
|
int |
offset, |
|
|
int |
min_iso_size |
|
) |
| |
|
static |
calculate cosine between two vectors a and b with additional parameters for fast calculation
- Parameters
-
a | vector a |
a_start | non zero start index of a |
a_end | non zero end index of a (exclusive) |
b | vector b |
b_size | size of b |
offset | element index offset between a and b |
min_iso_size | minimum isotope size. If isotope size is less than this, return 0 |
◆ getDeconvolvedSpectrum()
◆ getIsotopeCosineAndDetermineIsotopeIndex()
static float getIsotopeCosineAndDetermineIsotopeIndex |
( |
double |
mono_mass, |
|
|
const std::vector< float > & |
per_isotope_intensities, |
|
|
int & |
offset, |
|
|
const PrecalculatedAveragine & |
avg, |
|
|
int |
iso_int_shift = 0 , |
|
|
int |
window_width = -1 , |
|
|
int |
allowed_iso_error_for_second_best_cos = 0 , |
|
|
PeakGroup::TargetDummyType |
target_dummy_type = PeakGroup::TargetDummyType::target |
|
) |
| |
|
static |
Examine intensity distribution over isotope indices. Also determines the most plausible isotope index or, monoisotopic mono_mass.
- Parameters
-
mono_mass | monoisotopic mass |
per_isotope_intensities | vector of intensities associated with each isotope - aggregated through charges |
offset | output offset between input monoisotopic mono_mass and determined monoisotopic mono_mass |
avg | precalculated averagine |
iso_int_shift | isotope shift in per_isotope_intensities. |
window_width | isotope offset value range. If -1, set automatically. |
allowed_iso_error_for_second_best_cos | allowed isotope error to calculate the second best cos. If target_dummy_type is not PeakGroup::TargetDummyType::target, the second best cosine and its corresponding offset will be output |
target_dummy_type | This target_dummy_type specifies if a PeakGroup is a target (0), charge dummy (1), noise dummy (2), or isotope dummy (3) |
- Returns
- calculated cosine similar score
◆ getMassFromMassBin_()
double getMassFromMassBin_ |
( |
Size |
mass_bin, |
|
|
double |
bin_mul_factor |
|
) |
| const |
|
private |
get mass value for input mass bin
◆ getMzFromMzBin_()
double getMzFromMzBin_ |
( |
Size |
mass_bin, |
|
|
double |
bin_mul_factor |
|
) |
| const |
|
private |
get mz value for input mz bin
◆ getNominalMass()
static int getNominalMass |
( |
double |
mass | ) |
|
|
static |
convert double to nominal mass
◆ operator=() [1/2]
◆ operator=() [2/2]
◆ performSpectrumDeconvolution()
void performSpectrumDeconvolution |
( |
const MSSpectrum & |
spec, |
|
|
const std::vector< DeconvolvedSpectrum > & |
survey_scans, |
|
|
int |
scan_number, |
|
|
const std::map< int, std::vector< std::vector< float >>> & |
precursor_map_for_FLASHIda |
|
) |
| |
main deconvolution function that generates the deconvolved target and dummy spectrum based on the original spectrum.
- Parameters
-
spec | the original spectrum |
survey_scans | the survey scans to assign precursor mass to the deconvolved spectrum. |
scan_number | scan number from input spectrum. |
precursor_map_for_FLASHIda | deconvolved precursor information from FLASHIda |
Referenced by TOPPFLASHDeconv::main_().
◆ registerPrecursor_()
bool registerPrecursor_ |
( |
const std::vector< DeconvolvedSpectrum > & |
survey_scans, |
|
|
const std::map< int, std::vector< std::vector< float >>> & |
precursor_map_for_real_time_acquisition |
|
) |
| |
|
private |
register the precursor peak as well as the precursor peak group (or mass) if possible for MSn (n>1) spectrum. Given a precursor peak (found in the original MS n-1 Spectrum) the masses containing the precursor peak are searched. If multiple masses are detected, the one with the best Qscore is selected. For the selected mass, its corresponding peak group (along with precursor peak) is registered. If no such mass exists, only the precursor peak is registered.
- Parameters
-
survey_scans | the candidate precursor spectra - the user may allow search of previous N survey scans. |
precursor_map_for_real_time_acquisition | this contains the deconvolved mass information from FLASHIda runs. |
◆ removeChargeErrorPeakGroups_()
filter out charge error masses
◆ removeOverlappingPeakGroups_()
filter out overlapping masses
◆ scoreAndFilterPeakGroups_()
void scoreAndFilterPeakGroups_ |
( |
| ) |
|
|
private |
function for peak group scoring and filtering
◆ setAveragine()
◆ setFilters_()
Make the universal pattern.
◆ setTargetDummyType()
set target dummy type for the FLASHDeconvAlgorithm run. All masses from the target FLASHDeconvAlgorithm run will have the target_dummy_type_.
- Parameters
-
target_dummy_type | This target_dummy_type_ specifies if a PeakGroup is a target (0), charge dummy (1), noise dummy (2), or isotope dummy (3) |
target_dspec_for_dummy_calcualtion | target masses from normal deconvolution |
Referenced by TOPPFLASHDeconv::main_().
◆ setTargetMasses()
void setTargetMasses |
( |
const std::vector< double > & |
masses, |
|
|
bool |
exclude = false |
|
) |
| |
set targeted or excluded masses for targeted deconvolution. Masses are targeted or excluded in all ms levels.
- Parameters
-
masses | Masses to be targeted or excluded (exclude=true ) |
exclude | if set, masses are excluded. |
◆ updateCandidateMassBins_()
void updateCandidateMassBins_ |
( |
std::vector< float > & |
mass_intensities, |
|
|
const std::vector< float > & |
mz_intensities |
|
) |
| |
|
private |
Subfunction of updateMassBins_. It select candidate masses and update mass_bins_ using the universal pattern, eliminate possible harmonic masses.
- Parameters
-
mass_intensities | mass bin intensities which are updated in this function |
mz_intensities | mz bin intensities |
◆ updateLogMzPeaks_()
void updateLogMzPeaks_ |
( |
| ) |
|
|
private |
generate log mz peaks from the input spectrum
◆ updateMassBins_()
Matrix<int> updateMassBins_ |
( |
const std::vector< float > & |
mz_intensities | ) |
|
|
private |
Update mass_bins_. It select candidate mass bins using the universal pattern, eliminate possible harmonic masses. This function does not perform deisotoping.
- Parameters
-
mz_intensities | per mz bin intensity |
- Returns
- a matrix containing charge ranges for all found masses
◆ updateMembers_()
This method is used to update extra member variables at the end of the setParameters() method.
Also call it at the end of the derived classes' copy constructor and assignment operator.
The default implementation is empty.
Reimplemented from DefaultParamHandler.
◆ updateMzBins_()
void updateMzBins_ |
( |
Size |
bin_number, |
|
|
std::vector< float > & |
mz_bin_intensities |
|
) |
| |
|
private |
generate mz bins and intensity per mz bin from log mz peaks
- Parameters
-
bin_number | number of mz bins |
mz_bin_intensities | intensity per mz bin |
◆ allowed_iso_error_
int allowed_iso_error_ = 1 |
|
private |
allowed isotope error in deconvolved mass to calculate qvalue
◆ avg_
precalculated averagine distributions for fast averagine generation
◆ bin_mul_factors_
bin multiplication factor (log mz * bin_mul_factors_ = bin number) - for fast convolution, binning is used
◆ bin_offsets_
std::vector<int> bin_offsets_ |
|
private |
This stores the "universal pattern" in binned dimension.
◆ current_max_charge_
current_max_charge_: controlled by precursor charge for MSn n>1; otherwise just max_abs_charge_
◆ current_max_mass_
max mass is controlled by precursor mass for MSn n>1; otherwise just max_mass
◆ current_min_mass_
max mass is max_mass for MS1 and 50 for MS2
◆ deconvolved_spectrum_
selected_peak_groups_ stores the deconvolved mass peak groups
◆ excluded_masses_
std::vector<double> excluded_masses_ |
|
private |
mass bins that are excluded for FLASHIda global targeting mode
◆ excluded_peak_mzs_
std::unordered_set<double> excluded_peak_mzs_ |
|
private |
◆ filter_
std::vector<double> filter_ |
|
private |
This stores the "universal pattern".
◆ harmonic_bin_offset_matrix_
Matrix<int> harmonic_bin_offset_matrix_ |
|
private |
This stores the patterns for harmonic reduction in binned dimension.
◆ harmonic_filter_matrix_
Matrix<double> harmonic_filter_matrix_ |
|
private |
This stores the patterns for harmonic reduction.
◆ intensity_threshold_
double intensity_threshold_ |
|
private |
peak intensity threshold subject to analysis
◆ is_positive_
◆ iso_da_distance_
◆ isolation_window_size_
double isolation_window_size_ |
|
private |
default precursor isolation window size.
◆ log_mz_peaks_
◆ mass_bin_min_value_
double mass_bin_min_value_ |
|
private |
minimum mass and mz values representing the first bin of massBin and mzBin, respectively: to save memory space
◆ mass_bins_
boost::dynamic_bitset mass_bins_ |
|
private |
mass_bins_ stores the selected bins for this spectrum + overlapped spectrum (previous a few spectra).
◆ max_abs_charge_
◆ max_mass_
◆ max_mz_
◆ max_rt_
◆ min_abs_charge_
min charge and max charge subject to analysis, set by users
◆ min_iso_size_
const int min_iso_size_ = 2 |
|
staticprivate |
FLASHDeconv parameters.
minimum isotopologue count in a peak group
◆ min_isotope_cosine_
cosine threshold between observed and theoretical isotope patterns for each MS level
◆ min_mass_
mass ranges of deconvolution, set by users
◆ min_mz_
range of mz subject to analysis
◆ min_rt_
range of RT subject to analysis (in seconds)
◆ min_support_peak_count_
const int min_support_peak_count_ = 2 |
|
staticprivate |
minimum number of peaks supporting a mass minus one
◆ ms_level_
◆ mz_bin_min_value_
◆ mz_bins_
boost::dynamic_bitset mz_bins_ |
|
private |
mz_bins_ stores the binned log mz peaks
◆ previously_deconved_mass_bins_for_dummy_
boost::dynamic_bitset previously_deconved_mass_bins_for_dummy_ |
|
private |
mass bins that are previsouly deconvolved and excluded for dummy mass generation
◆ previously_deconved_mono_masses_for_dummy_
std::vector<double> previously_deconved_mono_masses_for_dummy_ |
|
private |
◆ target_dspec_for_dummy_calcualtion_
the deconvolved spectrum from normal run. This is used when dummy masses are generated.
◆ target_dummy_type_
◆ target_mass_bins_
boost::dynamic_bitset target_mass_bins_ |
|
private |
mass bins that are targeted for FLASHIda global targeting mode
◆ target_mono_masses_
std::vector<double> target_mono_masses_ |
|
private |
◆ tolerance_
tolerance in ppm for each MS level