OpenMS
|
Helper class for looking up spectrum meta data. More...
#include <OpenMS/METADATA/SpectrumMetaDataLookup.h>
Classes | |
struct | SpectrumMetaData |
Meta data of a spectrum. More... | |
Public Types | |
typedef unsigned char | MetaDataFlags |
Bit mask for which meta data to extract from a spectrum. More... | |
Public Member Functions | |
SpectrumMetaDataLookup () | |
Constructor. More... | |
~SpectrumMetaDataLookup () override | |
Destructor. More... | |
template<typename SpectrumContainer > | |
void | readSpectra (const SpectrumContainer &spectra, const String &scan_regexp=default_scan_regexp, bool get_precursor_rt=false) |
Read spectra and store their meta data. More... | |
void | setSpectraDataRef (const String &spectra_data) |
set spectra_data from read SpectrumContainer origin (i.e. filename) More... | |
void | getSpectrumMetaData (Size index, SpectrumMetaData &meta) const |
Look up meta data of a spectrum. More... | |
void | getSpectrumMetaData (const String &spectrum_ref, SpectrumMetaData &meta, MetaDataFlags flags=MDF_ALL) const |
Extract meta data via a spectrum reference. More... | |
Public Member Functions inherited from SpectrumLookup | |
SpectrumLookup () | |
Constructor. More... | |
virtual | ~SpectrumLookup () |
Destructor. More... | |
bool | empty () const |
Check if any spectra were set. More... | |
template<typename SpectrumContainer > | |
void | readSpectra (const SpectrumContainer &spectra, const String &scan_regexp=default_scan_regexp) |
Read and index spectra for later look-up. More... | |
Size | findByRT (double rt) const |
Look up spectrum by retention time (RT). More... | |
Size | findByNativeID (const String &native_id) const |
Look up spectrum by native ID. More... | |
Size | findByIndex (Size index, bool count_from_one=false) const |
Look up spectrum by index (position in the vector of spectra). More... | |
Size | findByScanNumber (Size scan_number) const |
Look up spectrum by scan number (extracted from the native ID). More... | |
Size | findByReference (const String &spectrum_ref) const |
Look up spectrum by reference. More... | |
void | addReferenceFormat (const String ®exp) |
Register a possible format for a spectrum reference. More... | |
Static Public Member Functions | |
static void | getSpectrumMetaData (const MSSpectrum &spectrum, SpectrumMetaData &meta, const boost::regex &scan_regexp=boost::regex(), const std::map< Size, double > &precursor_rts=(std::map< Size, double >())) |
Extract meta data from a spectrum. More... | |
static bool | addMissingRTsToPeptideIDs (std::vector< PeptideIdentification > &peptides, const String &filename, bool stop_on_error=false) |
Add missing retention time values to peptide identifications based on raw data. More... | |
static bool | addMissingSpectrumReferences (std::vector< PeptideIdentification > &peptides, const String &filename, bool stop_on_error=false, bool override_spectra_data=false, bool override_spectra_references=false, std::vector< ProteinIdentification > proteins=std::vector< ProteinIdentification >()) |
Add missing "spectrum_reference"s to peptide identifications based on raw data. More... | |
Static Public Member Functions inherited from SpectrumLookup | |
static Int | extractScanNumber (const String &native_id, const boost::regex &scan_regexp, bool no_error=false) |
Extract the scan number from the native ID of a spectrum. More... | |
static Int | extractScanNumber (const String &native_id, const String &native_id_type_accession) |
static std::string | getRegExFromNativeID (const String &native_id) |
Determine the RegEx string to extract scan/index number from native IDs. Can be used for extractScanNumber. More... | |
static bool | isNativeID (const String &id) |
Simple prefix check if a spectrum identifier id is a nativeID from a vendor file. More... | |
Static Public Attributes | |
static const MetaDataFlags | MDF_RT = 1 |
static const MetaDataFlags | MDF_PRECURSORRT = 2 |
static const MetaDataFlags | MDF_PRECURSORMZ = 4 |
static const MetaDataFlags | MDF_PRECURSORCHARGE = 8 |
static const MetaDataFlags | MDF_MSLEVEL = 16 |
static const MetaDataFlags | MDF_SCANNUMBER = 32 |
static const MetaDataFlags | MDF_NATIVEID = 64 |
static const MetaDataFlags | MDF_ALL = 127 |
Static Public Attributes inherited from SpectrumLookup | |
static const String & | default_scan_regexp |
Default regular expression for extracting scan numbers from spectrum native IDs. More... | |
Protected Attributes | |
std::vector< SpectrumMetaData > | metadata_ |
Meta data for spectra. More... | |
String | spectra_data_ref |
Protected Attributes inherited from SpectrumLookup | |
Size | n_spectra_ |
Number of spectra. More... | |
boost::regex | scan_regexp_ |
Regular expression to extract scan numbers. More... | |
std::vector< String > | regexp_name_list_ |
Named groups in vector format. More... | |
std::map< double, Size > | rts_ |
Mapping: RT -> spectrum index. More... | |
std::map< String, Size > | ids_ |
Mapping: native ID -> spectrum index. More... | |
std::map< Size, Size > | scans_ |
Mapping: scan number -> spectrum index. More... | |
Private Member Functions | |
SpectrumMetaDataLookup (const SpectrumMetaDataLookup &) | |
Copy constructor (not implemented) More... | |
SpectrumMetaDataLookup & | operator= (const SpectrumMetaDataLookup &) |
Assignment operator (not implemented) More... | |
Additional Inherited Members | |
Public Attributes inherited from SpectrumLookup | |
std::vector< boost::regex > | reference_formats |
Possible formats of spectrum references, defined as regular expressions. More... | |
double | rt_tolerance |
Tolerance for look-up by retention time. More... | |
Protected Member Functions inherited from SpectrumLookup | |
void | addEntry_ (Size index, double rt, Int scan_number, const String &native_id) |
Add a look-up entry for a spectrum. More... | |
Size | findByRegExpMatch_ (const String &spectrum_ref, const String ®exp, const boost::smatch &match) const |
Look up spectrum by regular expression match. More... | |
void | setScanRegExp_ (const String &scan_regexp) |
Set the regular expression for extracting scan numbers from spectrum native IDs. More... | |
Static Protected Attributes inherited from SpectrumLookup | |
static const String & | regexp_names_ |
Named groups recognized in regular expression. More... | |
Helper class for looking up spectrum meta data.
The class deals with meta data of spectra and provides functions for the extraction and look-up of this data.
A common use case for this functionality is importing peptide/protein identification results from search engine-specific file formats, where some meta information may have to be looked up in the raw data (primarily retention times). One example of this is in the function addMissingRTsToPeptideIDs().
Meta data of a spectra is stored in SpectrumMetaDataLookup::SpectrumMetaData structures. In order to control which meta data to extract/look-up, flags (SpectrumMetaDataLookup::MetaDataFlags) are used. Meta data can be extracted from spectra or from spectrum reference strings. The format of a spectrum reference is defined via a regular expression containing named groups (format "(?<GROUP>...)" for the different data items. The table below illustrates the different meta data types and how they are represented.
SpectrumMetaData member | MetaDataFlags flag | Reg. exp. group | Comment (*: undefined for MS1 spectra) |
rt | MDF_RT | RT | Retention time of the spectrum |
precursor_rt | MDF_PRECURSORRT | PRECRT | Retention time of the precursor spectrum* |
precursor_mz | MDF_PRECURSORMZ | MZ | Mass-to-charge ratio of the precursor ion* |
precursor_charge | MDF_PRECURSORCHARGE | CHARGE | Charge of the precursor ion* |
ms_level | MDF_MSLEVEL | LEVEL | MS level (1 for survey scan, 2 for fragment scan, etc.) |
scan_number | MDF_SCANNUMBER | SCAN | Scan number (extracted from the native ID) |
native_id | MDF_NATIVEID | ID | Native ID of the spectrum |
MDF_ALL | Shortcut for "all flags set" | ||
INDEX0 | Only for look-up: index (vector pos.) counting from 0 | ||
INDEX1 | Only for look-up: index (vector pos.) counting from 1 |
typedef unsigned char MetaDataFlags |
Bit mask for which meta data to extract from a spectrum.
|
inline |
Constructor.
|
inlineoverride |
Destructor.
|
private |
Copy constructor (not implemented)
|
static |
Add missing retention time values to peptide identifications based on raw data.
peptides | Peptide IDs with or without RT values |
filename | Name of a raw data file (e.g. mzML) for looking up RTs |
stop_on_error | Stop when an ID could not be matched to a spectrum (or keep going)? |
Look-up works by matching the "spectrum_reference" (meta value) of a peptide ID to the native ID of a spectrum. Only peptide IDs without RT (where PeptideIdentification::getRT() returns "NaN") are looked up; the RT is set to that of the corresponding spectrum.
|
static |
Add missing "spectrum_reference"s to peptide identifications based on raw data.
peptides | Peptide IDs with or without spectrum_reference |
filename | the name of the mz_file from which to draw spectrum_references |
stop_on_error | Stop when an ID could not be matched to a spectrum (or keep going)? |
override_spectra_data | if given ProteinIdentifications should be updated with new "spectra_data" values from SpectrumMetaDataLookup |
override_spectra_references | if given PeptideIdentifications with existing spectrum_reference should be updated from SpectrumMetaDataLookup |
proteins | Protein IDs corresponding to the Peptide IDs |
Look-up works by matching RT of a peptide identification with the given spectra. Matched spectra 'native ID' will be annotated to the identification. All spectrum_references are updated/added.
|
static |
Extract meta data from a spectrum.
spectrum | Spectrum input |
meta | Meta data output |
scan_regexp | Regular expression for extracting scan number from spectrum native ID |
precursor_rts | RTs of potential precursor spectra of different MS levels |
Scan number and precursor RT, respectively, are only extracted if scan_regexp/
not empty. precursor_rts
are
void getSpectrumMetaData | ( | const String & | spectrum_ref, |
SpectrumMetaData & | meta, | ||
MetaDataFlags | flags = MDF_ALL |
||
) | const |
Extract meta data via a spectrum reference.
spectrum_ref | Spectrum reference to parse |
meta | Meta data output |
flags | What meta data to extract |
Exception::ElementNotFound | if a spectrum look-up was necessary, but no matching spectrum was found |
This function is a combination of getSpectrumMetaData() and SpectrumLookup::findByReference(). However, the spectrum is only looked up if necessary, i.e. if the required meta data - as defined by flags
- cannot be extracted from the spectrum reference itself.
void getSpectrumMetaData | ( | Size | index, |
SpectrumMetaData & | meta | ||
) | const |
Look up meta data of a spectrum.
index | Index of the spectrum |
meta | Meta data output |
|
private |
Assignment operator (not implemented)
|
inline |
Read spectra and store their meta data.
SpectrumContainer | Spectrum container class, must support size and operator [] |
spectra | Container of spectra |
scan_regexp | Regular expression for matching scan numbers in spectrum native IDs (must contain the named group "?<SCAN>") |
get_precursor_rt | Assign precursor retention times? (This relies on all precursor spectra being present and in the right order.) |
Exception::IllegalArgument | if scan_regexp does not contain "?<SCAN>" (and is not empty) |
References SpectrumMetaDataLookup::SpectrumMetaData::ms_level, SpectrumMetaDataLookup::SpectrumMetaData::native_id, SpectrumMetaDataLookup::SpectrumMetaData::rt, and SpectrumMetaDataLookup::SpectrumMetaData::scan_number.
|
inline |
set spectra_data from read SpectrumContainer origin (i.e. filename)
spectra_data | the name (and path) of the origin of the read SpectrumContainer |
|
static |
|
static |
|
static |
|
static |
|
static |
|
static |
|
static |
Possible meta data to extract from a spectrum. Note that the static variables need to be put on separate lines due to a compiler bug in VS
|
static |
|
protected |
Meta data for spectra.
|
protected |