OpenMS
ExperimentalDesign Class Reference

Representation of an experimental design in OpenMS. Instances can be loaded with the ExperimentalDesignFile class. More...

#include <OpenMS/METADATA/ExperimentalDesign.h>

Collaboration diagram for ExperimentalDesign:
[legend]

Classes

class  MSFileSectionEntry
 
class  SampleSection
 

Public Types

using MSFileSection = std::vector< MSFileSectionEntry >
 

Public Member Functions

 ExperimentalDesign ()=default
 
 ExperimentalDesign (const MSFileSection &msfile_section, const SampleSection &sample_section)
 
const MSFileSectiongetMSFileSection () const
 
void setMSFileSection (const MSFileSection &msfile_section)
 
const ExperimentalDesign::SampleSectiongetSampleSection () const
 
void setSampleSection (const SampleSection &sample_section)
 
std::map< std::vector< String >, std::set< String > > getUniqueSampleRowToSampleMapping () const
 
std::map< String, unsigned > getSampleToPrefractionationMapping () const
 
std::map< unsigned int, std::vector< String > > getFractionToMSFilesMapping () const
 return fraction index to file paths (ordered by fraction_group) More...
 
std::vector< std::vector< std::pair< String, unsigned > > > getConditionToPathLabelVector () const
 
std::map< std::vector< String >, std::set< unsigned > > getConditionToSampleMapping () const
 return a condition (unique combination of sample section values except replicate) to Sample index mapping More...
 
std::map< std::pair< String, unsigned >, unsigned > getPathLabelToPrefractionationMapping (bool use_basename_only) const
 
std::map< std::pair< String, unsigned >, unsigned > getPathLabelToConditionMapping (bool use_basename_only) const
 
std::map< String, unsigned > getSampleToConditionMapping () const
 
std::map< std::pair< String, unsigned >, unsigned > getPathLabelToSampleMapping (bool use_basename_only) const
 return <file_path, label> to sample index mapping More...
 
std::map< std::pair< String, unsigned >, unsigned > getPathLabelToFractionMapping (bool use_basename_only) const
 return <file_path, label> to fraction mapping More...
 
std::map< std::pair< String, unsigned >, unsigned > getPathLabelToFractionGroupMapping (bool use_basename_only) const
 return <file_path, label> to fraction_group mapping More...
 
unsigned getNumberOfSamples () const
 
unsigned getNumberOfFractions () const
 
unsigned getNumberOfLabels () const
 
unsigned getNumberOfMSFiles () const
 
unsigned getNumberOfFractionGroups () const
 
unsigned getSample (unsigned fraction_group, unsigned label=1)
 
bool isFractionated () const
 
Size filterByBasenames (const std::set< String > &bns)
 
bool sameNrOfMSFilesPerFraction () const
 

Static Public Member Functions

static ExperimentalDesign fromConsensusMap (const ConsensusMap &c)
 Extract experimental design from consensus map. More...
 
static ExperimentalDesign fromFeatureMap (const FeatureMap &f)
 Extract experimental design from feature map. More...
 
static ExperimentalDesign fromIdentifications (const std::vector< ProteinIdentification > &proteins)
 Extract experimental design from identifications. More...
 

Private Member Functions

std::vector< StringgetFileNames_ (bool basename) const
 
std::vector< unsigned > getLabels_ () const
 
std::vector< unsigned > getFractions_ () const
 
std::map< std::pair< String, unsigned >, unsigned > pathLabelMapper_ (bool, unsigned(*f)(const ExperimentalDesign::MSFileSectionEntry &)) const
 Generic Mapper (Path, Label) -> f(row) More...
 
void sort_ ()
 
void isValid_ ()
 

Static Private Member Functions

template<typename T >
static void errorIfAlreadyExists (std::set< T > &container, T &item, const String &message)
 

Private Attributes

MSFileSection msfile_section_
 
SampleSection sample_section_
 

Detailed Description

Representation of an experimental design in OpenMS. Instances can be loaded with the ExperimentalDesignFile class.

Experimental designs can be provided in two formats: the one-table format and the two-table format.

The one-table format is simpler but slightly more redundant.

The one-table format consists of mandatory (file columns) and optional sample metadata (sample columns).

The mandatory file columns are Fraction_Group, Fraction, Spectra_Filepath and Label. These columns capture the mapping of quantitative values to files for label-free and multiplexed experiments and enables fraction-aware data processing.

  • Fraction_Group: a numeric identifier that indicates which fractions are grouped together. Please do NOT reuse the same identifiers across samples! Assign identifiers continuously.
  • Fraction: a numeric identifier that indicates which fraction was measured in this file. In the case of unfractionated data, the fraction identifier is 1 for all samples. Make sure the same identifiers are used across different Fraction_Groups, as this determines which fractions correspond to each other.
  • Label: a numeric identifier for the label. 1 for label-free, 1 and 2 for SILAC light/heavy, e.g., 1-10 for TMT10Plex
  • Spectra_Filepath: a filename or path as string representation (e.g., SILAC_file.mzML)

For processing with MSstats, the optional sample columns are typically MSstats_Condition and MSstats_BioReplicate with an additional MSstats_Mixture column in the case of TMT labeling. They capture the experimental factors and conditions associated with a sample.

  • MSstats_Condition: a string that indicates the condition (e.g., control or 1000 mMol). Will be forwarded to MSstats and can then be used to specify test contrasts.
  • MSstats_BioReplicate: a string identifier to indicate biological replication of a sample. Entries with the same Sample/Condition/BioReplicate but different Filepath (and therefore FractionGroup number) will be treated as technical replicates.
  • MSstats_Mixture: (for TMT labeling only): a string identifier to indicate the mixture of samples labeled with different TMT reagents, which can be analyzed in a single mass spectrometry experiment. E.g., same samples labeled with different TMT reagents have a different mixture identifier. Technical replicates need to have the same mixture identifier.

For details on the MSstats columns please refer to the MSstats manual for details (https://www.bioconductor.org/packages/release/bioc/vignettes/MSstats/inst/doc/MSstats.html).

Fraction_Group Fraction Spectra_Filepath Label MSstats_Condition MSstats_BioReplicate
1 1 UPS1_12500amol_R1.mzML 1 12500 amol 1
2 1 UPS1_12500amol_R2.mzML 1 12500 amol 2
3 1 UPS1_12500amol_R3.mzML 1 12500 amol 3
... ...
... ...
...
...
22 1 UPS1_500amol_R1.mzML 1 500 amol 1
23 1 UPS1_500amol_R2.mzML 1 500 amol 2
24 1 UPS1_500amol_R3.mzML 1 500 amol 3

Alternatively, the experimental design can be specified with a file consisting of two tables whose headers are separated by a blank line. The two tables are:

  • The file section table and the sample section table.
  • The file section consists of columns Fraction_Group, Fraction, Spectra_Filepath, Label and Sample

The sample section consists of columns Sample, MSstats_Condition and MSstats_BioReplicate.

The content is the same as described for the one table format, except that the additional numeric sample column allows referencing between file and sample section.

Fraction_Group Fraction Spectra_Filepath Label Sample
1 1 UPS1_12500amol_R1.mzML 1 1
2 1 UPS1_12500amol_R2.mzML 1 2
... ...
... ...
...
22 1 UPS1_500amol_R1.mzML 1 22
Sample MSstats_Condition MSstats_BioReplicate
1 12500 amol 1
2 12500 amol 2
... ...
...
22 500 amol 3

Member Typedef Documentation

◆ MSFileSection

using MSFileSection = std::vector<MSFileSectionEntry>

Constructor & Destructor Documentation

◆ ExperimentalDesign() [1/2]

ExperimentalDesign ( )
default

◆ ExperimentalDesign() [2/2]

ExperimentalDesign ( const MSFileSection msfile_section,
const SampleSection sample_section 
)

Member Function Documentation

◆ errorIfAlreadyExists()

static void errorIfAlreadyExists ( std::set< T > &  container,
T &  item,
const String message 
)
staticprivate

◆ filterByBasenames()

Size filterByBasenames ( const std::set< String > &  bns)

filters the MSFileSection to only include a given subset of files whose basenames are given with bns

Returns
number of files that have been filtered

◆ fromConsensusMap()

static ExperimentalDesign fromConsensusMap ( const ConsensusMap c)
static

Extract experimental design from consensus map.

◆ fromFeatureMap()

static ExperimentalDesign fromFeatureMap ( const FeatureMap f)
static

Extract experimental design from feature map.

◆ fromIdentifications()

static ExperimentalDesign fromIdentifications ( const std::vector< ProteinIdentification > &  proteins)
static

Extract experimental design from identifications.

◆ getConditionToPathLabelVector()

std::vector<std::vector<std::pair<String, unsigned> > > getConditionToPathLabelVector ( ) const

return vector of filepath/label combinations that share the same conditions after removing replicate columns in the sample section (e.g. for merging across replicates)

◆ getConditionToSampleMapping()

std::map<std::vector<String>, std::set<unsigned> > getConditionToSampleMapping ( ) const

return a condition (unique combination of sample section values except replicate) to Sample index mapping

◆ getFileNames_()

std::vector< String > getFileNames_ ( bool  basename) const
private

◆ getFractions_()

std::vector<unsigned> getFractions_ ( ) const
private

◆ getFractionToMSFilesMapping()

std::map<unsigned int, std::vector<String> > getFractionToMSFilesMapping ( ) const

return fraction index to file paths (ordered by fraction_group)

◆ getLabels_()

std::vector<unsigned> getLabels_ ( ) const
private

◆ getMSFileSection()

const MSFileSection& getMSFileSection ( ) const

◆ getNumberOfFractionGroups()

unsigned getNumberOfFractionGroups ( ) const

◆ getNumberOfFractions()

unsigned getNumberOfFractions ( ) const

◆ getNumberOfLabels()

unsigned getNumberOfLabels ( ) const

◆ getNumberOfMSFiles()

unsigned getNumberOfMSFiles ( ) const

◆ getNumberOfSamples()

unsigned getNumberOfSamples ( ) const

◆ getPathLabelToConditionMapping()

std::map< std::pair< String, unsigned >, unsigned> getPathLabelToConditionMapping ( bool  use_basename_only) const

return <file_path, label> to condition mapping (a condition is a unique combination of all columns in the sample section, except for replicates.

◆ getPathLabelToFractionGroupMapping()

std::map< std::pair< String, unsigned >, unsigned> getPathLabelToFractionGroupMapping ( bool  use_basename_only) const

return <file_path, label> to fraction_group mapping

◆ getPathLabelToFractionMapping()

std::map< std::pair< String, unsigned >, unsigned> getPathLabelToFractionMapping ( bool  use_basename_only) const

return <file_path, label> to fraction mapping

◆ getPathLabelToPrefractionationMapping()

std::map< std::pair< String, unsigned >, unsigned> getPathLabelToPrefractionationMapping ( bool  use_basename_only) const

return <file_path, label> to prefractionation mapping (a prefractionation group is a unique combination of all columns in the sample section, except for replicates.

◆ getPathLabelToSampleMapping()

std::map< std::pair< String, unsigned >, unsigned> getPathLabelToSampleMapping ( bool  use_basename_only) const

return <file_path, label> to sample index mapping

◆ getSample()

unsigned getSample ( unsigned  fraction_group,
unsigned  label = 1 
)

◆ getSampleSection()

const ExperimentalDesign::SampleSection& getSampleSection ( ) const

◆ getSampleToConditionMapping()

std::map<String, unsigned> getSampleToConditionMapping ( ) const

return Sample name to condition mapping (a condition is a unique combination of all columns in the sample section, except for replicates. Numbering of conditions is alphabetical due to map.

◆ getSampleToPrefractionationMapping()

std::map<String, unsigned> getSampleToPrefractionationMapping ( ) const

uses getUniqueSampleRowToSampleMapping to get the reversed map mapping sample ID to a real unique sample

◆ getUniqueSampleRowToSampleMapping()

std::map<std::vector<String>, std::set<String> > getUniqueSampleRowToSampleMapping ( ) const

returns a map from a sample section row to sample id for clustering duplicate sample rows (e.g. to find all fractions of the same "sample")

◆ isFractionated()

bool isFractionated ( ) const
Returns
whether we have a fractionated design

◆ isValid_()

void isValid_ ( )
private

◆ pathLabelMapper_()

std::map< std::pair< String, unsigned >, unsigned> pathLabelMapper_ ( bool  ,
unsigned(*)(const ExperimentalDesign::MSFileSectionEntry &)  f 
) const
private

Generic Mapper (Path, Label) -> f(row)

◆ sameNrOfMSFilesPerFraction()

bool sameNrOfMSFilesPerFraction ( ) const
Returns
whether all fraction groups have the same number of fractions

◆ setMSFileSection()

void setMSFileSection ( const MSFileSection msfile_section)

◆ setSampleSection()

void setSampleSection ( const SampleSection sample_section)

◆ sort_()

void sort_ ( )
private

Member Data Documentation

◆ msfile_section_

MSFileSection msfile_section_
private

◆ sample_section_

SampleSection sample_section_
private