OpenMS
ProtXMLFile Class Reference

Used to load (storing not supported, yet) ProtXML files. More...

#include <OpenMS/FORMAT/ProtXMLFile.h>

Inheritance diagram for ProtXMLFile:
[legend]
Collaboration diagram for ProtXMLFile:
[legend]

Public Types

typedef ProteinIdentification::ProteinGroup ProteinGroup
 A protein group (set of indices into ProteinIdentification) More...
 

Public Member Functions

 ProtXMLFile ()
 Constructor. More...
 
void load (const String &filename, ProteinIdentification &protein_ids, PeptideIdentification &peptide_ids)
 Loads the identifications of an ProtXML file without identifier. More...
 
void store (const String &filename, const ProteinIdentification &protein_ids, const PeptideIdentification &peptide_ids, const String &document_id="")
 [not implemented yet!] Stores the data in an ProtXML file More...
 
- Public Member Functions inherited from XMLFile
 XMLFile ()
 Default constructor. More...
 
 XMLFile (const String &schema_location, const String &version)
 Constructor that sets the schema location. More...
 
virtual ~XMLFile ()
 Destructor. More...
 
bool isValid (const String &filename, std::ostream &os)
 Checks if a file validates against the XML schema. More...
 
const StringgetVersion () const
 return the version of the schema More...
 

Protected Member Functions

void resetMembers_ ()
 reset members after reading/writing More...
 
void endElement (const XMLCh *const, const XMLCh *const, const XMLCh *const qname) override
 Docu in base class. More...
 
void startElement (const XMLCh *const, const XMLCh *const, const XMLCh *const qname, const xercesc::Attributes &attributes) override
 Docu in base class. More...
 
void registerProtein_ (const String &protein_name)
 Creates a new protein entry (if not already present) and appends it to the current group. More...
 
void matchModification_ (const double mass, const String &origin, String &modification_description)
 find modification name given a modified AA mass More...
 
- Protected Member Functions inherited from XMLHandler
void writeUserParam_ (const String &tag_name, std::ostream &os, const MetaInfoInterface &meta, UInt indent) const
 Writes the content of MetaInfoInterface to the file. More...
 
Int asInt_ (const String &in) const
 Conversion of a String to an integer value. More...
 
Int asInt_ (const XMLCh *in) const
 Conversion of a Xerces string to an integer value. More...
 
UInt asUInt_ (const String &in) const
 Conversion of a String to an unsigned integer value. More...
 
double asDouble_ (const String &in) const
 Conversion of a String to a double value. More...
 
float asFloat_ (const String &in) const
 Conversion of a String to a float value. More...
 
bool asBool_ (const String &in) const
 Conversion of a string to a boolean value. More...
 
DateTime asDateTime_ (String date_string) const
 Conversion of a xs:datetime string to a DateTime value. More...
 
bool equal_ (const XMLCh *a, const XMLCh *b) const
 Returns if two Xerces strings are equal. More...
 
SignedSize cvStringToEnum_ (const Size section, const String &term, const char *message, const SignedSize result_on_error=0)
 
String attributeAsString_ (const xercesc::Attributes &a, const char *name) const
 Converts an attribute to a String. More...
 
Int attributeAsInt_ (const xercesc::Attributes &a, const char *name) const
 Converts an attribute to a Int. More...
 
double attributeAsDouble_ (const xercesc::Attributes &a, const char *name) const
 Converts an attribute to a double. More...
 
DoubleList attributeAsDoubleList_ (const xercesc::Attributes &a, const char *name) const
 Converts an attribute to a DoubleList. More...
 
IntList attributeAsIntList_ (const xercesc::Attributes &a, const char *name) const
 Converts an attribute to an IntList. More...
 
StringList attributeAsStringList_ (const xercesc::Attributes &a, const char *name) const
 Converts an attribute to an StringList. More...
 
bool optionalAttributeAsString_ (String &value, const xercesc::Attributes &a, const char *name) const
 Assigns the attribute content to the String value if the attribute is present. More...
 
bool optionalAttributeAsInt_ (Int &value, const xercesc::Attributes &a, const char *name) const
 Assigns the attribute content to the Int value if the attribute is present. More...
 
bool optionalAttributeAsUInt_ (UInt &value, const xercesc::Attributes &a, const char *name) const
 Assigns the attribute content to the UInt value if the attribute is present. More...
 
bool optionalAttributeAsDouble_ (double &value, const xercesc::Attributes &a, const char *name) const
 Assigns the attribute content to the double value if the attribute is present. More...
 
bool optionalAttributeAsDoubleList_ (DoubleList &value, const xercesc::Attributes &a, const char *name) const
 Assigns the attribute content to the DoubleList value if the attribute is present. More...
 
bool optionalAttributeAsStringList_ (StringList &value, const xercesc::Attributes &a, const char *name) const
 Assigns the attribute content to the StringList value if the attribute is present. More...
 
bool optionalAttributeAsIntList_ (IntList &value, const xercesc::Attributes &a, const char *name) const
 Assigns the attribute content to the IntList value if the attribute is present. More...
 
String attributeAsString_ (const xercesc::Attributes &a, const XMLCh *name) const
 Converts an attribute to a String. More...
 
Int attributeAsInt_ (const xercesc::Attributes &a, const XMLCh *name) const
 Converts an attribute to a Int. More...
 
double attributeAsDouble_ (const xercesc::Attributes &a, const XMLCh *name) const
 Converts an attribute to a double. More...
 
DoubleList attributeAsDoubleList_ (const xercesc::Attributes &a, const XMLCh *name) const
 Converts an attribute to a DoubleList. More...
 
IntList attributeAsIntList_ (const xercesc::Attributes &a, const XMLCh *name) const
 Converts an attribute to a IntList. More...
 
StringList attributeAsStringList_ (const xercesc::Attributes &a, const XMLCh *name) const
 Converts an attribute to a StringList. More...
 
bool optionalAttributeAsString_ (String &value, const xercesc::Attributes &a, const XMLCh *name) const
 Assigns the attribute content to the String value if the attribute is present. More...
 
bool optionalAttributeAsInt_ (Int &value, const xercesc::Attributes &a, const XMLCh *name) const
 Assigns the attribute content to the Int value if the attribute is present. More...
 
bool optionalAttributeAsUInt_ (UInt &value, const xercesc::Attributes &a, const XMLCh *name) const
 Assigns the attribute content to the UInt value if the attribute is present. More...
 
bool optionalAttributeAsDouble_ (double &value, const xercesc::Attributes &a, const XMLCh *name) const
 Assigns the attribute content to the double value if the attribute is present. More...
 
bool optionalAttributeAsDoubleList_ (DoubleList &value, const xercesc::Attributes &a, const XMLCh *name) const
 Assigns the attribute content to the DoubleList value if the attribute is present. More...
 
bool optionalAttributeAsIntList_ (IntList &value, const xercesc::Attributes &a, const XMLCh *name) const
 Assigns the attribute content to the IntList value if the attribute is present. More...
 
bool optionalAttributeAsStringList_ (StringList &value, const xercesc::Attributes &a, const XMLCh *name) const
 Assigns the attribute content to the StringList value if the attribute is present. More...
 
 XMLHandler (const String &filename, const String &version)
 Default constructor. More...
 
 ~XMLHandler () override
 Destructor. More...
 
void reset ()
 Release internal memory used for parsing (call. More...
 
void fatalError (const xercesc::SAXParseException &exception) override
 
void error (const xercesc::SAXParseException &exception) override
 
void warning (const xercesc::SAXParseException &exception) override
 
void fatalError (ActionMode mode, const String &msg, UInt line=0, UInt column=0) const
 Fatal error handler. Throws a ParseError exception. More...
 
void error (ActionMode mode, const String &msg, UInt line=0, UInt column=0) const
 Error handler for recoverable errors. More...
 
void warning (ActionMode mode, const String &msg, UInt line=0, UInt column=0) const
 Warning handler. More...
 
void characters (const XMLCh *const chars, const XMLSize_t length) override
 Parsing method for character data. More...
 
void startElement (const XMLCh *const uri, const XMLCh *const localname, const XMLCh *const qname, const xercesc::Attributes &attrs) override
 Parsing method for opening tags. More...
 
void endElement (const XMLCh *const uri, const XMLCh *const localname, const XMLCh *const qname) override
 Parsing method for closing tags. More...
 
virtual void writeTo (std::ostream &)
 Writes the contents to a stream. More...
 
virtual LOADDETAIL getLoadDetail () const
 handler which support partial loading, implement this method More...
 
virtual void setLoadDetail (const LOADDETAIL d)
 handler which support partial loading, implement this method More...
 
DataValue cvParamToValue (const ControlledVocabulary &cv, const String &parent_tag, const String &accession, const String &name, const String &value, const String &unit_accession) const
 Convert the value of a <cvParam value=.> (as commonly found in PSI schemata) to the DataValue with the correct type (e.g. int) according to the type stored in the CV (usually PSI-MS CV), as well as set its unit. More...
 
DataValue cvParamToValue (const ControlledVocabulary &cv, const CVTerm &raw_term) const
 Convert the value of a <cvParam value=.> (as commonly found in PSI schemata) to the DataValue with the correct type (e.g. int) according to the type stored in the CV (usually PSI-MS CV), as well as set its unit. More...
 
void checkUniqueIdentifiers_ (const std::vector< ProteinIdentification > &prot_ids) const
 
- Protected Member Functions inherited from XMLFile
void parse_ (const String &filename, XMLHandler *handler)
 Parses the XML file given by filename using the handler given by handler. More...
 
void parseBuffer_ (const std::string &buffer, XMLHandler *handler)
 Parses the in-memory buffer given by buffer using the handler given by handler. More...
 
void save_ (const String &filename, XMLHandler *handler) const
 Stores the contents of the XML handler given by handler in the file given by filename. More...
 
void enforceEncoding_ (const String &encoding)
 
 XMLFile ()
 Default constructor. More...
 
 XMLFile (const String &schema_location, const String &version)
 Constructor that sets the schema location. More...
 
virtual ~XMLFile ()
 Destructor. More...
 
bool isValid (const String &filename, std::ostream &os)
 Checks if a file validates against the XML schema. More...
 
const StringgetVersion () const
 return the version of the schema More...
 

Protected Attributes

members for loading data
ProteinIdentificationprot_id_
 Pointer to protein identification. More...
 
PeptideIdentificationpep_id_
 Pointer to peptide identification. More...
 
PeptideHitpep_hit_
 Temporary peptide hit. More...
 
ProteinGroup protein_group_
 protein group More...
 
- Protected Attributes inherited from XMLHandler
String file_
 File name. More...
 
String version_
 Schema version. More...
 
StringManager sm_
 Helper class for string conversion. More...
 
std::vector< Stringopen_tags_
 Stack of open XML tags. More...
 
LOADDETAIL load_detail_
 parse only until total number of scans and chroms have been determined from attributes More...
 
std::vector< std::vector< String > > cv_terms_
 Array of CV term lists (one sublist denotes one term and it's children) More...
 
- Protected Attributes inherited from XMLFile
String schema_location_
 XML schema file location. More...
 
String schema_version_
 Version string. More...
 
String enforced_encoding_
 Encoding string that replaces the encoding (system dependent or specified in the XML). Disabled if empty. Used as a workaround for XTandem output xml. More...
 

Additional Inherited Members

- Protected Types inherited from XMLHandler
enum  ActionMode { LOAD , STORE }
 Action to set the current mode (for error messages) More...
 
enum  LOADDETAIL { LD_ALLDATA , LD_RAWCOUNTS , LD_COUNTS_WITHOPTIONS }
 
- Static Protected Member Functions inherited from XMLHandler
static String writeXMLEscape (const String &to_escape)
 Escapes a string and returns the escaped string. More...
 
static DataValue fromXSDString (const String &type, const String &value)
 Convert an XSD type (e.g. 'xsd:double') to a DataValue. More...
 

Detailed Description

Used to load (storing not supported, yet) ProtXML files.

This class is used to load (storing not supported, yet) documents that implement the schema of ProtXML files.

A documented schema for this format comes with the TPP and can also be found at https://github.com/OpenMS/OpenMS/tree/develop/share/OpenMS/SCHEMAS

OpenMS can only read parts of the protein_summary subtree to extract protein-peptide associations. All other parts are silently ignored.

For protein groups, only the "group leader" (which is annotated with a probability and coverage) receives these attributes. All indistinguishable proteins of the same group only have an accession and score of -1.

Todo:

Document which metavalues of Protein/PeptideHit are filled when reading ProtXML (Chris)

Writing of protXML is currently not supported

Member Typedef Documentation

◆ ProteinGroup

A protein group (set of indices into ProteinIdentification)

Constructor & Destructor Documentation

◆ ProtXMLFile()

Constructor.

Member Function Documentation

◆ endElement()

void endElement ( const XMLCh * const  ,
const XMLCh * const  ,
const XMLCh *const  qname 
)
overrideprotected

Docu in base class.

◆ load()

void load ( const String filename,
ProteinIdentification protein_ids,
PeptideIdentification peptide_ids 
)

Loads the identifications of an ProtXML file without identifier.

The information is read in and the information is stored in the corresponding variables

Exceptions
Exception::FileNotFoundis thrown if the file could not be opened
Exception::ParseErroris thrown if an error occurs during parsing

◆ matchModification_()

void matchModification_ ( const double  mass,
const String origin,
String modification_description 
)
protected

find modification name given a modified AA mass

Matches a mass of a modified AA to a mod in our modification db For ambiguous mods, the first (arbitrary) is returned If no mod is found an error is issued and the return string is empty

Note
A duplicate of this function is also used in PepXMLFile
Parameters
massModified AA's mass
originAA one letter code
modification_description[out] Name of the modification, e.g. 'Carboxymethyl (C)'

◆ registerProtein_()

void registerProtein_ ( const String protein_name)
protected

Creates a new protein entry (if not already present) and appends it to the current group.

◆ resetMembers_()

void resetMembers_ ( )
protected

reset members after reading/writing

◆ startElement()

void startElement ( const XMLCh * const  ,
const XMLCh * const  ,
const XMLCh *const  qname,
const xercesc::Attributes &  attributes 
)
overrideprotected

Docu in base class.

◆ store()

void store ( const String filename,
const ProteinIdentification protein_ids,
const PeptideIdentification peptide_ids,
const String document_id = "" 
)

[not implemented yet!] Stores the data in an ProtXML file

[not implemented yet!] The data is stored in the file 'filename'.

Exceptions
Exception::UnableToCreateFileis thrown if the file could not be created

Member Data Documentation

◆ pep_hit_

PeptideHit* pep_hit_
protected

Temporary peptide hit.

◆ pep_id_

PeptideIdentification* pep_id_
protected

Pointer to peptide identification.

◆ prot_id_

ProteinIdentification* prot_id_
protected

Pointer to protein identification.

◆ protein_group_

ProteinGroup protein_group_
protected

protein group