OpenMS
MaRaClusterAdapter

MaRaClusterAdapter facilitates the input to, the call of and output integration of MaRaCluster. MaRaCluster (https://github.com/statisticalbiotechnology/maracluster) is a tool to apply unsupervised clustering of ms2 spectra from shotgun proteomics datasets.

Experimental classes:
This tool is work in progress and usage and input requirements might change.
pot. predecessor tools → MaRaClusterAdapter → pot. successor tools
any signal-/preprocessing tool
(in mzML format)
MSGFPlusAdapter

MaRaCluster is dependent on the input parameter pcut, which is the logarithm of the pvalue cutoff. The default value is -10, lower values will result in smaller but purer clusters. If specified peptide search results can be provided as idXML files and the MaRaCluster Adapter will annotate cluster ids as attributes to each peptide identification, which will be outputed as a merged idXML. Moreover the merged idXML containing only scan numbers, cluster ids and file origin can be outputed without prior peptide identification searches. The assigned cluster ids in the respective idXML are equal to the scanindex of the produced clustered mzML.

The command line parameters of this tool are:

MaRaClusterAdapter -- Facilitate input to MaRaCluster and reintegrate.
Full documentation: http://www.openms.de/doxygen/release/3.2.0/html/TOPP_MaRaClusterAdapter.html
Version: 3.2.0 Sep 18 2024, 16:00:56, Revision: e231942
To cite OpenMS:
 + Pfeuffer, J., Bielow, C., Wein, S. et al.. OpenMS 3 enables reproducible analysis of large-scale mass spec
   trometry data. Nat Methods (2024). doi:10.1038/s41592-024-02197-7.
To cite MaRaClusterAdapter:
 + The M and Käll L. MaRaCluster: A Fragment Rarity Metric for Clustering Fragment Spectra in Shotgun Proteo
   mics. J Proteome Res 2016; 15: 3. doi:10.1021/acs.jproteome.5b00749.

Usage:
  MaRaClusterAdapter <options>

Options (mandatory options marked with '*'):
  -in <files>*                           Input file(s) (valid formats: 'mzML', 'mgf')
  -id_in <files>                         Optional idXML Input file(s) in the same order as mzML files - for 
                                         Maracluster Cluster annotation (valid formats: 'idXML')
  -out <file>                            Output file in idXML format (valid formats: 'idXML')
  -consensus_out <file>                  Consensus spectra in mzML format (valid formats: 'mzML')
  -output_directory <directory>          Output directory for MaRaCluster original consensus output
  -pcut <value>                          Log(p-value) cutoff, has to be < 0.0. Default: -10.0. (default: '-10
                                         .0') (max: '0.0')
  -min_cluster_size <value>              Minimum number of spectra in a cluster for consensus spectra (defaul
                                         t: '1') (min: '1')
  -maracluster_executable <executable>*  The maracluster executable. Provide a full or relative path, or make
                                          sure it can be found in your PATH environment.
                                         
Common TOPP options:
  -ini <file>                            Use the given TOPP INI file
  -threads <n>                           Sets the number of threads allowed to be used by the TOPP tool (defa
                                         ult: '1')
  -write_ini <file>                      Writes the default configuration file
  --help                                 Shows options
  --helphelp                             Shows all options (including advanced)

INI file documentation of this tool:

Legend:
required parameter
advanced parameter
+MaRaClusterAdapterFacilitate input to MaRaCluster and reintegrate.
version3.2.0 Version of the tool that generated this parameters file.
++1Instance '1' section for 'MaRaClusterAdapter'
in[] Input file(s)input file*.mzML, *.mgf
id_in[] Optional idXML Input file(s) in the same order as mzML files - for Maracluster Cluster annotationinput file*.idXML
out Output file in idXML formatoutput file*.idXML
consensus_out Consensus spectra in mzML formatoutput file*.mzML
output_directory Output directory for MaRaCluster original consensus output
pcut-10.0 log(p-value) cutoff, has to be < 0.0. Default: -10.0.-∞:0.0
min_cluster_size1 minimum number of spectra in a cluster for consensus spectra1:∞
maracluster_executablemaracluster The maracluster executable. Provide a full or relative path, or make sure it can be found in your PATH environment.input file, is_executable
verbose2 Set verbosity of output: 0=no processing info, 5=all.
precursor_tolerance20.0 Precursor monoisotopic mass tolerance
precursor_tolerance_unitsppm tolerance_mass_units 0=ppm, 1=Dappm, Da
log Name of log file (created only when specified)
debug0 Sets the debug level
threads1 Sets the number of threads allowed to be used by the TOPP tool
no_progressfalse Disables progress logging to command linetrue, false
forcefalse Overrides tool-specific checkstrue, false
testfalse Enables the test mode (needed for internal use only)true, false

MaRaCluster is written by Matthew The (https://github.com/statisticalbiotechnology/maracluster Copyright Matthew The matth.nosp@m.ew.t.nosp@m.he@sc.nosp@m.ilif.nosp@m.elab..nosp@m.se) Cite Publication: MaRaCluster: A Fragment Rarity Metric for Clustering Fragment Spectra in Shotgun Proteomics Journal of proteome research, 2016, 15(3), pp 713-720 DOI: 10.1021/acs.jproteome.5b00749