OpenMS
|
Use the following code conventions when contributing to OpenMS.
OpenMS uses coding conventions that are automatically checked using cpplint
(/src/tests/coding/cpplint.py
), when ENABLE_STYLE_TESTING
flag is 'ON' during CMake.
When developing in an IDE which support Clang format you can use the our style preset from the source tree OpenMS/.clang-format
. For Clion, you can import it by selecting Preferences > Code Style > Manage. VS2017 and later also support Clang format natively (press Ctrl-K, Ctrl-D
).
The following section focuses on formatting and style.
Use two spaces to indent. Tabulators are not allowed.
Use spaces after built-in key words (e.g. for
, if
, else
, etc.), and before and after binary mathematical operators, e.g. 1 + 3
not 1+3
.
Unix line endings are used on each platform (see <OpenMS>/.gitattributes
) to enable using a single source tree on a network drive or WLS with multi-OS clients.
Matching pairs of opening and closing curly braces should be set to the same column. See the following example:
The main reason for this rule is to avoid constructions like:
that might later be changed to something like and introduce a bug:
Thus, use braces around a block even for a single line.
Single line constructs for trivial cases like:
are allowed.
The following section describes the naming conventions followed by OpenMS developers.
Header files and source files should have the same name as the classes they contain. Source files end in .cpp
, while header files end in .h
. File names should be capitalised exactly as the class they contain (see below). Each header/source file should contain one class only, although exceptions are possible for light-weight classes.
The usage of underscores in names has two different meanings: A trailing "_" at the end indicates that something is protected or private to a class (a data member or a member function). Apart from that, different parts of a name are sometimes separated by an underscore, and sometimes separated by capital letters.
Class names and type names always start with a capital letter. Different parts of the name are separated by capital letters at the beginning of the word. No underscores are allowed in type names and class names, except for the names of protected types and classes in classes, which are suffixed by an underscore. The same conventions apply for namespaces.
Here is an example of some classes written using the conventions described above:
Function names (including class method names) always start with a lower case letter. Parts of the name are separated using capital letters (as are types and class names). They should be comprehensible, but as short as possible. The same variable names must be used in the declaration and in the definition. Arguments that are part of the interface (e.g. by inheritance), but actually not used in the implementation of a function have to be commented out - this avoids compiler warnings about unused variables. The argument of void functions (empty argument list) must be omitted in both the declaration and the definition. If function arguments are pointers or references, the pointer or reference qualifier is appended to the variable type. The pointer or reference qualifier should not prefix the variable name.
Here is an example of some method names written using the conventions described above:
Variable names are written in lower case letters. Distinguished parts of the name are separated using underscores. If parts of the name are derived from common acronyms (e.g. MS) they should be in upper case. Private or protected member variables of classes are suffixed by an underscore.
Here is an example of some variable names written using the conventions described above:
Enumerated values and preprocessor constants are all upper case letters. Parts of the name are separated by underscores.
Here is an example of some enumerated values and preprocessor constants written using the conventions described above:
Avoid using the preprocessor. Normally, const
and enum class
will suffice for most cases. Avoid enum
and prefer enum class
.
Parameters should consist of lower-case letters and underscores only. For numerical parameters, the range of reasonable values is given. Where applicable units are given in the description. This rule applies to all kinds of parameter strings, both keys and string-values.
The correct capitalization of all data file extensions supported by OpenMS is documented in FileHandler::NamesOfTypes[]
. The convention is to use only lowercase letters for file extensions. There are three exceptions: "ML" and "XML" are written in uppercase letters and "mzData" keeps its capital "D". Remember to keep this consistent when adding new data files or writing new TOPP tools (use correct capitalization for file type restrictions, here).
The following section outlines the class requirements with examples.
In OpenMS, every .h
file must be accompanied by a .cpp
file, even if is just a ''dummy''. This way a global make
will stumble across errors.
Here is an example of a correctly structured .h
file:
Here is an example of a correctly structured .cpp
file:
Remember that the definition of a class or function template has to be known at its point of instantiation. Therefore, the implementation of a template is normally contained in the .h
file. For template classes, declaration and definition are given in the same file. Things get more complicated when certain design patterns (e.g., the factory pattern) are used which lead to "circular dependencies". This is only a dependency of names, but it has to be resolved by separating declarations from definitions, at least for some of the member functions. In this case, a .h
file can be written that contains most of the definitions as well as the declarations of the peculiar functions. Their definition is deferred to the _impl.h
file ("impl" for "implementation"). The _impl.h
file is included only if the peculiar member functions have to be instantiated. Otherwise the .h
file should be sufficient. No .h
file should include an _impl.h file
.
The following section discusses rules around the use of primitives, namespaces, accessors to members and the STL.
OpenMS uses its own type names for primitive types. Use only the types defined in OpenMS/include/OpenMS/CONCEPT/Types.h
.
The main OpenMS classes are implemented in the namespace OpenMS. Auxiliary classes are implemented in OpenMS::Internal
. There are some other namespaces e.g. for constants and exceptions.
Importing a whole namespace in a header files is forbidden. For example:
Using the directive on C++ standard library datatypes in header files is forbidden. For example:
This could lead to name clashes when OpenMS is used together with other libraries. In source files (.cpp
) it is however allowed.
In general, follow the Rule-of-0 or Rule-of-6, when implementing any of the default operations, sometimes called special functions, i.e. constructor, destructor, copy assignment operator etc.
Accessors to protected or private members of a class are implemented as a pair of get-method and set-method. This is necessary as accessors that return mutable references to a member cannot be wrapped with Python.
For members that are too large to be read with the get-method or modified and written back with the set-method, an additional non-const get-method returning a reference can be implemented.
For primitive types, using a get-method which returns a reference is strictly forbidden. For more complex types it should be present only when necessary.
The following section describes how to handle exceptions and create exception classes.
No OpenMS program should dump a core if an error occurs. Instead, it should attempt to die as gracefully as possible. Furthermore, as OpenMS is a framework rather than an application, it should give the programmer ways to catch and correct errors. The recommended procedure to handle - even fatal - errors is to throw an exception. Uncaught exception will result in a call to abort thereby terminating the program.
To simplify debugging, use the following throw directive for exceptions:
FILE
and LINE
are standard-defined preprocessor macros. The macro OPENMS_PRETTY_FUNCTION
wraps Boost's version of a platform independent PRETTY_FUNCTION
macro, that works similar to a char*
and contains the type signature of the function as well as its bare name, if the GNU compiler is being used. It might differ on other platforms. Exception::Base
provides methods (getFile
, getLine
, getFunction
) that allow the localisation of the exception's cause.
The standard way to catch an exception should be by reference (and not by value), as shown below:
Potential exceptions must be documented to tell the user which exceptions can be caught.
All exceptions used in OpenMS are derived from Exception::Base
defined in CONCEPT/Exception.h
. A default constructor should not be implemented for these exceptions. Instead, the constructor of all derived exceptions should have the following signature:
Additional arguments are possible but should provide default values (see IndexOverflow
for an example).
C++ classes and their methods can be exposed to python via pyOpenMS. If you are interested in exposing your algorithms to python, view the pyopenms documentation for the coding conventions and examples.
To generate UML diagrams, use yEd and export the diagrams in PNG format. Do not forget to save also the corresponding .yed
file.
Each OpenMS class has to be documented using Doxygen. The documentation is inserted in Doxygen format in the header file where the class is defined. Documentation includes the description of the class, each method, type declaration, enum declaration, each constant, and member variable.
Longer pieces of documentation start with a @brief
description, followed by an empty line and a detailed description. The empty line is needed to separate the brief from the detailed description.
Descriptions of classes always have a brief section.
Use the doxygen style of the following example for OpenMS:
The defgroup
command indicates that a comment block contains documentation for a group of classes, files or namespaces. This can be used to categorize classes, files or namespaces, and document those categories. You can also use groups as members of other groups, thus building a hierarchy of groups. By using the ingroup
command, a comment block of a class, file or namespace will be added to the group or groups.
The groups (or modules as doxygen calls them) defined by the ingroup
command should contain only the classes of special interest to the OpenMS user. Helper classes and such must be omitted.
Documentation that does not belong to a specific .cpp
or .h
file can be written into a separate Doxygen file (with the ending ".doxygen"). This file will also be parsed by Doxygen.
Open tasks are noted in the documentation of a header or a group using the @todo
command. The ToDo list is then shown in the doxygen menu under 'Related pages'. Each ToDo should be followed by a name in parentheses to indicated who is going to handle it.
You can also use these commands:
The code for each .cpp
file has to be commented. Each piece of code in OpenMS has to contain at least 5% of comments. The use of:
instead of:
is recommended to avoid problems arising from nested comments. Comments should be written in plain english and describe the functionality of the next few lines.
Instructive programming examples are provided in the doc/code_examples
directory. See OpenMS Developer Guide.
View the How To Write Tests guidelines to learn how to write tests.
OpenMS uses git to manage different versions of the source files. For easier identification of the responsible person each OpenMS file contains the $Maintainer:$
string in the preamble.
Examples of .h
and .cpp
files have been given above. In non-C++ files (CMake
files, (La)TeX-Files, etc.) the C++ comments are replaced by the respective comment characters (e.g. `‘#’' for CMake
files, % for (La)TeX).