OpenMS
Coding Conventions

Use the following code conventions when contributing to OpenMS.

OpenMS uses coding conventions that are automatically checked using cpplint (/src/tests/coding/cpplint.py). You can find a configuration file for the CLion IDE here. You can import it by selecting Preferences > Code Style > Manage.

Formatting and style

The following section focuses on formatting and style.

Indentation

Use two spaces to indent. Tabulators in string literals should use the escape character \t.

Spaces

Use spaces after built-in key words (e.g. for, if, else, etc.), and before and after binary mathematical operators.

Line endings

Native line endings should be allowed on each platform. This is desirable as Visual Studio for example will always insert CRLF even if the file is LF only, leading to a mixed line ending style for this file. Native eol style avoids this problem.

Bracket placements

Matching pairs of opening and closing curly braces should be set to the same column. See the following example:

while (continue == true)
{
for (int i = 0; i < 10; i++)
{
...
}
if (x < 7)
{
....
}
}

The main reason for this rule is to avoid constructions like:

if (isValid(a))
error = 0;
return 0;

that might later be changed to something like:

if (isValid(a))
error = 0;
return 0;

Use braces around a block even if you use only single line. Single line constructs for trivial tests like:

if (test) continue;
bool test
Status of the current subsection.

are allowed.

One exception is several if/else statements, which can be written as:

if (cond1)
{
...
}
else if (cond2)
{
...
}
else if (cond3)
{
...
}
else
{
...
}

This is safe because the first statement in each else branch is used, which is itself braced by the if branch.

Naming conventions

The following section describes the naming conventions followed by OpenMS developers.

Reserved words

Reserved words of the C++ language and symbols defined e.g. in the STL or in the standard C library must not be used as names for classes or class members. Examples include but are not limited to: set, map, exp and log.

File names

Header files and source files should have the same name as the classes they contain. Source files end in .cpp, while header files end in .h. File names should be capitalised exactly as the class they contain (see below). Each header/source file should contain one class only, although exceptions are possible for light-weight classes.

Underscores

The usage of underscores in names has two different meanings: A trailing "_" at the end indicates that something is protected or private to a class. Apart from that, different parts of a name are sometimes separated by an underscore, and sometimes separated by capital letters.

Note
According to the C++ standard, names that start with an underscore are reserved for internal purposes of the language and its standard library, so you should never use them.

Classes/Types/Namespaces

Class names and type names always start with a capital letter. Different parts of the name are separated by capital letters at the beginning of the word. No underscores are allowed in type names and class names, except for the names of protected types and classes in classes, which are suffixed by an underscore. The same conventions apply for namespaces.

Here is an example of some classes written using the conventions described above:

class Simple; //ordinary class
class SimpleThing; //ordinary class
class PDBFile; //using an abbreviation
class Buffer_; //protected or private nested class
class ForwardIteratorTraits_; //protected or private nested class

Method names

Function names (including class method names) always start with a lower case letter. Parts of the name are separated using capital letters (as are types and class names). They should be comprehensible, but as short as possible. The same variable names must be used in the declaration and in the definition. Arguments that are actually not used in the implementation of a function have to be commented out - this avoids compiler warnings. The argument of void functions (empty argument list) must be omitted in both the declaration and the definition. If function arguments are pointers or references, the pointer or reference qualifier is appended to the variable type. The pointer or reference qualifier should not prefix the variable name.

Here is an example of some method names written using the conventions described above:

void hello(); //ordinary function, no arguments
int countPeaks(PeakArray const& p); //ordinary function
bool ignore(string& /* name */); //ordinary function with an unused argument
bool isAdjacentTo(Peak const* const* const& p) const; //an ordinary function
bool doSomething_(int i, string& name); //protected or private member function

Variable names

Variable names are written in lower case letters. Distinguished parts of the name are separated using underscores. If parts of the name are derived from common acronyms (e.g. MS) they should be in upper case. Private or protected member variables of classes are suffixed by an underscore.

Here is an example of some variable names written using the conventions described above:

int simple; //ordinary variable
bool is_found; //ordinary variable
string MS_instrument; //using an abbreviation
int counter_; //protected or private member
int persistent_id_; //protected or private member

Enum and preprocessor constants

Enumerated values and preprocessor constants are all upper case letters. Parts of the name are separated by underscores.

Here is an example of some enumerated values and preprocessor constants written using the conventions described above:

#define MYCLASS_SUPPORTS_MIN_MAX 0 //preprocessor constant
enum DimensionId { DIM_MZ = 0, DIM_RT = 1 }; //enumerated values
enum DimensionId_ { MZ = 0, RT = 1 }; //enumerated values
@ RT
RT in seconds.

Avoid using the preprocessor. Normally, const and enum will suffice for most cases.

Parameters

Parameters should consist of lower-case letters and underscores only. For numerical parameters, the range of reasonable values is given. Where applicable units are given in the description. This rule applies to all kinds of parameter strings, both keys and string-values.

File extensions

The correct capitalization of all data file extensions supported by OpenMS is documented in FileHandler::NamesOfTypes[]. The convention is to use only lowercase letters for file extensions. There are three exceptions: "ML" and "XML" are written in uppercase letters and "mzData" keeps its capital "D". Remember to keep this consistent when adding new data files or writing new TOPP tools or UTILS (use correct capitalization for file type restrictions, here).

Classes

The following section outlines the class requirements with examples.

Class requirements

Each OpenMS class should provide the following interface:

class OPENMS_DLLAPI Test
{
public:
// default constructor
Test();
// copy constructor
Test(const Test& test);
// destructor
virtual ~Test();
// assignment operator
Test& operator=(const Test& test)
{
//ALWAYS CHECK FOR SELF ASSIGNMENT!
if (this == &test) return *this;
//...
return *this;
}
};

However, the following exceptions exist:

  • Default constructor: Can be omitted when the class is entirely dependent on some information or another class. In this case you should declare the default constructor private.
  • Destructor: Should be a virtual function, unless you have a very good reason why it shouldn't (e.g., saving a vtable pointer.) Otherwise a delete invocation on a pointer to a base class will call the wrong destructor. View more details here
  • Copy constructor: Can be omitted when it will not be used anyway, e.g. for classes that represent an algorithm. In this case you should declare the copy constructor private.
  • Assignment operator: Can be omitted when it will not be used anyway, e.g. for classes that represent an algorithm. In this case, you should declare the assignment operator private. The OPENMS_DLLAPI macro on the first line is required for correctly building the DLL. The correct usage of this macro is explained in the OpenMS C++ guide.

Example class files

In OpenMS, every .h file must be accompanied by a .cpp file, even if is just a ''dummy''. This way a global make will stumble across errors.

Here is an example of a correctly structured .h file:

... copyright header, not shown ...
//
// --------------------------------------------------------------------------
// $Maintainer: Heinz Erhardt $
// $Authors: Heinz Erhardt $
// --------------------------------------------------------------------------
#pragma once
#include <functional>
#include <sstream>
namespace OpenMS
{
... the actual code goes here ...
} // namespace OpenMS
#endif // OPENMS_KERNEL_DPEAK_H
Main OpenMS namespace.
Definition: FeatureDeconvolution.h:48

Here is an example of a correctly structured .cpp file:

... copyright header, not shown ...
//
// --------------------------------------------------------------------------
// $Maintainer: Heinz Erhardt $
// $Authors: Heinz Erhardt $
// --------------------------------------------------------------------------
#include <OpenMS/KERNEL/DPeak.h>
namespace OpenMS
{
... the actual code goes here ...
} // namespace OpenMS

For template classes, default instances with common template arguments should be put into the .cpp file. The variable names of these instances start with default_. Here is an example for the DPeak class:

namespace OpenMS
{
DPeak<1> default_dpeak_1;
DPeak<2> default_dpeak_2;
}

The compiler does instantiate the template and detects errors at compile time, which saves time. Otherwise the error is detected much later, when the test is compiled.

For template classes, you should include an _impl.h file. An _impl.h file for a template is like a .cpp file for an ordinary class. Remember that the definition of a class or function template has to be known at its point of instantiation. Therefore, the implementation of a template is normally contained in the .h file. For template classes, declaration and definition are given in the same file. Things get more complicated when certain design patterns (e.g., the factory pattern) are used which lead to "circular dependencies". This is only a dependency of names, but it has to be resolved by separating declarations from definitions, at least for some of the member functions. In this case, a .h file can be written that contains most of the definitions as well as the declarations of the peculiar functions. Their definition is deferred to the _impl.h file ("impl" for "implementation"). The _impl.h file is included only if the peculiar member functions have to be instantiated. Otherwise the .h file should be sufficient. No .h file should include an _impl.h file.

General rules

The following section discusses rules around the use of primitives, namespaces, accessors to members and the STL.

Primitive types

OpenMS uses its own type names for primitive types. Use only the types defined in OpenMS/include/OpenMS/CONCEPT/Types.h.

Namespaces

The main OpenMS classes are implemented in the namespace OpenMS. Auxiliary classes are implemented in OpenMS::Internal. There are some other namespaces e.g. for constants and exceptions.

Importing a whole namespace in a header files is forbidden. For example:

// sample.h
using namespace std; //< Don't do this at home!

Using the directive on C++ standard library datatypes in header files is forbidden. For example:

// sample.h
using std::vector; //< Don't do this shorthand!
void sampleFunction1(vector &v1); //< bad: Shorthand leads to a confusing datatype. Don't do this.
void sampleFunction2(std::vector &v1); //< good: Full namespacing of datatype prevents confusion. Be explicit.

This could lead to name clashes when OpenMS is used together with other libraries. In source files (.cpp) it is however allowed.

Accessors to members

Accessors to protected or private members of a class are implemented as a pair of get-method and set-method. This is necessary as accessors that return mutable references to a member cannot be wrapped with Python.

class Test
{
public:
// always implement a non-mutable get-method
UInt getMember() const
{
return member_;
}
// always implement a set-method
void setMember(UInt name)
{
member_ = name;
}
protected:
UInt member_;
};
unsigned int UInt
Unsigned integer type.
Definition: Types.h:94

For members that are too large to be read with the get-method or modified and written back with the set-method, an additional non-const get-method can be implemented.

For primitive types, a non-const get-method is strictly forbidden. For more complex types it should be present only when necessary.

class Test
{
public:
const vector<String>& getMember() const
{
return member_;
}
void setMember(const vector<String>& name)
{
member_ = name;
}
// if absolutely necessary implement a mutable get-method
vector<String>& getMember()
{
return member_;
}
protected:
vector<String> member_;
};

Use of the STL

Many OpenMS classes base on STL classes. However, only the C++ Standard Library part of the STL must be used. This means that SGI extensions like hash_set, hash_multiset, hash_map and hash_multimap are not allowed; use std::unordered_... instead.

Exceptions

The following section describes how to handle exceptions and create exception classes.

Exception handling

No OpenMS program should dump a core if an error occurs. Instead, it should attempt to die as gracefully as possible. Furthermore, as OpenMS is a framework rather than an application, it should give the programmer ways to catch and correct errors. The recommended procedure to handle - even fatal - errors is to throw an exception. Uncaught exception will result in a call to abort thereby terminating the program.

Throw exceptions

To simplify debugging, use the following throw directive for exceptions:

throw AnyException(__FILE__, __LINE__, OPENMS_PRETTY_FUNCTION);

__FILE__ and __LINE__ are standard-defined preprocessor macros. The macro OPENMS_PRETTY_FUNCTION wraps Boost's version of a platform independent __PRETTY_FUNCTION__ macro, that works similar to a char* and contains the type signature of the function as well as its bare name, if the GNU compiler is being used. It might differ on other platforms. Exception::Base provides methods (getFile, getLine, getFunction) that allow the localisation of the exception's cause.

Catch exceptions

The standard way to catch an exception should be by reference (and not by value), as shown below:

try
{
// some code which might throw
}
catch (Exception& e)
{
// Handle the exception, then possibly re-throw it:
// throw; // the modified e
}

Specify exceptions

Don't use the throw statement to specify exceptions, as this forces the compiler to check that only the specified exceptions are thrown. This check not only increases the runtime, but may prevent efficient optimization of the code by the compiler.

However, thrown exceptions must be documented to tell the user which exceptions can be caught.

void myFunction()
{
throw Foo(__FILE__, __LINE__, OPENMS_PRETTY_FUNCTION);
}

Exception classes

All exceptions used in OpenMS are derived from Exception::Base defined in CONCEPT/Exception.h. A default constructor should not be implemented for these exceptions. Instead, the constructor of all derived exceptions should have the following signature:

AnyException(const char* file, int line, const char* function[, ...]);

Additional arguments are possible but should provide default values (see IndexOverflow for an example).

How to expose classes and methods to python

C++ classes and their methods can be exposed to python via pyOpenMS. If you are interested in exposing your algorithms to python, view the pyopenms documentation for the coding conventions and examples.

Documentation

UML diagrams

To generate UML diagrams, use yEd and export the diagrams in PNG format. Do not forget to save also the corresponding .yed file.

Doxygen

Each OpenMS class has to be documented using Doxygen. The documentation is inserted in Doxygen format in the header file where the class is defined. Documentation includes the description of the class, each method, type declaration, enum declaration, each constant, and member variable.

Longer pieces of documentation start with a brief description, followed by an empty line and a detailed description. The empty line is needed to separate the brief from the detailed description.

Descriptions of classes always have a brief section.

Use the doxygen style of the following example for OpenMS:

class Test
{
public:
enum EnumType
{
int EVal1,
int EVal2
};
Test();
int dummy(int dummy_a, const char* dummy_s);
int isDummy();
void dummy2();
void dummy3();
protected:
int value;
};

The defgroup command indicates that a comment block contains documentation for a group of classes, files or namespaces. This can be used to categorize classes, files or namespaces, and document those categories. You can also use groups as members of other groups, thus building a hierarchy of groups. By using the ingroup command, a comment block of a class, file or namespace will be added to the group or groups.

The groups (or modules as doxygen calls them) defined by the ingroup command should contain only the classes of special interest to the OpenMS user. Helper classes and such must be omitted.

Documentation that does not belong to a specific .cpp or .h file can be written into a separate Doxygen file (with the ending ".doxygen"). This file will also be parsed by Doxygen.

Open tasks are noted in the documentation of a header or a group using the todo command. The ToDo list is then shown in the doxygen menu under 'Related pages'. Each ToDo should be followed by a name in parentheses to indicated who is going to handle it.

You can also use these commands:

@todo Necessary todo for the the next release. Should be done as soon as possible.
Please add the name of the responsible person in parentheses!
@improvement Possible improvement, but not really necessary.
Please add the name of the responsible person in parentheses!
@deprecated Deprecated class, that must be removed in the next release.
@experimental Experimental class, that will perhaps not make it to the library.
@bug Description of a bug in the class/method.
Please add the name of the finder in parentheses!
Doxygen is not hard to learn, have a look at the manual :-)

Add comments to code

The code for each .cpp file has to be commented. Each piece of code in OpenMS has to contain at least 5% of comments. The use of:

// Comment text

instead of:

/* Comment text */

is recommended to avoid problems arising from nested comments. Comments should be written in plain english and describe the functionality of the next few lines.

Examples

Instructive programming examples are provided in the doc/code_examples directory

Testing

View the How To Write Tests guidelines to learn how to write tests.

Revision control

OpenMS uses git to manage different versions of the source files. For easier identification of the responsible person each OpenMS file contains the $Maintainer:$ string in the preamble.

Examples of .h and .cpp files have been given above. In non-C++ files (CMake files, (La)TeX-Files, etc.) the C++ comments are replaced by the respective comment characters (e.g. ``#'' for CMake files, % for (La)TeX).