OpenMS
Loading...
Searching...
No Matches
XMLFile Class Reference

Base class for loading and storing XML files via Xerces, with optional schema validation and transparent gzip / bzip2 / zip decompression on load. More...

#include <OpenMS/FORMAT/XMLFile.h>

Inheritance diagram for XMLFile:
[legend]
Collaboration diagram for XMLFile:
[legend]

Public Member Functions

 XMLFile ()
 Construct an XMLFile without schema info; schema_location_ remains unset, so isValid cannot be used until derived-class logic initializes schema_location_ before calling isValid.
 
 XMLFile (const std::string &schema_location, const std::string &version)
 Construct with a schema location for later isValid calls.
 
virtual ~XMLFile ()
 Virtual destructor — defaulted; allows safe deletion through a base-class pointer.
 
bool isValid (const std::string &filename, std::ostream &os)
 Check if filename validates against the bound XML schema.
 
const std::string & getVersion () const
 Return the schema version string passed to the parameterised constructor; empty for default-constructed instances.
 

Protected Member Functions

void parse_ (const std::string &filename, XMLHandler *handler)
 Parse the XML file at filename through handler.
 
void parseBuffer_ (const std::string &buffer, XMLHandler *handler)
 Parse an in-memory XML buffer through handler.
 
void save_ (const std::string &filename, XMLHandler *handler) const
 Stores the contents of the XML handler given by handler in the file given by filename.
 
void enforceEncoding_ (const std::string &encoding)
 Set or clear the XML-encoding override applied to subsequent parse_ / parseBuffer_ calls.
 

Protected Attributes

std::string schema_location_
 Path of the XML schema for validation; empty when the default constructor was used (isValid then throws NotImplemented).
 
std::string schema_version_
 Schema version string returned by getVersion.
 
std::string enforced_encoding_
 Optional XML encoding override applied to the InputSource in parse_ and parseBuffer_; empty disables the override. Used as a workaround for XTandem output XML which carries an encoding the parser otherwise stumbles on.
 

Detailed Description

Base class for loading and storing XML files via Xerces, with optional schema validation and transparent gzip / bzip2 / zip decompression on load.

Concrete file-format classes derive from XMLFile and provide an XMLHandler subclass that drives the SAX parser. The base class hides the Xerces setup, the compression-magic detection on the input side, the suffix-based compression selection on the output side, and the optional encoding override used for non-conforming XML producers (e.g. X!Tandem).

Compression is detected on parse_ by reading the first two bytes of the file and matching one of three magic numbers: "BZ" (bzip2), 0x1F8B (gzip), "PK" (zip). On save_, compression is selected by filename suffix (.gz, .bz2) instead.

Constructor & Destructor Documentation

◆ XMLFile() [1/2]

XMLFile ( )

Construct an XMLFile without schema info; schema_location_ remains unset, so isValid cannot be used until derived-class logic initializes schema_location_ before calling isValid.

◆ XMLFile() [2/2]

XMLFile ( const std::string &  schema_location,
const std::string &  version 
)

Construct with a schema location for later isValid calls.

Parameters
[in]schema_locationPath of the XML schema (resolved at isValid time via File::find).
[in]versionSchema version string returned by getVersion.

◆ ~XMLFile()

virtual ~XMLFile ( )
virtual

Virtual destructor — defaulted; allows safe deletion through a base-class pointer.

Member Function Documentation

◆ enforceEncoding_()

void enforceEncoding_ ( const std::string &  encoding)
protected

Set or clear the XML-encoding override applied to subsequent parse_ / parseBuffer_ calls.

Stores encoding in enforced_encoding_. Pass an empty string to disable the override and let Xerces use the encoding declared inside the XML.

Parameters
[in]encodingEncoding name (e.g. "ISO-8859-1") to apply to the InputSource before parsing; empty string disables the override so Xerces consults the XML declaration instead.

◆ getVersion()

const std::string & getVersion ( ) const

Return the schema version string passed to the parameterised constructor; empty for default-constructed instances.

◆ isValid()

bool isValid ( const std::string &  filename,
std::ostream &  os 
)

Check if filename validates against the bound XML schema.

Resolves the configured schema_location_ via File::find and delegates to XMLValidator::isValid. Error messages are written to os.

Parameters
[in]filenameThe file to validate.
[in,out]osError-message sink.
Returns
true if the file validates against the schema, false otherwise.
Exceptions
OpenMS::Exception::NotImplementedif no schema is bound (default-constructed instance — schema_location_ is empty).
OpenMS::Exception::FileNotFoundif filename cannot be found.

◆ parse_()

void parse_ ( const std::string &  filename,
XMLHandler handler 
)
protected

Parse the XML file at filename through handler.

Reads the first two bytes of filename to detect bzip2 ("BZ"), gzip (0x1F8B), or zip ("PK") and wraps the file in a CompressedInputSource accordingly; any other magic falls through to a plain LocalFileInputSource. If enforceEncoding_ has set enforced_encoding_, the override is applied to the source before parsing.

On exit (success or exception), handler->reset() is called via an RAII guard so the handler can be reused on subsequent calls.

Parameters
[in]filenamePath to the XML file.
[in,out]handlerSAX handler driven by the parser.
Exceptions
OpenMS::Exception::FileNotFoundif filename does not exist.
OpenMS::Exception::ParseErrorif Xerces initialisation fails or a parse error occurs.

◆ parseBuffer_()

void parseBuffer_ ( const std::string &  buffer,
XMLHandler handler 
)
protected

Parse an in-memory XML buffer through handler.

Wraps buffer in a xercesc::MemBufInputSource (with the literal id "inMemory"). If enforceEncoding_ has set enforced_encoding_, the override is applied to the source before parsing. The RAII handler reset described for parse_ also applies here.

Note
The buffer must be plain text; gzip / bzip2 / zip buffers are not supported on this path (only on parse_).
Parameters
[in]bufferIn-memory XML text.
[in,out]handlerSAX handler driven by the parser.
Exceptions
OpenMS::Exception::ParseErrorif Xerces initialisation fails or a parse error occurs.

◆ save_()

void save_ ( const std::string &  filename,
XMLHandler handler 
) const
protected

Stores the contents of the XML handler given by handler in the file given by filename.

If filename ends with .gz, the output is written with gzip compression. If filename ends with .bz2, the output is written with bzip2 compression. Otherwise, uncompressed output is written.

Parameters
[in]filenameThe output filename (extension determines compression: .gz, .bz2, or none)
[in]handlerThe XML handler containing the content to write
Exceptions
Exception::UnableToCreateFileis thrown if the file cannot be created

Member Data Documentation

◆ enforced_encoding_

std::string enforced_encoding_
protected

Optional XML encoding override applied to the InputSource in parse_ and parseBuffer_; empty disables the override. Used as a workaround for XTandem output XML which carries an encoding the parser otherwise stumbles on.

◆ schema_location_

std::string schema_location_
protected

Path of the XML schema for validation; empty when the default constructor was used (isValid then throws NotImplemented).

◆ schema_version_

std::string schema_version_
protected

Schema version string returned by getVersion.