OpenMS
InspectOutfile Class Reference

Representation of an Inspect outfile. More...

#include <OpenMS/FORMAT/InspectOutfile.h>

Collaboration diagram for InspectOutfile:
[legend]

Public Member Functions

 InspectOutfile ()
 default constructor More...
 
 InspectOutfile (const InspectOutfile &inspect_outfile)
 copy constructor More...
 
virtual ~InspectOutfile ()
 destructor More...
 
InspectOutfileoperator= (const InspectOutfile &inspect_outfile)
 assignment operator More...
 
bool operator== (const InspectOutfile &inspect_outfile) const
 equality operator More...
 
std::vector< Sizeload (const String &result_filename, std::vector< PeptideIdentification > &peptide_identifications, ProteinIdentification &protein_identification, const double p_value_threshold, const String &database_filename="")
 
std::vector< SizegetWantedRecords (const String &result_filename, double p_value_threshold)
 
void compressTrieDB (const String &database_filename, const String &index_filename, std::vector< Size > &wanted_records, const String &snd_database_filename, const String &snd_index_filename, bool append=false)
 
void generateTrieDB (const String &source_database_filename, const String &database_filename, const String &index_filename, bool append=false, const String &species="")
 
void getACAndACType (String line, String &accession, String &accession_type)
 
void getPrecursorRTandMZ (const std::vector< std::pair< String, std::vector< std::pair< Size, Size > > > > &files_and_peptide_identification_with_scan_number, std::vector< PeptideIdentification > &ids)
 
void getLabels (const String &source_database_filename, String &ac_label, String &sequence_start_label, String &sequence_end_label, String &comment_label, String &species_label)
 
std::vector< SizegetSequences (const String &database_filename, const std::map< Size, Size > &wanted_records, std::vector< String > &sequences)
 
void getExperiment (PeakMap &exp, String &type, const String &in_filename)
 
bool getSearchEngineAndVersion (const String &cmd_output, ProteinIdentification &protein_identification)
 get the search engine and its version from the output of the InsPecT executable without parameters More...
 
void readOutHeader (const String &filename, const String &header_line, Int &spectrum_file_column, Int &scan_column, Int &peptide_column, Int &protein_column, Int &charge_column, Int &MQ_score_column, Int &p_value_column, Int &record_number_column, Int &DB_file_pos_column, Int &spec_file_pos_column, Size &number_of_columns)
 read the header of an inspect output file and retrieve various information More...
 

Static Protected Attributes

static const Size db_pos_length_
 length of 1) More...
 
static const Size trie_db_pos_length_
 length of 2) More...
 
static const Size protein_name_length_
 length of 3) More...
 
static const Size record_length_
 length of the whole record More...
 
static const char trie_delimiter_
 the sequences in the trie database are delimited by this character More...
 
static const String score_type_
 type of score More...
 

Detailed Description

Representation of an Inspect outfile.

This class serves to read in an Inspect outfile and write an idXML file

Todo:
Handle Modifications (Andreas)

Constructor & Destructor Documentation

◆ InspectOutfile() [1/2]

default constructor

◆ InspectOutfile() [2/2]

InspectOutfile ( const InspectOutfile inspect_outfile)

copy constructor

◆ ~InspectOutfile()

virtual ~InspectOutfile ( )
virtual

destructor

Member Function Documentation

◆ compressTrieDB()

void compressTrieDB ( const String database_filename,
const String index_filename,
std::vector< Size > &  wanted_records,
const String snd_database_filename,
const String snd_index_filename,
bool  append = false 
)

generates a trie database from another one, using the wanted records only

Exceptions
Exception::FileNotFound
Exception::ParseError
Exception::UnableToCreateFile

◆ generateTrieDB()

void generateTrieDB ( const String source_database_filename,
const String database_filename,
const String index_filename,
bool  append = false,
const String species = "" 
)

generates a trie database from a given one (the type of database is determined by getLabels)

Exceptions
Exception::FileNotFound
Exception::UnableToCreateFile

◆ getACAndACType()

void getACAndACType ( String  line,
String accession,
String accession_type 
)

retrieve the accession type and accession number from a protein description line (e.g. from FASTA line: >gi|5524211|gb|AAD44166.1| cytochrome b [Elephas maximus maximus], get ac:AAD44166.1 ac type: GenBank)

◆ getExperiment()

void getExperiment ( PeakMap exp,
String type,
const String in_filename 
)
inline

get the experiment from a file

Exceptions
Exception::ParseErroris thrown if the file could not be parsed or the filetype could not be determined

References FileHandler::getTypeByContent(), FileHandler::loadExperiment(), ProgressLogger::NONE, MSExperiment::reset(), FileTypes::typeToName(), and FileTypes::UNKNOWN.

◆ getLabels()

void getLabels ( const String source_database_filename,
String ac_label,
String sequence_start_label,
String sequence_end_label,
String comment_label,
String species_label 
)

retrieve the labels of a given database (at the moment FASTA and Swissprot)

Exceptions
Exception::FileNotFound
Exception::ParseError

◆ getPrecursorRTandMZ()

void getPrecursorRTandMZ ( const std::vector< std::pair< String, std::vector< std::pair< Size, Size > > > > &  files_and_peptide_identification_with_scan_number,
std::vector< PeptideIdentification > &  ids 
)

retrieve the precursor retention time and mz value

Exceptions
Exception::ParseError

◆ getSearchEngineAndVersion()

bool getSearchEngineAndVersion ( const String cmd_output,
ProteinIdentification protein_identification 
)

get the search engine and its version from the output of the InsPecT executable without parameters

returns true on success, false otherwise

◆ getSequences()

std::vector<Size> getSequences ( const String database_filename,
const std::map< Size, Size > &  wanted_records,
std::vector< String > &  sequences 
)

retrieve sequences from a trie database

Exceptions
Exception::FileNotFound

◆ getWantedRecords()

std::vector<Size> getWantedRecords ( const String result_filename,
double  p_value_threshold 
)

loads only results which exceeds a given P-value threshold

Parameters
result_filenameThe filename of the results file
p_value_thresholdOnly identifications exceeding this threshold are read
Exceptions
FileNotFoundis thrown is the file is not found
FileEmptyis thrown if the given file is empty

◆ load()

std::vector<Size> load ( const String result_filename,
std::vector< PeptideIdentification > &  peptide_identifications,
ProteinIdentification protein_identification,
const double  p_value_threshold,
const String database_filename = "" 
)

load the results of an Inspect search

Parameters
result_filenameInput parameter which is the file name of the input file
peptide_identificationsOutput parameter which holds the peptide identifications from the given file
protein_identificationOutput parameter which holds the protein identifications from the given file
p_value_threshold
database_filename
Exceptions
FileNotFoundis thrown if the given file could not be found
ParseErroris thrown if the given file could not be parsed
FileEmptyis thrown if the given file is empty

◆ operator=()

InspectOutfile& operator= ( const InspectOutfile inspect_outfile)

assignment operator

◆ operator==()

bool operator== ( const InspectOutfile inspect_outfile) const

equality operator

◆ readOutHeader()

void readOutHeader ( const String filename,
const String header_line,
Int spectrum_file_column,
Int scan_column,
Int peptide_column,
Int protein_column,
Int charge_column,
Int MQ_score_column,
Int p_value_column,
Int record_number_column,
Int DB_file_pos_column,
Int spec_file_pos_column,
Size number_of_columns 
)

read the header of an inspect output file and retrieve various information

Exceptions
Exception::ParseError

Member Data Documentation

◆ db_pos_length_

const Size db_pos_length_
staticprotected

length of 1)

a record in the index file that belongs to a trie database consists of three parts 1) the protein's position in the original database 2) the protein's position in the trie database 3) the name of the protein (the line with the accession identifier)

◆ protein_name_length_

const Size protein_name_length_
staticprotected

length of 3)

◆ record_length_

const Size record_length_
staticprotected

length of the whole record

◆ score_type_

const String score_type_
staticprotected

type of score

◆ trie_db_pos_length_

const Size trie_db_pos_length_
staticprotected

length of 2)

◆ trie_delimiter_

const char trie_delimiter_
staticprotected

the sequences in the trie database are delimited by this character