OpenMS
2.4.0
|
template parameter for vector-based FASTA access More...
#include <OpenMS/DATASTRUCTURES/FASTAContainer.h>
template parameter for vector-based FASTA access
This class allows for a chunk-wise single linear read over a (large) FASTA file, with spurious (since potentially slow) access to earlier entries which are currently not in the active chunk.
Internally uses FASTAFile class to read single sequences.
FASTAContainer supports two template specializations FASTAContainer<TFI_File> and FASTAContainer<TFI_Vector>.
FASTAContainer<TFI_File> will make FASTA entries available chunk-wise from start to end by loading it from a FASTA file. This avoids having to load the full file into memory. While loading, the container will memorize the file offsets of each entry, allowing to read an arbitrary i'th entry again from disk. If possible, only entries from the currently cached chunk should be queried, otherwise access will be slow.
FASTAContainer<TFI_Vector> simply takes an existing vector of FASTAEntries and provides the same interface (with a potentially huge speed benefit over FASTAContainer<TFI_File> since it does not need disk access, but at the cost of memory).
If an algorithm searches through a FASTA file linearly, you can use FASTAContainer<TFI_File> to pre-load a small chunk and start working, while loading the next chunk in a background thread and swap it in when the active chunk was processed.