OpenMS  3.0.0
FASTAContainer.h File Reference
#include <OpenMS/CONCEPT/Exception.h>
#include <OpenMS/CONCEPT/LogStream.h>
#include <OpenMS/DATASTRUCTURES/String.h>
#include <OpenMS/DATASTRUCTURES/ListUtils.h>
#include <OpenMS/DATASTRUCTURES/StringUtilsSimple.h>
#include <OpenMS/FORMAT/FASTAFile.h>
#include <functional>
#include <fstream>
#include <unordered_map>
#include <memory>
#include <utility>
#include <vector>
#include <boost/regex.hpp>
Include dependency graph for FASTAContainer.h:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Classes

class  FASTAContainer< TBackend >
 template parameter for vector-based FASTA access More...
 
class  FASTAContainer< TFI_File >
 FASTAContainer<TFI_File> will make FASTA entries available chunk-wise from start to end by loading it from a FASTA file. This avoids having to load the full file into memory. While loading, the container will memorize the file offsets of each entry, allowing to read an arbitrary i'th entry again from disk. If possible, only entries from the currently cached chunk should be queried, otherwise access will be slow. More...
 
class  FASTAContainer< TFI_Vector >
 FASTAContainer<TFI_Vector> simply takes an existing vector of FASTAEntries and provides the same interface with a potentially huge speed benefit over FASTAContainer<TFI_File> since it does not need disk access, but at the cost of memory. More...
 
class  DecoyHelper
 Helper class for calculations on decoy proteins. More...
 
struct  DecoyHelper::Result
 
struct  DecoyHelper::DecoyStatistics
 struct for intermediate results needed for calculations on decoy proteins More...
 

Namespaces

 OpenMS
 Main OpenMS namespace.
 

Class Documentation

◆ OpenMS::FASTAContainer

class OpenMS::FASTAContainer

template<typename TBackend>
class OpenMS::FASTAContainer< TBackend >

template parameter for vector-based FASTA access

This class allows for a chunk-wise single linear read over a (large) FASTA file, with spurious (since potentially slow) access to earlier entries which are currently not in the active chunk.

Internally uses FASTAFile class to read single sequences.

FASTAContainer supports two template specializations FASTAContainer<TFI_File> and FASTAContainer<TFI_Vector>.

FASTAContainer<TFI_File> will make FASTA entries available chunk-wise from start to end by loading it from a FASTA file. This avoids having to load the full file into memory. While loading, the container will memorize the file offsets of each entry, allowing to read an arbitrary i'th entry again from disk. If possible, only entries from the currently cached chunk should be queried, otherwise access will be slow.

FASTAContainer<TFI_Vector> simply takes an existing vector of FASTAEntries and provides the same interface (with a potentially huge speed benefit over FASTAContainer<TFI_File> since it does not need disk access, but at the cost of memory).

If an algorithm searches through a FASTA file linearly, you can use FASTAContainer<TFI_File> to pre-load a small chunk and start working, while loading the next chunk in a background thread and swap it in when the active chunk was processed.