-
Notifications
You must be signed in to change notification settings - Fork 0
API Reference
shijiashuai edited this page Mar 9, 2026
·
1 revision
FastQTools 提供清晰的 C++ 公共 API,支持作为库集成到其他项目中。
#include <fqtools/fq.h>fq.h 聚合了所有公共接口:
| 模块 | 头文件 | 命名空间 | 说明 |
|---|---|---|---|
| I/O | <fqtools/io/...> |
fq::io |
FastqReader, FastqWriter, FastqRecord, FastqBatch |
| 处理 | <fqtools/processing/...> |
fq::processing |
流水线、Predicate、Mutator |
| 统计 | <fqtools/statistics/...> |
fq::statistic |
统计计算接口 |
| 核心 | <fqtools/core/core.h> |
fq::core |
序列工具函数 |
| 配置 | <fqtools/config/config.h> |
fq::config |
配置管理 |
| 错误 | <fqtools/error/error.h> |
fq::error |
异常处理框架 |
| 日志 | <fqtools/logging.h> |
fq::logging |
日志初始化与级别控制 |
| 通用 | <fqtools/common/common.h> |
fq::common |
Timer、IDGenerator 等 |
fq.h(聚合入口)
├── io/ → FastqReader / FastqWriter / FastqRecord / FastqBatch
├── processing/ → ProcessingPipelineInterface / Predicate / Mutator
├── statistics/ → StatisticCalculatorInterface
├── core/ → SequenceUtils
├── config/ → Configuration
├── error/ → FastQException 异常体系
├── logging/ → init / setLevel
└── common/ → Timer / IDGenerator
FASTQ 记录的零拷贝视图,使用 std::string_view 指向 FastqBatch 的连续内存。
字段:
| 字段 | 类型 | 说明 |
|---|---|---|
id |
std::string_view |
记录标识符 |
sequence |
std::string_view |
DNA 序列 |
quality |
std::string_view |
质量分数字符串 |
separator |
std::string_view |
分隔符行(通常为 +) |
方法:
auto averageQuality(int qualityEncoding = 33) const -> double;
auto length() const -> size_t;
auto gcContent() const -> double;
auto nRatio() const -> double;批量存储多条 FASTQ 记录的容器,维护连续内存缓冲区。
auto records() const -> const std::vector<FastqRecord>&;
auto size() const -> size_t;
auto empty() const -> bool;
void clear();
void reserve(size_t count);内存模型:
FastqBatch
├── buffer_ 连续内存块(存储原始文本)
└── records_ FastqRecord 数组(string_view 指向 buffer_)
fq::io::FastqReader reader("input.fastq.gz");
fq::io::FastqBatch batch;
while (reader.nextBatch(batch, 10000)) {
for (const auto& record : batch.records()) {
// 处理每条记录
}
}性能参数: readChunkBytes, zlibBufferBytes, maxBufferBytes
fq::io::FastqWriter writer("output.fastq.gz");
for (const auto& record : batch.records()) {
writer.write(record);
}性能参数: zlibBufferBytes, outputBufferBytes
基于对象池模式,减少 TBB pipeline 中的频繁分配:
auto pool = fq::io::createFastqBatchPool(initialSize, maxSize);
auto batch = pool->acquire(); // 从池获取
// shared_ptr 析构时自动归还通过工厂模式创建:
auto pipeline = fq::processing::createProcessingPipeline();接口方法:
class ProcessingPipelineInterface {
public:
virtual void setInputPath(const std::string& path) = 0;
virtual void setOutputPath(const std::string& path) = 0;
virtual void setProcessingConfig(const ProcessingConfig& config) = 0;
virtual void addReadPredicate(std::unique_ptr<ReadPredicateInterface> predicate) = 0;
virtual void addReadMutator(std::unique_ptr<ReadMutatorInterface> mutator) = 0;
virtual auto run() -> ProcessingStats = 0;
};| 参数 | 类型 | 说明 |
|---|---|---|
batchSize |
size_t |
每批 reads 数量 |
threadCount |
size_t |
并行线程数 |
readChunkBytes |
size_t |
读取块大小 |
zlibBufferBytes |
size_t |
zlib 缓冲区 |
writerBufferBytes |
size_t |
写入缓冲区 |
batchCapacityBytes |
size_t |
批次内存限制 |
memoryLimitBytes |
size_t |
总内存限制 |
maxInFlightBatches |
size_t |
并发批次数 |
| 字段 | 类型 | 说明 |
|---|---|---|
totalReads |
uint64_t |
输入读段总数 |
passedReads |
uint64_t |
通过过滤的读段数 |
filteredReads |
uint64_t |
被过滤的读段数 |
errorReads |
uint64_t |
错误读段数 |
inputBytes |
uint64_t |
输入字节数 |
outputBytes |
uint64_t |
输出字节数 |
elapsedMs |
uint64_t |
总耗时(毫秒) |
throughputMbps |
double |
吞吐量(MB/s) |
class ReadPredicateInterface {
public:
virtual auto evaluate(const fq::io::FastqRecord& read) const -> bool = 0;
};内置实现:
| 类 | 说明 |
|---|---|
MinQualityPredicate |
最小平均质量过滤 |
MinLengthPredicate |
最小读长过滤 |
MaxLengthPredicate |
最大读长过滤 |
MaxNRatioPredicate |
最大 N 碱基比例过滤 |
class ReadMutatorInterface {
public:
virtual void process(fq::io::FastqRecord& read) = 0;
};内置实现:
| 类 | 说明 |
|---|---|
QualityTrimmer |
质量修剪(Both / FivePrime / ThreePrime) |
LengthTrimmer |
长度修剪(FixedLength / MaxLength / FromStart / FromEnd) |
AdapterTrimmer |
接头修剪 |
fq::statistic::StatisticOptions options;
options.inputFastqPath = "input.fastq.gz";
options.outputStatPath = "output.stat.txt";
options.threadCount = 4;
auto calculator = fq::statistic::createStatisticCalculator(options);
calculator->run();| 字段 | 类型 | 说明 |
|---|---|---|
inputFastqPath |
std::string |
输入 FASTQ 文件路径 |
outputStatPath |
std::string |
输出统计文件路径 |
threadCount |
size_t |
线程数 |
batchSize |
size_t |
批处理大小 |
| 字段 | 类型 | 说明 |
|---|---|---|
readCount |
uint64_t |
读段总数 |
totalBases |
uint64_t |
碱基总数 |
maxReadLength |
uint32_t |
最大读长 |
posQualityDist |
vector<vector<uint64_t>> |
位置质量分布 |
posBaseDist |
vector<vector<uint64_t>> |
位置碱基分布 |
支持 operator+= 合并多个批次的统计结果。
DNA/RNA 序列处理工具类,使用 C++23 Concepts 约束模板参数:
namespace fq::core {
class SequenceUtils {
public:
template <std::ranges::range R>
static auto gcContent(const R& sequence) -> double;
template <std::ranges::range R>
static auto nRatio(const R& sequence) -> double;
static auto reverseComplement(std::string_view sequence) -> std::string;
static auto isValidBase(char base) -> bool;
};
}namespace fq::config {
class Configuration {
public:
void loadFromFile(const std::string& configFile);
void loadFromArgs(int argc, const char* argv[]);
void loadFromEnv();
template <typename T> auto get(const std::string& key) const -> T;
template <typename T> auto getOr(const std::string& key, const T& def) const -> T;
template <typename T> void set(const std::string& key, const T& value);
auto hasKey(const std::string& key) const -> bool;
void validate() const;
};
}配置优先级:默认值 → 配置文件 → 环境变量 → 命令行参数
FastQException
├── IOError — 文件 I/O 错误
├── FormatError — FASTQ 格式错误
├── ConfigurationError — 配置错误
└── ValidationError — 验证错误
enum class ErrorCategory { IO, Format, Validation, Processing, Resource, Configuration };
enum class ErrorSeverity { Info, Warning, Error, Critical };FQ_THROW_CONFIG_ERROR("Required key 'input' is missing");
FQ_THROW_IO_ERROR("Failed to open file: " + path);fq::logging::LogOptions options;
options.level = "info"; // trace/debug/info/warn/error
options.colored = true;
fq::logging::init(options);
fq::logging::info("Processing {} reads", readCount);
fq::logging::warn("Quality below threshold: {}", quality);
fq::logging::setLevel("debug");find_package(FastQTools REQUIRED)
target_link_libraries(my_app PRIVATE FastQTools::FastQTools)#include <fqtools/fq.h>
#include <iostream>
int main() {
// 创建处理流水线
auto pipeline = fq::processing::createProcessingPipeline();
pipeline->setInputPath("input.fastq");
pipeline->setOutputPath("output.fastq");
// 配置
fq::processing::ProcessingConfig config;
config.batchSize = 10000;
config.threadCount = 4;
pipeline->setProcessingConfig(config);
// 添加过滤条件
pipeline->addReadPredicate(
std::make_unique<fq::processing::MinQualityPredicate>(20.0, 33));
pipeline->addReadPredicate(
std::make_unique<fq::processing::MinLengthPredicate>(50));
// 添加修剪器
pipeline->addReadMutator(
std::make_unique<fq::processing::QualityTrimmer>(
20.0, 50, fq::processing::QualityTrimmer::TrimMode::Both, 33));
// 执行
auto stats = pipeline->run();
std::cout << stats.toString() << std::endl;
return 0;
}- Architecture — 架构设计与并发模型
- CLI Reference — 命令行用法
- Testing Strategy — 测试策略
FastQTools © 2026 LessUp · MIT License · 在线文档 · Issues
FastQTools v3.1.0
🚀 快速上手
🏗️ 架构与设计
🔧 构建与部署
🧪 质量工程
📖 规范与参考
🔗 外部链接