当前系统订单跟踪存在哪些严重的bug6

SHI XIAOLONG

21 Feb 2026 — 6 min read

订单跟踪系统严重 Bug 分析报告

分析日期：2026-02-21
分析范围：src/trading/executor.py、src/trading/websocket_order_manager.py、src/trading/position_manager.py、src/trading/trade_repository.py

严重性汇总

#	位置	分类	影响
1	`executor.py:677`	🔴 严重	成交价回填永久失效，PnL 计算错误
2	`executor.py:1143`	🔴 严重	错误状态混淆，监控和 DB 记录失真
3	`websocket_order_manager.py:83`	🟠 中等	竞态导致成交价丢失，PnL 偏差
4	`position_manager.py:1134`	🟠 中等	幽灵仓位 PnL 未记录，统计不准
5	`position_manager.py:550`	🟠 中等	Lark 异常导致风险/统计模块静默失效
6	`websocket_order_manager.py:210`	🟡 次要	语义错误，有其他保护，功能尚可

Bug 1（最严重）：`_backfill_order_price` API 响应未解包，导致成交价永远回填失败

位置：executor.py:677-688

根因

query_order_by_oid 返回的原始格式是嵌套结构：

{
  "status": "order",
  "order": {
    "status": "filled",
    "avgPx": "1234.5",
    "totalSz": "1.0"
  }
}

query_order_status（:1571）已正确实现解包逻辑：

# executor.py:1575-1579
if raw.get("status") == "order" and isinstance(raw.get("order"), dict):
    return raw["order"]   # ✅ 返回内层，包含 avgPx/totalSz
return raw

但 _backfill_order_price（:677）直接在外层 dict 上调用 get("avgPx")，永远返回 None：

# executor.py:677-688（有 bug 的代码）
status_resp = retry_call(
    lambda: self._info.query_order_by_oid(self._wallet.address, order_result.order_id),
    description="订单状态查询",
)
if isinstance(status_resp, dict):
    avg_px = status_resp.get("avgPx") or status_resp.get("avg_px")   # ❌ 外层无此字段，永远 None
    total_sz = status_resp.get("totalSz") or status_resp.get("total_sz")  # ❌ 同上

触发场景（3 处调用，全部受影响）

调用位置	触发条件
`limit_open` `:822`	超时撤单后发现部分成交
`limit_close` Leg A `:985`	超时撤单后持仓归零，视为成交
`limit_close` Leg B `:1021`	超时撤单后持仓归零，视为成交

影响

order_result.price 保持为挂单价而非实际成交均价
PnL 计算错误，偏差取决于市场价格与挂单价的差值
DB 中 trade_orders.price 字段记录失真
开仓时 Leg B 数量基于错误的 Leg A 价格计算，导致对冲比例偏差

修复方案

_backfill_order_price 应复用已有的 query_order_status 方法（已正确解包），而非直接调用底层 API：

# 修复思路
def _backfill_order_price(self, order_result: OrderResult, coin: str):
    if not order_result.order_id:
        return
    try:
        # 复用 query_order_status（已做嵌套解包）
        status_resp = self.query_order_status(order_result.order_id)
        if isinstance(status_resp, dict):
            avg_px = status_resp.get("avgPx") or status_resp.get("avg_px")
            total_sz = status_resp.get("totalSz") or status_resp.get("total_sz")
            ...

Bug 2：`_parse_order_response` 错误订单状态混淆（`ERROR` 被标为 `REJECTED`）

位置：executor.py:1143-1146

根因

# executor.py:1143-1146（有 bug 的代码）
elif statuses and ORDER_STATUS_ERROR in statuses[0]:
    result.success = False
    result.error_message = statuses[0][ORDER_STATUS_ERROR]
    result.status = ORDER_STATUS_REJECTED   # ❌ 应为 ORDER_STATUS_ERROR

常量定义：

ORDER_STATUS_REJECTED = "rejected"（交易所主动拒绝，如保证金不足）
ORDER_STATUS_ERROR = "error"（系统/API 错误，非正常拒绝）

当交易所返回 error 状态时，系统将其错误地标记为 rejected。

影响

_notify_limit_order 中的通知条件（:516）：result.status in (ORDER_STATUS_REJECTED, ORDER_STATUS_ERROR) 虽然两者都能触发，但告警标题逻辑混乱
DB trade_orders.status 字段无法区分"交易所拒绝"和"系统错误"两种场景，影响事后排查
Lark 告警中显示的状态信息不准确

修复方案

elif statuses and ORDER_STATUS_ERROR in statuses[0]:
    result.success = False
    result.error_message = statuses[0][ORDER_STATUS_ERROR]
    result.status = ORDER_STATUS_ERROR   # ✅ 正确状态

Bug 3：`wait_for_order` 中 `userFills` 成交价存在竞态丢失窗口

位置：websocket_order_manager.py:83-98

根因

# websocket_order_manager.py:83-98
def wait_for_order(self, tracking: OrderTracking) -> bool:
    tracking.result_event.wait(timeout=tracking.timeout_seconds + 30)  # 由 orderUpdates 触发 set()

    # ❌ event.set() 后立即 pop，但 userFills 可能还未到达
    if tracking.status == OrderStatus.FILLED and not tracking.has_fill_price:
        with self._lock:
            fill = self._fill_prices.pop(tracking.oid, None)   # 此时可能仍为 None
        if fill:
            ...

竞态时序

[Thread A: orderUpdates]         [Thread B: wait_for_order]       [Thread C: userFills]
        |                                    |                              |
  status=FILLED                             |                              |
  _tracking.pop(oid)                        |                              |
  event.set() ──────────────────────► wait() 返回                         |
                                      检查 has_fill_price=False            |
                                      _fill_prices.pop(oid) → None        |
                                      成交价丢失 ✗                        |
                                                                   _cache_fill(oid, px, sz)
                                                                   （已无人消费）

影响

tracking.avg_price 保持为 0 或挂单价（取决于 _on_order_update 的 fallback 逻辑）
_track_limit_order 中 order_result.price 使用错误价格
在价格波动较大时，PnL 计算偏差显著

修复方向

在 wait_for_order 的 pop 之前加入短暂等待，或在 _on_order_update 中将 has_fill_price 标志的设置延迟到 userFills 到达后，避免提前 set event。

Bug 4：`sync_with_exchange` 幽灵仓位关闭后 `realized_pnl` 未更新 DB

位置：position_manager.py:1134-1138

根因

# position_manager.py:1134-1138（有 bug 的代码）
for pos_id, close_time, *_ in _close_ops:
    self._repo.update_position_status(
        position_id=pos_id,
        status=PositionStatus.CLOSED,
        close_time=close_time,
        # ❌ 缺少 realized_pnl 参数
    )

对比正常平仓流程（_execute_close :538-546）：

# executor.py:538-546（正确的平仓流程）
self._repo.update_position_status(
    position_id=position.position_id,
    status=PositionStatus.CLOSED,
    close_time=now,
    realized_pnl=realized_pnl,   # ✅ 正确记录 PnL
    alt_exit_price=alt_exit_price,
    ...
)

影响

幽灵仓位（交易所持仓消失但系统未感知的仓位）关闭时，DB 中 realized_pnl 保持为 0
daily_trading_stats.total_realized_pnl 统计数据不准确
历史 PnL 报表中该仓位的盈亏记录永久丢失

修复方向

在 _close_ops 元组中增加 realized_pnl 字段，在 update_position_status 调用时传入（可使用 mid_price 估算，与 _execute_close 中 alt_gone_from_exchange 分支的处理逻辑一致）。

Bug 5：平仓 DB 更新失败后，后续统计操作被意外中断

位置：position_manager.py:550-558

根因

# position_manager.py:550-558（有 bug 的代码）
except Exception as e:
    logger.error(f"平仓 DB 更新失败: {position.position_id} | {e}")
    sender_colourful(        # ❌ 无 try-except 保护
        title="平仓 DB 更新失败",
        content=(
            f"仓位 `{position.position_id}` 已在交易所平仓但 DB 更新失败，请手动核查\n"
            f"```\n{e}\n```"
        ),
    )
    # 若 sender_colourful 抛出异常，以下两行永远不会执行：

self._risk_manager.update_daily_pnl(realized_pnl)    # ❌ 风险统计丢失
self._repo.update_daily_stats(...)                    # ❌ 日交易统计丢失

触发条件

Lark Bot 发生网络故障（OSError、TimeoutError 等）时，sender_colourful 抛出异常，后续的 update_daily_pnl 和 update_daily_stats 静默跳过，无任何日志提示。

影响

RiskManager._daily_pnl 累计值偏低，可能影响止损熔断触发阈值
daily_trading_stats 统计的 trades_closed 和 total_realized_pnl 不准确
问题静默，仅靠外部对账才能发现

修复方案

except Exception as e:
    logger.error(f"平仓 DB 更新失败: {position.position_id} | {e}")
    try:
        sender_colourful(title="平仓 DB 更新失败", content=...)
    except Exception as notify_err:
        logger.error(f"平仓 DB 更新失败告警发送失败: {notify_err}")
    # 无论通知是否成功，统计更新必须执行

Bug 6（次要）：`_on_user_fill` 的条件判断语义错误

位置：websocket_order_manager.py:210

根因

# websocket_order_manager.py:210（语义错误）
if not px and not sz:   # ❌ AND 逻辑，语义错误（应为 OR）
    continue

当前行为：只有 px 和 sz 同时为假值时才跳过。若 px=None, sz="10"，则不跳过，但 fill_px=0.0，会被 _accumulate_fill 内部的 if fill_px <= 0 or fill_sz <= 0: return 拦截。

正确语义：只要 px 或 sz 任一缺失，则该条成交记录不完整，应跳过：

if not px or not sz:   # ✅ OR 逻辑，语义正确
    continue

影响

目前由 _accumulate_fill 内部保护兜底，无实际功能问题，但属于防御性编程缺失，代码语义不清晰。

修复优先级建议

立即修复（Bug 1、2）：直接影响成交价记录准确性和 PnL 计算，每次限价单超时均会触发
尽快修复（Bug 3、4、5）：在特定条件下触发，影响统计准确性和风险控制
计划修复（Bug 6）：无实际功能影响，代码 review 时一并处理

当前系统订单跟踪存在哪些严重的bug6

SHI XIAOLONG

订单跟踪系统严重 Bug 分析报告

严重性汇总

Bug 1（最严重）：`_backfill_order_price` API 响应未解包，导致成交价永远回填失败

根因

触发场景（3 处调用，全部受影响）

影响

修复方案

Bug 2：`_parse_order_response` 错误订单状态混淆（`ERROR` 被标为 `REJECTED`）

根因

影响

修复方案

Bug 3：`wait_for_order` 中 `userFills` 成交价存在竞态丢失窗口

根因

竞态时序

影响

修复方向

Bug 4：`sync_with_exchange` 幽灵仓位关闭后 `realized_pnl` 未更新 DB

根因

影响

修复方向

Bug 5：平仓 DB 更新失败后，后续统计操作被意外中断

根因

触发条件

影响

修复方案

Bug 6（次要）：`_on_user_fill` 的条件判断语义错误

根因

影响

修复优先级建议

Read more

跑步的技巧（滚动落地）

AMI的优越性

什么是：“世界模型（World Models）”

K线周期可配置化设计方案

订单跟踪系统严重 Bug 分析报告

严重性汇总

Bug 1（最严重）：_backfill_order_price API 响应未解包，导致成交价永远回填失败

根因

触发场景（3 处调用，全部受影响）

影响

修复方案

Bug 2：_parse_order_response 错误订单状态混淆（ERROR 被标为 REJECTED）

根因

影响

修复方案

Bug 3：wait_for_order 中 userFills 成交价存在竞态丢失窗口

根因

竞态时序

影响

修复方向

Bug 4：sync_with_exchange 幽灵仓位关闭后 realized_pnl 未更新 DB

根因

影响

修复方向

Bug 5：平仓 DB 更新失败后，后续统计操作被意外中断

根因

触发条件

影响

修复方案

Bug 6（次要）：_on_user_fill 的条件判断语义错误

根因

影响

修复优先级建议

Read more

跑步的技巧（滚动落地）

AMI的优越性

什么是：“世界模型（World Models）”

K线周期可配置化设计方案

Bug 1（最严重）：`_backfill_order_price` API 响应未解包，导致成交价永远回填失败

Bug 2：`_parse_order_response` 错误订单状态混淆（`ERROR` 被标为 `REJECTED`）

Bug 3：`wait_for_order` 中 `userFills` 成交价存在竞态丢失窗口

Bug 4：`sync_with_exchange` 幽灵仓位关闭后 `realized_pnl` 未更新 DB

Bug 6（次要）：`_on_user_fill` 的条件判断语义错误