Rename test_g6_protocol_surface_gaps.py to test_protocol_surface_gaps.py.
Strip G6 from module docstring, and finding IDs (N01, N02, N03, N05,
N09, R04) from docstrings and section comments.
Rename test_g5_asyncio_lifecycle.py to test_asyncio_lifecycle.py.
Strip G5 from module docstring, finding IDs (F05, F07, F08, F19)
from class names and docstrings.
Rename test_g4_transport_symmetry.py to test_transport_symmetry.py.
Strip G4 from module docstring, _g4_ from function names, and finding
IDs (F04, NEW-A, F18, M06, N04) from docstrings and section comments.
Rename test_g2_error_handling.py to test_error_handling.py. Strip G2
prefix from module docstring, _g2_ from function names, and finding
IDs (F22, F21/M01, M02, M04, N06, F14) from docstrings and section
comments. Proposal cross-references removed.
Strip internal forensics finding references (F06, N07, NEW-C, R02)
from docstrings, comments, and assertion messages. Descriptive text
is preserved — only the ID prefixes are removed.
Strip internal forensics finding references (F01, F02, F03, N11)
from docstrings and section comments. The descriptive text is
preserved — only the ID prefixes are removed.
Strip G1/ prefix from docstrings, _g1_ from function names,
and G1 from section comments. Finding IDs (F06, N07, etc.)
are preserved as they are meaningful standalone references.
send_binary_req reads result.payload['expected_ack'] from MSG_SENT
events. The resolving subscribe must provide that key for MSG_SENT
event types to avoid KeyError.
The test_r03_placeholder_registered_before_send test used a bare
MagicMock dispatcher whose subscribe never resolved event futures,
causing send() to block for DEFAULT_TIMEOUT (15s). Add a resolving
subscribe mock matching the pattern from the fixture fix on
fix/test-timeout-waste.
The mock_dispatcher fixture's fake_subscribe recorded event handlers
but never called them, causing asyncio.wait() to block for the full
DEFAULT_TIMEOUT (15s) on every test that passes expected_events to
send(). With 28 affected tests, the suite wasted ~8 minutes on dead
waits and required an undocumented pytest-timeout plugin to complete.
Add call_soon to the default fake_subscribe so futures resolve on the
next event loop iteration, matching the pattern already used by
setup_event_response(). Override with a non-resolving mock in
test_send_timeout to preserve timeout path coverage.
Suite now completes in <1 second with no --timeout flag.
Why: MeshCore.subscribe typed callbacks as Callable[[Event], Coroutine[...]]
(async only), while EventDispatcher.subscribe typed them as
Callable[[Event], Union[None, asyncio.Future]] (sync or async). Type-checkers
flag any sync handler passed through MeshCore.subscribe. Fix: align the
annotation to match EventDispatcher's union type; remove unused Coroutine import.
Refs: Forensics report finding R05
Why: send_binary_req() registered the pending request with the reader AFTER
send() returned. If a BINARY_RESPONSE arrives between send() returning and
registration (reachable for TCP-companion proxies), the reader logs "No tracked
request found" and the caller's wait_for_event times out. Fix: pre-register a
placeholder keyed by object id before send(), then swap it for the real tag
from MSG_SENT. On send() failure, the placeholder is cleaned up.
Refs: Forensics report finding R03
Why: Three minor cleanup fixes. M03 adds an else branch in set_flood_scope so
unsupported scope types raise TypeError instead of UnboundLocalError. M05 removes
the dead `out_path_len >> 6` shift in update_contact (high bits always zero due
to reader masking) and initializes path_hash_mode=0 explicitly. M07 normalizes
three `return None` paths in get_contacts to return Event(EventType.ERROR, ...)
so callers can rely on the return type always being Event.
Refs: Forensics report findings M03, M05, M07
Why: 5 seconds is too short for slow-path mesh operations (path-resolving
messaging, long binary responses, remote auth). Also the root cause of tests
that appeared to "hang" — they were falling through to the 5s timeout because
their mock dispatchers don't wire matching responses. Landed as a separate
commit so reviewers can drop it independently if they push back.
Refs: Forensics report finding F09
Why: req_mma() references undefined variables `start` and `end`, causing a
NameError on every call. The logger.error migration warning confirms the method
is intentionally deprecated in favor of req_mma_sync. Since it is broken as
shipped, removing it cannot break any working caller.
Refs: Forensics report finding F13
Why: Five CommandType enum entries had no user-facing SDK method:
SEND_RAW_DATA (25), HAS_CONNECTION (28), GET_CONTACT_BY_KEY (30),
GET_TUNING_PARAMS (43), FACTORY_RESET (51). Added send_raw_data() in
MessagingCommands, has_connection()/get_tuning()/request_factory_reset()/
confirm_factory_reset() in DeviceCommands, get_contact_by_key() in
ContactCommands. FACTORY_RESET uses a two-step token pattern as a
Python-side safety measure against accidental invocation.
Refs: Forensics report finding N09 (also N03 for get_tuning)
Why: Firmware requires strict len > 10 (MyMesh.cpp:1620). When path is
empty, send_trace() builds exactly 10 bytes (cmd+tag+auth+flags), which
is silently rejected. Appending one zero byte when the packet is <= 10
bytes satisfies the firmware gate without changing the semantic content.
Refs: Forensics report finding N05
Why: Three firmware push/response codes had no SDK handler — frames fell
through to the "Unhandled" debug log. CONTACT_DELETED (0x8F) carries a
32-byte pubkey from onContactOverwrite(); CONTACTS_FULL (0x90) is a
1-byte push from onContactsFull(); TUNING_PARAMS (23) is the 9-byte
response to CMD_GET_TUNING_PARAMS carrying rx_delay and airtime_factor.
All three now dispatch typed events. Short frames are guarded.
Refs: Forensics report findings N01, N02, N03
Why: PacketType was missing CONTACTS_FULL (0x90), emitted by
MyMesh::onContactsFull(). CommandType was missing GET_STATS (56),
used by get_stats_core/radio/packets but referenced only as literal
b"\\x38". Adding both enum entries prepares for handler and wrapper
implementations in subsequent commits.
Refs: Forensics report findings N02, R04
10 new tests in tests/unit/test_g5_asyncio_lifecycle.py:
- TestF05: _spawn_background retains tasks in TCP, Serial, and
EventDispatcher; tracked tasks survive gc.collect(); TCP handle_rx
and connection_lost use tracked dispatch.
- TestF07: stop() waits for in-flight async callbacks to complete.
- TestF08: EventDispatcher.queue is None before start(), created on
start(), dispatch() before start() raises RuntimeError;
CommandHandlerBase lock is None before access, created lazily.
- TestF19: send() calls get_running_loop (not get_event_loop).
Refs: Forensics report findings F05, F07, F08, F19
Why: asyncio.get_event_loop() inside an async function emits
DeprecationWarning since Python 3.10 and raises in some contexts on
Python 3.12+. The call in CommandHandlerBase.send() is always inside
a running async context where get_running_loop() is the correct API.
Refs: Forensics report finding F19
Why: On Python 3.9/3.10, asyncio.Queue() and asyncio.Lock() bind to
the running event loop at construction time. If the SDK is instantiated
from a synchronous factory before an event loop exists, both primitives
raise "RuntimeError: ... is bound to a different event loop" on first
use. Fix: EventDispatcher defers Queue creation to start(), with a
guard in dispatch() that raises RuntimeError if called before start().
CommandHandlerBase defers Lock creation via a lazy @property accessor.
Both document the contract change in class docstrings.
Refs: Forensics report finding F08
Why: EventDispatcher._process_events() calls task_done() on the queue
immediately after spawning async callback tasks. await queue.join() in
stop() therefore returns as soon as all items are marked done, even if
their async callbacks are still executing. Any caller that does
"await dispatcher.stop(); cleanup()" could race with still-running
callbacks. Fix: after queue.join(), gather all tracked background tasks
before cancelling the dispatch loop.
Refs: Forensics report finding F07
Why: Python's asyncio holds only weak references to tasks created via
create_task(). Under GC pressure (especially Python < 3.11), unretained
tasks can be silently cancelled mid-execution, and any exceptions are
swallowed as "Task exception was never retrieved." Seven call sites
across TCPConnection, BLEConnection, SerialConnection, and
EventDispatcher used fire-and-forget create_task with no stored
reference. Fix: introduce _background_tasks set and _spawn_background()
helper on each class, following the standard pattern from the asyncio
docs (task.add_done_callback(set.discard)).
Refs: Forensics report finding F05
10 new tests in tests/unit/test_g4_transport_symmetry.py covering all
G4 findings:
- test_g4_tcp_send_write_error_fires_disconnect (F04): TCP write
OSError fires _disconnect_callback.
- test_g4_serial_send_no_transport_fires_disconnect (NEW-A): serial
send on None transport fires _disconnect_callback.
- test_g4_serial_send_write_error_fires_disconnect (F04): serial write
OSError fires _disconnect_callback.
- test_g4_ble_send_no_client_fires_disconnect (F04): BLE send with no
client fires _disconnect_callback.
- test_g4_serial_connect_timeout (F18): connect raises TimeoutError
when connection_made never fires.
- test_g4_tcp_oversize_frame_empty_data_returns (M06): oversize header
with empty trailing data returns without dispatch.
- test_g4_serial_oversize_frame_empty_data_returns (M06): same for
serial transport.
- test_g4_tcp_receive_count_per_frame_not_per_segment (N04): 3 TCP
segments carrying 1 frame yield _receive_count == 1.
- test_g4_tcp_multiple_frames_count_correctly (N04): 2 complete frames
yield _receive_count == 2.
F16 and F17 are covered by the updated pre-existing test in
tests/test_ble_pin_pairing.py (committed with F17).
Refs: Forensics report findings F04, NEW-A, F18, M06, N04
Why: _receive_count incremented inside data_received(), which fires
once per TCP read — not once per completed MeshCore frame. Under TCP
fragmentation a single frame can arrive in multiple segments, inflating
_receive_count relative to _send_count. The disconnect-detection
heuristic (send_count - receive_count >= 5) then never fires because
the receive side is over-counted. Moving the increment into handle_rx
after a complete frame is assembled makes the counter semantically
correct: one increment per MeshCore frame dispatched to the reader.
Refs: Forensics report finding N04
Why: When BLE pairing failed during connect(), the exception was caught
as a warning and connect() continued normally. This left the transport
in a half-usable state — the BLE link was up but the pairing handshake
never completed, so encrypted characteristics could silently fail.
Now the pairing exception disconnects the client and re-raises, giving
the caller a clean failure. Updated the pre-existing
test_ble_connection_with_pin_failed_pairing test to assert the re-raise
behavior instead of the old swallow-and-continue behavior.
Refs: Forensics report finding F17
Why: handle_disconnect resets self.client to self._user_provided_client.
That client's disconnected_callback was not re-registered, so subsequent
BLE disconnects after a successful reconnect cycle were missed —
ConnectionManager never learned the link dropped again. Now re-registers
via set_disconnected_callback with a hasattr guard and try/except for
bleak version compatibility. The next connect() call would also re-create
the client with the callback, but this closes the gap between disconnect
and reconnect.
Refs: Forensics report finding F16
Why: When handle_rx detects an oversize frame header (>300 bytes), it
resets state (header, inframe, frame_expected_size) and re-calls itself
on any remaining data. But when data is empty after the header, the
method fell through to the frame-assembly code with inframe=b"",
eventually dispatching an empty frame to reader.handle_rx. Currently
absorbed by F06's umbrella try/except, but a guaranteed crash if that
guard moves. Adding a bare return after the reset block prevents the
fallthrough in both TCP and serial transports.
Refs: Forensics report finding M06
Why: After create_serial_connection, connect() awaited
_connected_event.wait() with no timeout. If the serial device opened
but connection_made was never called (driver bug, USB adapter glitch),
connect() hung indefinitely. Now wrapped in asyncio.wait_for with a
configurable timeout (default 10s). asyncio.TimeoutError propagates
to the caller for clean failure handling.
Refs: Forensics report finding F18
Why: TCP was the only transport that fired _disconnect_callback on a dead
transport or send failure. Serial and BLE returned silently, leaving
ConnectionManager unaware the link was down. Additionally, transport.write()
(TCP/serial) and write_gatt_char (BLE) had no try/except — any write-time
exception escaped the send() coroutine unhandled. This commit makes all three
transports symmetric: (a) fire _disconnect_callback when transport is None/dead,
(b) wrap the write call in try/except and fire _disconnect_callback on failure.
Refs: Forensics report findings F04, NEW-A