Add per-`ctx_key` isolation tests for `maybe_open_context()`
Add `test_per_ctx_key_resource_lifecycle` to verify that per-key user tracking correctly tears down resources independently - exercises the fix from 02b2ef18 where a global `_Cache.users` counter caused stale cache hits when the same `acm_func` was called with different kwargs. Also, add a paired `acm_with_resource()` helper `@acm` that yields its `resource_id` for per-key testing in the above suite. (this patch was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code Prompt-IO: ai/prompt-io/claude/20260406T172848Z_02b2ef1_prompt_io.mdmaybe_open_ctx_locking
parent
ebe9d5e4b5
commit
85f9c5df6f
|
|
@ -0,0 +1,54 @@
|
|||
---
|
||||
model: claude-opus-4-6
|
||||
service: claude
|
||||
session: (ad-hoc, not tracked via conf.toml)
|
||||
timestamp: 2026-04-06T17:28:48Z
|
||||
git_ref: 02b2ef1
|
||||
scope: tests
|
||||
substantive: true
|
||||
raw_file: 20260406T172848Z_02b2ef1_prompt_io.raw.md
|
||||
---
|
||||
|
||||
## Prompt
|
||||
|
||||
User asked to extend `tests/test_resource_cache.py` with a test
|
||||
that reproduces the edge case fixed in commit `02b2ef18` (per-key
|
||||
locking+user tracking in `maybe_open_context()`). The bug was
|
||||
originally triggered in piker's `brokerd.kraken` backend where the
|
||||
same `acm_func` was called with different kwargs, and the old
|
||||
global `_Cache.users` counter caused:
|
||||
|
||||
- teardown skipped for one `ctx_key` bc another key's users kept
|
||||
the global count > 0
|
||||
- re-entry hitting `assert not resources.get(ctx_key)` during the
|
||||
teardown window
|
||||
|
||||
User requested a test that would fail under the old code and pass
|
||||
with the fix.
|
||||
|
||||
## Response summary
|
||||
|
||||
Designed and implemented `test_per_ctx_key_resource_lifecycle`
|
||||
which verifies per-`ctx_key` resource isolation by:
|
||||
|
||||
1. Holding resource `'a'` open in a bg task
|
||||
2. Opening+closing resource `'b'` (same `acm_func`, different
|
||||
kwargs) while `'a'` is still alive
|
||||
3. Re-opening `'b'` and asserting cache MISS — proving `'b'` was
|
||||
torn down independently despite `'a'` keeping its own user
|
||||
count > 0
|
||||
|
||||
With the old global counter, phase 3 would produce a stale cache
|
||||
HIT (leaked resource) or crash on the assert.
|
||||
|
||||
Also added a trivial `acm_with_resource(resource_id)` ACM helper
|
||||
at module level.
|
||||
|
||||
## Files changed
|
||||
|
||||
- `tests/test_resource_cache.py` — add `acm_with_resource` ACM +
|
||||
`test_per_ctx_key_resource_lifecycle` test fn
|
||||
|
||||
## Human edits
|
||||
|
||||
None — committed as generated (pending user review).
|
||||
|
|
@ -0,0 +1,27 @@
|
|||
# AI Prompt I/O Log — claude
|
||||
|
||||
This directory tracks prompt inputs and model
|
||||
outputs for AI-assisted development using
|
||||
`claude` (Claude Code).
|
||||
|
||||
## Policy
|
||||
|
||||
Prompt logging follows the
|
||||
[NLNet generative AI policy][nlnet-ai].
|
||||
All substantive AI contributions are logged
|
||||
with:
|
||||
- Model name and version
|
||||
- Timestamps
|
||||
- The prompts that produced the output
|
||||
- Unedited model output (`.raw.md` files)
|
||||
|
||||
[nlnet-ai]: https://nlnet.nl/foundation/policies/generativeAI/
|
||||
|
||||
## Usage
|
||||
|
||||
Entries are created by the `/prompt-io` skill
|
||||
or automatically via `/commit-msg` integration.
|
||||
|
||||
Human contributors remain accountable for all
|
||||
code decisions. AI-generated content is never
|
||||
presented as human-authored work.
|
||||
|
|
@ -425,3 +425,115 @@ def test_lock_not_corrupted_on_fast_cancel(
|
|||
)
|
||||
|
||||
trio.run(main)
|
||||
|
||||
|
||||
@acm
|
||||
async def acm_with_resource(resource_id: str):
|
||||
'''
|
||||
Yield `resource_id` as the cached value.
|
||||
|
||||
Used to verify per-`ctx_key` isolation when the same
|
||||
`acm_func` is called with different kwargs.
|
||||
|
||||
'''
|
||||
yield resource_id
|
||||
|
||||
|
||||
def test_per_ctx_key_resource_lifecycle(
|
||||
debug_mode: bool,
|
||||
loglevel: str,
|
||||
):
|
||||
'''
|
||||
Verify that `maybe_open_context()` correctly isolates resource
|
||||
lifecycle **per `ctx_key`** when the same `acm_func` is called
|
||||
with different kwargs.
|
||||
|
||||
Previously `_Cache.users` was a single global `int` and
|
||||
`_Cache.locks` was keyed on `fid` (function ID), so calling
|
||||
the same `acm_func` with different kwargs (producing different
|
||||
`ctx_key`s) meant:
|
||||
|
||||
- teardown for one key was skipped bc the *other* key's users
|
||||
kept the global count > 0,
|
||||
- and re-entry could hit the old
|
||||
`assert not resources.get(ctx_key)` crash during the
|
||||
teardown window.
|
||||
|
||||
This was the root cause of a long-standing bug in piker's
|
||||
`brokerd.kraken` backend.
|
||||
|
||||
'''
|
||||
timeout: float = 6
|
||||
if debug_mode:
|
||||
timeout = 999
|
||||
|
||||
async def main():
|
||||
a_ready = trio.Event()
|
||||
a_exit = trio.Event()
|
||||
|
||||
async def hold_resource_a():
|
||||
'''
|
||||
Open resource 'a' and keep it alive until signalled.
|
||||
|
||||
'''
|
||||
async with maybe_open_context(
|
||||
acm_with_resource,
|
||||
kwargs={'resource_id': 'a'},
|
||||
) as (cache_hit, value):
|
||||
assert not cache_hit
|
||||
assert value == 'a'
|
||||
log.info("resource 'a' entered (holding)")
|
||||
a_ready.set()
|
||||
await a_exit.wait()
|
||||
log.info("resource 'a' exiting")
|
||||
|
||||
with trio.fail_after(timeout):
|
||||
async with (
|
||||
tractor.open_root_actor(
|
||||
debug_mode=debug_mode,
|
||||
loglevel=loglevel,
|
||||
),
|
||||
trio.open_nursery() as tn,
|
||||
):
|
||||
# Phase 1: bg task holds resource 'a' open.
|
||||
tn.start_soon(hold_resource_a)
|
||||
await a_ready.wait()
|
||||
|
||||
# Phase 2: open resource 'b' (different kwargs,
|
||||
# same acm_func) then exit it while 'a' is still
|
||||
# alive.
|
||||
async with maybe_open_context(
|
||||
acm_with_resource,
|
||||
kwargs={'resource_id': 'b'},
|
||||
) as (cache_hit, value):
|
||||
assert not cache_hit
|
||||
assert value == 'b'
|
||||
log.info("resource 'b' entered")
|
||||
|
||||
log.info("resource 'b' exited, waiting for teardown")
|
||||
await trio.lowlevel.checkpoint()
|
||||
|
||||
# Phase 3: re-open 'b'; must be a fresh cache MISS
|
||||
# proving 'b' was torn down independently of 'a'.
|
||||
#
|
||||
# With the old global `_Cache.users` counter this
|
||||
# would be a stale cache HIT (leaked resource) or
|
||||
# trigger `assert not resources.get(ctx_key)`.
|
||||
async with maybe_open_context(
|
||||
acm_with_resource,
|
||||
kwargs={'resource_id': 'b'},
|
||||
) as (cache_hit, value):
|
||||
assert not cache_hit, (
|
||||
"resource 'b' was NOT torn down despite "
|
||||
"having zero users! (global user count bug)"
|
||||
)
|
||||
assert value == 'b'
|
||||
log.info(
|
||||
"resource 'b' re-entered "
|
||||
"(cache miss, correct)"
|
||||
)
|
||||
|
||||
# Phase 4: let 'a' exit, clean shutdown.
|
||||
a_exit.set()
|
||||
|
||||
trio.run(main)
|
||||
|
|
|
|||
Loading…
Reference in New Issue