Skip to content

Modules and imports test panel#94

Merged
tamnd merged 114 commits into
mainfrom
feat/v0.13.5-spec-modules-imports
Jun 19, 2026
Merged

Modules and imports test panel#94
tamnd merged 114 commits into
mainfrom
feat/v0.13.5-spec-modules-imports

Conversation

@tamnd

@tamnd tamnd commented Jun 14, 2026

Copy link
Copy Markdown
Owner

Next slice of the spec 1700 vendored-test work: the Modules / imports panel. That's the 12 flat files plus the test_import/, test_importlib/ and test_module/ directory suites, driven to CPython 3.14.5 parity under the 1726 zero-skip bridge (we run what CPython runs and skip what it skips).

Baseline audit against CPython 3.14.5 (all of these are green on CPython):

Test gopy baseline
test_modulefinder ModuleNotFoundError: modulefinder
test_pkg dir() missing cached/doc/loader/spec
test_pkgutil os has no attribute altsep
test_pyclbr ModuleNotFoundError: pyclbr
test_runpy 1 ERROR (test_run_package_init_exceptions)
test_frozen ModuleNotFoundError: hello
test_zipimport os has no attribute altsep
test_zipimport_support os has no attribute altsep
test_zipapp ModuleNotFoundError: zipapp
test__interpchannels / test__interpreters PEP 554, deferred

Plan, smallest blast radius first:

  1. os.altsep + the module-object dir() surface (unblocks pkgutil, zipimport, zipimport_support, pkg)
  2. vendor the pure-Python stdlib modules: modulefinder, pyclbr, zipapp
  3. frozen modules (hello + the frozen table) for test_frozen
  4. the runpy package-init exception residual
  5. re-audit the three directory suites
  6. PEP 554 interpreters, matched to CPython's skip/run behaviour

Spec: website/docs/specs/1700/1731. Opening as a draft; will fill in as each phase lands and keep CI green.

tamnd added 21 commits June 14, 2026 22:13
Audit the 12 flat files plus test_import/, test_importlib/, test_module/
against CPython 3.14.5 under the 1726 zero-skip bridge, and lay out a
phased plan: os.altsep + module dir() surface first, then the pure-Python
stdlib modules (modulefinder, pyclbr, zipapp), frozen modules, the runpy
residual, the directory suites, and finally the PEP 554 interpreters.
The os module published sep/extsep/pathsep but not altsep, so any code
doing os.altsep raised AttributeError. test_pkgutil, test_zipimport and
test_zipimport_support all reach for it through ntpath/posixpath. Add it
to the module constants, matching CPython (None on POSIX, '/' on nt).
All three are pure-Python Lib modules the import panel reaches for, and
all three import cleanly under gopy. test_modulefinder, test_pyclbr and
test_zipapp now get past the ModuleNotFoundError and surface the real
gaps (importlib.machinery.PathFinder, test_importlib package, io
__class__) tracked as follow-ups.
A few gaps that surfaced once test_zipapp could import:

- Python subclasses of the io base types are *Instance objects, so their
  own methods and instance dict have to win over the synthesized native
  methods. The custom getattr/setattr now route those instances through
  the generic path, matching PyObject_GenericGetAttr's MRO walk. This is
  what let _ZipWriteFile override close() and carry _zinfo.
- BytesIO and StringIO keep object's identity hash (they define no
  __eq__), so they're hashable again.
- SystemExit grows its code member, derived from the constructor args
  like SystemExit_init and overridable by assignment.
- os.chmod accepts str, bytes, or any os.PathLike via __fspath__.

test_zipapp now matches CPython 3.14 (35 passing).
test.test_importlib.util guards itself with import_module("_testmultiphase")
at import time, so every test that pulls in that helper (test_pkgutil,
test_pyclbr, the test_importlib extension suites) was raising SkipTest
under gopy where CPython runs them. Reproduce the PEP 489 extension's main
module Go-side: foo, call_state_registration_func, the Example/error/Str
types, and the int_const/str_const constants the C execfunc installs.

Vendor test_importlib into stdlib/test so test.test_importlib.util resolves
as a support module, and re-export importlib.__import__ to match CPython's
public surface (util.py reaches for source_importlib.__import__).

test_pkgutil and test_pyclbr now run their suites instead of skipping; the
remaining failures need the importlib PathFinder/importer surface, which is
the next batch.
pyclbr.readmodule_ex calls importlib.util._find_spec_from_path to locate
a module's source without importing it. Port it on top of the existing
directory-scan helper (factored out of find_spec): check sys.modules
first, returning the cached __spec__ or raising when it is missing/None,
otherwise scan the supplied path. test_pyclbr gets past the import and
now only trips on modules that lack __spec__, which is the next gap.
gopy's import runs Go-side, so modules loaded through PathFinder, the
inittab, and the script-as-main entry never picked up the ModuleSpec
surface CPython's _init_module_attrs fills in. Tools that introspect a
module by name (pyclbr._readmodule, runpy, inspect) read __spec__ and
broke on the missing/None attribute.

Build the spec by calling importlib.util.spec_from_file_location (file
modules) or spec_from_loader (built-ins) once the body has run, mirroring
what FileFinder/BuiltinImporter produce. Modules imported before
importlib.util itself is importable are queued and flushed the moment it
becomes available. The vendored test-as-main module gets a file-location
spec so test.<name> matches what regrtest's import produces; plain
__main__ keeps __spec__ = None like python script.py.

Also force submodule imports for __all__ entries in 'from pkg import *',
and vendor the sre_parse/sre_constants/sre_compile deprecation shims plus
the pyclbr_input fixture.
The machinery.ModuleSpec and util._ModuleSpec stubs diverged from
CPython: no parent property, no has_location/cached descriptors, no
__repr__/__eq__. runpy._run_code reads mod_spec.parent and tools compare
specs, so the stubs broke. Replace both with a faithful port of
importlib._bootstrap.ModuleSpec (gopy keeps it in machinery since the
bootstrap is Go-side) and have importlib.util import it.

spec_from_file_location now follows _bootstrap_external: abspath the
location, set _set_fileattr, derive submodule_search_locations from the
loader. Add _get_cached to _bootstrap_external for the cached property.
The abspath call tolerates posixpath still being mid-import during the
bootstrap spec flush.
Drive test_runpy to parity with CPython 3.14:

- sys.dont_write_bytecode/path_hooks/path_importer_cache top-level attrs
- ModuleNotFoundError carries name= through the import miss path so
  runpy._get_module_details can keep searching dotted names
- subprocess cwd accepts path-like (PyUnicode_FSConverter parity)
- propagate GOPY_STDLIB to child interpreters so subprocess.run with a
  changed cwd still bootstraps encodings
- exit via SIG_DFL SIGINT on unhandled KeyboardInterrupt (bpo-1054041)
- PEP 420 namespace packages on the Go and Python find paths
- _testinternalcapi.get_recursion_depth
Split exitSigint into unix/windows build-tagged files so the windows
runner stops failing on syscall.Kill. Drop the nilerr lint hit in the
ImportError member getter by treating a dict miss as None rather than an
error. Accept os.PathLike argv members in _posixsubprocess.fork_exec the
way fsconvert_strdup does, so subprocess calls passing pathlib.Path args
no longer raise TypeError.
…tartup

gopy reported sys.flags.no_site as 0, claiming the site module had run,
while exit/quit/help/copyright/credits/license were missing. Vendor
site.py and _sitebuiltins.py unchanged from CPython 3.14 and import site
during bootstrap (after encodings) the way init_import_site does, so
site.main() runs setquit/setcopyright/sethelper and the builtins land.

Also fall back to GenericGetAttr in _io.File getattr so dunders like
__class__ resolve through the MRO; abc.__instancecheck__ probes sys.stdout
and was raising AttributeError on the missing __class__.
…d a stale slot

A value-replacement on an at-capacity dict triggers dictResize before the
replace-vs-insert branch in dictInsert. The resize rebuilds the table and
renumbers every slot, but the replace path returns without touching the keys
version, so LOAD_ATTR_INSTANCE_VALUE kept reading the cached (now wrong) slot
index. CPython hands a resized dict a fresh keys object whose dk_version is 0,
which drops every stamped inline cache; mirror that by resetting the version
inside dictResize.

Also route sys.excepthook through the live sys.stderr and format the full
traceback via errors.FormatException, the way _PyErr_Display does, so a test
that mocks sys.stderr captures the output.
ModuleType.__repr__ now forwards to importlib._bootstrap._module_repr the
same way CPython's module_repr goes through _PyImport_ImportlibModuleRepr,
so the __spec__/__loader__/__file__ variants (namespace packages, the '?'
name fallback, bare/full loader reprs) all render identically. Wired the
_bootstrap_external module global that _install_external_importers would
normally set, vendored NamespaceLoader/_NamespacePath, and re-exported
NamespaceLoader from importlib.machinery.

Modules are now GC-tracked with a tp_traverse over md_dict. A module whose
__dict__ holds functions closing over that same dict is a reference cycle;
without the traverse edge the collector treated md_dict as rooted and never
ran __del__ on cyclic objects defined in the module body.

test_module: 39 tests, all green.
zipimport plugs into sys.path_hooks and leans on _bootstrap_external
for the loader machinery. CPython freezes _bootstrap/_bootstrap_external
and runs _setup()/_install_external_importers() at startup to inject sys,
_imp and cross-link the two modules; gopy imports them like ordinary
modules and never runs that startup, so the bindings have to happen at
import time. Bind sys/_imp into _bootstrap, point _bootstrap_external
back at _bootstrap, and have _bootstrap_external register itself as
_bootstrap._bootstrap_external at the end of its module body.

Also fill in the _bootstrap_external pieces zipimport reaches for:
_path_stat, _LoaderBasics, _compile_bytecode, SourcelessFileLoader,
spec_from_file_location, _get_supported_file_loaders and _fix_up_module.
… path hooks

Two fixes that unblock importing modules out of a zip archive on sys.path.

zlib.Compress.flush() defaulted to Z_SYNC_FLUSH, but CPython defaults the
mode to Z_FINISH. The common compressobj().compress(x) + flush() idiom has
to emit a complete deflate stream (final block) or a one-shot decompressor
reads back a truncated stream and raises 'unexpected EOF'. zipfile stores
compressed members through that idiom and zipimport inflates them with raw
deflate, so every compressed-zip import was failing.

Add sys.meta_path so import_helper's save/restore around each test stops
raising AttributeError, register zipimport.zipimporter on sys.path_hooks
ahead of the FileFinder hook, and have the Go path finder consult
sys.path_hooks for non-directory sys.path entries: it builds the importer,
asks it for the spec, and loads the module via module_from_spec +
exec_module, mirroring _bootstrap._load_unlocked.
PEP 420 namespace packages were dropping portions found in zip path-hook
importers, so a package split across two archives ended up with a
__path__ of length 2 instead of the merged single entry. PathFinder now
accumulates namespace portions from path-hook specs the same way it does
for plain directories.

A real-filesystem namespace portion that only holds .pyc files (no .py)
was also invisible to the directory scan, so submodules under it raised
ModuleNotFoundError. Added __init__.pyc and <tail>.pyc handling that
loads the marshalled code directly.

importlib.util.module_from_spec was a divergent stub that only set
__file__ when origin was not None; namespace specs left it unset and
mod.__file__ raised AttributeError. Re-export _bootstrap.module_from_spec
so namespace modules get __file__ = None like CPython.

zlib.crc32/adler32 now accept bytearray.

Drops two test_zipimport scratch artifacts that were committed by mistake.
…Error.msg

gopy compiles every extension module into the binary, so they behave
exactly like statically-linked builtins: they are found before the path
finder and cannot be shadowed by a module of the same name on sys.path.
sys.builtin_module_names only listed builtins and sys, which left the
importlib builtin/extension finder tests with no usable module name and
made test_zipimport.testAFakeZlib run (and fail) instead of skipping the
way it does on a statically-linked CPython build. Build the tuple from
the inittab snapshot, minus the few pure-Python modules gopy keeps there
as an import shortcut so 'os' in sys.builtin_module_names stays False.

ImportError now exposes the msg member CPython sets from the single
positional argument, so exc.msg (read by zipimport's bad-magic test and
others) works.
func_getattro pulled the attribute straight out of the function __dict__
and returned it without an incref, so the caller's arg-drop could decref a
value the dict still held. A list stored on a function (mock wraps its
patchings list on the decorated function this way) got emptied by
list_dealloc after the first read, so a second read saw an empty list and
the shared decorator silently stopped patching across test classes.

Matches PyXINCREF in Objects/funcobject.c func_getattro.
@tamnd

tamnd commented Jun 14, 2026

Copy link
Copy Markdown
Owner Author

test_zipimport is fully green now (91 tests, 4 skipped to match CPython).

Two things were behind the last failures:

  • The two testTraceback errors were just a missing _testcapi.config_get. Ported it (plus config_getint/config_names) over a PyConfig_Get spec table.
  • test_checked_hash_based_change_pyc was the interesting one. It only failed in the cross-class run, and it turned out not to be a zipimport bug at all. func_getattro was reading attributes straight out of a function's __dict__ and returning them without an incref. mock.patch as a decorator stashes its patchings list on the wrapped function, so after the first test consumed it the list got emptied by list_dealloc, and the second class that shared the inherited decorated method silently stopped patching. Added the Py_XINCREF that CPython does. Minimal repro was just f.lst = [1]; print(f.lst); print(f.lst) printing [1] then [].

That incref fix is broad, not zipimport-specific, so worth a look.

Where the rest of the panel stands: test_module/ is green (39 tests). test_import/, test_importlib/, test_modulefinder, and test_runpy all bottom out on the same thing: our import dispatch runs Go-side and sys.meta_path is empty, so the Python PathFinder/FrozenImporter/BuiltinImporter finders aren't live and importlib.machinery doesn't re-export them. Making those the real dispatch path (and porting the _imp C functions the full bootstrap needs) is its own subsystem-sized piece. Wrote it up as P7 in the spec.

tamnd added 8 commits June 15, 2026 02:38
gopy shipped a trimmed importlib (stub machinery.py, a util.py that
imported source_hash from _bootstrap_external instead of defining it,
a _bootstrap.py that injected sys/_imp at module top). sys.meta_path
was empty and the Python finders were dead code, so anything that
introspected the import system or walked meta_path failed.

Vendor the unmodified CPython 3.14 files (__init__, _bootstrap,
_bootstrap_external, util, machinery, _abc, abc) plus the metadata,
resources, readers and simple submodules, then run the two-phase
install at startup the way pylifecycle does: __init__ self-bootstraps
through its except-ImportError branch (we have no frozen
_frozen_importlib), then we call _bootstrap._install /
_bootstrap_external._install directly so meta_path ends up as
[BuiltinImporter, FrozenImporter, PathFinder] and the FileFinder /
zipimport path hooks are registered.

Port the _imp C-function surface the full bootstrap drives
(find_frozen, get_frozen_object, is_frozen_package, create_builtin,
exec_builtin, extension_suffixes, _fix_co_filename) and add
sys.pycache_prefix so cache_from_source works.

test_runpy goes green (40), test_pkg green, test_pkgutil down to a
couple of residuals.
…mpare equal

ImportModuleLevel now walks sys.meta_path for finders a program installs,
skipping the BuiltinImporter/FrozenImporter/PathFinder entries gopy realizes
in Go and driving any spec a custom finder returns through loadFromSpec. This
lets test-installed importers (pkgutil's MyTestImporter) satisfy 'import foo'.

classmethod_get now stamps methOrigin so two bindings of int.from_bytes compare
equal and hash alike, matching meth_richcompare's m_ml pointer test.
Port _PyModule_IsPossiblyShadowing to read the startup-captured leading
sys.path entry (config->sys_path_0) instead of live sys.path[0], so a
script that mutates sys.path after startup keeps consistent shadowing
detection. The leading entry is now prepended to sys.path after site.main
runs, matching CPython, so a -c run keeps sys.path[0] == '' rather than
letting site.removeduppaths absolutize it.

Set spec._initializing around module exec so a self-importing module hits
the circular-import and 'consider renaming' hints. Pass the live __name__
object through to PySet_Contains so an unhashable str subclass raises, and
guard stdlib_module_names with PyAnySet_Check. Module getattro now formats
with %U-style literal quotes; os.__getattr__ miss uses single quotes.
A backward jump computed off an instrumented bytecode position read the
live byte (INSTRUMENTED_LINE or an INSTRUMENTED_<X> variant) and looked
its cache width up in the per-opcode table, which is keyed by base
opcode only. The marker returned a zero cache count, so the jump landed
one codeunit short of its target. Under sys.settrace this dropped the
loop header by one instruction in inlined comprehensions, leaving the
freshly built list on top of the stack instead of the iterator and
raising 'list object is not an iterator' on the next FOR_ITER.

advance() now resolves the marker (via the line original-opcode table)
and de-instruments before reading the cache stride.
CPython binds the builtins module object (not its dict) to __builtins__
in the __main__ namespace; every other module gets the dict. The frame
builder already unwraps a module back to its dict for LOAD_GLOBAL, so
the only behavioural change is that 'del __builtins__.__import__' now
reaches a module attribute and the import machinery raises ImportError
afterwards, matching test_import.test_delete_builtins_import.
The integer-fd and path-open FileIO constructors wrap an os.File whose
descriptor gopy already owns through FileIO.Close and the closefd flag.
Go's runtime also arms its own finalizer on those os.File values, and a
GC mid-run could fire it after the descriptor number had been freed and
reused by an unrelated open file, closing that file's fd out from under
it. Long write loops then failed with a spurious EBADF.

Clear the Go finalizer at every borrowed-fd wrap site, and also on
os.isatty's throwaway wrapper, so release stays deterministic.
…ystems

On macOS the filesystem is case-insensitive but case-preserving, so a
plain os.Stat probe lets `import RAnDoM` resolve random.py. CPython's
FileFinder guards against this by testing the candidate name against the
exact-case set(os.listdir(dir)) unless _relax_case() allows folding.

Port that check: confirm each resolved candidate's final component matches
a real directory entry with exact case, relaxed only on case-insensitive
platforms when PYTHONCASEOK is set.
tamnd added 6 commits June 18, 2026 03:36
The CI lint gate flagged three issues on this branch: a misspelled
'behaviour' in errors/builtins.go, the always-true bool result on
runSinglephase/reloadSinglephase, and fromDefAndSpec creeping over the
cognitive-complexity ceiling.

The two singlephase helpers always returned found=true once we had a
def, so the bool carried no information. Drop it and let CreateExtModule
supply the constant true at the call sites. Extract the slot-table scan
in fromDefAndSpec into scanExtSlots, which both reads cleaner and pulls
the function back under the gocognit limit.
…un-mode note

test_all_locks now passes: the module-lock drain landed when the
collector grew a tp_clear slot driven from delete_garbage. Drop it from
the residuals and note the namespace_pkgs standalone-vs-package run-mode
artifact, which CPython 3.14 reproduces identically.
PyImport_ImportModuleLevelObject checks sys.modules before ever calling
_find_and_load. When the cached module is still initializing (another
thread is mid-import on it), the C body waits on the per-module lock via
_bootstrap._lock_unlock_module, which catches the _DeadlockError a
concurrent circular import raises. _find_and_load's own _ModuleLockManager
lets that error propagate and kills the importing thread, so going
straight there for an in-flight circular import is the difference between
both threads finishing and one dying.

Route IMPORT_NAME through importModuleLevelObject, which now takes that
fast path only for the cached-and-initializing case and delegates every
other import to the live __import__ (its _find_and_load already returns a
fully loaded module without locking, so the common path is unchanged).
The cache hit borrows from sys.modules where import_get_module returns a
new reference, so incref before handing the module back or IMPORT_NAME's
DECREF_INPUTS under-counts it.

Fixes test_threaded_import.test_circular_imports.
The circular-import cache fast path and the gh-134100 dotted-head KeyError
pulled the shared importModuleLevelObject in opposite directions: the VM
IMPORT_NAME opcode needs the refcount-proven delegateImport route (it applies
DECREF_INPUTS to the module it pushes), while the builtin __import__ needs the
C-faithful _gcd_import + headSelection that raises KeyError when a non-module
sits in sys.modules.

Give IMPORT_NAME its own importViaDelegate: it runs the same
import_ensure_initialized still-initializing fast path (so concurrent circular
imports resolve instead of dying on an uncaught _DeadlockError) but otherwise
delegates the load to _frozen_importlib.__import__. The accepted fast-path
branch still runs the fromlist / dotted-head selection so a bare
'from . import sub' during a package's own init force-imports the submodule.
importModuleLevelObject keeps the full C body for the builtin __import__.
Both the threaded circular-import failure and the incomplete multi-phase init
error are resolved; the panel runs 1346 tests with 0 failures and 0 errors.
A cached module without a usable __spec__ cannot be mid-import, so
acceptInitializingModule treats the lookup failure as not-initializing and
falls back to the normal import path. Annotate the deliberate error swallow so
the nilerr linter passes.
@tamnd

tamnd commented Jun 17, 2026

Copy link
Copy Markdown
Owner Author

The two remaining test_importlib residuals are closed, so the panel now runs 1346 tests with no failures and no errors.

The threaded circular-import case was the interesting one. The fix is splitting the two import entry points that had been sharing a single function:

  • The VM IMPORT_NAME opcode keeps the long-standing _frozen_importlib.__import__ delegate route. It applies DECREF_INPUTS to the module it pushes, and that route returns a reference whose count the decref is balanced against. What it gains is the import_ensure_initialized still-initializing fast path out front: when another thread is mid-import on the target, it waits via _bootstrap._lock_unlock_module (which catches _DeadlockError) instead of re-entering _find_and_load's _ModuleLockManager, which lets the error propagate and kills the importing thread. That's the difference between both threads finishing a circular import and one dying.

  • The builtin __import__ keeps the full PyImport_ImportModuleLevelObject body: _gcd_import plus the C dotted-head selection that re-reads sys.modules and raises KeyError when a non-module has been stuffed in there (the gh-134100 guard). Routing that through the Python mirror would surface an AttributeError instead.

I tried to keep these as one function and it kept pulling in opposite directions: the opcode needs the delegate route for refcount reasons, the builtin needs the C-faithful head selection for the KeyError. Splitting them is cleaner than trying to make one body satisfy both.

Verified with a per-module diff of the whole test_importlib tree against the pre-change binary: the only delta is test_threaded_import going from one failure to zero. Everything else is byte-identical, including the known GC-off and PathFinder run-mode artifacts. CI is green.

tamnd added 3 commits June 18, 2026 14:47
The collector could never reclaim a class kept alive only by its own
instances, or a module-body cycle (a function whose __globals__ is the
module dict). Two things were missing: the outgoing edges those objects
own were not reported through tp_traverse, and the transient class
namespace copy type_new leaves behind was never released.

- Give module/type/instance and the map/filter/zip iterators real
  traverse + clear + dealloc slots so subtract_refs can zero their
  gc_refs and move_unreachable can re-float the reachable ones. The
  iterators now also incref the function/iterators they hold (they were
  storing them raw), which is what their dealloc decrefs against.
- Pin sys.modules as a static root. CPython keeps it alive through
  interp->modules; gopy holds it via a Go pointer the refcount collector
  can't see, so without re-floating it the whole module graph collapsed
  to gc_refs == 0 and a live module global got reclaimed.
- Release the PyDict_Copy namespace in type_new. gopy installs each entry
  into the type via SetTypeDescr (which re-increfs), so the copy is
  transient; leaving it at refcnt 1 rooted the {namespace, class,
  instance} cycle and test_module test_clear_dict_in_ref_cycle never saw
  its instances finalized.
- Bracket synchronous __del__ with save/restore of the raised exception
  so a finalizer firing mid-unwind can't clobber the in-flight exception.

Fixes the zip self-iterator, which returned itself without increfing;
once zip grew a dealloc, `for x in zip(...)` over a temporary dropped the
zip to refcount 0 during GET_ITER and yielded nothing. That had been
silently breaking dataclass __init__ generation (it zips names with
generated functions), and so every dataclass-using import.
…ules

add_main_module gives __main__ the BuiltinImporter as its initial
__loader__ when nothing else set one, so test_importlib's
test_everyone_has___loader__ finds the attribute. Plus the sys.modules
wiring the import-path split needs.
The new-style class machinery copies the class namespace before
draining it into the type's descriptor table, then drops the copy.
CPython frees that copy the instant its refcount hits zero, which
also unlinks it from the GC list, so no collection ever sees it.

gopy's Decref leaves a refcount-zero container tracked on purpose:
the next cycle pass still needs to walk it to fire any weakref
callbacks. For this copy that is the wrong call. It owns the only
extra references to the class methods, whose __globals__ pin the
defining module dict, so leaving them live keeps the whole
namespace/class/instance graph rooted and a user __del__ never
runs. And once the collector does walk it, it counts as a reclaimed
cycle member that CPython would never have reported.

Clear and untrack the copy on disposal, matching dict_dealloc, so
the captured method references drop immediately and the transient
stays out of later collections. Fixes the module/gc __del__ and
resurrection counts and keeps test_module's clear_dict_in_ref_cycle
green.
@tamnd

tamnd commented Jun 18, 2026

Copy link
Copy Markdown
Owner Author

The module/gc unit tests were going red on the last couple of runs (TestUserDelFiresDuringCycleCollect, TestUserDelResurrectionSurvivesCycleCollect expecting 4 reclaimed, getting 5). Tracked it down to the namespace copy that type_new builds.

processClassNamespace copies the class body into a fresh dict, drains the entries onto the type, then drops the copy. Earlier in this branch I'd switched that drop to a plain Decref. The catch is gopy deliberately leaves refcount-zero non-finalizable containers tracked so a later cycle pass can still walk them for weakref callbacks. So the dropped copy sat at refcount 0 but stayed on the GC list, and the next collection counted it as an extra cycle member.

CPython doesn't have that window. dict_dealloc untracks and decrefs every key/value synchronously the moment the dict hits zero. So I gave the copy the same treatment with dropTransientDict: decref, and if that takes it to zero, clear its contents (releasing the captured class methods, whose __globals__ would otherwise pin the defining module dict) and untrack it right there. That keeps test_clear_dict_in_ref_cycle green too, which the first untrack-only attempt broke because the method refs stayed live.

CI is green again across macOS, Ubuntu and Windows. The import panel suites (test_import, test_importlib, test_module) all pass when run in their natural stdlib location.

tamnd added 8 commits June 18, 2026 15:27
… tables

addTools/removeTools used to bail when a slot's live byte was already
INSTRUMENTED_LINE or INSTRUMENTED_INSTRUCTION, so adding or removing a
static event (e.g. the global PY_RETURN that the legacy trace bridge
installs) never reached the real opcode parked in the line / per-instruction
side table. When opcode tracing was later toggled off the slot restored to
the plain opcode and the return event was lost, which is what broke pdb
single-stepping under doctest.

Port instrument() and de_instrument() faithfully: walk the live byte through
the INSTRUMENTED_LINE (lines table) and INSTRUMENTED_INSTRUCTION
(per-instruction table) layers to the location CPython tracks as opcode_ptr,
then write the (de)instrumented opcode back to that resolved slot instead of
the visible marker. addTools/removeTools now just maintain the tools mask and
delegate.

CPython: Python/instrumentation.c:757 instrument, :676 de_instrument
Wire the PEP 669 INSTRUCTION event into the sys.settrace path so bdb/pdb can
single-step at the bytecode level. The local INSTRUCTION event toggles with
frame.f_trace_opcodes; when a debugger sets it mid-frame we re-instrument the
live call chain immediately rather than waiting for the next monitoring
version bump.

dispatch resolves a slot hidden behind INSTRUMENTED_INSTRUCTION/INSTRUMENTED_LINE
to its base opcode before the adaptive ladder runs, so specialization never
clobbers a marker, and the trace trampoline calls the local f_trace for
line/return/exception/opcode events and the global handler for call.

CPython: Python/instrumentation.c:1401 _Py_call_instrumentation_instruction,
Python/ceval.c _PyEval_SetOpcodeTrace
…ed descriptor wrappers

frame gains the SetOpcodeTraceHook indirection so bdb/pdb setting
frame.f_trace_opcodes re-instruments the live chain without objects/ taking a
dependency on vm. list_item / list_ass_item now raise the CPython-exact
'list index out of range' and 'list assignment index out of range' text.
AddDescriptorSlotWrappers is exported so types defined in other packages can
surface __get__/__set__/__delete__ the same way add_operators does.

CPython: Python/ceval.c _PyEval_SetOpcodeTrace, Objects/listobject.c:469
list_item, Objects/typeobject.c add_operators
…nstruction

partial and lru_cache_wrapper now install the tp_descr_get wrapper so
inspect.ismethoddescriptor recognises them, matching add_operators for the C
types. _elementtree.ParseError is built through NewExcType and adopts
SyntaxError's tp_new/tp_str (a bare NewType left TpNew nil, so type_call
refused construction), and carries __module__ = xml.etree.ElementTree so it
renders with the dotted name.

CPython: Modules/_elementtree.c:4505 PyErr_NewException,
Objects/typeobject.c add_operators
When the positional argument is an import-path entry (a directory or a ZIP
archive) rather than a .py file, prepend it to sys.path and run its __main__
submodule via runpy, leaving argv[0] untouched. Mirrors pymain_get_importer
running the path hooks and pymain_run_module(__main__, set_argv0=0). This is
what test_zipimport_support drives when it executes a built zip.

CPython: Modules/main.c:127 pymain_get_importer, :691 pymain_run_python
… test_zipimport_support

Note the test_importlib test_util cwd-shadowing artifact too: running it from the
repo root puts module/ on sys.path so import module resolves as a namespace package
and test_find_submodule_in_module stops raising. CPython fails identically from such
a cwd; the canonical regrtest run uses a clean directory.
…ec stat taint

run() crept over the gocyclo ceiling once the importer-path branch landed;
pull that branch into runPositional so the option parser stays under the limit.
isFile feeds the same argv-derived path isDir already annotates, so it gets the
matching nolint.
@tamnd

tamnd commented Jun 18, 2026

Copy link
Copy Markdown
Owner Author

Latest batch closes out the doctest/pdb side of the panel.

The interesting one was pdb single-stepping under doctest. test_pdb_set_trace
and test_pdb_set_trace_nested were stopping at the right place on the initial
set_trace() but then step/next would walk into doctest's own __run frame
instead of returning out of the user function. Tracked it down to the monitoring
shadow walk: add_tools/remove_tools were bailing the moment a slot's live byte
was already INSTRUMENTED_LINE or INSTRUMENTED_INSTRUCTION, so toggling opcode
events back off never restored the real opcode and the return events got eaten.

Ported instrument() and de_instrument() properly (Python/instrumentation.c) so
they resolve the opcode through the side tables the way CPython does, the
opcode_ptr walk from the live byte to the line table to the per-instruction
table before reading or writing. With that in place opcode tracing toggles cleanly
and the debugger steps out the way it should. test_doctest is 71/71 and
test_zipimport_support is 4/4.

Also in here: running a directory or a zip as python path/ now does the
pymain_get_importer thing, prepends the path and runs its __main__ submodule
through runpy instead of trying to read it as a source file.

One honest caveat on test_importlib: test_find_submodule_in_module "fails" when
you run the suite straight from the repo root, because the root has a module/
directory (our Go module ports) and with cwd on sys.path[0] that resolves as a
PEP 420 namespace package, so find_spec('module.something') doesn't raise the
ModuleNotFoundError the test wants. Stock CPython fails the same way from a cwd
that shadows a stdlib-ish name. It's clean from any other directory and the gate
doesn't run from the repo root, so I've noted it in the panel doc rather than
papering over it.

tamnd added 2 commits June 18, 2026 20:09
The cyclic collector reclaimed importlib._bootstrap's _blocking_on
_WeakValueDictionary out from under a live import during a full
collection, surfacing as '_WeakValueDictionary' object has no
attribute data while test_zipimport's testZip64 churned the heap.

sys.modules is held through a Go pointer the refcount pass cannot
see, so pin_roots floated only the direct module entries and trusted
move_unreachable to resurrect the rest. That works when every edge
carries a counted reference, but gopy containers do not incref what
they store (instance __dict__ among them), so subtract_refs
over-decrements an interior node on the module -> module __dict__ ->
_WeakValueDictionary -> instance __dict__ chain and a partition order
that fails to resurrect every hop drops a still-live object.

Walk the whole strongly-reachable closure from the static roots and
float each candidate to refs >= 1, recursing only through candidates
so a young-generation collection stays as cheap as before.
@tamnd

tamnd commented Jun 18, 2026

Copy link
Copy Markdown
Owner Author

test_zipimport is green now: 91 tests, 0 failures, 0 errors. Four skips, all matching CPython behavior on a static-zlib build (testAFakeZlib in both class variants skips with "zlib is a builtin module", same as CPython's own Bug #765456 guard).

The interesting fix here was in the cyclic collector. _frozen_importlib._blocking_on is a module-global _WeakValueDictionary, and its backing data lived in an instance dict reachable only from sys.modules. sys.modules is held by a Go pointer, so it's invisible to the refcount collector and only floated in via GCStaticRootsHook, which previously pinned a single level. That left the dict one hop too deep, so a collection could tp_clear it and later imports blew up. pinRoots now walks the full strongly-reachable closure from the sys.modules roots (markReachableClosure in module/gc/refs.go), recursing only through collection candidates so young-gen collections stay cheap.

Regression-checked test_gc (identical 29/4 before and after) and test_weakref (same point), and the whole imports panel is green: test_import, test_frozen, test_modulefinder, test_pkgutil, test_runpy, test_zipimport, test_zipimport_support, plus the test_importlib/test_module suites via regrtest.

tamnd added 3 commits June 19, 2026 09:57
test_zipimport_support pulls test_doctest, sample_doctest and friends
out of test.test_doctest to exercise zip-imported doctest discovery.
That package was an empty namespace dir in the vendored stdlib, so the
import blew up before any test ran. Vendor the package from the CPython
3.14 source so the helpers resolve.
Package rows in the manifest (test_import/, test_importlib/,
test_module/) used to surface as OutcomeError because the runner looked
for a <pkg>/<pkg>.py entry point that these packages do not ship.
Drive them the way CPython's regrtest does instead: load the package
through unittest discovery with -m unittest test.<name>.

The command runs from the corpus directory so the repo-root module/ Go
source tree does not shadow stdlib imports on sys.path[0] (otherwise
importlib.util.find_spec('module.name') resolves module/ as a namespace
package and two find_spec tests stop raising ModuleNotFoundError).

All three suites pass: test_import 118/118, test_module 39/39,
test_importlib 1346/1346.
@tamnd

tamnd commented Jun 19, 2026

Copy link
Copy Markdown
Owner Author

Closed out the last two gaps in the Modules/imports panel so the whole thing is green now.

test_zipimport_support was failing at import time: it pulls test_doctest and the sample_doctest_* helpers out of test.test_doctest, but that package was an empty namespace dir in the vendored stdlib, so the import blew up before any test ran. Vendored the test_doctest package from the 3.14 source and it's back to 4/4.

The three directory suites (test_import/, test_importlib/, test_module/) were surfacing as errors in the regrtest runner because it looked for a /.py entry point that none of them ship. Wired the runner to drive packages the way CPython's regrtest does, with -m unittest test.. One wrinkle worth noting: the command has to run from the corpus dir, otherwise the repo-root module/ Go-port tree lands on sys.path[0] and find_spec('module.name') resolves module/ as a PEP 420 namespace package, which makes the two test_find_submodule_in_module rows stop raising ModuleNotFoundError. CPython fails the same way from such a cwd.

Full panel now: test_import 118/118, test_module 39/39, test_importlib 1346/1346, plus the 12 flat files (test_frozen, test_modulefinder, test_pkg, test_pkgutil, test_pyclbr, test_runpy, test_zipapp, test_zipimport 91/91, test_zipimport_support). TestModulesImportsPanelPackages pins the three suites. Skip counts track CPython.

CI green across the matrix.

@tamnd tamnd marked this pull request as ready for review June 19, 2026 03:17
@tamnd tamnd merged commit 6225fd3 into main Jun 19, 2026
6 checks passed
@tamnd tamnd deleted the feat/v0.13.5-spec-modules-imports branch June 19, 2026 03:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant