Skip to content

Add shared-memory arena framework for iterate_in_subprocess (#1520)#1520

Merged
meta-codesync[bot] merged 1 commit into
mainfrom
export-D107709116
Jun 10, 2026
Merged

Add shared-memory arena framework for iterate_in_subprocess (#1520)#1520
meta-codesync[bot] merged 1 commit into
mainfrom
export-D107709116

Conversation

@moto-meta

@moto-meta moto-meta commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Summary:

Adds the backend-agnostic machinery for moving large binary payloads (large bytes, NumPy arrays, Torch tensors) between the worker and parent of iterate_in_subprocess through shared memory instead of pickling them through the result queue. This commit contains only the framework; concrete buffer backends are added in following commits.

New private subpackage spdl/pipeline/_arena/:

  • _protocol.py: structural ArenaProtocol / ArenaWriterProtocol / ArenaReaderProtocol interfaces. iterate_in_subprocess accepts any object satisfying ArenaProtocol through one buffer argument and dispatches on its type, so additional backends need no signature change. A unit (one pipeline item, possibly several binaries) is bracketed by begin_unit / commit_unit on the writer and returned with end_unit on the reader, so reservation and return happen in bulk per unit.
  • _marker.py: _ShmMarker, the small descriptor that replaces an offloaded binary in the pickled envelope.
  • _registry.py: decides which objects are offloaded and how. A handler exposes an object's bytes as a zero-copy source buffer (get_buffer) so the arena copies into shared memory exactly once, and rebuilds it from a destination buffer (from_buffer), returning the restored object plus a lifetime anchor for zero-copy views. The bytes handler is pure Python; NumPy and Torch are reached through lazy_import so import spdl.pipeline works without them and they never become hard dependencies.
  • _offload.py: offload / restore, which divert offloadable leaves of an arbitrarily nested object into the arena via a pickler persistent_id / persistent_load pair while the rest is pickled normally. The envelope carries an 8-byte unit span so the reader can return the whole region in one call.

iterate_in_subprocess gains a buffer: ArenaProtocol | None argument. When provided, the worker offloads each item before putting it on the queue and the parent restores it as the first step; the arena is created and owned by the parent, passed to the worker, and closed and unlinked at teardown after the worker is confirmed dead. With buffer=None the behavior is unchanged.

Reviewed By: huangruizhe

Differential Revision: D107709116

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jun 10, 2026
@meta-codesync

meta-codesync Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

@moto-meta has exported this pull request. If you are a Meta employee, you can view the originating Diff in D107709116.

@meta-codesync meta-codesync Bot changed the title Add shared-memory arena framework for iterate_in_subprocess Add shared-memory arena framework for iterate_in_subprocess (#1520) Jun 10, 2026
@meta-codesync meta-codesync Bot force-pushed the export-D107709116 branch from f30336f to 30ebad3 Compare June 10, 2026 03:54
Summary:

Adds the backend-agnostic machinery for moving large binary payloads (large `bytes`, NumPy arrays, Torch tensors) between the worker and parent of `iterate_in_subprocess` through shared memory instead of pickling them through the result queue. This commit contains only the framework; concrete buffer backends are added in following commits.

New private subpackage `spdl/pipeline/_arena/`:

- `_protocol.py`: structural `ArenaProtocol` / `ArenaWriterProtocol` / `ArenaReaderProtocol` interfaces. `iterate_in_subprocess` accepts any object satisfying `ArenaProtocol` through one `buffer` argument and dispatches on its type, so additional backends need no signature change. A unit (one pipeline item, possibly several binaries) is bracketed by `begin_unit` / `commit_unit` on the writer and returned with `end_unit` on the reader, so reservation and return happen in bulk per unit.
- `_marker.py`: `_ShmMarker`, the small descriptor that replaces an offloaded binary in the pickled envelope.
- `_registry.py`: decides which objects are offloaded and how. A handler exposes an object's bytes as a zero-copy source buffer (`get_buffer`) so the arena copies into shared memory exactly once, and rebuilds it from a destination buffer (`from_buffer`), returning the restored object plus a lifetime anchor for zero-copy views. The `bytes` handler is pure Python; NumPy and Torch are reached through `lazy_import` so `import spdl.pipeline` works without them and they never become hard dependencies.
- `_offload.py`: `offload` / `restore`, which divert offloadable leaves of an arbitrarily nested object into the arena via a pickler `persistent_id` / `persistent_load` pair while the rest is pickled normally. The envelope carries an 8-byte unit span so the reader can return the whole region in one call.

`iterate_in_subprocess` gains a `buffer: ArenaProtocol | None` argument. When provided, the worker offloads each item before putting it on the queue and the parent restores it as the first step; the arena is created and owned by the parent, passed to the worker, and closed and unlinked at teardown after the worker is confirmed dead. With `buffer=None` the behavior is unchanged.

Reviewed By: huangruizhe

Differential Revision: D107709116
@meta-codesync meta-codesync Bot force-pushed the export-D107709116 branch from 30ebad3 to 06a515a Compare June 10, 2026 08:49
@meta-codesync meta-codesync Bot merged commit 95ae378 into main Jun 10, 2026
110 of 112 checks passed
@mthrok mthrok deleted the export-D107709116 branch June 10, 2026 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot. meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants