History

enzotar 3c14beea50 feat(engine): v0.14-v0.16 releases ds-physics 0.16.0 (81 tests): - v0.14: proximity queries, physics regions - v0.15: event hooks, transform hierarchy, timeline - v0.16: deterministic seed/checksum, collision manifolds, distance+hinge constraints ds-stream 0.16.0 (143 tests): - v0.14: FrameCompressor (RLE), MultiClientSync, BandwidthThrottle, FrameType::Compressed - v0.15: AdaptiveBitrate, MetricsSnapshot, FramePipeline - v0.16: FrameEncryptor (XOR), StreamMigration, FrameDedup, PriorityQueue ds-stream-wasm 0.16.0 (54 tests): - v0.14: RLE compress/decompress, sync drift, bandwidth limiting - v0.15: adaptive quality, metrics snapshot, frame transforms - v0.16: encrypt/decrypt frames, migration handoff, frame dedup ds-screencast 0.16.0: - v0.14: --roi, /clients, --compress, --migrate-on-crash - v0.15: --adaptive-bitrate, /metrics, --viewport-transform, --cdn-push - v0.16: /tabs, --encrypt-key, --watermark, --graceful-shutdown		2026-03-10 22:47:44 -07:00
..
src	feat(engine): v0.14-v0.16 releases	2026-03-10 22:47:44 -07:00
Cargo.toml	feat(engine): v0.14-v0.16 releases	2026-03-10 22:47:44 -07:00
CHANGELOG.md	feat(engine): v0.14-v0.16 releases	2026-03-10 22:47:44 -07:00
README.md	feat: physics language integration — scene container with Rapier2D WASM	2026-02-25 10:58:43 -08:00

README.md

ds-stream — Universal Bitstream Streaming

Any input bitstream → any output bitstream. Neural nets will generate the pixels. The receiver just renders bytes.

What It Does

┌──────────┐    WebSocket    ┌─────────┐    WebSocket    ┌──────────┐
│  Source   │ ───frames────► │  Relay  │ ───frames────► │ Receiver  │
│ (renders) │ ◄───inputs──── │ (:9100) │ ◄───inputs──── │ (~300 LOC)│
└──────────┘                 └─────────┘                 └──────────┘

The source runs a DreamStack app (signal graph + springs + renderer), captures output as bytes, and streams it. The receiver is a thin client that renders whatever bytes arrive — no framework, no runtime.

Binary Protocol

Every message = 16-byte header + payload.

┌──────┬───────┬──────┬───────────┬───────┬────────┬────────────┐
│ type │ flags │ seq  │ timestamp │ width │ height │ payload_len│
│  u8  │  u8   │ u16  │    u32    │  u16  │  u16   │    u32     │
└──────┴───────┴──────┴───────────┴───────┴────────┴────────────┘

Frame Types (source → receiver)

Code	Type	Description
`0x01`	Pixels	Raw RGBA framebuffer
`0x02`	CompressedPixels	PNG/WebP (future)
`0x03`	DeltaPixels	XOR delta + RLE
`0x10`	AudioPcm	Float32 PCM samples
`0x11`	AudioCompressed	Opus (future)
`0x20`	Haptic	Vibration command
`0x30`	SignalSync	Full signal state JSON
`0x31`	SignalDiff	Changed signals only
`0x40`	NeuralFrame	Neural-generated pixels
`0x41`	NeuralAudio	Neural speech/music
`0x42`	NeuralActuator	Learned motor control
`0x43`	NeuralLatent	Latent space scene
`0xFE`	Ping	Keep-alive
`0xFF`	End	Stream termination

Input Types (receiver → source)

Code	Type	Payload
`0x01`	Pointer	x(u16) y(u16) buttons(u8)
`0x02`	PointerDown	same
`0x03`	PointerUp	same
`0x10`	KeyDown	keycode(u16) modifiers(u8)
`0x11`	KeyUp	same
`0x20`	Touch	id(u8) x(u16) y(u16)
`0x30`	GamepadAxis	axis(u8) value(f32)
`0x40`	Midi	status(u8) d1(u8) d2(u8)
`0x50`	Scroll	dx(i16) dy(i16)
`0x60`	Resize	width(u16) height(u16)
`0x90`	BciInput	(future)

Flags

Bit	Meaning
0	`FLAG_INPUT` — message is input (receiver→source)
1	`FLAG_KEYFRAME` — full state, no delta
2	`FLAG_COMPRESSED` — payload is compressed

Streaming Modes

Mode	Frame Size	Bandwidth @30fps	Use Case
Pixel	938 KB	~28 MB/s	Full fidelity, any renderer
Delta	50-300 KB	~1-9 MB/s	Low-motion scenes
Signal	~80 B	~2 KB/s	DreamStack-native, receiver renders
Neural	938 KB	~28 MB/s	Model-generated output

Quick Start

# Start the relay server
cargo run -p ds-stream

# Serve the examples
python3 -m http.server 8080 --directory examples

# Open in browser
# Tab 1: http://localhost:8080/stream-source.html
# Tab 2: http://localhost:8080/stream-receiver.html

Click mode buttons on the source to switch between Pixel / Delta / Signal / Neural. Toggle audio. Open multiple receiver tabs.

Crate Structure

engine/ds-stream/
├── Cargo.toml
└── src/
    ├── lib.rs          # crate root
    ├── protocol.rs     # types, header, events (412 lines)
    ├── codec.rs        # encode/decode, delta, builders (247 lines)
    ├── relay.rs        # WebSocket relay server (255 lines)
    └── main.rs         # CLI entry point

Tests

cargo test -p ds-stream    # 17 tests

Covers: header roundtrip, event roundtrip, delta compression, frame type enums, flag checks, partial buffer handling, message size calculation.

Next Steps

Near-term

WebRTC transport — Replace WebSocket with WebRTC DataChannel for sub-10ms latency and NAT traversal. The protocol is transport-agnostic; only the relay changes.
Opus audio compression — Replace raw PCM (AudioPcm) with Opus encoding (AudioCompressed). 28 KB/s → ~6 KB/s for voice, near-transparent quality.
Adaptive quality — Source monitors receiver lag (via ACK frames) and auto-downgrades: full pixels → delta → signal diff. Graceful degradation on slow networks.
WASM codec — Compile protocol.rs + codec.rs to WASM so source and receiver share the exact same binary codec. No JS reimplementation drift.

Medium-term

Persistent signal state — Source sends SignalSync keyframes periodically. New receivers joining mid-stream get full state immediately, then switch to diffs.
Touch/gamepad input — Wire up Touch, GamepadAxis, GamepadButton input types on receiver. Enable mobile and controller interaction.
Frame compression — PNG/WebP encoding for pixel frames (CompressedPixels). Canvas toBlob('image/webp', 0.8) gives ~50-100 KB/frame vs 938 KB raw.
Haptic output — Receiver calls navigator.vibrate() on Haptic frames. Spring impact → buzz.

Long-term (neural path)

Train a pixel model — Collect (signal_state, screenshot) pairs from the springs demo. Train a small CNN/NeRF to predict framebuffer from signal state. Replace neuralRender() with real inference.
Latent space streaming — Instead of pixels, stream a compressed latent representation (NeuralLatent). Receiver runs a lightweight decoder model. ~1 KB/frame for HD content.
Voice input — Receiver captures microphone audio, streams as VoiceInput. Source interprets via speech-to-intent model to drive signals.
Multi-source composition — Multiple sources stream to the same relay. Receiver composites layers. Each source owns a region or z-layer of the output.