ds-physics 0.16.0 (81 tests): - v0.14: proximity queries, physics regions - v0.15: event hooks, transform hierarchy, timeline - v0.16: deterministic seed/checksum, collision manifolds, distance+hinge constraints ds-stream 0.16.0 (143 tests): - v0.14: FrameCompressor (RLE), MultiClientSync, BandwidthThrottle, FrameType::Compressed - v0.15: AdaptiveBitrate, MetricsSnapshot, FramePipeline - v0.16: FrameEncryptor (XOR), StreamMigration, FrameDedup, PriorityQueue ds-stream-wasm 0.16.0 (54 tests): - v0.14: RLE compress/decompress, sync drift, bandwidth limiting - v0.15: adaptive quality, metrics snapshot, frame transforms - v0.16: encrypt/decrypt frames, migration handoff, frame dedup ds-screencast 0.16.0: - v0.14: --roi, /clients, --compress, --migrate-on-crash - v0.15: --adaptive-bitrate, /metrics, --viewport-transform, --cdn-push - v0.16: /tabs, --encrypt-key, --watermark, --graceful-shutdown |
||
|---|---|---|
| .. | ||
| src | ||
| Cargo.toml | ||
| CHANGELOG.md | ||
| README.md | ||
ds-stream — Universal Bitstream Streaming
Any input bitstream → any output bitstream. Neural nets will generate the pixels. The receiver just renders bytes.
What It Does
┌──────────┐ WebSocket ┌─────────┐ WebSocket ┌──────────┐
│ Source │ ───frames────► │ Relay │ ───frames────► │ Receiver │
│ (renders) │ ◄───inputs──── │ (:9100) │ ◄───inputs──── │ (~300 LOC)│
└──────────┘ └─────────┘ └──────────┘
The source runs a DreamStack app (signal graph + springs + renderer), captures output as bytes, and streams it. The receiver is a thin client that renders whatever bytes arrive — no framework, no runtime.
Binary Protocol
Every message = 16-byte header + payload.
┌──────┬───────┬──────┬───────────┬───────┬────────┬────────────┐
│ type │ flags │ seq │ timestamp │ width │ height │ payload_len│
│ u8 │ u8 │ u16 │ u32 │ u16 │ u16 │ u32 │
└──────┴───────┴──────┴───────────┴───────┴────────┴────────────┘
Frame Types (source → receiver)
| Code | Type | Description |
|---|---|---|
0x01 |
Pixels | Raw RGBA framebuffer |
0x02 |
CompressedPixels | PNG/WebP (future) |
0x03 |
DeltaPixels | XOR delta + RLE |
0x10 |
AudioPcm | Float32 PCM samples |
0x11 |
AudioCompressed | Opus (future) |
0x20 |
Haptic | Vibration command |
0x30 |
SignalSync | Full signal state JSON |
0x31 |
SignalDiff | Changed signals only |
0x40 |
NeuralFrame | Neural-generated pixels |
0x41 |
NeuralAudio | Neural speech/music |
0x42 |
NeuralActuator | Learned motor control |
0x43 |
NeuralLatent | Latent space scene |
0xFE |
Ping | Keep-alive |
0xFF |
End | Stream termination |
Input Types (receiver → source)
| Code | Type | Payload |
|---|---|---|
0x01 |
Pointer | x(u16) y(u16) buttons(u8) |
0x02 |
PointerDown | same |
0x03 |
PointerUp | same |
0x10 |
KeyDown | keycode(u16) modifiers(u8) |
0x11 |
KeyUp | same |
0x20 |
Touch | id(u8) x(u16) y(u16) |
0x30 |
GamepadAxis | axis(u8) value(f32) |
0x40 |
Midi | status(u8) d1(u8) d2(u8) |
0x50 |
Scroll | dx(i16) dy(i16) |
0x60 |
Resize | width(u16) height(u16) |
0x90 |
BciInput | (future) |
Flags
| Bit | Meaning |
|---|---|
| 0 | FLAG_INPUT — message is input (receiver→source) |
| 1 | FLAG_KEYFRAME — full state, no delta |
| 2 | FLAG_COMPRESSED — payload is compressed |
Streaming Modes
| Mode | Frame Size | Bandwidth @30fps | Use Case |
|---|---|---|---|
| Pixel | 938 KB | ~28 MB/s | Full fidelity, any renderer |
| Delta | 50-300 KB | ~1-9 MB/s | Low-motion scenes |
| Signal | ~80 B | ~2 KB/s | DreamStack-native, receiver renders |
| Neural | 938 KB | ~28 MB/s | Model-generated output |
Quick Start
# Start the relay server
cargo run -p ds-stream
# Serve the examples
python3 -m http.server 8080 --directory examples
# Open in browser
# Tab 1: http://localhost:8080/stream-source.html
# Tab 2: http://localhost:8080/stream-receiver.html
Click mode buttons on the source to switch between Pixel / Delta / Signal / Neural. Toggle audio. Open multiple receiver tabs.
Crate Structure
engine/ds-stream/
├── Cargo.toml
└── src/
├── lib.rs # crate root
├── protocol.rs # types, header, events (412 lines)
├── codec.rs # encode/decode, delta, builders (247 lines)
├── relay.rs # WebSocket relay server (255 lines)
└── main.rs # CLI entry point
Tests
cargo test -p ds-stream # 17 tests
Covers: header roundtrip, event roundtrip, delta compression, frame type enums, flag checks, partial buffer handling, message size calculation.
Next Steps
Near-term
-
WebRTC transport — Replace WebSocket with WebRTC DataChannel for sub-10ms latency and NAT traversal. The protocol is transport-agnostic; only the relay changes.
-
Opus audio compression — Replace raw PCM (
AudioPcm) with Opus encoding (AudioCompressed). 28 KB/s → ~6 KB/s for voice, near-transparent quality. -
Adaptive quality — Source monitors receiver lag (via ACK frames) and auto-downgrades: full pixels → delta → signal diff. Graceful degradation on slow networks.
-
WASM codec — Compile
protocol.rs+codec.rsto WASM so source and receiver share the exact same binary codec. No JS reimplementation drift.
Medium-term
-
Persistent signal state — Source sends
SignalSynckeyframes periodically. New receivers joining mid-stream get full state immediately, then switch to diffs. -
Touch/gamepad input — Wire up Touch, GamepadAxis, GamepadButton input types on receiver. Enable mobile and controller interaction.
-
Frame compression — PNG/WebP encoding for pixel frames (
CompressedPixels). CanvastoBlob('image/webp', 0.8)gives ~50-100 KB/frame vs 938 KB raw. -
Haptic output — Receiver calls
navigator.vibrate()on Haptic frames. Spring impact → buzz.
Long-term (neural path)
-
Train a pixel model — Collect (signal_state, screenshot) pairs from the springs demo. Train a small CNN/NeRF to predict framebuffer from signal state. Replace
neuralRender()with real inference. -
Latent space streaming — Instead of pixels, stream a compressed latent representation (
NeuralLatent). Receiver runs a lightweight decoder model. ~1 KB/frame for HD content. -
Voice input — Receiver captures microphone audio, streams as
VoiceInput. Source interprets via speech-to-intent model to drive signals. -
Multi-source composition — Multiple sources stream to the same relay. Receiver composites layers. Each source owns a region or z-layer of the output.