ds-physics v0.50.0 (138 tests) - v0.28: apply_body_torque, is_body_sleeping, get_body_angle - v0.30: set_body_gravity, set_linear_damping, count_awake_bodies - v0.40: joints (distance/pin), raycast, kinematic, time scale, world stats - v0.50: point query, explosion, velocity/position set, contacts, gravity, collision groups ds-stream v0.50.0 (201 tests) - v0.28: BufferPool, PacketJitterBuffer, RttTracker - v0.30: FrameRingBuffer, PacketLossDetector, ConnectionQuality - v0.40: QualityAdapter, SourceMixer, FrameDeduplicator, BackpressureController, HeartbeatMonitor, CompressionTracker, FecEncoder, StreamSnapshot, AdaptivePriorityQueue - v0.50: StreamCipher, ChannelMux/Demux, FramePacer, CongestionWindow, FlowController, ProtocolNegotiator, ReplayRecorder, BandwidthShaper ds-stream-wasm v0.50.0 (111 tests) - WASM bindings for all stream features ds-screencast v0.50.0 - CLI: --jitter-buffer, --latency-window, --ring-buffer, --loss-threshold, --adaptive, --dedup, --backpressure, --heartbeat-ms, --fec, --encrypt-key, --channels, --pacing-ms, --max-bps, --replay-file |
||
|---|---|---|
| .. | ||
| src | ||
| Cargo.toml | ||
| CHANGELOG.md | ||
| README.md | ||
ds-stream — Universal Bitstream Streaming
Any input bitstream → any output bitstream. Neural nets will generate the pixels. The receiver just renders bytes.
What It Does
┌──────────┐ WebSocket ┌─────────┐ WebSocket ┌──────────┐
│ Source │ ───frames────► │ Relay │ ───frames────► │ Receiver │
│ (renders) │ ◄───inputs──── │ (:9100) │ ◄───inputs──── │ (~300 LOC)│
└──────────┘ └─────────┘ └──────────┘
The source runs a DreamStack app (signal graph + springs + renderer), captures output as bytes, and streams it. The receiver is a thin client that renders whatever bytes arrive — no framework, no runtime.
Binary Protocol
Every message = 16-byte header + payload.
┌──────┬───────┬──────┬───────────┬───────┬────────┬────────────┐
│ type │ flags │ seq │ timestamp │ width │ height │ payload_len│
│ u8 │ u8 │ u16 │ u32 │ u16 │ u16 │ u32 │
└──────┴───────┴──────┴───────────┴───────┴────────┴────────────┘
Frame Types (source → receiver)
| Code | Type | Description |
|---|---|---|
0x01 |
Pixels | Raw RGBA framebuffer |
0x02 |
CompressedPixels | PNG/WebP (future) |
0x03 |
DeltaPixels | XOR delta + RLE |
0x10 |
AudioPcm | Float32 PCM samples |
0x11 |
AudioCompressed | Opus (future) |
0x20 |
Haptic | Vibration command |
0x30 |
SignalSync | Full signal state JSON |
0x31 |
SignalDiff | Changed signals only |
0x40 |
NeuralFrame | Neural-generated pixels |
0x41 |
NeuralAudio | Neural speech/music |
0x42 |
NeuralActuator | Learned motor control |
0x43 |
NeuralLatent | Latent space scene |
0xFE |
Ping | Keep-alive |
0xFF |
End | Stream termination |
Input Types (receiver → source)
| Code | Type | Payload |
|---|---|---|
0x01 |
Pointer | x(u16) y(u16) buttons(u8) |
0x02 |
PointerDown | same |
0x03 |
PointerUp | same |
0x10 |
KeyDown | keycode(u16) modifiers(u8) |
0x11 |
KeyUp | same |
0x20 |
Touch | id(u8) x(u16) y(u16) |
0x30 |
GamepadAxis | axis(u8) value(f32) |
0x40 |
Midi | status(u8) d1(u8) d2(u8) |
0x50 |
Scroll | dx(i16) dy(i16) |
0x60 |
Resize | width(u16) height(u16) |
0x90 |
BciInput | (future) |
Flags
| Bit | Meaning |
|---|---|
| 0 | FLAG_INPUT — message is input (receiver→source) |
| 1 | FLAG_KEYFRAME — full state, no delta |
| 2 | FLAG_COMPRESSED — payload is compressed |
Streaming Modes
| Mode | Frame Size | Bandwidth @30fps | Use Case |
|---|---|---|---|
| Pixel | 938 KB | ~28 MB/s | Full fidelity, any renderer |
| Delta | 50-300 KB | ~1-9 MB/s | Low-motion scenes |
| Signal | ~80 B | ~2 KB/s | DreamStack-native, receiver renders |
| Neural | 938 KB | ~28 MB/s | Model-generated output |
Quick Start
# Start the relay server
cargo run -p ds-stream
# Serve the examples
python3 -m http.server 8080 --directory examples
# Open in browser
# Tab 1: http://localhost:8080/stream-source.html
# Tab 2: http://localhost:8080/stream-receiver.html
Click mode buttons on the source to switch between Pixel / Delta / Signal / Neural. Toggle audio. Open multiple receiver tabs.
Crate Structure
engine/ds-stream/
├── Cargo.toml
└── src/
├── lib.rs # crate root
├── protocol.rs # types, header, events (412 lines)
├── codec.rs # encode/decode, delta, builders (247 lines)
├── relay.rs # WebSocket relay server (255 lines)
└── main.rs # CLI entry point
Tests
cargo test -p ds-stream # 17 tests
Covers: header roundtrip, event roundtrip, delta compression, frame type enums, flag checks, partial buffer handling, message size calculation.
Next Steps
Near-term
-
WebRTC transport — Replace WebSocket with WebRTC DataChannel for sub-10ms latency and NAT traversal. The protocol is transport-agnostic; only the relay changes.
-
Opus audio compression — Replace raw PCM (
AudioPcm) with Opus encoding (AudioCompressed). 28 KB/s → ~6 KB/s for voice, near-transparent quality. -
Adaptive quality — Source monitors receiver lag (via ACK frames) and auto-downgrades: full pixels → delta → signal diff. Graceful degradation on slow networks.
-
WASM codec — Compile
protocol.rs+codec.rsto WASM so source and receiver share the exact same binary codec. No JS reimplementation drift.
Medium-term
-
Persistent signal state — Source sends
SignalSynckeyframes periodically. New receivers joining mid-stream get full state immediately, then switch to diffs. -
Touch/gamepad input — Wire up Touch, GamepadAxis, GamepadButton input types on receiver. Enable mobile and controller interaction.
-
Frame compression — PNG/WebP encoding for pixel frames (
CompressedPixels). CanvastoBlob('image/webp', 0.8)gives ~50-100 KB/frame vs 938 KB raw. -
Haptic output — Receiver calls
navigator.vibrate()on Haptic frames. Spring impact → buzz.
Long-term (neural path)
-
Train a pixel model — Collect (signal_state, screenshot) pairs from the springs demo. Train a small CNN/NeRF to predict framebuffer from signal state. Replace
neuralRender()with real inference. -
Latent space streaming — Instead of pixels, stream a compressed latent representation (
NeuralLatent). Receiver runs a lightweight decoder model. ~1 KB/frame for HD content. -
Voice input — Receiver captures microphone audio, streams as
VoiceInput. Source interprets via speech-to-intent model to drive signals. -
Multi-source composition — Multiple sources stream to the same relay. Receiver composites layers. Each source owns a region or z-layer of the output.