# ds-stream — Universal Bitstream Streaming > Any input bitstream → any output bitstream. > Neural nets will generate the pixels. The receiver just renders bytes. ## What It Does ``` ┌──────────┐ WebSocket ┌─────────┐ WebSocket ┌──────────┐ │ Source │ ───frames────► │ Relay │ ───frames────► │ Receiver │ │ (renders) │ ◄───inputs──── │ (:9100) │ ◄───inputs──── │ (~300 LOC)│ └──────────┘ └─────────┘ └──────────┘ ``` The **source** runs a DreamStack app (signal graph + springs + renderer), captures output as bytes, and streams it. The **receiver** is a thin client that renders whatever bytes arrive — no framework, no runtime. ## Binary Protocol Every message = **16-byte header** + payload. ``` ┌──────┬───────┬──────┬───────────┬───────┬────────┬────────────┐ │ type │ flags │ seq │ timestamp │ width │ height │ payload_len│ │ u8 │ u8 │ u16 │ u32 │ u16 │ u16 │ u32 │ └──────┴───────┴──────┴───────────┴───────┴────────┴────────────┘ ``` ### Frame Types (source → receiver) | Code | Type | Description | |------|------|-------------| | `0x01` | Pixels | Raw RGBA framebuffer | | `0x02` | CompressedPixels | PNG/WebP (future) | | `0x03` | DeltaPixels | XOR delta + RLE | | `0x10` | AudioPcm | Float32 PCM samples | | `0x11` | AudioCompressed | Opus (future) | | `0x20` | Haptic | Vibration command | | `0x30` | SignalSync | Full signal state JSON | | `0x31` | SignalDiff | Changed signals only | | `0x40` | NeuralFrame | Neural-generated pixels | | `0x41` | NeuralAudio | Neural speech/music | | `0x42` | NeuralActuator | Learned motor control | | `0x43` | NeuralLatent | Latent space scene | | `0xFE` | Ping | Keep-alive | | `0xFF` | End | Stream termination | ### Input Types (receiver → source) | Code | Type | Payload | |------|------|---------| | `0x01` | Pointer | x(u16) y(u16) buttons(u8) | | `0x02` | PointerDown | same | | `0x03` | PointerUp | same | | `0x10` | KeyDown | keycode(u16) modifiers(u8) | | `0x11` | KeyUp | same | | `0x20` | Touch | id(u8) x(u16) y(u16) | | `0x30` | GamepadAxis | axis(u8) value(f32) | | `0x40` | Midi | status(u8) d1(u8) d2(u8) | | `0x50` | Scroll | dx(i16) dy(i16) | | `0x60` | Resize | width(u16) height(u16) | | `0x90` | BciInput | (future) | ### Flags | Bit | Meaning | |-----|---------| | 0 | `FLAG_INPUT` — message is input (receiver→source) | | 1 | `FLAG_KEYFRAME` — full state, no delta | | 2 | `FLAG_COMPRESSED` — payload is compressed | ## Streaming Modes | Mode | Frame Size | Bandwidth @30fps | Use Case | |------|-----------|-------------------|----------| | **Pixel** | 938 KB | ~28 MB/s | Full fidelity, any renderer | | **Delta** | 50-300 KB | ~1-9 MB/s | Low-motion scenes | | **Signal** | ~80 B | ~2 KB/s | DreamStack-native, receiver renders | | **Neural** | 938 KB | ~28 MB/s | Model-generated output | ## Quick Start ```bash # Start the relay server cargo run -p ds-stream # Serve the examples python3 -m http.server 8080 --directory examples # Open in browser # Tab 1: http://localhost:8080/stream-source.html # Tab 2: http://localhost:8080/stream-receiver.html ``` Click mode buttons on the source to switch between Pixel / Delta / Signal / Neural. Toggle audio. Open multiple receiver tabs. ## Crate Structure ``` engine/ds-stream/ ├── Cargo.toml └── src/ ├── lib.rs # crate root ├── protocol.rs # types, header, events (412 lines) ├── codec.rs # encode/decode, delta, builders (247 lines) ├── relay.rs # WebSocket relay server (255 lines) └── main.rs # CLI entry point ``` ## Tests ```bash cargo test -p ds-stream # 17 tests ``` Covers: header roundtrip, event roundtrip, delta compression, frame type enums, flag checks, partial buffer handling, message size calculation. --- ## Next Steps ### Near-term 1. **WebRTC transport** — Replace WebSocket with WebRTC DataChannel for sub-10ms latency and NAT traversal. The protocol is transport-agnostic; only the relay changes. 2. **Opus audio compression** — Replace raw PCM (`AudioPcm`) with Opus encoding (`AudioCompressed`). 28 KB/s → ~6 KB/s for voice, near-transparent quality. 3. **Adaptive quality** — Source monitors receiver lag (via ACK frames) and auto-downgrades: full pixels → delta → signal diff. Graceful degradation on slow networks. 4. **WASM codec** — Compile `protocol.rs` + `codec.rs` to WASM so source and receiver share the exact same binary codec. No JS reimplementation drift. ### Medium-term 5. **Persistent signal state** — Source sends `SignalSync` keyframes periodically. New receivers joining mid-stream get full state immediately, then switch to diffs. 6. **Touch/gamepad input** — Wire up Touch, GamepadAxis, GamepadButton input types on receiver. Enable mobile and controller interaction. 7. **Frame compression** — PNG/WebP encoding for pixel frames (`CompressedPixels`). Canvas `toBlob('image/webp', 0.8)` gives ~50-100 KB/frame vs 938 KB raw. 8. **Haptic output** — Receiver calls `navigator.vibrate()` on Haptic frames. Spring impact → buzz. ### Long-term (neural path) 9. **Train a pixel model** — Collect (signal_state, screenshot) pairs from the springs demo. Train a small CNN/NeRF to predict framebuffer from signal state. Replace `neuralRender()` with real inference. 10. **Latent space streaming** — Instead of pixels, stream a compressed latent representation (`NeuralLatent`). Receiver runs a lightweight decoder model. ~1 KB/frame for HD content. 11. **Voice input** — Receiver captures microphone audio, streams as `VoiceInput`. Source interprets via speech-to-intent model to drive signals. 12. **Multi-source composition** — Multiple sources stream to the same relay. Receiver composites layers. Each source owns a region or z-layer of the output.