canvas/docs/arch-gesture-system-v2.md
2026-03-11 18:42:08 -07:00

344 lines
13 KiB
Markdown

# Shared Input System v2 — Architecture
## Overview
v2 is a 4-layer shared input pipeline that combines pointer gestures and keyboard events into one resolver/dispatcher stack. It lives in `packages/canvas/src/gestures/` and replaces the v1/v2 split where pointer logic lived in the gesture runtime while keyboard ownership was still scattered across separate hooks.
```
DOM Events (pointer, touch, wheel, keydown, keyup)
|
v
Layer 1: NORMALIZE NormalizedPointer
| + raw keyboard state
v
Layer 2: RECOGNIZE PointerGestureEvent | KeyInputEvent
| (pointer gestures + direct key events)
v
Layer 3: RESOLVE ResolvedAction
| (priority-sorted context stack + specificity scoring)
v
Layer 4: DISPATCH Side Effects
(phase-aware handlers + provider-owned keyboard ownership)
```
## File Map
| File | Lines | Layer | Responsibility |
|------|-------|-------|----------------|
| `types.ts` | 179 | All | Type definitions (no runtime code) |
| `normalize.ts` | 73 | 1 | PointerEvent -> NormalizedPointer, modifier extraction |
| `timed-state.ts` | 159 | 2 | Pure state machine for tap counting + long-press |
| `timed-state-runner.ts` | 88 | 2 | Timer bridge wrapping pure TimedState |
| `specificity.ts` | 99 | 3 | Pattern-to-event scoring algorithm |
| `mapper.ts` | 159 | 3 | Context stack resolution with bucketed lookup |
| `contexts.ts` | 203 | 3 | Built-in contexts (palm rejection, default, input modes) |
| `dispatcher.ts` | 106 | 4 | Action handler registry + phase routing |
| `inertia.ts` | 138 | 4 | Pure momentum simulation (pan + zoom) |
| `GestureProvider.tsx` | 52 | React | Context provider for shared gesture state |
| `useGestureSystem.ts` | 81 | React | Consumer hook: context stack CRUD + mapping index |
| `useCanvasGestures.ts` | 635 | React | Main viewport hook (drag, pinch, wheel, inertia) |
| `useNodeGestures.ts` | 192 | React | Node-level hook (tap, double-tap, long-press, right-click) |
| `useInputModeGestureContext.ts` | 49 | React | Auto-push/pop contexts when input mode changes |
| `index.ts` | 71 | - | Barrel export |
## Layer 1: Normalize
**Files:** `normalize.ts`
**Pure functions, no DOM listeners.**
```typescript
extractModifiers(e) -> Modifiers { shift, ctrl, alt, meta, custom? }
normalizePointer(e: PointerEvent) -> NormalizedPointer
```
`classifyPointer` (from `core/input-classifier.ts`) determines `InputSource`: `'finger' | 'pencil' | 'mouse'` based on `pointerType` and pressure heuristics.
`Modifiers.custom` remains available for non-keyboard flags only. Held keyboard state is explicit in `heldKeys.byKey` / `heldKeys.byCode`.
## Layer 2: Recognize
**Files:** `timed-state.ts`, `timed-state-runner.ts`
### TimedState Machine (pure)
Per-pointer state machine that tracks tap sequences and long-press timing. Keyboard does not go through this state machine; it enters the runtime as direct `KeyInputEvent`s.
Returns `TimedStateResult` with:
- `state`: next state
- `emit?`: gesture type to emit (`'tap'`, `'double-tap'`, `'triple-tap'`, `'long-press'`)
- `scheduleTimer?` / `cancelTimer?`: timer commands for the caller
```
States: idle -> pressed -> { long-pressed | released } -> idle
down: idle -> pressed (schedule long-press timer)
released -> pressed (increment tapCount)
up: pressed -> released (emit tap/double-tap/triple-tap)
move: pressed -> idle (cancel — becomes drag via @use-gesture)
timer: pressed -> long-pressed (emit long-press)
settle: released -> idle (finalize tap sequence)
cancel: any -> idle (cleanup)
```
**Key design:** The state machine is pure — no `setTimeout` inside. The `TimedStateRunner` class manages actual timers and calls `transition()` on timer fire.
### @use-gesture Integration
`@use-gesture/react` handles drag, pinch, and wheel recognition. The gesture hooks configure it with source-aware thresholds:
| Source | Drag threshold | Tap threshold |
|--------|---------------|---------------|
| finger | 10px | 10px |
| pencil | 2px | 3px |
| mouse | 3px | 5px |
## Layer 3: Resolve
**Files:** `specificity.ts`, `mapper.ts`, `contexts.ts`
### Specificity Scoring
Scores how well an `InputPattern` matches an `InputEvent`. Higher = more specific. Returns `-1` on mismatch.
| Dimension | Score | Why |
|-----------|-------|-----|
| `type` | 128 | Must outweigh all others combined |
| `key` | 64 | Specific key binding |
| `code` | 64 | Physical key binding |
| `subjectKind` | 32 | Narrows to element category |
| `modifiers` (positive, e.g. `shift: true`) | 16 each | Requires key pressed |
| `modifiers` (negative, e.g. `shift: false`) | 8 each | Requires key NOT pressed |
| `heldKeys` (positive) | 16 each | Held-key keyboard context |
| `heldKeys` (negative) | 8 each | Explicit absence of held key |
| `source` | 4 | Input device |
| `button` | 2 | Mouse button |
Keyboard matching is not encoded as `GestureType`. It matches `kind: 'key'` plus `phase`, `key`, and/or `code`.
Max possible: 128 + 32 + N*16 + 4 + 2 = 166 + N*16
### MappingIndex
Contexts are indexed by pointer gesture type or keyboard phase into buckets for O(1) lookup:
```typescript
type MappingIndex = ContextIndex[] // sorted by priority ascending (0 = highest)
interface ContextIndex {
contextId: string
priority: number
enabled: boolean
buckets: Map<string | '__wildcard__', Binding[]>
}
```
Pointer buckets use `pointer:${type}`. Keyboard buckets use `key:${phase}`.
### Resolution Algorithm
```
resolve(event, index, guardContext):
for each context (priority ascending = highest first):
skip if !enabled
lookup bucket for event.type (+ wildcard bucket)
for each binding in bucket:
score = specificity(pattern, event)
skip if -1 (no match)
skip if binding.when(guard) returns false
track best match (highest score)
if best match found:
if consumeInput → return immediately (block lower contexts)
else → record as candidate
return highest-priority candidate or null
```
### Built-in Contexts
**Palm Rejection** (priority 0 — highest):
- When stylus is active, blocks all finger taps/long-press via `when: ctx => ctx.isStylusActive`
- Remaps finger drag-on-node to pan
- Pinch + scroll intentionally NOT blocked
**Default** (priority 100 — lowest):
- Taps: select node/edge, clear selection, shift-toggle
- Drags: move-node, pan (per source), lasso/rect-select
- Held-key pointer modifiers: `Space+drag`
- Right-click: context-menu
- Double/triple-tap: fit-to-view, toggle-lock
- Long-press: context-menu on node, create-node on background
- Pinch: zoom on background, split-node on node
- Scroll: zoom
- Keyboard shortcuts: search, clipboard, history, delete, escape
**Shared keyboard contexts**:
- active-interaction cancellation
- search navigation
- keyboard navigate mode
- keyboard manipulate mode
**Input Mode Contexts** (priority 5):
- `PICK_NODE_CONTEXT`: tap-node -> resolve-pick-node
- `PICK_NODES_CONTEXT`: tap-node -> resolve-pick-node, double-tap-bg -> finish
- `PICK_POINT_CONTEXT`: tap-anywhere -> resolve-pick-point
### GuardContext
Runtime state available to `when` predicates on bindings:
```typescript
interface GuardContext {
isStylusActive: boolean
fingerCount: number
isDragging: boolean
isResizing: boolean
isSplitting: boolean
inputMode: InputMode
keyboardInteractionMode: 'navigate' | 'manipulate'
selectedNodeIds: ReadonlySet<string>
focusedNodeId: string | null
isSearchActive: boolean
commandLineVisible: boolean
heldKeys: HeldKeysState
custom: Record<string, unknown> // extensible
}
```
Rebuilt from Jotai atoms each render (cheap — just reads).
## Layer 4: Dispatch
**Files:** `dispatcher.ts`, `inertia.ts`
### Action Handlers
Two forms:
```typescript
// Simple pointer handler — fires on pointer 'start' and 'instant'
registerAction('select-node', (event: PointerGestureEvent) => { ... })
// Keyboard handler — fires on key 'down'
registerAction('delete-selection', (event: KeyInputEvent) => { ... })
// Phase-aware — different logic per phase (for continuous gestures)
registerAction('pan', {
onStart: (e) => { ... },
onMove: (e) => { ... },
onEnd: (e) => { startInertia(e.velocity) },
onCancel: () => { cancelInertia() },
})
```
Handler registry is a plain `Map<string, ActionHandler>` — not Jotai atoms. Synchronous, immediate invocation.
`'none'` actionId is a built-in no-op (used for intentional blocking in palm rejection).
### Inertia
Pure engines with no `requestAnimationFrame` dependency — the hook manages the rAF loop.
- `VelocitySampler`: ring buffer of recent velocity samples
- `PanInertia`: friction-decayed momentum (`friction = 0.92`)
- `ZoomInertia`: friction-decayed zoom with snap-to-100% (`friction = 0.88`, `snapThreshold = 0.03`)
## React Integration
### Component Hierarchy
```
Canvas.tsx
└─ GestureProvider (holds GestureSystemAPI + onAction callback)
├─ Viewport.tsx
│ └─ useCanvasGestures(ref, mappingIndex, onAction)
│ ├─ @use-gesture for drag/pinch/wheel
│ ├─ TimedStateRunner for background taps
│ ├─ Inertia engines for momentum
│ └─ resolve() + dispatch() on every gesture event
└─ Node.tsx (for each node)
└─ useNodeGestures(nodeId, mappingIndex, onAction)
├─ TimedStateRunner for node taps
└─ resolve() + dispatch() for tap/double-tap/long-press/right-click
```
### MappingIndex Flow
```
useGestureSystem()
builds: [palmRejection?] + [staticContexts] + [dynamicContexts] + DEFAULT_CONTEXT
useMemo → buildMappingIndex() → MappingIndex
GestureProvider holds in React context
Viewport reads via useGestureContext() → passes to useCanvasGestures
Node reads via useGestureContext() → passes to useNodeGestures
```
Both hooks call `resolve(event, mappingIndex, guardContext)` with the same index — consistent resolution across viewport and node-level gestures.
### Consumer API
```typescript
const gestureSystem = useGestureSystem({ palmRejection: true })
// Push a custom context at runtime
gestureSystem.pushContext({
id: 'drawing-mode',
priority: 50,
enabled: true,
bindings: [
{ id: 'pencil-draw', pattern: { type: 'drag', source: 'pencil' }, actionId: 'draw', consumeInput: true },
],
})
// Remove it
gestureSystem.removeContext('drawing-mode')
```
## Pointer Ownership
- Each active pointer is owned by at most one gesture stream
- When `@use-gesture` recognizes a pinch (2+ pointers), single-pointer drags are cancelled
- Node drag uses `setPointerCapture` with `capture: false` on viewport to prevent event stealing
- Local gesture scopes (resize handles, edge creation, split) maintain their own pointer maps and suppress pipeline gestures via GuardContext flags
## Relationship to v1
Two action systems coexist:
| | v1 (`core/action-registry.ts`) | v2 (`gestures/dispatcher.ts`) |
|-|------|------|
| **Handler type** | `(ActionContext, ActionHelpers) => void` | `(GestureEvent) => void` or PhaseHandler |
| **Registration** | `ActionDefinition` objects with metadata | Simple `registerAction(id, handler)` |
| **Execution** | `executeAction(id, context, helpers)` | `dispatch(event, resolution)` |
| **Scope** | Commands, keyboard shortcuts, menu items | Pointer/touch gesture pipeline |
v1 actions (`BuiltInActionId`) have rich metadata (label, description, category, icon) and receive `ActionHelpers` for state manipulation. v2 handlers are lightweight — just the gesture event.
Both registries are global Maps. They can reference the same action IDs — a gesture resolution can trigger a v1 action by ID.
## Test Coverage
| Test | Module | What |
|------|--------|------|
| `specificity.test.ts` | Scoring | Pattern matching, weight ordering, mismatch rejection |
| `timed-state.test.ts` | State machine | Transitions, tap counting, long-press, cancel |
| `mapper.test.ts` | Resolution | Bucketing, priority, consumeInput, guard evaluation |
| `inertia.test.ts` | Momentum | Friction decay, velocity sampling, snap-to-zoom |
| `palm-rejection.test.ts` | Palm rejection | Stylus blocking, pinch passthrough |
All tests are pure unit tests (no DOM, no React) — run via `pnpm exec vitest run` from `packages/canvas/`.
## Key Design Principles
1. **Pure core, imperative shell**: normalize, timed-state, specificity, mapper, inertia are all pure functions. DOM listeners, timers, and rAF live in hooks.
2. **Refs for hot path**: Drag position, velocity samples, inertia engines use `useRef` — no re-renders during 60fps gesture streams.
3. **Specificity over priority**: Within a context, the most specific match wins automatically. Between contexts, explicit priority ordering resolves conflicts.
4. **consumeInput for hard blocks**: Palm rejection doesn't "disable" finger input — it matches finger gestures at priority 0 with `consumeInput: true`, preventing default context from seeing them.
5. **Context stack is data**: MappingContexts are plain objects. Push/pop at runtime for modal states (pick mode, edge creation, drawing mode) without touching handler code.
6. **Custom modifiers for extensibility**: `Modifiers.custom` bag supports arbitrary named modifiers (iPad toolbar buttons, held keys) with full specificity scoring — no special-casing needed.