13 KiB
Shared Input System v2 — Architecture
Overview
v2 is a 4-layer shared input pipeline that combines pointer gestures and keyboard events into one resolver/dispatcher stack. It lives in packages/canvas/src/gestures/ and replaces the v1/v2 split where pointer logic lived in the gesture runtime while keyboard ownership was still scattered across separate hooks.
DOM Events (pointer, touch, wheel, keydown, keyup)
|
v
Layer 1: NORMALIZE NormalizedPointer
| + raw keyboard state
v
Layer 2: RECOGNIZE PointerGestureEvent | KeyInputEvent
| (pointer gestures + direct key events)
v
Layer 3: RESOLVE ResolvedAction
| (priority-sorted context stack + specificity scoring)
v
Layer 4: DISPATCH Side Effects
(phase-aware handlers + provider-owned keyboard ownership)
File Map
| File | Lines | Layer | Responsibility |
|---|---|---|---|
types.ts |
179 | All | Type definitions (no runtime code) |
normalize.ts |
73 | 1 | PointerEvent -> NormalizedPointer, modifier extraction |
timed-state.ts |
159 | 2 | Pure state machine for tap counting + long-press |
timed-state-runner.ts |
88 | 2 | Timer bridge wrapping pure TimedState |
specificity.ts |
99 | 3 | Pattern-to-event scoring algorithm |
mapper.ts |
159 | 3 | Context stack resolution with bucketed lookup |
contexts.ts |
203 | 3 | Built-in contexts (palm rejection, default, input modes) |
dispatcher.ts |
106 | 4 | Action handler registry + phase routing |
inertia.ts |
138 | 4 | Pure momentum simulation (pan + zoom) |
GestureProvider.tsx |
52 | React | Context provider for shared gesture state |
useGestureSystem.ts |
81 | React | Consumer hook: context stack CRUD + mapping index |
useCanvasGestures.ts |
635 | React | Main viewport hook (drag, pinch, wheel, inertia) |
useNodeGestures.ts |
192 | React | Node-level hook (tap, double-tap, long-press, right-click) |
useInputModeGestureContext.ts |
49 | React | Auto-push/pop contexts when input mode changes |
index.ts |
71 | - | Barrel export |
Layer 1: Normalize
Files: normalize.ts
Pure functions, no DOM listeners.
extractModifiers(e) -> Modifiers { shift, ctrl, alt, meta, custom? }
normalizePointer(e: PointerEvent) -> NormalizedPointer
classifyPointer (from core/input-classifier.ts) determines InputSource: 'finger' | 'pencil' | 'mouse' based on pointerType and pressure heuristics.
Modifiers.custom remains available for non-keyboard flags only. Held keyboard state is explicit in heldKeys.byKey / heldKeys.byCode.
Layer 2: Recognize
Files: timed-state.ts, timed-state-runner.ts
TimedState Machine (pure)
Per-pointer state machine that tracks tap sequences and long-press timing. Keyboard does not go through this state machine; it enters the runtime as direct KeyInputEvents.
Returns TimedStateResult with:
state: next stateemit?: gesture type to emit ('tap','double-tap','triple-tap','long-press')scheduleTimer?/cancelTimer?: timer commands for the caller
States: idle -> pressed -> { long-pressed | released } -> idle
down: idle -> pressed (schedule long-press timer)
released -> pressed (increment tapCount)
up: pressed -> released (emit tap/double-tap/triple-tap)
move: pressed -> idle (cancel — becomes drag via @use-gesture)
timer: pressed -> long-pressed (emit long-press)
settle: released -> idle (finalize tap sequence)
cancel: any -> idle (cleanup)
Key design: The state machine is pure — no setTimeout inside. The TimedStateRunner class manages actual timers and calls transition() on timer fire.
@use-gesture Integration
@use-gesture/react handles drag, pinch, and wheel recognition. The gesture hooks configure it with source-aware thresholds:
| Source | Drag threshold | Tap threshold |
|---|---|---|
| finger | 10px | 10px |
| pencil | 2px | 3px |
| mouse | 3px | 5px |
Layer 3: Resolve
Files: specificity.ts, mapper.ts, contexts.ts
Specificity Scoring
Scores how well an InputPattern matches an InputEvent. Higher = more specific. Returns -1 on mismatch.
| Dimension | Score | Why |
|---|---|---|
type |
128 | Must outweigh all others combined |
key |
64 | Specific key binding |
code |
64 | Physical key binding |
subjectKind |
32 | Narrows to element category |
modifiers (positive, e.g. shift: true) |
16 each | Requires key pressed |
modifiers (negative, e.g. shift: false) |
8 each | Requires key NOT pressed |
heldKeys (positive) |
16 each | Held-key keyboard context |
heldKeys (negative) |
8 each | Explicit absence of held key |
source |
4 | Input device |
button |
2 | Mouse button |
Keyboard matching is not encoded as GestureType. It matches kind: 'key' plus phase, key, and/or code.
Max possible: 128 + 32 + N16 + 4 + 2 = 166 + N16
MappingIndex
Contexts are indexed by pointer gesture type or keyboard phase into buckets for O(1) lookup:
type MappingIndex = ContextIndex[] // sorted by priority ascending (0 = highest)
interface ContextIndex {
contextId: string
priority: number
enabled: boolean
buckets: Map<string | '__wildcard__', Binding[]>
}
Pointer buckets use pointer:${type}. Keyboard buckets use key:${phase}.
Resolution Algorithm
resolve(event, index, guardContext):
for each context (priority ascending = highest first):
skip if !enabled
lookup bucket for event.type (+ wildcard bucket)
for each binding in bucket:
score = specificity(pattern, event)
skip if -1 (no match)
skip if binding.when(guard) returns false
track best match (highest score)
if best match found:
if consumeInput → return immediately (block lower contexts)
else → record as candidate
return highest-priority candidate or null
Built-in Contexts
Palm Rejection (priority 0 — highest):
- When stylus is active, blocks all finger taps/long-press via
when: ctx => ctx.isStylusActive - Remaps finger drag-on-node to pan
- Pinch + scroll intentionally NOT blocked
Default (priority 100 — lowest):
- Taps: select node/edge, clear selection, shift-toggle
- Drags: move-node, pan (per source), lasso/rect-select
- Held-key pointer modifiers:
Space+drag - Right-click: context-menu
- Double/triple-tap: fit-to-view, toggle-lock
- Long-press: context-menu on node, create-node on background
- Pinch: zoom on background, split-node on node
- Scroll: zoom
- Keyboard shortcuts: search, clipboard, history, delete, escape
Shared keyboard contexts:
- active-interaction cancellation
- search navigation
- keyboard navigate mode
- keyboard manipulate mode
Input Mode Contexts (priority 5):
PICK_NODE_CONTEXT: tap-node -> resolve-pick-nodePICK_NODES_CONTEXT: tap-node -> resolve-pick-node, double-tap-bg -> finishPICK_POINT_CONTEXT: tap-anywhere -> resolve-pick-point
GuardContext
Runtime state available to when predicates on bindings:
interface GuardContext {
isStylusActive: boolean
fingerCount: number
isDragging: boolean
isResizing: boolean
isSplitting: boolean
inputMode: InputMode
keyboardInteractionMode: 'navigate' | 'manipulate'
selectedNodeIds: ReadonlySet<string>
focusedNodeId: string | null
isSearchActive: boolean
commandLineVisible: boolean
heldKeys: HeldKeysState
custom: Record<string, unknown> // extensible
}
Rebuilt from Jotai atoms each render (cheap — just reads).
Layer 4: Dispatch
Files: dispatcher.ts, inertia.ts
Action Handlers
Two forms:
// Simple pointer handler — fires on pointer 'start' and 'instant'
registerAction('select-node', (event: PointerGestureEvent) => { ... })
// Keyboard handler — fires on key 'down'
registerAction('delete-selection', (event: KeyInputEvent) => { ... })
// Phase-aware — different logic per phase (for continuous gestures)
registerAction('pan', {
onStart: (e) => { ... },
onMove: (e) => { ... },
onEnd: (e) => { startInertia(e.velocity) },
onCancel: () => { cancelInertia() },
})
Handler registry is a plain Map<string, ActionHandler> — not Jotai atoms. Synchronous, immediate invocation.
'none' actionId is a built-in no-op (used for intentional blocking in palm rejection).
Inertia
Pure engines with no requestAnimationFrame dependency — the hook manages the rAF loop.
VelocitySampler: ring buffer of recent velocity samplesPanInertia: friction-decayed momentum (friction = 0.92)ZoomInertia: friction-decayed zoom with snap-to-100% (friction = 0.88,snapThreshold = 0.03)
React Integration
Component Hierarchy
Canvas.tsx
└─ GestureProvider (holds GestureSystemAPI + onAction callback)
├─ Viewport.tsx
│ └─ useCanvasGestures(ref, mappingIndex, onAction)
│ ├─ @use-gesture for drag/pinch/wheel
│ ├─ TimedStateRunner for background taps
│ ├─ Inertia engines for momentum
│ └─ resolve() + dispatch() on every gesture event
│
└─ Node.tsx (for each node)
└─ useNodeGestures(nodeId, mappingIndex, onAction)
├─ TimedStateRunner for node taps
└─ resolve() + dispatch() for tap/double-tap/long-press/right-click
MappingIndex Flow
useGestureSystem()
builds: [palmRejection?] + [staticContexts] + [dynamicContexts] + DEFAULT_CONTEXT
useMemo → buildMappingIndex() → MappingIndex
↓
GestureProvider holds in React context
↓
Viewport reads via useGestureContext() → passes to useCanvasGestures
Node reads via useGestureContext() → passes to useNodeGestures
Both hooks call resolve(event, mappingIndex, guardContext) with the same index — consistent resolution across viewport and node-level gestures.
Consumer API
const gestureSystem = useGestureSystem({ palmRejection: true })
// Push a custom context at runtime
gestureSystem.pushContext({
id: 'drawing-mode',
priority: 50,
enabled: true,
bindings: [
{ id: 'pencil-draw', pattern: { type: 'drag', source: 'pencil' }, actionId: 'draw', consumeInput: true },
],
})
// Remove it
gestureSystem.removeContext('drawing-mode')
Pointer Ownership
- Each active pointer is owned by at most one gesture stream
- When
@use-gesturerecognizes a pinch (2+ pointers), single-pointer drags are cancelled - Node drag uses
setPointerCapturewithcapture: falseon viewport to prevent event stealing - Local gesture scopes (resize handles, edge creation, split) maintain their own pointer maps and suppress pipeline gestures via GuardContext flags
Relationship to v1
Two action systems coexist:
v1 (core/action-registry.ts) |
v2 (gestures/dispatcher.ts) |
|
|---|---|---|
| Handler type | (ActionContext, ActionHelpers) => void |
(GestureEvent) => void or PhaseHandler |
| Registration | ActionDefinition objects with metadata |
Simple registerAction(id, handler) |
| Execution | executeAction(id, context, helpers) |
dispatch(event, resolution) |
| Scope | Commands, keyboard shortcuts, menu items | Pointer/touch gesture pipeline |
v1 actions (BuiltInActionId) have rich metadata (label, description, category, icon) and receive ActionHelpers for state manipulation. v2 handlers are lightweight — just the gesture event.
Both registries are global Maps. They can reference the same action IDs — a gesture resolution can trigger a v1 action by ID.
Test Coverage
| Test | Module | What |
|---|---|---|
specificity.test.ts |
Scoring | Pattern matching, weight ordering, mismatch rejection |
timed-state.test.ts |
State machine | Transitions, tap counting, long-press, cancel |
mapper.test.ts |
Resolution | Bucketing, priority, consumeInput, guard evaluation |
inertia.test.ts |
Momentum | Friction decay, velocity sampling, snap-to-zoom |
palm-rejection.test.ts |
Palm rejection | Stylus blocking, pinch passthrough |
All tests are pure unit tests (no DOM, no React) — run via pnpm exec vitest run from packages/canvas/.
Key Design Principles
-
Pure core, imperative shell: normalize, timed-state, specificity, mapper, inertia are all pure functions. DOM listeners, timers, and rAF live in hooks.
-
Refs for hot path: Drag position, velocity samples, inertia engines use
useRef— no re-renders during 60fps gesture streams. -
Specificity over priority: Within a context, the most specific match wins automatically. Between contexts, explicit priority ordering resolves conflicts.
-
consumeInput for hard blocks: Palm rejection doesn't "disable" finger input — it matches finger gestures at priority 0 with
consumeInput: true, preventing default context from seeing them. -
Context stack is data: MappingContexts are plain objects. Push/pop at runtime for modal states (pick mode, edge creation, drawing mode) without touching handler code.
-
Custom modifiers for extensibility:
Modifiers.custombag supports arbitrary named modifiers (iPad toolbar buttons, held keys) with full specificity scoring — no special-casing needed.