canvas/docs/arch-gesture-system-v2.md
2026-03-11 18:42:08 -07:00

13 KiB

Shared Input System v2 — Architecture

Overview

v2 is a 4-layer shared input pipeline that combines pointer gestures and keyboard events into one resolver/dispatcher stack. It lives in packages/canvas/src/gestures/ and replaces the v1/v2 split where pointer logic lived in the gesture runtime while keyboard ownership was still scattered across separate hooks.

DOM Events (pointer, touch, wheel, keydown, keyup)
  |
  v
Layer 1: NORMALIZE     NormalizedPointer
  |                     + raw keyboard state
  v
Layer 2: RECOGNIZE      PointerGestureEvent | KeyInputEvent
  |                     (pointer gestures + direct key events)
  v
Layer 3: RESOLVE        ResolvedAction
  |                     (priority-sorted context stack + specificity scoring)
  v
Layer 4: DISPATCH       Side Effects
                        (phase-aware handlers + provider-owned keyboard ownership)

File Map

File Lines Layer Responsibility
types.ts 179 All Type definitions (no runtime code)
normalize.ts 73 1 PointerEvent -> NormalizedPointer, modifier extraction
timed-state.ts 159 2 Pure state machine for tap counting + long-press
timed-state-runner.ts 88 2 Timer bridge wrapping pure TimedState
specificity.ts 99 3 Pattern-to-event scoring algorithm
mapper.ts 159 3 Context stack resolution with bucketed lookup
contexts.ts 203 3 Built-in contexts (palm rejection, default, input modes)
dispatcher.ts 106 4 Action handler registry + phase routing
inertia.ts 138 4 Pure momentum simulation (pan + zoom)
GestureProvider.tsx 52 React Context provider for shared gesture state
useGestureSystem.ts 81 React Consumer hook: context stack CRUD + mapping index
useCanvasGestures.ts 635 React Main viewport hook (drag, pinch, wheel, inertia)
useNodeGestures.ts 192 React Node-level hook (tap, double-tap, long-press, right-click)
useInputModeGestureContext.ts 49 React Auto-push/pop contexts when input mode changes
index.ts 71 - Barrel export

Layer 1: Normalize

Files: normalize.ts Pure functions, no DOM listeners.

extractModifiers(e) -> Modifiers { shift, ctrl, alt, meta, custom? }
normalizePointer(e: PointerEvent) -> NormalizedPointer

classifyPointer (from core/input-classifier.ts) determines InputSource: 'finger' | 'pencil' | 'mouse' based on pointerType and pressure heuristics.

Modifiers.custom remains available for non-keyboard flags only. Held keyboard state is explicit in heldKeys.byKey / heldKeys.byCode.

Layer 2: Recognize

Files: timed-state.ts, timed-state-runner.ts

TimedState Machine (pure)

Per-pointer state machine that tracks tap sequences and long-press timing. Keyboard does not go through this state machine; it enters the runtime as direct KeyInputEvents.

Returns TimedStateResult with:

  • state: next state
  • emit?: gesture type to emit ('tap', 'double-tap', 'triple-tap', 'long-press')
  • scheduleTimer? / cancelTimer?: timer commands for the caller
States: idle -> pressed -> { long-pressed | released } -> idle

down:     idle -> pressed (schedule long-press timer)
          released -> pressed (increment tapCount)
up:       pressed -> released (emit tap/double-tap/triple-tap)
move:     pressed -> idle (cancel — becomes drag via @use-gesture)
timer:    pressed -> long-pressed (emit long-press)
settle:   released -> idle (finalize tap sequence)
cancel:   any -> idle (cleanup)

Key design: The state machine is pure — no setTimeout inside. The TimedStateRunner class manages actual timers and calls transition() on timer fire.

@use-gesture Integration

@use-gesture/react handles drag, pinch, and wheel recognition. The gesture hooks configure it with source-aware thresholds:

Source Drag threshold Tap threshold
finger 10px 10px
pencil 2px 3px
mouse 3px 5px

Layer 3: Resolve

Files: specificity.ts, mapper.ts, contexts.ts

Specificity Scoring

Scores how well an InputPattern matches an InputEvent. Higher = more specific. Returns -1 on mismatch.

Dimension Score Why
type 128 Must outweigh all others combined
key 64 Specific key binding
code 64 Physical key binding
subjectKind 32 Narrows to element category
modifiers (positive, e.g. shift: true) 16 each Requires key pressed
modifiers (negative, e.g. shift: false) 8 each Requires key NOT pressed
heldKeys (positive) 16 each Held-key keyboard context
heldKeys (negative) 8 each Explicit absence of held key
source 4 Input device
button 2 Mouse button

Keyboard matching is not encoded as GestureType. It matches kind: 'key' plus phase, key, and/or code.

Max possible: 128 + 32 + N16 + 4 + 2 = 166 + N16

MappingIndex

Contexts are indexed by pointer gesture type or keyboard phase into buckets for O(1) lookup:

type MappingIndex = ContextIndex[]  // sorted by priority ascending (0 = highest)

interface ContextIndex {
  contextId: string
  priority: number
  enabled: boolean
  buckets: Map<string | '__wildcard__', Binding[]>
}

Pointer buckets use pointer:${type}. Keyboard buckets use key:${phase}.

Resolution Algorithm

resolve(event, index, guardContext):
  for each context (priority ascending = highest first):
    skip if !enabled
    lookup bucket for event.type (+ wildcard bucket)
    for each binding in bucket:
      score = specificity(pattern, event)
      skip if -1 (no match)
      skip if binding.when(guard) returns false
      track best match (highest score)
    if best match found:
      if consumeInput → return immediately (block lower contexts)
      else → record as candidate
  return highest-priority candidate or null

Built-in Contexts

Palm Rejection (priority 0 — highest):

  • When stylus is active, blocks all finger taps/long-press via when: ctx => ctx.isStylusActive
  • Remaps finger drag-on-node to pan
  • Pinch + scroll intentionally NOT blocked

Default (priority 100 — lowest):

  • Taps: select node/edge, clear selection, shift-toggle
  • Drags: move-node, pan (per source), lasso/rect-select
  • Held-key pointer modifiers: Space+drag
  • Right-click: context-menu
  • Double/triple-tap: fit-to-view, toggle-lock
  • Long-press: context-menu on node, create-node on background
  • Pinch: zoom on background, split-node on node
  • Scroll: zoom
  • Keyboard shortcuts: search, clipboard, history, delete, escape

Shared keyboard contexts:

  • active-interaction cancellation
  • search navigation
  • keyboard navigate mode
  • keyboard manipulate mode

Input Mode Contexts (priority 5):

  • PICK_NODE_CONTEXT: tap-node -> resolve-pick-node
  • PICK_NODES_CONTEXT: tap-node -> resolve-pick-node, double-tap-bg -> finish
  • PICK_POINT_CONTEXT: tap-anywhere -> resolve-pick-point

GuardContext

Runtime state available to when predicates on bindings:

interface GuardContext {
  isStylusActive: boolean
  fingerCount: number
  isDragging: boolean
  isResizing: boolean
  isSplitting: boolean
  inputMode: InputMode
  keyboardInteractionMode: 'navigate' | 'manipulate'
  selectedNodeIds: ReadonlySet<string>
  focusedNodeId: string | null
  isSearchActive: boolean
  commandLineVisible: boolean
  heldKeys: HeldKeysState
  custom: Record<string, unknown>  // extensible
}

Rebuilt from Jotai atoms each render (cheap — just reads).

Layer 4: Dispatch

Files: dispatcher.ts, inertia.ts

Action Handlers

Two forms:

// Simple pointer handler — fires on pointer 'start' and 'instant'
registerAction('select-node', (event: PointerGestureEvent) => { ... })

// Keyboard handler — fires on key 'down'
registerAction('delete-selection', (event: KeyInputEvent) => { ... })

// Phase-aware — different logic per phase (for continuous gestures)
registerAction('pan', {
  onStart: (e) => { ... },
  onMove: (e) => { ... },
  onEnd: (e) => { startInertia(e.velocity) },
  onCancel: () => { cancelInertia() },
})

Handler registry is a plain Map<string, ActionHandler> — not Jotai atoms. Synchronous, immediate invocation.

'none' actionId is a built-in no-op (used for intentional blocking in palm rejection).

Inertia

Pure engines with no requestAnimationFrame dependency — the hook manages the rAF loop.

  • VelocitySampler: ring buffer of recent velocity samples
  • PanInertia: friction-decayed momentum (friction = 0.92)
  • ZoomInertia: friction-decayed zoom with snap-to-100% (friction = 0.88, snapThreshold = 0.03)

React Integration

Component Hierarchy

Canvas.tsx
  └─ GestureProvider (holds GestureSystemAPI + onAction callback)
       ├─ Viewport.tsx
       │    └─ useCanvasGestures(ref, mappingIndex, onAction)
       │         ├─ @use-gesture for drag/pinch/wheel
       │         ├─ TimedStateRunner for background taps
       │         ├─ Inertia engines for momentum
       │         └─ resolve() + dispatch() on every gesture event
       │
       └─ Node.tsx (for each node)
            └─ useNodeGestures(nodeId, mappingIndex, onAction)
                 ├─ TimedStateRunner for node taps
                 └─ resolve() + dispatch() for tap/double-tap/long-press/right-click

MappingIndex Flow

useGestureSystem()
  builds: [palmRejection?] + [staticContexts] + [dynamicContexts] + DEFAULT_CONTEXT
  useMemo → buildMappingIndex() → MappingIndex
    ↓
GestureProvider holds in React context
    ↓
Viewport reads via useGestureContext() → passes to useCanvasGestures
Node reads via useGestureContext() → passes to useNodeGestures

Both hooks call resolve(event, mappingIndex, guardContext) with the same index — consistent resolution across viewport and node-level gestures.

Consumer API

const gestureSystem = useGestureSystem({ palmRejection: true })

// Push a custom context at runtime
gestureSystem.pushContext({
  id: 'drawing-mode',
  priority: 50,
  enabled: true,
  bindings: [
    { id: 'pencil-draw', pattern: { type: 'drag', source: 'pencil' }, actionId: 'draw', consumeInput: true },
  ],
})

// Remove it
gestureSystem.removeContext('drawing-mode')

Pointer Ownership

  • Each active pointer is owned by at most one gesture stream
  • When @use-gesture recognizes a pinch (2+ pointers), single-pointer drags are cancelled
  • Node drag uses setPointerCapture with capture: false on viewport to prevent event stealing
  • Local gesture scopes (resize handles, edge creation, split) maintain their own pointer maps and suppress pipeline gestures via GuardContext flags

Relationship to v1

Two action systems coexist:

v1 (core/action-registry.ts) v2 (gestures/dispatcher.ts)
Handler type (ActionContext, ActionHelpers) => void (GestureEvent) => void or PhaseHandler
Registration ActionDefinition objects with metadata Simple registerAction(id, handler)
Execution executeAction(id, context, helpers) dispatch(event, resolution)
Scope Commands, keyboard shortcuts, menu items Pointer/touch gesture pipeline

v1 actions (BuiltInActionId) have rich metadata (label, description, category, icon) and receive ActionHelpers for state manipulation. v2 handlers are lightweight — just the gesture event.

Both registries are global Maps. They can reference the same action IDs — a gesture resolution can trigger a v1 action by ID.

Test Coverage

Test Module What
specificity.test.ts Scoring Pattern matching, weight ordering, mismatch rejection
timed-state.test.ts State machine Transitions, tap counting, long-press, cancel
mapper.test.ts Resolution Bucketing, priority, consumeInput, guard evaluation
inertia.test.ts Momentum Friction decay, velocity sampling, snap-to-zoom
palm-rejection.test.ts Palm rejection Stylus blocking, pinch passthrough

All tests are pure unit tests (no DOM, no React) — run via pnpm exec vitest run from packages/canvas/.

Key Design Principles

  1. Pure core, imperative shell: normalize, timed-state, specificity, mapper, inertia are all pure functions. DOM listeners, timers, and rAF live in hooks.

  2. Refs for hot path: Drag position, velocity samples, inertia engines use useRef — no re-renders during 60fps gesture streams.

  3. Specificity over priority: Within a context, the most specific match wins automatically. Between contexts, explicit priority ordering resolves conflicts.

  4. consumeInput for hard blocks: Palm rejection doesn't "disable" finger input — it matches finger gestures at priority 0 with consumeInput: true, preventing default context from seeing them.

  5. Context stack is data: MappingContexts are plain objects. Push/pop at runtime for modal states (pick mode, edge creation, drawing mode) without touching handler code.

  6. Custom modifiers for extensibility: Modifiers.custom bag supports arbitrary named modifiers (iPad toolbar buttons, held keys) with full specificity scoring — no special-casing needed.