Hand–Object Interaction (HOI) is a key interaction component in Virtual Reality (VR). However, designing HOI still requires manual efforts to decide how object should be selected and manipulated, while also considering user abilities, which leads to time-consuming refinements. We present HOICraft, a VLM-based in-situ HOI authoring tool that enables part-level interaction design in VR. Here, HOICraft assists designers by recommending interactable elements from 3D objects, customizing HOI design properties, and mapping hand movement with virtual object behavior. We conducted a formative study with three expert VR designers to identify five representative HOI designs to support diverse user experiences. Building upon preference data from 20 participants, we develop an HOI mapping module with in-context learning. In a user study with 12 VR interaction designers, HOI mapping from HOICraft significantly reduced trial-and-error iterations compared to manual authoring. Finally, we assessed the usability of HOICraft, demonstrating its effectiveness for HOI design in VR.
From a formative study with professional VR developers, we distilled five representative HOI designs spanning part selection (physics/collider, gesture, contact) and manipulation response (direct vs. animation) to support diverse contexts and experience levels.
We propose the VLM-based in-situ VR authoring tool that takes a segmented 3D object and textual design intent to help designers create part-level interactive objects in VR. System first recommends interactive parts, then let designers customize HOI properties and finalize the HOI mapping via ranked recommendations.
Our HOI mapping module leverages empirical user-preference data and an intent-to-metric selector to suggest suitable HOI designs via ranking-based (or binary) decisions.
The UI guides designers through four steps: intent input, part selection, HOI design customization, and HOI mapping. Designers can refine interactive parts via priority-based recommendations or manual toggles, tune HOI parameters with sliders/toggles and test changes in-scene, and review a top-1 mapping plus top-N alternatives with rationale and pros/cons.