Kinect is fun. Kinect is new. Kinect is awesome. However, doing anything in Kinect right now requires developer work – and even with the available building blocks to rely on, for a developer to get up to speed takes a long time. This means that interaction designers cannot do Kinect development without a top-notch developer, and developers are needed even to finetune gesture settings (tolerance, timers, etc) and visualizations. In essence, designers cannot “play” with their design, cannot just turn a knob and see the end result.
Luckily, there are quite a lot of great examples in XBOX game menus that provide great interaction models. If the Kinect WPF Toolkit (KWT) allows creating similar experiences, without deep knowledge of Kinect or even coding, it has reached its goal. At this stage, we are only dealing with single-user scenarios.
Below you can read the first draft of the first draft of the main idea.
note: The Kinect WPF Toolkit borrows code from Joshua Blake’s amazing InfoStrat Motion Framework, but does not build on it as of now. I am especially grateful for Joshua’s encouragement, and for the huuge amount of time his image processing and DirectCanvas code saved me so far.
Vision
The goal of the Toolkit is to give Interaction Designers the means to create Kinect-driven applications (not games!) with the similar ease they are creating mouse / touch driven UI today in Expression Blend.
Key components
There are two key component types in KWT. Recognition components and visualizers. Recognition components deal with detecting users, hands, gestures, poses and so on. Visualizer components aid in visualizing what the camera sees or the recognizers detected.
Recognition components
User recognition – events when a user enters / leaves the scene, number of users. Events for PrimeSense calibration. There is always only one “Active” User.
Pose recognition – XBOX equivalent: the 45 degree “Kinect Menu” gesture.
The app should be notified if your body or parts of it enters or leaves a specified position, or stays there for a certain time. The app should also be able to visualize the time spent in the pose, and the time left for the pose to be acted upon. Ideally, the poses are defined by example – just performing the pose in Front of Kinect, or choosing a frame from a recording. The designer should also be able to set a “tolerance” value.
“Pointer” – XBOX Equivalent: the “Kinect experience, menu navigation in Kinectimals, Kinect Adventure”
One of the user’s body parts (mostly, but not neccessarily a hand) acts as the pointer on screen. When it is over interactive elements, they react, like a “mouseover” effect. Keeping the hand over such an element for a given time will activate it (click).
Two-hand gestures – Equivalent: multitouch pinch zoom and rotate on Surface
The user’s both hands are tracked. When they are at a certain distance from the shoulder, they are considered “touching” the screen.
Gesture detector – detecting gestures such as Push, Swipe, Steady, Wave, Circle, etc. (NITE) with any body part. Thresholds should be configurable (relative to person’s size)
3D joystick
Body parts (such as a hand, head or the center of mass) can be used as a simple 3D joystick. The zero point can be calibrated using a calibration gesture. The designer should be able to set dead zones, and a mapping function. The latter is needed for things like height mapping for center of mass – standing on tiptoe lifts COM a lot less than crouching.
Read more: VBandi's blog