Gestures, manipulations, and interactions (HTML)

With touch interactions, your app can translate and use physical gestures to emulate the direct manipulation of UI elements.

Touch interactions provide a natural, real-world experience when users interact with the elements on the screen. By contrast, interacting with an object through its properties window or other dialog box is considered indirect manipulation. Windows also supports touch interactions across input modes and devices, including touch, mouse, and pen.

The Windows Runtime platform APIs support user interactions through three types of interaction events: pointer, gesture, and manipulation.

  • Pointer events are used to get basic contact info such as location and device type, extended info such as pressure and contact geometry, and to support more complex interactions.
  • Gesture events are used to handle static single-finger interactions such as tapping and press-and-hold (double-tap and right-tap are derived from these basic gestures).
  • Manipulation events are used for dynamic multi-touch interactions such as pinching and stretching, and interactions that use inertia and velocity data such as panning/scrolling, zooming, and rotating. Note  The info provided by the manipulation events doesn't identify the interaction. It specifies input data such as position, translation delta, and velocity. This data is then used to determine the manipulation and perform the interaction.  

See these Quickstarts for more info:

Next, we describe the relationship between gestures, manipulations, and interactions.


A gesture is the physical act or motion performed on, or by, the input device (finger, fingers, pen/stylus, mouse, and so on). For example, to launch, activate, or invoke a command you would use a single finger tap for a touch or touchpad device (equivalent to a left-click with a mouse, a tap with a pen, or Enter on a keyboard).

Here is the basic set of touch gestures for manipulating the UI and performing an interaction.

Name Type Description
Tap Static gesture One finger touches the screen and lifts up.
Press and hold Static gesture One finger touches the screen and stays in place.
Slide Manipulation gesture One or more fingers touch the screen and move in the same direction.
Swipe Manipulation gesture One or more fingers touch the screen and move a short distance in the same direction.
Turn Manipulation gesture Two or more fingers touch the screen and move in a clockwise or counter-clockwise arc.
Pinch Manipulation gesture Two or more fingers touch the screen and move closer together.
Stretch Manipulation gesture Two or more fingers touch the screen and move farther apart.



A manipulation is the immediate, ongoing reaction or response an object or UI has to a gesture. For example, both the slide and swipe gestures typically cause an element or UI to move in some way.

The final outcome of a manipulation, how it is manifested by the object on the screen and in the UI, is the interaction.


Interactions depend on how a manipulation is interpreted and the command or action that results from the manipulation. For example, objects can be moved through both the slide and swipe gestures, but the results differ depending on whether a distance threshold is crossed. Slide can be used to drag an object or pan a view while swipe can be used to select an item or display the AppBar.

This section describes some common interactions.


The press and hold gesture displays detailed info or teaching visuals (for example, a tooltip or context menu) without committing to an action or command. Panning is still possible if a sliding gesture is started while the visual is displayed. For more info, see Guidelines for visual feedback.

Learn interaction


The tap gesture invokes a primary action, for example launching an app or executing a command.

Command interaction


The slide gesture is used primarily for panning interactions but can also be used for moving, drawing, or writing. Panning is a touch-optimized technique for navigating short distances over small sets of content within a single view (such as the folder structure of a computer, a library of documents, or a photo album). Equivalent to scrolling with a mouse or keyboard, panning is necessary only when the amount of content in the view causes the content area to overflow the viewable area. For more info, see Guidelines for panning.

Panning interaction


The pinch and stretch gestures are used for three types of interactions: optical zoom, resizing, and Semantic Zoom.

Optical zoom and resizing

Optical zoom adjusts the magnification level of the entire content area to get a more detailed view of the content. In contrast, resizing is a technique for adjusting the relative size of one or more objects within a content area without changing the view into the content area. The top two images here show an optical zoom, and the bottom two images show resizing a rectangle on the screen without changing the size of any other objects. For more info, see Guidelines for optical zoom and resizing.

Optical zoom interaction

Resizing interaction

Semantic Zoom

Semantic Zoom is a touch-optimized technique for presenting and navigating structured data or content within a single view (such as the folder structure of a computer, a library of documents, or a photo album) without the need for panning, scrolling, or tree view controls. Semantic Zoom provides two different views of the same content by letting you see more detail as you zoom in and less detail as you zoom out. For more information, see Guidelines for Semantic Zoom.

Semantic zoom interaction


The rotate gesture simulates the experience of rotating a piece of paper on a flat surface. The interaction is performed by placing two fingers on the object and pivoting one finger around the other or pivoting both fingers around a center point, and swiveling the hand in the desired direction. You can use two fingers from the same hand, or one from each hand. For more information, see Guidelines for rotation.

Rotation interaction

Selecting and moving

The slide and swipe gestures are used in a cross-slide manipulation, a movement perpendicular to the panning direction of the content area. This is interpreted as either a selection or, if a distance threshold is crossed, a move (drag) interaction. This diagram describes these processes. For more info, see Guidelines for cross-slide.

Select and drag and drop interactions

Displaying command bars

The swipe gesture reveals various command bars or the login screen.

App commands are revealed by swiping from the bottom or top edge of the screen. Use the AppBar to display app commands.

Display app commands

System commands are revealed by swiping from the right edge, recently used apps are revealed by swiping from the left edge, and swiping from the top edge to the bottom edge reveals docking or closing commands.

Display system commands


Responding to user interaction

Touch interaction design





Samples (DOM)

HTML scrolling, panning and zooming sample

Input: DOM pointer event handling sample

Input: Instantiable gestures sample

Samples (Windows Store app APIs)

Input: Gestures and manipulations with GestureRecognizer

Input: XAML user input events sample

XAML scrolling, panning, and zooming sample

Samples (DirectX)

DirectX touch input sample

Input: Manipulations and gestures (C++) sample