Get started using AI-backed features and APIs in your Windows app
Article
Windows Copilot Runtime offers a variety of AI-backed features and APIs that let you to tap into AI functionality without the need to find, run, or optimize your own Machine Learning (ML) model. The models that power Windows Copilot Runtime on Copilot+ PCs run locally and in the background at all times.
Windows Copilot Runtime features and APIs for Windows apps
Windows Copilot Runtime includes the following features and AI-backed APIs (in the Windows App SDK) powered by models running locally on the Windows device.
Phi Silica: Not yet available. The Phi Silica APIs will ship in the Windows App SDK. Similar to OpenAI's GPT Large Language Model (LLM) that powers ChatGPT, Phi is a Small Language Model (SLM) developed by Microsoft Research to perform language-processing tasks on a local device. Phi Silica is specifically designed for Windows devices with a Neural Processing Unit (NPU), allowing text generation and conversation features to run in a high performance, hardware-accelerated way directly on the device.
Text Recognition with OCR: Not yet available. The Text Recognition APIs (also referred to as Optical Character Recognition or OCR) will ship in the Windows App SDK. These APIs enable the recognition of text in an image and the conversion of different types of documents (such as scanned paper documents, PDF files, or images captured by a digital camera) into editable and searchable data on a local device.
Imaging APIs: Not yet available. The AI-enhanced Imaging APIs will ship in the Windows App SDK. These APIs perform a variety of actions such as intelligently scaling images and identifying objects within images.
Studio Effects: Available in Windows 11, version 22H2 or newer (Build 22623.885+), on Copilot+ PCs. Windows devices with compatible Neural Processing Units (NPUs) integrate Studio Effects into the built-in device camera and microphone settings. Apply special effects that utilize AI, including: Background Blur, Eye Contact correction, Automatic Framing, Portrait Light correction, Creative Filters, or Voice Focus for filtering out background noise.
Recall: Available for preview via Windows Insiders Program on Copilot+ PCs. Recall enables users to quickly find things from their past activity, such as documents, images, websites and more. Developers can enrich the user's Recall experience with their app by adding contextual information to the underlying vector database with the User Activity API. This integration will help users pick up where they left off in your app, improving app engagement and user's seamless flow between Windows and your app.
Live Caption Translations help everyone on Windows, including those who are deaf or hard of hearing, better understand audio by viewing captions of spoken content (even when the audio content is in a language different from the system's preferred language).
Cloud-based, AI-backed APIs for Windows apps
You may also be interested in using APIs that run models in the cloud to power AI features that can be added to your Windows app. A few examples of cloud-based AI-backed APIs offered by Microsoft or OpenAI include:
Azure OpenAI Service: If you want your Windows app to access OpenAI models, such as GPT-4, GPT-4 Turbo with Vision, GPT-3.5-Turbo, DALLE-3 or the Embeddings model series, with the added security and enterprise capabilities of Azure, you can find guidance in this Azure OpenAI documentation.
Azure AI Services: Azure offers an entire suite of AI services available through REST APIs and client library SDKs in popular development languages. For more information, see each service's documentation. These cloud-based services help developers and organizations rapidly create intelligent, cutting-edge, market-ready, and responsible applications with out-of-the-box and prebuilt and customizable APIs and models. Example applications include natural language processing for conversations, search, monitoring, translation, speech, vision, and decision-making.
Considerations for using local versus cloud-based AI-backed APIs in your Windows app
When deciding between using an API in your Windows app that relies on running an ML model locally versus in the cloud, there are several advantages and disadvantages to consider.
Resource Availability
Local Device: Running a model depends on the resources available on the device being used, including the CPU, GPU, NPU, memory, and storage capacity. This can be limiting if the device does not have high computational power or sufficient storage. Small Language Models (SLMs), like Phi, are more ideal for use locally on a device.
Cloud: Cloud platforms, such as Azure, offer scalable resources. You can use as much computational power or storage as you need and only pay for what you use. Large Language Models (LLMs), like the OpenAI language models, require more resources, but are also more powerful.
Data Privacy and Security
Local Device: Since data remains on the device, running a model locally can be more secure and private. The responsibility of data security rests on the user.
Cloud: Cloud providers offer robust security measures, but data needs to be transferred to the cloud, which might raise data privacy concerns in some cases.
Accessibility and Collaboration
Local Device: The model and data are accessible only on the device unless shared manually. This has the potential to make collaboration on model data more challenging.
Cloud: The model and data can be accessed from anywhere with internet connectivity. This may be better for collaboration scenarios.
Cost
Local Device: There is no additional cost beyond the initial investment in the device.
Cloud: While cloud platforms operate on a pay-as-you-go model, costs can accumulate based on the resources used and the duration of usage.
Maintenance and Updates
Local Device: The user is responsible for maintaining the system and installing updates.
Cloud: Maintenance, system updates, and new feature updates are handled by the cloud service provider, reducing maintenance overhead for the user.
Learn how to get started with Microsoft Copilot, understand its main functions and benefits, and explore strategies for effective interaction. This module covers GPT model fundamentals, setup, and tips for engaging conversations.