Edit

What are the AI capabilities for Azure HorizonDB (Preview)

Generative AI is transforming how applications interact with data. As organizations move beyond basic chatbots toward retrieval-augmented generation (RAG), autonomous agents, and intelligent search, one thing is clear: data is the foundation of intelligence. Raw data becomes knowledge when it's structured, embedded, and made searchable, and knowledge becomes intelligence when AI models can reason over it, retrieve what's relevant, and take action.

Azure HorizonDB brings this full stack into PostgreSQL. Instead of stitching together separate services for embeddings, vector search, reranking, and orchestration, you get a single database that handles your operational data and your AI workloads together, with SQL as the interface.

Key concepts

If you're new to generative AI, this section introduces the core concepts that underpin AI applications and agents. Each concept builds on the previous one.

Large language models (LLMs)

A large language model (LLM) is an AI model trained on massive amounts of text data to understand and generate human-like language. LLMs use deep learning architectures (primarily transformers) with billions or trillions of parameters that capture complex patterns in language. Models like GPT-4o, GPT-5, and open-source alternatives (Llama, Mistral) can perform a wide range of tasks: text generation, summarization, translation, code generation, question answering, and more.

LLMs are powerful but have a key limitation: they only know what was in their training data. They can't access your private business data, and their knowledge has a cutoff date. This limitation is what makes the next concept, RAG, essential.

Retrieval-augmented generation (RAG)

Retrieval-augmented generation (RAG) is a pattern that addresses the limitation of large language models (LLMs) by grounding their responses in your actual data. Instead of relying solely on what the model learned during training, a RAG system retrieves relevant documents from a data source and passes them as context to the LLM before it generates a response.

A typical RAG flow has three steps:

  1. Retrieve: Search your data (using vector search, keyword search, or hybrid techniques) to find content relevant to the user's query.
  2. Augment: Include the retrieved content in the prompt sent to the LLM, providing factual context the model wouldn't otherwise have.
  3. Generate: The LLM produces a response grounded in the retrieved information, reducing inaccurate information and improving accuracy.

RAG is the foundation of most production AI applications, from customer-facing Q&A systems to internal knowledge assistants. The quality of a RAG system depends heavily on how well your data is prepared, embedded, indexed, and searched.

To learn more, see Retrieval-augmented generation (RAG).

AI agents

AI agents go beyond RAG by adding a reasoning loop. Where a RAG application follows a fixed retrieve-then-generate pipeline, an agent uses an LLM to plan, decide which tools to call, retrieve information, evaluate results, and self-correct, autonomously completing multistep tasks without human intervention. Agents combine a model, instructions, tools, and persistent memory to operate across sessions and workflows. Because agents need durable storage, access to knowledge, and scalable infrastructure, the choice of database is critical to their design.

To learn more, see What are AI agents?

A vector is a mathematical object: an ordered array of numbers that represents a point in multidimensional space. In AI, vectors encode the meaning of content (text, images, records) so that machines can compare, search, and reason over it numerically.

An embedding is a specific type of vector produced by a machine learning model, where semantically similar content maps to nearby points in vector space. For example, the phrases "lightweight laptop for travel" and "ultraportable notebook under 1 kg" produce embeddings that are geometrically close together, even though they share no words. Embedding models such as text-embedding-3-small or text-embedding-ada-002 perform this conversion, taking raw text (or other content) as input and outputting a dense vector of floating-point numbers.

The proximity between vectors is measured using vector similarity functions like cosine similarity, inner product, or Euclidean distance. Vector search uses this property to find content by meaning rather than keywords. At query time, the user's question is converted into a vector using the same embedding model, and the database finds the stored vectors closest to the query vector, returning the most semantically relevant results. Vector search is the core retrieval mechanism behind RAG. When combined with keyword search and other techniques like semantic reranking, it forms a comprehensive retrieval strategy. For a detailed look at all available retrieval techniques, see Retrieval foundations: vector, full-text, and hybrid search in Azure HorizonDB (Preview).

To see an interactive visualization of how vector similarity works, see Vectors comparison.

The role of databases in AI

Every AI pattern, whether it's RAG, agents, or fine-tuning, starts with data. But the relationship between AI and databases goes deeper than simple storage. As AI applications move from prototypes to production, the database becomes the critical infrastructure layer that determines scalability, reliability, and data freshness.

  • Data is the source of knowledge. LLMs are only as good as the context they receive. Your business data (product catalogs, support tickets, policy documents, customer records, and more) needs to be chunked, embedded, indexed, and kept in sync. The database orchestrates this entire data-to-knowledge pipeline.
  • Persistent memory for stateful applications. Chatbots and agents need to remember conversation history, user preferences, and task progress across sessions. Without durable, transactional storage, every interaction starts from zero.
  • Unified multimodal storage. AI workloads involve relational records, JSON documents, vector embeddings, graph relationships, and geospatial data. Managing these types of data across separate specialized systems introduces synchronization complexity, consistency risks, and operational overhead. A database that handles all of these natively remove that fragmentation.
  • Production-grade reliability. Prototype AI apps can use in-memory stores or flat files. Production systems need ACID transactions, point-in-time recovery, high availability, and security. Mature database systems provide these capabilities out of the box.

PostgreSQL is uniquely suited for AI workloads because it handles relational data, JSON, vectors, graphs, and full-text search in a single transactional system backed by decades of ecosystem maturity, extensibility, and broad framework support. Azure HorizonDB builds on PostgreSQL with managed infrastructure, built-in AI functions, model management, and durable pipelines purpose-built for AI workloads. For a deeper dive, see Why PostgreSQL and Azure HorizonDB for AI agents.

AI capabilities in Azure HorizonDB

Diagram showing AI capabilities in Azure HorizonDB organized as a top-down flow: Build AI agents and apps, AI functions in SQL, Data preparation and pipelines, Search and retrieval with subsections for improving performance and enhancing relevance, all on the Azure HorizonDB PostgreSQL foundation.

AI functions in SQL

Call AI models directly from SQL queries with no application code required.

Data preparation and pipelines

Prepare your data for AI retrieval by using automated, fault-tolerant workflows.

Search and retrieval

Find the right information using multiple retrieval strategies, individually or combined.

Improve search performance

As your dataset grows, indexing strategies become critical for maintaining fast query response times.

Enhance search relevance

Retrieval is only the first step. Enhance accuracy and depth with second-stage scoring and structured knowledge.

Build AI agents and apps

Connect Azure HorizonDB to agent frameworks, orchestration services, and tools.

Samples and tutorials

Get started

Azure HorizonDB gives you a single platform to go from raw data to production AI: embedding generation, vector and hybrid search, semantic reranking, knowledge graphs, durable pipelines, and agent integration, all within PostgreSQL and all accessible through SQL. Whether you're building your first RAG application or deploying multi-agent systems at scale, the capabilities described in this article work together as a complete, integrated stack. Explore the linked articles to dive deeper into each capability.

To go further, visit the PostgreSQL Hub for Azure Developers: a one-stop shop for curated code samples, solution accelerators, tutorials, structured learning pathways, and a growing developer community where you can connect with Microsoft and ecosystem experts.