← Back to Videos
AIAzure

Azure AI Foundry, explained in one diagram | Azure AI Foundry Architecture

A one-diagram walkthrough of Azure AI Foundry architecture — Hub, Projects, Model Catalog, Prompt Flow, and key Azure service connections.

📅 26 June 20260:50✍️ Rahul Kumar

What Is Azure AI Foundry and Why Does It Matter for Enterprise Architects

Azure AI Foundry is Microsoft's unified platform for building, deploying, evaluating, and governing enterprise AI applications at scale. It consolidates capabilities that were previously spread across Azure OpenAI Studio, Azure Machine Learning, and standalone Cognitive Services into a single, coherent workspace with shared identity, networking, and governance controls. For enterprise architects evaluating AI platforms in 2025 and beyond, understanding the Foundry architecture is non-negotiable — it is the layer through which Microsoft expects organizations to operationalize AI at production scale.

The single-diagram explainer format is effective precisely because Foundry's architecture has a clear hierarchy: a Hub at the top, Projects beneath it, and a set of connected Azure services and model catalogs that Projects consume. Once that hierarchy clicks, the rest of the platform makes sense.

The Hub: Governance and Shared Infrastructure

The Azure AI Foundry Hub is the top-level organizational construct. Think of it as an Azure resource that encapsulates shared infrastructure for one or more AI teams or workloads within your organization. A Hub owns and manages the following shared resources:

  • Azure Storage Account — stores datasets, model artifacts, prompt flow outputs, and evaluation results
  • Azure Key Vault — centralized secret management for API keys, connection strings, and credentials shared across all Projects under the Hub
  • Azure Container Registry — stores custom container images used in training and inference pipelines
  • Application Insights — unified monitoring and telemetry across all Projects
  • Managed Virtual Network — optional private networking that governs how Hub-connected resources communicate, enabling full private endpoint isolation

From a governance perspective, the Hub is where your platform team establishes baseline security controls. Role-based access control (RBAC) applied at the Hub level cascades down to all Projects, which is critical for organizations that need to enforce data boundary separation between business units while sharing infrastructure costs.

Projects: The Developer and Data Scientist Workspace

Each Azure AI Foundry Project sits beneath a Hub and represents a scoped workspace for a specific application, team, or use case. A Project inherits the Hub's shared infrastructure but has its own isolated context for experiments, deployments, and connection configurations. Key characteristics of a Project include:

  • Isolated namespace for model deployments, prompt flows, and evaluations
  • Scoped RBAC — a data scientist can be granted access to one Project without seeing others under the same Hub
  • Independent connection overrides — a Project can define its own Azure OpenAI endpoint or Azure AI Search instance separate from Hub defaults
  • Tracing and telemetry scoped to the Project, making it possible to track cost and performance per application

The Hub-Project model is deliberately designed to mirror how enterprise AI teams are organized: a central platform team manages the Hub, while individual product teams own their Projects. This mirrors Azure Landing Zone principles applied specifically to AI workloads.

The Model Catalog: One Interface, Many Providers

The Azure AI Foundry Model Catalog is the central registry for foundation models available within the platform. It is not limited to Azure OpenAI models. The catalog includes models from multiple sources:

  • Azure OpenAI Service — GPT-4o, GPT-4, o-series reasoning models, DALL-E, Whisper, and embeddings models
  • Meta — Llama 3 and Llama 3.1 families for organizations that need open-weight models with data residency control
  • Mistral AI — Mistral Large, Mistral Small, and instruction-tuned variants
  • Hugging Face — a broad library of open-source models deployable on managed compute
  • Cohere — embedding and rerank models suited for retrieval-augmented generation pipelines
  • Microsoft Research — Phi-3 and Phi-3.5 small language models optimized for edge and cost-sensitive inference

The catalog supports two deployment modes: Serverless API (pay-per-token, no infrastructure management) and Managed Compute (dedicated GPU clusters, full control over throughput and isolation). For enterprise architects, this distinction drives both cost and compliance decisions — regulated industries often require managed compute to control data residency and audit trails.

Azure AI Services: Pre-Built Cognitive Capabilities

Azure AI Services connects directly into Foundry Projects as a set of pre-built, API-accessible capabilities. These are not custom model deployments — they are production-grade Microsoft-managed endpoints for specific modalities:

  • Azure AI Vision — image analysis, OCR, face detection, and object recognition
  • Azure AI Speech — speech-to-text, text-to-speech, and real-time translation
  • Azure AI Language — entity recognition, sentiment analysis, summarization, and PII detection
  • Azure AI Translator — multi-language translation at scale
  • Azure AI Document Intelligence — structured data extraction from invoices, forms, and contracts

In the Foundry diagram, Azure AI Services sits as a connected service layer that Projects reference via Connections — a Foundry abstraction that stores endpoint URLs and credentials in Key Vault and surfaces them as named references within the workspace.

Prompt Flow: Orchestration and Evaluation Built In

Prompt Flow is Foundry's built-in LLM orchestration and evaluation engine. It provides a directed acyclic graph (DAG) authoring environment where architects and developers can chain LLM calls, tool invocations, Python code, and retrieval steps into reproducible, version-controlled pipelines. Key capabilities include:

  • Flow authoring — visual or code-based composition of multi-step AI pipelines
  • Evaluation flows — first-class support for measuring groundedness, relevance, fluency, and custom metrics against test datasets
  • Variants — built-in A/B testing for prompt versions and model configurations
  • Batch runs — execute a flow against large datasets and collect structured output for offline evaluation
  • Deployment as endpoints — a finalized flow can be deployed as a managed online endpoint, callable from any application

For organizations that need to move beyond single-turn chat completions into complex, multi-step reasoning pipelines with audit trails and quality gates, Prompt Flow is the production path inside Foundry.

Key Connected Services: Azure OpenAI, Azure AI Search, and Azure ML

Three Azure services deserve special attention in the Foundry architecture diagram because they represent the most common integration points for enterprise AI applications:

  • Azure OpenAI Service — when a Foundry Project deploys a GPT or embedding model, the underlying resource is an Azure OpenAI deployment. Foundry wraps it with project-scoped access control and exposes it through the catalog, but the actual inference endpoint is Azure OpenAI.
  • Azure AI Search — the primary vector store and retrieval engine for RAG patterns within Foundry. Projects connect to an Azure AI Search instance to index documents, run hybrid search (keyword plus vector), and feed retrieved context to LLM calls inside Prompt Flow.
  • Azure Machine Learning — Foundry shares infrastructure lineage with Azure ML. Custom training jobs, MLflow experiment tracking, and compute cluster management are available through the ML integration layer, making Foundry the right surface for teams that need both fine-tuning workflows and prompt engineering in the same governance boundary.

Azure AI Foundry vs Azure OpenAI Studio: When to Use Each

This is the question enterprise architects most frequently ask when evaluating the platform:

  • Use Azure OpenAI Studio when your team's scope is limited to OpenAI models specifically, you need rapid prototyping without organizational governance overhead, and you are not integrating with other Azure AI services in the same workflow
  • Use Azure AI Foundry when you need multi-model flexibility, when multiple teams or projects share infrastructure, when you need built-in evaluation and observability, when RAG pipelines require tight Azure AI Search integration, or when your compliance posture demands centralized RBAC and private networking across all AI workloads

In practice, Microsoft is actively consolidating the Azure OpenAI Studio experience into Foundry. The architectural direction is clear: Foundry is the long-term platform.

Architectural Takeaways

The Foundry Hub-Project model maps directly to enterprise concepts you already understand: management groups, subscriptions, and resource groups in the landing zone world translate to Hubs, Projects, and Connections in the Foundry world. The most important design decisions are where you draw the Hub boundary (typically per business unit or data classification level), which model deployment mode you choose (serverless for variable load, managed compute for throughput guarantees and data isolation), and whether you build orchestration inside Prompt Flow or bring your own orchestration framework via the SDK. All three decisions have cost, compliance, and operational implications that compound at enterprise scale.

Watch on YouTube

▶ Watch Now

Opens in YouTube

Share on LinkedIn

One click — copies a ready-to-post update about this video

About the Author

Rahul Kumar is a Senior Cloud and AI Architect at Microsoft with 13+ years of enterprise experience across Azure, AWS, and GCP.

Book a Discussion