I recently set up Azure AI Foundry for a project and picked up a few things along the way that I wish I’d known sooner.
OpenAI-Compatible Endpoints
The first surprise was that you don’t have to use the Azure SDK to talk to models hosted on Foundry. The service exposes OpenAI-compatible endpoints, which means you can plug it into any client that speaks the OpenAI API — chatbot frontends, CLI tools like opencode, or your own apps. Just point the base URL to your Foundry endpoint and use your Azure API key. No Azure-specific integration code needed.
One Endpoint, Multiple Models
Another nice thing is that a single Foundry API endpoint can serve different models. You don’t need to set up separate deployments with separate URLs for each model. You specify which model you want in the request, and the endpoint routes it accordingly. This makes it much easier to experiment with different models without changing your client configuration every time.
Chat Completions vs Responses
This one tripped me up briefly. Azure Foundry supports both the Chat Completions API and the newer Responses API. Chat Completions is the familiar /chat/completions endpoint — you send a list of messages, you get a response back. It’s stateless and straightforward.
The Responses API is newer and more powerful. It supports built-in tool use, structured outputs, and can maintain state across multiple turns. If you’re building something simple, Chat Completions is fine. If you need agentic behavior or richer interactions, the Responses API is the way to go.
One caveat: as of now, the Responses API on Azure Foundry only seems to work with OpenAI models. If you’re hoping to use it with other models hosted on Foundry, you’ll likely need to stick with Chat Completions for those.
Takeaway
Azure Foundry is more flexible than I expected. The OpenAI compatibility alone makes it worth considering if you’re already using OpenAI-based tooling and want to switch to Azure-hosted models without rewriting your client code.