Azure to AWS Bedrock Migration Checklist for OpenAI

By
Brennan Lewis
June 17, 2026
7 min

01

Blog content test

Blog content test

For CTOs, platform leads, and the engineers who will actually do the work of moving an Azure OpenAI workload to Bedrock.

The hard part is proving the workflow still works.

That is where we see teams under-scope the work. They will treat the move like cloud plumbing when it is really an application, governance, and quality migration.

OpenAI and AWS have created a real new option. OpenAI models, Codex, and Managed Agents are now available on AWS,  and AWS describes the Bedrock path as inheriting controls like IAM, PrivateLink, guardrails, encryption, and CloudTrail logging.

If you are AWS-heavy, migration is the right direction. The work is making sure it happens in the right order, with the right controls, and without breaking production workflows that already depend on Azure OpenAI.

First, decide the migration path

Not every workload should move the same way. Here is how we sort the candidates.

Some workloads can move quickly because they are mostly model calls wrapped inside an application your team already controls. Some need a more careful migration because they depend on structured outputs, retrieval, tool calls, or downstream automation. The most complex workflows should move as runtime rebuilds, especially when they depend on memory, approvals, policy enforcement, traces, and long-running state.

The limited preview status matters, but it should shape the migration plan rather than stall it. Use preview-stage maturity, region coverage, SLAs, and roadmap stability to decide sequencing: pilot first, prove behavior, then move production workloads as the target path matures.

The migration candidates with the strongest business case are the ones that gain from sitting inside AWS's control plane: workloads tied to S3, Redshift, DynamoDB, Lambda, AWS-hosted internal APIs, IAM, PrivateLink, CloudTrail, KMS, guardrails, centralized observability, or AWS procurement commitments.

Start there. Then run the checklist below.

1. Freeze the current workload

Before changing anything, document the source system as it actually runs.

Not the architecture diagram from six months ago. The real current state.

Capture the model and deployment, API shape, prompts, structured output schemas, tool definitions, retrieval sources, auth path, network path, logs, rate limits, safety filters, latency, cost, approval points, and failure modes.

This is where a lot of "simple" migrations get more complicated.

A workflow may look like a summarizer, but underneath it has a custom JSON schema, two retrieval calls, a human approval path, a retry loop, and downstream automation that breaks if one field changes. That is not an endpoint swap. That is a production system.

2. Build behavior parity tests

Planning for behavior parity testing is an overlooked but critical step for preparing a migration.

The parity harness should include a golden dataset of real examples. For each example, define what good output looks like. Not just "the answer seems fine." Actual checks.

For example: JSON schema validity, required fields, classification labels, citation behavior, refusal behavior, tool-call selection, extraction accuracy, hallucination checks, latency, cost, and human-review escalation.

Run the Azure OpenAI version and the Bedrock target side by side. Compare outputs. Review failures. Decide what differences matter.

Some behavior differences will be acceptable. Some will be better. Some will quietly break downstream systems. You want to find those before customers, employees, or business processes do.

A migration is a controlled rollout, not a switch flip

Move only after behavior, controls, tools, and rollback are proven.

Inventorycurrent state
Parityeval harness
MapAPI + models
Controlsidentity + logs
Shadowdual run
Cut overproduction

No traffic move without parity, observability, owner signoff, and rollback

3. Map the API and model surface

The next step is mapping the current API behavior to the target implementation.

This is not just about whether the target model is "good enough." It is about whether your application depends on specific behavior.

Check prompt format, system message handling, structured outputs, tool/function calling, streaming, multi-turn state, file inputs, embeddings, errors, rate limits, timeouts, and content filtering.

If the workload is using OpenAI's newer Responses-style primitives, map those explicitly. If it is using older Chat Completions patterns, decide whether migration is also the right time to modernize the interface.

Avoid bundling too many changes together.

If you change the model, API shape, prompt structure, retrieval flow, and tool contracts all at once, you will not know what caused the regression.

4. Remap controls

The strongest reason to move some workloads to Bedrock is not the model. It is the control plane around the model.

For AWS-heavy enterprises, the target architecture may simplify IAM-based access, PrivateLink networking, CloudTrail logging, KMS encryption, guardrails, cost allocation, centralized observability, and procurement alignment with cloud commitments.

But every control has to be mapped from the current Azure implementation.

If the source system uses Azure RBAC, Managed Identity, Foundry projects, Azure monitoring, or Microsoft security review patterns, write down the AWS equivalent. Do not assume the words are interchangeable.

Identity is a good example. "The agent has access" is not an implementation plan. Which role? Which policy? Which tool? Which data source? Which action needs human approval? Which logs prove what happened?

The migration plan should answer those questions before production.

5. Migrate tools and data paths

Most useful OpenAI workloads are not just model calls.

They retrieve data, call APIs, write records, create tickets, summarize documents, update systems, or hand work to people.

That means the tool and data migration matters as much as the model endpoint.

For each tool, document the owner, input schema, output schema, auth method, permission boundary, retry behavior, failure mode, audit log, and test cases. For each data source, document where it lives, who can access it, how it is filtered, how freshness is handled, and how sensitive data is protected.

This is where AWS-native architecture can help. A workflow pulling from S3, Redshift, Lambda, and internal AWS-hosted APIs may be cleaner when the model access, logs, permissions, and network path also sit inside AWS.

But a cleaner target architecture still has to be built and proven, not assumed.

6. Treat agents differently

Some workloads should not follow the standard migration path.

If the workflow depends on memory, multiple tools, approvals, pause/resume behavior, policy enforcement, browser execution, code execution, long-running state, or action audit, it is not just an Azure OpenAI migration.

It is an agent runtime rebuild.

OpenAI's Stateful Runtime Environment for Agents in Amazon Bedrock points directly at this distinction. AWS AgentCore docs describe primitives like Memory, Gateway, Identity, Code Interpreter, Browser, Observability, Evaluations, and Policy.

Those are not small features. They are the scaffolding a real agent workflow needs.

A finance exception workflow is a good example. It may need to read an invoice, check purchase order history, compare against policy, ask for approval, update an ERP queue, and log every action.

When we migrate workflows like this, moving the model call is the least interesting part. The real work is defining the agent's tools, permissions, state, approvals, traces, evals, and rollback behavior — and for anything with this shape, plan in weeks, not days.

7. Roll out in shadow before cutover

Do not migrate by flipping all traffic.

Run the target Bedrock implementation in shadow mode first. Feed it the same inputs. Compare outputs. Measure failures. Review edge cases. Tune prompts and schemas. Validate logs. Confirm cost and latency. Then move a small percentage of traffic.

A sane rollout usually moves from offline replay to shadow mode, internal users, a small production slice, monitored ramp, rollback window, and post-cutover review.

The migration owner should be explicit. The success metric should be explicit. The rollback trigger should be explicit.

Without those three things, the migration is still under-scoped.

The real checklist

The checklist is not "does the Bedrock call work?"

The checklist is:

  • Does the output still satisfy the business process?
  • Do structured outputs still validate?
  • Do tools still execute safely?
  • Do permissions match the target operating model?
  • Do logs explain what happened?
  • Do evals catch regressions?
  • Does the team know how to roll back?
  • Does the business owner know what changed?

OpenAI on AWS creates a meaningful migration path for enterprises that already run on AWS.

The value is moving the right workloads into the operating environment where they can be governed, observed, scaled, and improved.

That makes this an engineering migration, not an endpoint decision.