AWS re:Invent 2025: My Serverless & Agentic AI Takeaways
- Ran Isenberg
- 4 days ago
- 18 min read
Updated: 3 days ago

Now that AWS re:Invent 2025 has wrapped up, let’s dive into the most exciting new services and features from a Serverless and Agentic AI developer perspective.
This year truly felt like the year of Agentic AI. AWS re:Invent put AI front and center, with over 50 sessions mentioning MCP in their abstracts and more than 850 sessions related to AI—more than any other topic at the event.
Table of Contents
My AWS re:Invent Session: Scaling Serverless with Platform Engineering: A Blueprint for Success
This is a shameless self-plug :)
If you missed my breakout session with Anton Aleksander, the recording is up and ready.
Having been part of the amazing Platform Engineering team at CyberArk since its inception and seeing it grow into a team of over 100 engineers, it's finally time to share our journey, lessons learned, and insights on scaling serverless adoption.
Our session, "Scaling Serverless with Platform Engineering: A Blueprint for Success", talks about the challenges of adopting and scaling Serverless in a SaaS and enterprise organization. I'll share our journey and what we did to save millions of dollars and reduce the time to production by 99% for every new SaaS service.
Pre-Re:invent Announcements
Let's review the most exciting announcements and improvements announced before AWS re:invent. This year we have seen plenty. Honestly, this is one the best year we've had in a long time.
Amazon Bedrock AgentCore is now generally available
all AgentCore services now have support for Virtual Private Cloud (VPC), AWS PrivateLink, AWS CloudFormation, and resource tagging, enabling developers to deploy AI agents with enhanced enterprise security and infrastructure automation capabilities. AgentCore Runtime builds on its preview capabilities of industry-leading eight-hour execution windows and complete session isolation by adding support for the Agent-to-Agent (A2A) protocol, with broader A2A support coming soon across all AgentCore services
Finally, there's support for CloudFormation, meaning that CDK support will follow soon (it might already be out by the time I publish this). I'm also very excited about the new A2A protocol support. With the now nine supported regions, recommending Agentcore for any agentic/MCP/A2A flows is becoming much easier. I'm honestly not sure you should build your own MCP server on Lambda or Fargate like I showcase in my article and my MCP blueprint GitHub repo.
URL and host header rewrite with AWS Application Load Balancers
You can use this new feature to implement regex matches based on request parameters and rewrite both host headers and URLs before routing to your targets. You could rewrite before using exact value matching, but now with regex, you get a more powerful capability.
Amazon SQS increases maximum message payload size to 1 MiB
AWS Lambda increases asynchronous invocations maximum payload size from 256 KB to 1 MB, allowing customers to ingest richer, complex payloads for their event-driven workloads without the need to split, compress, or externalize data. Customers invoke their Lambda functions asynchronously using either Lambda API directly, or by receiving push-based events from various AWS services like Amazon S3, Amazon CloudWatch, Amazon SNS, Amazon EventBridge, AWS Step Functions.
SQS recently increased its payload limit to 1MB, and this new limit is now spreading to other services in the event-driven architecture domain.
This seems mostly useful at the moment for async direct invocation of a Lambda function (using the function ARN), since SNS and EventBridge haven't increased their limits from 256 KB yet.
The pricing model is the same as for SQS, which also had its payload quota increased to 1 MB recently—you pay for individual payload sizes beyond 256 KB at an additional request per 64 KB, up to 1 MB.
Now, while this change is welcome, I don't think you should use async direct Lambda invocation (using a function ARN), as it is a poor abstraction and increases coupling between your services' components. I wrote a lengthy article that explains the issues; you can find it here. I'd use SQS with a DLQ instead for async event-driven architecture between services or microservices.
Amazon ECS now supports built-in Linear and Canary deployments
With linear deployments, you can gradually shift traffic from your current service revision to the new revision in equal percentage increments over a specified time period. You configure the step percentage (for example, 10%) to control how much traffic shifts at each increment, and set a step bake time to wait between each traffic shift for monitoring and validation. This allows you to validate your new application version at multiple stages with increasing amounts of production traffic. With canary deployments, you can route a small percentage of production traffic to your new service revision while the majority of traffic remains on the current stable version. You set a canary bake time to monitor the new revision's performance, after which Amazon ECS shifts the remaining traffic to the new revision.
Amazon Bedrock AgentCore Browser now reduces CAPTCHAs with Web Bot Auth (Preview)
Amazon Bedrock AgentCore Browser provides a fast, secure, cloud-based browser for AI agents to interact with websites at scale. It now enables agents to establish trusted, accountable access quickly and reduce CAPTCHA interruptions in automated workflows through Web Bot Auth, a draft IETF protocol that cryptographically identifies AI agents to websites.
AWS Step Functions announces a new metrics dashboard
With this launch, you can now view usage and billing metrics in one dashboard on the AWS Step Functions console. Metrics are available at both account and state-machine level. You can now view these metrics for both standard and express workflows. In addition, existing metrics, such as ApproximateOpenMapRunCount, are available on the metrics dashboard.
Application loadbalancer support client credential flow with JWT verification
This is a very interesting feature for ALB users, such as those who utilise Fargate ECS with ALBs. It makes life easier and removes the need for custom authentication code on your compute container.
enabling secure machine-to-machine (M2M) and service-to-service (S2S) communications. This feature allows ALB to verify JSON Web Tokens (JWTs) included in request headers, validating token signatures, expiration times, and claims without requiring modifications to application code.
By offloading OAuth 2.0 token validation to ALB, customers can significantly reduce architectural complexity and streamline their security implementation.
If you want to get started with Fargate ECS and host a web application, I wrote an article and provided a complete GitHub repo with CDK code. Check them both out here.
AWS Lambda enhances event processing with provisioned mode for SQS event-source mapping
AWS gives us more control over SQS->Lambda pattern for increased performance.
With provisioned mode, you can configure both minimum and maximum numbers of event pollers for your SQS ESM. Each event poller represents a unit of compute that handles queue polling, event batching, and filtering before invoking Lambda functions. Each event poller can handle up to 1 MB/sec of throughput, up to 10 concurrent invokes, or up to 10 SQS polling API calls per second. By setting a minimum number of event pollers, you enable your application to maintain a baseline processing capacity that can immediately handle sudden traffic increases.
AWS recommends that you set the minimum event pollers required to handle your known peak workload requirements.
Amazon MWAA Serverless
New managed and serverless services are always welcome! This time, it's an implementation of Apache Airflow - you don't manage or care about infrastructure, you pay only when it runs, but I wasn't able to find whether it can scale to zero or not - I believe it does not, so almost fully serverless, but still very useful!
Also, you still need to define a VPC, which "ruins" the serverless "feel" quite a bit.
you can focus on your workflow logic rather than monitoring for provisioned capacity. You can now submit your Airflow workflows for execution on a schedule or on demand, paying only for the actual compute time used during each task’s execution. The service automatically handles all infrastructure scaling so that your workflows run efficiently regardless of load.
Python 3.14 runtime now available in AWS Lambda
Pretty straight forward but pay attention that some advanced Python 3.13/3.14 are not enabled.
The just-in-time (JIT) compiler is not available in the Lambda runtime because it’s still in an experimental phase. Free-threaded mode, running Python without the global interpreter lock, is supported in Python 3.14, but it is not enabled in the Lambda runtime due to potential performance impact. To use these features in Lambda, you can deploy your own Python runtime build with these features enabled, using a container image or custom runtime.
Streamlined multi-tenant application development with tenant isolation mode in AWS Lambda
Tenant isolation is one of the most critical security aspects for any SaaS provider. Today, AWS Lambda makes it easier to manage tenants in a Lambda pool model and harder to make mistakes.
Is it perfect? No, not yet, I'll share my thoughts on a review blog post (hopefully very soon).
And yes, it comes at an extra cost of more cold starts, BUT it might be worth it to many of you.
TL;DR - you can use your Lambda memory for caching safely as each tenant has its own Lambda instance, but it comes at the cost of more concurrent functions and more cold starts as functions don't cater to multiple tenants anymore and you need to use a Lambda authorizer to set it up properly (extract tenant id and pass it to the Lambda function with the correct header name). Also, it still does not address tenant isolation issues when accessing tenant data in a shared DB that stores data from multiple tenants (more on that in my blog post after the re:Invent madness is over).
To summarize - I love the fact that AWS tackles this issue natively in Lambda, but it's still not perfect, and we're getting there.
This built-in capability processes function invocations in separate execution environments for each tenant, enabling you to meet strict isolation requirements without additional implementation effort to manage tenant-specific resources within function code.
Check out this article that covers the feature in more details and provides a GitHub repo.
Building responsive APIs with Amazon API Gateway response streaming
AWS announced support for response streaming in Amazon API Gateway to significantly improve the responsiveness of your REST APIs by progressively streaming response payloads back to the client. With this new capability, you can use streamed responses to enhance user experience when building LLM-driven applications (such as AI agents and chatbots), improve time-to-first-byte (TTFB) performance for web and mobile applications, stream large files, and perform long-running operations while reporting incremental progress using protocols such as server-sent events (SSE). Each 10MB of response data, rounded up to the nearest 10MB, is billed as a single request
Amazon S3 adds new bucket-level setting to standardize encryption types used in your buckets
Amazon S3 now supports a new default encryption configuration setting to enforce Amazon S3 managed server-side encryption (SSE-S3) or server-side encryption with AWS KMS keys (SSE-KMS) for all write requests to your buckets. This new bucket-level setting helps you standardize the server-side encryption types that can be used with your buckets. Using the PutBucketEncryption API, you can disable server-side encryption with customer-provided keys (SSE-C) on specific buckets or in your AWS CloudFormation templates.
Amazon DynamoDB now supports multi-attribute composite keys in global secondary indexes
A very nice quality of life update!
With multi-attribute keys, you no longer need to manually concatenate values into synthetic keys, which sometimes result in the need to backfill data before adding new indexes. Instead, you can create primary keys using up to eight existing attributes, making it easier to model diverse access patterns and adapt to new query requirements.
Simplify access to external services using AWS IAM Outbound Identity Federation
This announcement introduces IAM outbound identity federation, which lets AWS workloads authenticate to external systems using short-lived, cryptographically signed JWTs instead of storing long-term secrets like API keys and passwords. External services can validate these tokens to trust the AWS identity securely and grant access without credential sprawl.
Amazon API Gateway adds Developer Portal capabilities
I'm conflicted about this one. I think that most developer portal solutions today provide way more than just API documentation. So, it's a nice start by AWS but unless this becomes a proper product and does more than the API gateway it is limited to, it's just a nice to have and nothing more.
The documentation generation is easily solvable without it by using AWS Lambda Powertools for Lambda (if you are using Lambda at least, check out my blog post).
Amazon API Gateway launches Portals that now enable businesses to create fully managed, AWS native developer portals that serve as the central hub for AWS assets such as REST APIs for discovery, documentation, governance, and monetization across their AWS infrastructure.
AWS Lambda networking over IPv6
You can transition to IPv6 to future-proof your overall architecture by preparing ahead of the broader transition to IPv6, and establish compatibility with IPv6 clients or services.
IPv6 also eliminates the need for a NAT gateway when the Lambda functions need internet connectivity from a private subnet in your Amazon Virtual Private Cloud (Amazon VPC).
Accelerate workflow development with enhanced local testing in AWS Step Functions
AWS improved the Step Function test API, making it easier to run local tests in the IDE (no console, please!).
The enhanced TestState API introduces three significant capabilities for local testing in AWS Step Functions. It now supports full mocking of state outputs and errors with strict, present, or no validation, allowing true unit testing without calling downstream services. It also enables testing of all state types—including Map, Parallel, Activity tasks, and .sync/.waitForTaskToken integrations—and allows developers to test individual states within a full state machine by name, including retries, Map iterations, and error paths.
Node.js 24 runtime now available in AWS Lambda
You can now develop AWS Lambda functions using Node.js 24, either as a managed runtime or using the container base image. Node.js 24 is in active LTS status and ready for production use. It is expected to be supported with security patches and bugfixes until April 2028.
The blog posts highlights Node.js 24 improvements:
AWS Service Quotas adds now support for automatic quota management
AWS service quotas tracks your service' quotas but NOW it also increase it automatically for you - this is a huge QOL update as you don't need to open the support ticket to increase the quotas. It doesn't specify which quotas are supported, but it points to the service documentation for further information.
Amazon CloudFront announces support for mutual TLS authentication
Amazon CloudFront now supports mutual TLS (mTLS) at the edge, so only clients presenting trusted X.509 certificates can access distributions, enabling strong client authentication for B2B APIs and IoT without custom access layers. The feature works with third-party CAs or AWS Private Certificate Authority, is available at no extra cost, and can be configured via Console, CLI, SDK, CDK, or CloudFormation.
AWS Lambda announces enhanced error handling capabilities for Kafka event processing
AWS Lambda launches enhanced error handling capabilities for Amazon Managed Streaming for Apache Kafka (MSK) and self-managed Apache Kafka (SMK) event sources.
With this launch, developers can now exercise precise control over failed event processing and leverage Kafka topics as an additional on-failure destination when using Provisioned mode for Kafka ESM. Customers can now define specific retry limits and time boundaries for retry, automatically discarding failed records beyond these limits to customer-specified destination. They can now also set automatic retries of failed records in the batch and enhance their function code to report individual failed messages, optimizing the retry process.
Amazon Route 53 announces accelerated recovery for managing public DNS records
As we know from recent AWS outages, outages in us-east-1 are critical. Route53's control plane runs in a single region (us-east-1), so if it goes down, you can't change your DNS records. Proper DR solutions use Route53's health checks to route traffic to another region automatically, but if you didn't build such a system, this feature will allow your systems to come back up in another region, BUT only after 60 minutes.
Accelerated recovery targets a 60-minute recovery time objective (RTO) for regaining the ability to make DNS changes to your DNS records in Route 53 public hosted zones, if AWS services in US East (N. Virginia) become temporarily unavailable.
Amazon Bedrock introduces Reserved Service tier
Amazon Bedrock launched a Reserved tier that lets you pre-reserve token-per-minute capacity with tunable input/output rates, 99.5% uptime target, automatic overflow to on-demand, and monthly billing, now available for Anthropic Claude Sonnet 4.5 (and only for now!).
Announcing AWS CDK Mixins (Preview): Composable Abstractions for AWS Resources
CDK Mixins enable you to apply sophisticated features to any construct whether L1, L2, or custom without being locked into specific implementations.
CDK Mixins are composable, reusable abstractions that can be applied to constructs after they are created. Think of them as modular capabilities that you can mix and match to build exactly the infrastructure you need. Unlike traditional L2 constructs that bundle all features together, Mixins give you fine-grained control over which abstractions apply.
I highly recommend you check out the detailed post with code examples:
The difference between aspects and mixins:
According to the Mixins RFC, it is recommended to use Mixins to make changes and to use Aspects to validate behaviors.
From a governance standpoint, as a platform engineer, I still advocate to use custom constructs that encapsulate all the options inside of them and not let developer use mixins or aspects.
Apply fine-grained access control with Bedrock AgentCore Gateway interceptors
I attended Dhawalkumar Patel and Bill Tarr's chalk talk (which was excellent BTW), where they covered AgentCore in more detail. I initially missed this announcement, but it's a huge one for those who write agentic/MCP gateways and need more granular authorization and control over MCP requests and responses for their tenants and users.
The challenge is securing MCP tool access based on the calling principal’s access permissions and contextually responding to ListTools, InvokeTool, and Search calls to AgentCore Gateway.
Interceptors are Lambda functions with your custom code that can be either on the request side (once it hits the gateway) or at the response side (before it is returned to the caller). It's agentic spin of an API Gateway Lambda authorizer. You get complete control. For example, you can use it to reduce the number of tools returned to an agent calling on behalf of a user who has permissions to access only three of the five tools the MCP serves. The article is quite detailed, and I highly recommend it.
AWS re:Invent Announcements
Plenty of announcements this year with a couple of them being truly innovative!
Introducing AWS Lambda Managed Instances: Serverless simplicity with EC2 flexibility
Starting re:Invent with a bang!
My TL;DR - a very unique offering. Run a Lambda function on configurable EC2 instances you don't manage (serverless FTW!). Use it if you have a high, steady traffic load, and it makes sense cost-wise. Fifteen15-minute runtime limits still apply, and you are now required to use a VPC by default.
As for scale and performance, it has a positive impact on cold start and utilization, perhaps allowing those with high scaling requirements to get them at a better cost:
Lambda automatically routes requests to preprovisioned execution environments on the instances, eliminating cold starts that can affect first-request latency. Each execution environment can handle multiple concurrent requests through the multiconcurrency feature, maximizing resource utilization across your functions.
As for the underlying infrastructure, it's EC2, but you don't actually need to manage it, which keeps this solution truly serverless:
AWS handles instance provisioning, OS patching, security updates, load balancing across instances, and automatic scaling based on demand.
Lambda Managed Instances pricing has three parts: you pay per Lambda request, you pay for the EC2 servers that run your code, and you pay an extra 15% fee for AWS to manage those servers for you. You don't pay per execution time like normal Lambda, and using multiple concurrent requests on the same instance can reduce total cost and cold starts.
Check out Marcin's cost calculator for an easy understanding when it's better to use managed instances.
Does this mean it's cheaper than regular Lambda at scale?
Sometimes, yes. If you have steady, high traffic, it can be cheaper because EC2 pricing (especially with Savings Plans or Reserved Instances) is often lower than per-request execution costs—but for low or spiky traffic, regular Lambda is usually cheaper.
As for other minor caveats:
Overall, a more complicated setup. You need to set up a VPC (Lambda uses its default VPC), which incurs an extra cost and makes the setup more complex.
I don't know yet how it's implemented, whether every request is its own thread or process on the EC2 instance - if they are threads, that might be an issue due to global variables/cache usages in your code that you need to address (i.e, required using a mutex now, etc.).
The Lambda function still has the 15-minute runtime limit.
AWS announces preview of AWS Interconnect - multicloud
AWS making multi cloud simpler for everyone. Well done.
It enables customers to quickly establish private, secure, high-speed network connections with dedicated bandwidth and built-in resiliency between their Amazon VPCs and other cloud environments. Interconnect - multicloud makes it easy to connect AWS networking services such as AWS Transit Gateway, AWS Cloud WAN, and Amazon VPC to other Cloud Service Providers (CSPs)
It is available in preview in five AWS Regions. You can enable this capability using the AWS Management Console.
Amazon EKS Capabilities
AWS is taking managed K8s to the next level with this release. Application developers get ready-to-use platform capabilities that enable faster workload deployment and scaling across the organization, while platform teams can offload operational tasks to AWS.
This, however, comes at an added cost (see pricing) that might be offset by the fact that your engineers can focus on services rather than maintaining the K8s cluster.
Three capabilities are available at launch including continuous deployment with Argo CD, AWS resource management through AWS Controllers for Kubernetes (ACK), and dynamic resource orchestration using Kube Resource Orchestrator (KRO).
AWS Lambda durable functions
Think of Step Functions but without the YAML definitions. You can now write a new, durable Lambda function. You can write your own code, add a new SDK to define wait steps, sleep steps, just like you expect a real step function to behave. Oh, and it can run for almost a year (without paying for idle compute during waits.) and you retain the Lambda developer experience. However, every step can last only 15 minutes as all Lambda functions.
After enabling a function for durable execution, you add the new open source durable execution SDK to your function code. You then use SDK primitives like “steps” to add automatic checkpointing and retries to your business logic and “waits” to efficiently suspend execution without compute charges.
Error handling and replay is quite smooth:
When execution terminates unexpectedly, Lambda resumes from the last checkpoint, replaying your event handler from the beginning while skipping completed operations.
As someone who used Step Functions only when he had to, this is an interesting offering. I will need to see what the DevEx is like and how easy it is to test locally, but this might be my go-to for long-running serverless processes. Expect a blog post on the matter.
AWS DevOps Agent helps you accelerate incident response and improve system reliability (preview)
AWS DevOps Agent is a “frontier agent” that acts as an autonomous on-call engineer: when an alert fires, it automatically gathers metrics, logs, deployment history, and other telemetry from tools like Amazon CloudWatch, GitHub, Splunk, Datadog (also supports MCP serverls to connect to your own applications) etc., to identify root causes and suggest targeted mitigation steps — thereby cutting mean time to resolution.
Frontier agents represent a new class of AI agents that are autonomous, massively scalable, and work for hours or days without constant intervention.
New AWS Security Agent secures applications proactively from design to deployment (preview)
a frontier agent that proactively secures your applications throughout the development lifecycle. It conducts automated application security reviews tailored to your organizational requirements and delivers context-aware penetration testing on demand.
A new agent that supports design review, pen testing or code review from a security point of view. On paper, it sounds quite promising!
The design review capability analyzes architectural documents and product specifications to identify security risks before code is written.
The code review capability analyzes pull requests in GitHub to identify security vulnerabilities and organizational policy violations. AWS Security Agent detects OWASP Top Ten common vulnerabilities such as SQL injection, cross-site scripting, and inadequate input validation.
The on-demand penetration testing capability executes comprehensive security testing to discover and validate vulnerabilities through multistep attack scenarios. AWS Security Agent systematically discovers the application’s attack surface through reconnaissance and endpoint enumeration, then deploys specialized agents to execute security testing across 13 risk categories, including authentication, authorization, and injection attacks. It uses your source code, API spec and business documentation to build a test plan.
Amazon Nova 2 Sonic: Our new speech-to-speech model for conversational AI
Nova 2 Sonic delivers expressive voices, masculine and feminine voices in each of the supported languages with native expressivity, natural turn-taking, and seamless handling of user interruptions.
Amazon Bedrock AgentCore adds quality evaluations and policy controls for deploying trusted AI agents
Amazon Bedrock AgentCore now adds fine-grained policy controls and quality evaluations.
Policies enforce the use of CEDAR language (used by AWS Verified Permissions) to specify what agents are allowed to do before they run any tool, while evaluations continuously assess agent behavior (helpfulness, correctness, safety, tool usage, etc.). For other custom use cases or requirements, you should use the newly introduced interceptors feature.
You can define which tools and data agents can access—whether they are APIs, AWS Lambda functions, Model Context Protocol (MCP) servers, or third-party services—what actions they can perform, and under what conditions.
AgentCore Evaluations is a fully managed service that helps you continuously monitor and analyze agent performance based on real-world behavior.
With AgentCore Evaluations, you can use built-in evaluators for common quality dimensions such as correctness, helpfulness, tool selection accuracy, safety, goal success rate, and context relevance. You can also create custom model-based scoring systems configured with your choice of prompt and model for business
They also announced more advanced capabilities (like long-term episodic memory and bidirectional streaming) — letting agents learn from past interactions and support natural, interruptible voice interactions.
Amazon Bedrock adds 18 fully managed open weight models, including the new Mistral Large 3 and Ministral 3 models
Pretty straightforward.
Accelerate AI development using Amazon SageMaker AI with serverless MLflow
Amazon SageMaker AI with MLflow now includes a serverless capability that eliminates infrastructure management. This new MLflow capability transforms experiment tracking into an immediate, on-demand experience with automatic scaling that removes the need for capacity planning.
Build reliable AI agents for UI workflow automation with Amazon Nova Act, now generally available
Amazon Nova Act is now GA. Act delivers over 90% task reliability at scale while offering the fastest time to value and ease of implementation compared to other AI frameworks.
Nova Act addresses the challenge of building reliable browser automation at enterprise scale. Powered by a custom Amazon Nova 2 Lite model, Nova Act excels at driving browsers, support calling APIs, and escalating to humans when needed. The service has core capabilities for web quality assurance (QA) testing, data entry, data extraction, and checkout flows.
Reinforcement Fine-tuning in Amazon Bedrock
Amazon Bedrock now supports reinforcement fine-tuning (RFT), which lets you train models using feedback (rewards) rather than large labeled datasets — giving developers a simpler and cheaper way to customize AI for their needs.
On average, RFT improves model accuracy by ~66% over base models, making smaller, faster, more cost-effective models that still deliver high-quality outputs.
Graviton5—the company’s most powerful and efficient CPU
New AWS Graviton5-based Amazon EC2 M9g instances deliver up to 25% higher performance than the previous generation. With 192 cores per chip and 5x larger cache, customers can scale up workloads and improve application performance while reducing infrastructure cost.