LogiCast AWS News: DevOps Agents, ECS Innovation, and Supply Chain Security in Focus

Jon Goodall 8 April 2026

Welcome back to another episode of the LogiCast AWS News Podcast! In Season 5, Episode 13, hosts Karl Robinson and Jon Goodall of Logicata dive into a comprehensive mix of AWS announcements, from game-changing DevOps automation to infrastructure updates and critical security lessons. Let’s explore what’s captured their attention this week.

The Game-Changing Potential of AWS DevOps Agent

The most exciting development this week is the general availability of AWS DevOps Agent—a tool that Karl and Jon believe could fundamentally transform how cloud teams handle incident response. DevOps Agent moved from preview to GA last week, following its announcement at AWS re:Invent 2024.

According to Jon, who attended a recent session on the tool, “it’s on my list of things to do. Now there’s nothing in that session that was kind of NDA. It was just one of those, you know, you gotta be in the know to be in it, but it’s all stuff that was available in the preview. Um, and it looks really, really interesting.”

What makes DevOps Agent particularly compelling is its ability to automate the investigation phase of incident response. Jon explains the transformative potential: “One of the things that we do for our customer base… we kind of look after infrastructure for other people and that involves getting the pager and waking up in the middle of the night, which is no fun for anybody. What DevOps agent is able to do is it doesn’t fully replace an SRE or a person that’s gonna pick the phone up at the end of the day, right? It’s not doing that yet.”

However, the investigation capabilities alone represent a massive quality-of-life improvement. Jon notes that investigations constitute approximately 80 percent of incident response work: “You wake up at 3 o’clock in the morning, your phone’s screaming, you’ve got no idea what’s going on, you read the error message… you work out what’s going on, then you start digging into the logs and you go, the problem’s over here, the problem’s over there… then you work out what the problem is.”

The DevOps Agent can shortcut this entire investigation process. Rather than manually correlating logs, monitoring canaries, and analyzing exit codes and response rates, the agent performs these tasks automatically. As Jon explains, “you don’t have to go and look through every single cloudwatch log file, you don’t have to go and look at all your canaries and see, well this end points down and that end points down and correlate them to the logs… It’s done that for you and more importantly, because it’s an LLM you haven’t had to teach it how to do that.”

Customization Through Skills

One particularly powerful feature is the ability to customize the agent’s behavior through “skills”—essentially runbooks that teach the agent about your specific infrastructure. Jon clarifies: “It’s a run book effectively, right? It’s if this, do that, run all of these steps so you can give it some if you’ve got known failure modes and things that you need to investigate.”

AWS and other organizations have already created sample skills, and the agent’s agentic architecture means it continues learning from your infrastructure telemetry automatically. This combination of taught runbooks and autonomous learning eliminates the need to manually teach the agent every possible failure scenario.

The Trust Gap and Future Potential

However, Karl and Jon identify a critical gap in the current offering: while DevOps Agent can investigate and identify problems, it cannot yet execute fixes. Jon emphasizes the trust component of this limitation: “The thing that it’s not doing is it’s not then going off and going, oh, I could fix that. And then you know, increasing the disc or or whatever, it’s not doing that. Hopefully it will because that is an absolute game changer, but there’s a huge amount of trust that you’ve got to have there.”

Jon suggests the ideal middle ground: “I think a good middle ground from here would be something like automated investigations and then it can only do pre-approved actions, i.e. it must run an SSM document or something like that, something that you’ve had to build and tell it what to do.”

This graduated approach—investigations first, then pre-approved remediation actions—strikes a balance between automation and safety. Nevertheless, the investigation capabilities alone are genuinely exciting for teams managing infrastructure at scale.

Amazon ECS Managed Instances Get Enhanced Features

AWS announced two significant updates to ECS managed instances: support for Amazon EC2 instant store and the introduction of managed daemons.

Instant Store Support: A Blast from the Past

The addition of instant store support for ECS managed instances represents an interesting revival of an older AWS technology. Jon provides context: “Instant store was the sort of thing that happened on pre-Nitro EC2s where you could bring them up and they didn’t have EBS volumes attached to them… the discs were directly attached, they’re in the same rack and that kind of deal, uh, and they were really fast.”

Karl frames this cyclically: “Cloud computing is a bit like fashion, it’s kind of cyclical, isn’t it? These things just come back every once in a while. Ooh, instant store, you know, let’s have a new instant store trend like flared jeans or something like that.”

The practical use case for instant store is narrow but real: extremely latency-sensitive containerized workloads that benefit from direct-attached storage. Jon notes, however, that “I would think the number of people that need this is vanishingly small though.”

Managed Daemons: Filling an ECS Gap

More practically significant is the introduction of managed daemons for ECS managed instances. This feature draws conceptual parallels to Kubernetes DaemonSets—ensuring exactly one instance of a service runs on every node in your cluster.

Karl was initially surprised that ECS didn’t already support this: “I didn’t realize ECS didn’t already do this, um, so this is something… I didn’t realize ECS didn’t, that’s really weird.”

Jon explains that this capability likely already existed for traditional EC2 compute but wasn’t available specifically for managed instances. The feature fills a genuine gap: “I think what this is doing, as you say, is it’s something that was available before and it’s just bringing managed instances in line with EC2 self-managed.”

For teams using ECS managed instances, this means you can now easily ensure that monitoring agents, log collectors, or security tools run on every node without manually managing replicas or dealing with placement complexity.

Lambda Heavy: Convergence of Serverless and Traditional Computing

An AWS Compute blog post about building high-performance applications with Lambda managed instances sparked discussion about the evolving definition of serverless computing. This represents another feature in AWS’s “managed instances” rollout—infrastructure that AWS manages but that sits somewhere between pure serverless and self-managed.

Jon expresses mixed feelings about this direction: “I still have very mixed feelings about lambda heavy, as we’re calling it now. I have very mixed feelings about this, cause to me, the whole point of lambda, and this was the strapline of the services and I think it still is, is run apps without thinking about infrastructure.”

He traces Lambda’s evolution: “The billing when it first came out was per 100 milliseconds, and then it was per 10, and now it’s per millisecond… Cold starts on certain runtime when it was a new set were a real problem. They were a real problem, and now they’re basically not.”

The introduction of Lambda managed instances represents a philosophical shift. Instead of pure abstraction, AWS now offers deployment options with explicit capacity management. Jon questions whether this solves a real problem: “What you can do with this is you can say, here is my capacity… if it’s something that you need really high CPU capacity because lambda is just kind of one knob, you turn one knob and you get a certain thing.”

However, he’s skeptical about the target audience: “For your general day to day workloads, for your, I’m running an API system, I’m running, you know, it’s a static site and there’s an API behind it that hits API gateway and lambdas do things, this doesn’t make sense for most people.”

Karl and Jon ultimately agree that if you need Lambda managed instances, you probably already know it. For everyone else, traditional Lambda, Fargate, or other compute options remain more appropriate.

Valkey Migration: Open Source Economics in Action

An AWS database blog post highlighted migration best practices for moving from Amazon ElastiCache for Redis to ElastiCache for Valkey. This represents an interesting case study in open source economics and cloud cost optimization.

Jon notes that Valkey is relatively recent, arriving roughly a year ago as Amazon’s answer to Redis licensing changes. “It’s Amazon’s own, uh, caching engine. And, uh, it’s cheaper than Reddis, so that would be your primary motivator for, uh, wanting to migrate to Valky from Reddi is that would be to, uh, save money.”

The article includes a case study from “a global leader in the travel technology industry”—anonymized likely for contractual reasons. The migration results were impressive: costs dropped from just under $1,000 per day to just under $800 per day, a 20 percent reduction. They achieved this with minimal downtime and elevated latencies only during the migration window itself.

The motivation behind the migration, according to the case study, was maintaining open source technology. However, Karl and Jon are pragmatic about the real driver: “This realistically wasn’t a, it must stay open source requirement, this is a migration because we want to save 30% on our bill, thanks very much.”

Valkey’s strategy is to maintain perfect API compatibility with open source Redis, enabling what Jon describes as a straightforward migration path: “If they can if they can, um, pattern match it and just say forever, this is going to match the functionality of Redd open source and you can just drop it straight in.”

For organizations with significant ElastiCache for Redis spend, this represents a genuine opportunity for cost optimization without application changes. Jon recommends a cautious approach though: “I wouldn’t recommend doing that with your production workloads admittedly, I recommend if you try that with staging as a quick way of kind of giving it a test.”

Supply Chain Attacks: The New Weak Link

The week’s final article covered a data breach affecting the European Commission, which exposed 350 gigabytes of data from multiple databases hosted on AWS. Initial reporting was vague, but further investigation revealed the actual attack vector: hackers exploited an API key compromised during a supply chain attack on Aqua Security’s Trivy vulnerability scanner.

The attack chain illustrates a broader security trend. Karl notes that direct attacks have become increasingly difficult: “Everything else is reasonably well hardened… it’s like the case of… In the 70s, if you wanted to nick someone’s car, you just nicked the car. Now you have to break into their house to steal the keys because you can’t nick the car.”

Supply chain attacks represent the modern equivalent of stealing keys before attempting to enter the car. Rather than breaking into AWS or compromising the European Commission’s infrastructure directly, attackers compromised a development tool that the Commission used, stealing API credentials in the process.

Jon emphasizes that this represents an AWS and cloud infrastructure issue only tangentially: “From an AWS and cloud perspective, OK, just another Tuesday, right, there’s nothing kind of particularly new or novel about this.”

The real lesson, according to both hosts, involves understanding where the actual weak link lies. Karl points out that the attacker likely didn’t target AWS specifically: “Odds are it wasn’t an AWS problem, odds are the commission did something wrong.” While this sounds harsh, the technical reality supports this assessment—the Commission’s mistake was likely insufficient rotation of API keys or excessive permissions granted to development tools.

Karl suggests the practical reality of defense: “We’re seeing more and more supply chain attacks and it’s a thing to be concerned about, but there’s not a whole heap you can do unless you start rolling your own custom libraries for everything, and just why would you do that?”

The supply chain attack landscape will likely continue evolving. Organizations can minimize risk through practices like limiting API key permissions, rotating credentials regularly, and monitoring unusual API activity—but eliminating supply chain risk entirely while using modern development tools remains essentially impossible.

Conclusion

This week’s AWS news covers exciting progress in automation with DevOps Agent, incremental improvements to ECS and Lambda, cost optimization opportunities with Valkey, and important security lessons from real-world breaches. The common thread running through these announcements is AWS’s continued investment in tools that abstract infrastructure management while maintaining developer control.

For teams managing infrastructure at scale, DevOps Agent’s investigation capabilities represent a meaningful quality-of-life improvement. For cost-conscious organizations running Redis caching layers, Valkey presents a straightforward migration path. And for security teams, the European Commission breach serves as a reminder that cloud infrastructure itself is increasingly secure—the weak link lies in supply chain dependencies and credential management.

As Karl notes, “if you’re not sure, you don’t” need many of these advanced features. But for those in the specific use cases these tools address, the improvements are genuinely valuable.

This is an AI generated piece of content, based on the Logicast Podcast Season 5, Episode 13.