Get the top HN stories in your inbox every day.
0xbadcafebee
As someone who's been doing Infra stuff for two decades, this is very exciting. There is a lot of mindless BS we have to deal with due to shitty tools and services, and AI could save us a lot of time that we'd rather use to create meaningful value.
There is still benefit for non-Infra people. But non-Infra people don't understand system design, so the benefits are limited. Imagine a "mechanic AI". Yes, you could ask it all sorts of mechanic questions, and maybe it could even do some work on the car. But if you wanted to, say, replace the entire engine with a different one, that is a systemic change and has farther reaching implications than an AI will explain, much less perform competently. You need a mechanic to stop you and say, uh, no, please don't change the engine; explain to me what you're trying to do and I'll help you find a better solution. Then you need a real mechanic to manage changing the tires on the moving bus so it doesn't crash into the school. But having an AI could make the mechanic do all of that smoother.
Another thing I'd love to see more AI use of, is people asking the AI for advice. Most devs seem to avoid asking Infra people for architectural/design advice. This leads to them putting together a system using their limited knowledge, and it turns out to be an inferior design to what an Infra person would have suggested. Hopefully they will ask AI for advice in the future.
nickpapciak
Glad you find it interesting. A surprising way people are using us right now has been people who are technical but don’t have deep infrastructure expertise, asking datafruit questions about how stuff should be done.
Something we’ve been dealing with is trying to get the agents to not over-complicate their designs, because they have a tendency to do so. But with good prompting they can be very helpful assistants!
0xbadcafebee
Yeah it's def gonna be hard. So much of engineering is an amalgam of contexts, restrictions, intentions, best practice, and what you can get away with. An agent honed by a team of experts to keep all those things in mind (and force the user to answer important questions) would be invaluable.
Might be good to train multiple "personalities": one's a startup codebro that will tell you the easiest way to do anything; another will only give you the best practice and won't let you cheat yourself. Let the user decide who they want advice from.
Going further: input the business's requirements first, let that help decide? Just today I was on a call where somebody wants to manually deploy a single EC2 instance to run a big service. My first question is, if it goes down and it takes 2+ days to bring it back, is the business okay with that? That'll change my advice.
nickpapciak
Yes definitely! That's why we do believe the agents, for the time being, will act as great junior devs that you can offload work onto, while as they get better they can slowly get promoted into more active roles.
The personalities approach sounds fun to experiment with. I'm wondering if you could use SAEs to scan for a "startup codebro" feature in language models. Alas this is not something we get to look into until we think that fine-tuning our own models is the best way to make them better. For now we are betting on in-context learning.
Business requirements are also incredibly valuable. Notion, Slack, and Confluence hold a lot of context, but it can be hard to find. This is something that I think the subagents architecture is great for though.
paool
Funnily enough, the same scenario holds true for actual programmers vs vibe coders.
Even if you manage to prompt an app, you'll still have no idea how the system works.
cddotdotslash
I can see the value, but to do the things you're describing, the AI needs to be given fairly highly-privileged credentials.
> Right now, Datafruit receives read-only access to your infrastructure
> "Grant @User write access to analytics S3 bucket for 24 hours" > -> Creates temporary IAM role, sends least-privilege credentials, auto-revokes tomorrow
These statements directly conflict with one another.
So it needs "iam:CreateRole," "iam:AttachPolicy," and other similar permissions. Those are not "read-only." And, they make it effectively admin in the account.
What safeguards are in place to make sure it doesn't delete other roles, or make production-impacting changes?
nickpapciak
Ahh. To clarify, changes like granting users access would be done by our agent modifying IaC, so you would still have to manually apply the changes. Every potentially destructive change being an IaC change helps allow the humans to always stay in the loop. This admittedly makes the agents a little more annoying to work with, but safer.
Kwpolska
So you’re modifying Terraform? How is your tool better than just using an AI-enabled IDE and asking it to apply the change?
How is the auto-revoke handled? Will it require human intervention to merge a PR/apply the Terraform configuration, or will it do it automatically?
nickpapciak
Lots of people have asked us this! We try to do more than just being an AI-enabled IDE by giving the agent access to your infrastructure and observability tools. So you can query over your AWS, get information about metrics over the past few days, etc etc. We also plan to integrate with more DevOps tools as our customers ask for them. We also try to be less like an IDE, and more like an autonomous agent. We've noticed that DevOps engineers actually like being engineers, and enjoy some infrastructure tasks, while there are others that they would rather automate away. Not sure if you have experienced this sentiment?
Also, auto-revoke right now can be handled by creating a role in Terraform that can be assumed and expires after a certain time. But we’re exploring deeper integrations with identity providers like Okta to handle this better.
primitivesuave
IMO it is a smart decision to implement this as a self-hosted system, and have the AI make PRs against the IaC configuration - for devops matters, human-in-the-loop is a high priority. I'm curious how well this would work if I'm using Pulumi or the AWS CDK (both are well-known to LLMs).
I consulted for an early stage company that was trying to do this during the GPT-3 era. Despite the founders' stellar reputation and impressive startup pedigree, it was exceedingly difficult to get customers to provide meaningful read access to their AWS infrastructure, let alone the ability to make changes.
nickpapciak
LLMs are pretty awesome at Terraform, probably because there is just so much training data. They are also pretty good at the AWS CDK and Pulumi to a bit of a lesser extent, but I think giving them access to documentation is what helps make them the most accurate. Without good documentation the models start to hallucinate a bit.
And yeah, we are noticing that it’s difficult to convince people to give us access to their infrastructure. I hope that a BYOC model will help with that.
elpakal
Congrats on the launch. As a former CI build engineer, I’m very curious about this and look forward to watching your progress. One question
> we’ve talked to a couple of startups where the Claude Code + AWS CLI combo has taken their infra down
Do you care to share what language model(s) you use?
nickpapciak
Thank you! We currently mainly use Claude Sonnet and then Opus for more difficult tasks. We experimented with GPT 5 when it came out but we might need to do some more experiments to see if it’s better. Better evals is something we are working on before we experiment too much with different models!
debarshri
I think you are under estimating the nuances you have in non faang infrastructure. Also, based on my previous experience you will meet with developer resistance (may be AI can help you beat that). By being broad you also competing with purpose built solution like finops, devsecops etc. Who also seems to have agents now.
It is workflow automation in the end of the day. I would rather pick SOAR or AI-SOC where automation like this is very common. For eg blinkops or torq.
nickpapciak
That's fair. For what it's worth, our agents are being used by small startups in the YC batch and they have been helpful for them.
We have not spent as much time working in the security space, and I do think that purpose-built solutions are better if you only care about security. We are purposefully trying to stay broad, which might mean that our agents lack depth in specific verticals.
debarshri
I wouldnt index on these startups. People who would pay big bucks are in enterprise. Thats largely your market.
nickpapciak
Totally agree, enterprise is where the most $ is to be made, but from what we've found they care a lot about doing one specific thing very well. This has been something we've been thinking about. For now we've enjoyed working with startups as they have very interesting challenges that only appear at smaller scale.
Kwpolska
> (1) automated infrastructure audits— agents periodically scan your environment to find cost optimization opportunities, detect infrastructure drift, and validate your infra against compliance requirements.
Why does that need an AI? I’m pretty sure many tools for those things exist, and they predate LLMs.
nickpapciak
Glad you mentioned this! We do use open source rule-based scanners internally to make it more deterministic. This is also a new feature, and we'd probably want to integrate with existing tools rather than competing with them. We do think there are some benefits of using LLMs though.
I think the power language models introduce is being able to more tightly integrate app-code with the infrastructure. They can read YAML, shell scripts, or ad-hoc wiki policies and map them to compliance checks, for example.
Albert-Lam
Congrats on the launch! Excited to see you guys adopt a BYOC distribution model
nickpapciak
thank you!
thecolorblue
I have loved the integration of AI feedback in github PRs. Would I be able to test out Datafruit github integration on an open source project?
nickpapciak
Potentially! But we do not do PR reviews at the moment. Is infra code reviews something you are interested in?
roggenbuck
Really great stuff! Congrats on the launch!
tealpod
Sounds good.
BTW, your website is heavy, for a basic set of components it shouldn't be taking 100% CPU.
vivzkestrel
there have been a lot of attempts to make products like these but this kinda product almost always only one problem. nobody really is sure about the access privileges it requires to operate and what it does on its backend with such privileges
nickpapciak
That's an interesting approach. For us, we give it read-only privileges which gives the agent the context of your infrastructure, without giving it the capabilities to break things. But I do see a world where we give it more access, but add additional safeguards.
solatic
My 2 cents (DevOps Engineer with a decade of experience, MBA, optimistic about tools like Claude Code and pessimistic about what I'm seeing here):
You need to be very clear about the persona who you're building for, what their pain point is, and why they're willing to spend money to solve it. So far it seems like you took an emerging technology (agentic workflows), applied it to a novel area (DevOps), built a UX around it, and tried to immediately start selling. This is the product trap of a solution in search of a problem.
Are you trying to sell to large companies? The problem that large companies have is cultural/organizational, not tooling. For any change, you need to get about a dozen people to review, understand, wait for people to come back from vacation, ping people because it fell off their desk, sign off, get them to prioritize, answer questions again from the engineer the task was assigned to, wait for another round of reviews and approvals, and maybe finally somebody will get the fix applied in production. DevOps is (or at least, it originally used to be) focused on finding and alleviating the bottlenecks; the actual process of finding data or applying changes is not the bottleneck in large companies and so therefore it is not a solution to the pain point that different folk in large companies have. If your value proposition is that large company executives could replace Infrastructure employee salaries with a cheaper agentic workflow, you need to re-read my prior point - if large companies have all this process and approvals for human beings making changes, why would they ever let an agentic workflow YOLO the changes without approval? And yes, I know, your agent proposes Terraform PRs for making changes to keep a human in the loop - but now you slayed one of the Hydra's heads and three more have popped up in its place: the customer needs the Terraform PR to be reviewed by a human committee, some of whose members are on vacation, some of whose members missed the PR request because they had other priorities and it fell off their desk, etc. etc. Doesn't really sound like you solved anything. The fundamental difference between what you built and something like Claude Code is that Claude Code doesn't need a human committee to review on every iteration it executes on an engineer's laptop, only the review of the One Benevolent Laptop User who is incentivized to get good output from Claude Code and provide human review as quickly as (literally) humanly possible.
Are you trying to sell to small companies that don't have DevOps Engineers? What's the competitive space here? The options usually look something like, (a) pay a premium for a PaaS, (b) spend on the salary for your first DevOps Engineer in the hopes that they will save more on low-level infra bills compared to their salary, so you're posing now (c) some kind of DevOps agentic workflow that is cheaper than a DevOps Engineer salary but will provide similar infra cost savings? So your agentic workflow will actually lift and shift to better/cheaper infra primitives and own day-to-day maintenance, responding to infra issues which your customers - who aren't DevOps Engineers, and don't know anything about infra, and are trying to outsource these concerns to you - which your customers don't know how to handle? I would argue that if you really did achieve that, then you should be building an agentic-workflow-maintained PaaS that, by virtue of using agents instead of humans, can undercut traditional PaaS on cost while offering a maybe better UX somehow. If you're asking your customers to review infra changes that they don't understand, then they need to hire a DevOps Engineer for the expertise to review it, and then you have a much less interesting value proposition.
nickpapciak
That’s actually a really interesting point. We started out building out basically an “agentic PaaS” exactly as you described, but quickly found difficulty in securing more customers and moving up-market (from seed stage to series A+) for it. Because a PaaS did not have sufficient abstractions + the customers were too afraid to give us control, because even if it was their cloud, if we went under there was a sense that they “lost” their deployment platform. (This was the sentiment we were able to piece together from talking to many people).
Right now most of our value, as you said, is in augmenting an infra engineer at a growth stage company to limit some of the operational burdens they deal with. For the companies we’ve been selling to, the customers are SWEs who have been forced to learn infra when needs arise. But overall they are fairly competent and technical. And Claude code or other agentic coding tools are not always sufficient or safe to use. Our customers have told us anecdotally that Claude code gets stuck in a hallucination loop of nothingness on certain tasks and that Datafruit was able to solve them.
That being said, we have lost sales because people are content with Claude code. So this is something we are thinking about.
solatic
So if you want to target an infra engineer at a growth company, usually in growth companies where there is only one, maybe two infra/non-product engineers, I would recommend that you start from the following axioms:
1. Infra engineers always want to apply changes by themselves, but tooling can always recommend changes
2. What are all the kinds of work that infra engineers would love to do, that *do* add value, but that they haven't built yet because they can't prioritize it?
3. How do you build an agent that:
a. Understands architectural context
b. Offers to set up (i.e. Terraform PR) value-adding infra
c. That the human infra engineer can easily maintain
d. That the human infra engineer will appreciate as being value-adding and not time-wasting or unnecessary-expense?
Maybe the key isn't to provide an agent that will power a PaaS; maybe the key is to give early infra engineers the productivity to build their own in-house PaaS. Then your value-add above Claude Code is clear, because Claude Code is a generic enough tool that it doesn't even make any recommendations; because a DevOps agent works within an axiomatic framework of improving visibility, reducing costs, improving release velocity, improving security, etc., so it can even start (after understanding the architecture, i.e. by hooking up MCP servers and writing an INFRA.md) by making recommendations and then just ask the customer if they like the PRs it is proposing. Does that resonate with you?nickpapciak
Yes, some of this definitely resonates. We really want the agents to suggest their own, novel projects, beyond security or cost optimization. I think this is more feasible for coding agents to do in infra rather than dev work because a lot of the dev work depends strictly on what the customers want, whereas infra work can be more internal and developer focused so there’s opportunities to suggest improvements to the internal system.
I think in the near-term, however, the problem we have identified is that while developers at growth stage have been vastly accelerated, the infra engineers have not been. So our tool is almost helping them “catch up” to the new rapid pace of development. This is dangerous due to the complexity and need for infrastructure to be robust, hence why we are really focused on making it safe to use.
At larger enterprisey companies, AI has not yet been an extreme productivity boost for the developers like it has been for growth stage companies. But I do believe that an enterprise adoption wave is coming.
Get the top HN stories in your inbox every day.
Hey HN! We’re Abhi, Venkat, Tom, and Nick and we are building Datafruit (https://datafruit.dev/), an AI DevOps agent. We’re like Devin for DevOps. You can ask Datafruit to check your cloud spend, look for loose security policies, make changes to your IaC, and it can reason across your deployment standards, design docs, and DevOps practices.
Demo video: https://www.youtube.com/watch?v=2FitSggI7tg.
Right now, we have two main methods to interact with Datafruit:
(1) automated infrastructure audits— agents periodically scan your environment to find cost optimization opportunities, detect infrastructure drift, and validate your infra against compliance requirements.
(2) chat interface (available as a web UI and through slack) — ask the agent questions for real-time insights, or assign tasks directly, such as investigating spend anomalies, reviewing security posture, or applying changes to IaC resources.
Working at FAANG and various high-growth startups, we realized that infra work requires an enormous amount of context, often more than traditional software engineering. The business decisions, codebase, and cloud itself are all extremely important in any task that has been assigned. To maximize the success of the agents, we do a fair amount of context engineering. Not hallucinating is super important!
One thing which has worked incredibly well for us is a multi-agent system where we have specialized sub-agents with access to specific tool calls and documentation for their specialty. Agents choose to “handoff” to each other when they feel like another agent would be more specialized for the task. However, all agents share the same context (https://cognition.ai/blog/dont-build-multi-agents). We’re pretty happy with this approach, and believe it could work in other disciplines which require high amounts of specialized expertise.
Infrastructure is probably the most mission-critical part of any software organization, and needs extremely heavy guardrails to keep it safe. Language models are not yet at the point where they can be trusted to make changes (we’ve talked to a couple of startups where the Claude Code + AWS CLI combo has taken their infra down). Right now, Datafruit receives read-only access to your infrastructure and can only make changes through pull requests to your IaC repositories. The agent also operates in a sandboxed virtual environment so that it could not write cloud CLI commands if it wanted to!
Where LLMs can add significant value is in reducing the constant operational inefficiencies that eat up cloud spend and delay deadlines—the small-but-urgent ops work. Once Datafruit indexes your environment, you can ask it to do things like:
We charge a straightforward subscription model for a managed version, but we also offer a bring-your-own-cloud model. All of Datafruit can be deployed on Kubernetes using Helm charts for enterprise customers where data can’t leave your VPC. For the time being, we’re installing the product ourselves on customers' clouds. It doesn’t exist in a self-serve form yet. We’ll get there eventually, but in the meantime if you’re interested we’d love for you guys to email us at founders@datafruit.dev.We would love to hear your thoughts! If you work with cloud infra, we are especially interested in learning about what kinds of work you do which you wish could be offloaded onto an agent.