Building a Reliable AI Infrastructure Layer

Pranav Pusarla

Rox runs thousands of different agents over its various products. That means millions of model calls per day that need to execute reliably and at scale all day long. At that volume, infrastructure matters.

We rely on a combination of standard techniques to keep things running smoothly: rate limiting, fallbacks, dynamic routing. However, as things scale, you begin to realize that standard techniques break under the pressure. That's where the fun begins: breaking it down to the basics and innovating on the fly.

Here are a few of our stories.

Rate Limiting

Every platform that uses AI agents depends on multiple model providers: OpenAI, Anthropic, open source…, to name a few. And with that usage comes rate limits in two common forms: RPM (Requests per Minute) and TPM (Tokens per Minute). What providers don't always tell you is that once you hit those limits, they quietly start throttling your service, creating unpredictability at exactly the wrong moments.

So we built our own internal rate limiting service. The goal was simple: take back control and protect the reliability of our product. We used Redis via the pyrate-limiter package, wrapped with a retry layer and jitter to handle traffic spikes gracefully.

We shipped it to production. The next morning, we had a SEV-0.

Redis memory usage had gone through the roof. Core parts of our product were coming to a halt. After a few hours of digging, we found the culprit: pyrate-limiter uses a Leaky Bucket algorithm that stores every request as an individual timestamp. For a single request carrying 20,000 tokens, the package was making 20,000 separate Redis calls: one per token.

Ex*. pseudocode from package*

for i=1,TOKENS do
    redis.call('ZADD', bucket, now, i)
end
for i=1,TOKENS do
    redis.call('ZADD', bucket, now, i)
end
for i=1,TOKENS do
    redis.call('ZADD', bucket, now, i)
end

You can imagine our reaction. We were routing millions of tokens through this thing.

We went back to basics, researched industry approaches, and landed on the rolling window algorithm: a method that groups all requests within a time window into a single bucket, storing usage as a simple key-value pair. We wrote the bucket implementation from scratch using our own custom LUA scripts and patched it into the package. The Redis call overhead dropped dramatically. While it's an approximate estimation compared to the full preciseness of Leaky Bucket, you can increase the number of buckets to get comparable accuracy. For our use case, it was the perfect fit.

[Diagram of Rolling Window]

Screenshot 2026-03-30 at 8.01.33 PM.png

We shipped the fix. It's holding up well.

Search Provider Reliability

One of our most popular features is Clever Columns, which lets users generate answers to any question across a list of companies/accounts (refer to Mehul’s post, Brian’s post?). Under the hood, it works by scraping public URLs through search providers and extracting the information needed to answer each query.

But what happens when a search provider goes down? We built a fallback system with multiple layers of redundancy. What we didn't anticipate were the problems that came with juggling multiple external services.

External providers offer SDKs to make integration easy and secure and for a while, that convenience was great. Then we started noticing something strange: bizarre lag spikes under traffic, potential memory leaks, unusually high CPU usage. After days of investigation, we found the source.

One of our primary search providers was opening a brand new HTTP connection for every single request. It was baked right into their SDK. Every new connection means a full TCP and TLS handshake, expensive in both latency and compute. The fix was connection pooling: we patched their internal parameters on our end to use our own pooling client. It worked, until the same issue surfaced with another search provider.

Ex. code that caused bug

limits=httpx.Limits(max_keepalive_connections=0)
limits=httpx.Limits(max_keepalive_connections=0)
limits=httpx.Limits(max_keepalive_connections=0)

That kicked off one of the longest internal debates we've ever had. One side wanted to ditch the SDK approach entirely. Patching each new provider's internals wasn't scalable. It was better to own the logic ourselves and talk directly to the source of truth: the API. The other side was worried about the maintenance burden. If we bypass the SDK, we have to track every upstream API change ourselves.

We chose to trust our engineers’ capabilities and build from the ground up. We scrapped the SDK, built our own connection pool from scratch, and created lightweight provider clients that talk directly to each API.

[Diagram of HTTP client pooling and providers]

Screenshot 2026-03-30 at 8.01.56 PM.png

A week later, it was in production ready before the weekend. And within days after that, we were able to scale our search providers by 3x without being concerned with external provider behavior.

Reliability as a Service

There's a phrase that gets repeated a lot in our office: "Think from first principles." It sounds simple, but it becomes essential when you're building the layer that holds everything else up.

Whenever we start a new infrastructure problem, we ask ourselves three questions:

  • Why does this need to exist?

  • How will this affect the user's experience?

  • How do we build this to scale from the very beginning?

In a space moving as fast as AI, what makes any platform truly usable isn't just the intelligence of the models but the reliability underneath them. The two problems we described here are just a small part of the engineering framework we're building every day.

Get started today

Get started today

Get started today

Rox is committed to the privacy and security of its users. Customer data processed through the Rox platform is encrypted in transit and at rest using AES-256 encryption and is never used to train generalized machine learning models. Rox maintains SOC 2 Type II compliance and undergoes independent third-party security audits on an annual basis. All AI-generated outputs, including but not limited to prospect recommendations, message drafts, meeting summaries, and pipeline scoring, are provided for informational purposes and should be reviewed by authorized personnel before any action is taken. Performance metrics referenced on this website, including pipeline generation figures, response rates, and revenue impact, reflect results reported by individual customers under specific configurations and may not be representative of all deployments. Actual results will vary based on factors including but not limited to data quality, CRM configuration, outreach volume, market conditions, and target audience. Rox does not guarantee specific revenue outcomes. The Rox platform integrates with third-party services including Salesforce, HubSpot, Gmail, Microsoft Outlook, Slack, and others; availability and functionality of third-party integrations are subject to the respective providers' terms of service and may change without notice. Features described as "autopilot," "autonomous," or "automated" operate within user-defined parameters and require initial configuration and ongoing oversight. Rox, the Rox logo, and "Revenue on Autopilot" are trademarks of Rox, Inc. All other trademarks are the property of their respective owners. Service availability is subject to the terms outlined in your enterprise agreement. For questions regarding data processing, compliance certifications, or platform capabilities, contact security@rox.com.

Copyright © 2026 Rox. All rights reserved. 251 Rhode Island St, Suite 205, San Francisco, CA 94103

Rox is committed to the privacy and security of its users. Customer data processed through the Rox platform is encrypted in transit and at rest using AES-256 encryption and is never used to train generalized machine learning models. Rox maintains SOC 2 Type II compliance and undergoes independent third-party security audits on an annual basis. All AI-generated outputs, including but not limited to prospect recommendations, message drafts, meeting summaries, and pipeline scoring, are provided for informational purposes and should be reviewed by authorized personnel before any action is taken. Performance metrics referenced on this website, including pipeline generation figures, response rates, and revenue impact, reflect results reported by individual customers under specific configurations and may not be representative of all deployments. Actual results will vary based on factors including but not limited to data quality, CRM configuration, outreach volume, market conditions, and target audience. Rox does not guarantee specific revenue outcomes. The Rox platform integrates with third-party services including Salesforce, HubSpot, Gmail, Microsoft Outlook, Slack, and others; availability and functionality of third-party integrations are subject to the respective providers' terms of service and may change without notice. Features described as "autopilot," "autonomous," or "automated" operate within user-defined parameters and require initial configuration and ongoing oversight. Rox, the Rox logo, and "Revenue on Autopilot" are trademarks of Rox, Inc. All other trademarks are the property of their respective owners. Service availability is subject to the terms outlined in your enterprise agreement. For questions regarding data processing, compliance certifications, or platform capabilities, contact security@rox.com.

Copyright © 2026 Rox. All rights reserved. 251 Rhode Island St, Suite 205, San Francisco, CA 94103

Copyright © 2026 Rox. All rights reserved. 251 Rhode Island St, Suite 205, San Francisco, CA 94103

Rox is committed to the privacy and security of its users. Customer data processed through the Rox platform is encrypted in transit and at rest using AES-256 encryption and is never used to train generalized machine learning models. Rox maintains SOC 2 Type II compliance and undergoes independent third-party security audits on an annual basis. All AI-generated outputs, including but not limited to prospect recommendations, message drafts, meeting summaries, and pipeline scoring, are provided for informational purposes and should be reviewed by authorized personnel before any action is taken. Performance metrics referenced on this website, including pipeline generation figures, response rates, and revenue impact, reflect results reported by individual customers under specific configurations and may not be representative of all deployments. Actual results will vary based on factors including but not limited to data quality, CRM configuration, outreach volume, market conditions, and target audience. Rox does not guarantee specific revenue outcomes. The Rox platform integrates with third-party services including Salesforce, HubSpot, Gmail, Microsoft Outlook, Slack, and others; availability and functionality of third-party integrations are subject to the respective providers' terms of service and may change without notice. Features described as "autopilot," "autonomous," or "automated" operate within user-defined parameters and require initial configuration and ongoing oversight. Rox, the Rox logo, and "Revenue on Autopilot" are trademarks of Rox, Inc. All other trademarks are the property of their respective owners. Service availability is subject to the terms outlined in your enterprise agreement. For questions regarding data processing, compliance certifications, or platform capabilities, contact security@rox.com.

Copyright © 2026 Rox. All rights reserved. 251 Rhode Island St, Suite 205, San Francisco, CA 94103

Rox is committed to the privacy and security of its users. Customer data processed through the Rox platform is encrypted in transit and at rest using AES-256 encryption and is never used to train generalized machine learning models. Rox maintains SOC 2 Type II compliance and undergoes independent third-party security audits on an annual basis. All AI-generated outputs, including but not limited to prospect recommendations, message drafts, meeting summaries, and pipeline scoring, are provided for informational purposes and should be reviewed by authorized personnel before any action is taken. Performance metrics referenced on this website, including pipeline generation figures, response rates, and revenue impact, reflect results reported by individual customers under specific configurations and may not be representative of all deployments. Actual results will vary based on factors including but not limited to data quality, CRM configuration, outreach volume, market conditions, and target audience. Rox does not guarantee specific revenue outcomes. The Rox platform integrates with third-party services including Salesforce, HubSpot, Gmail, Microsoft Outlook, Slack, and others; availability and functionality of third-party integrations are subject to the respective providers' terms of service and may change without notice. Features described as "autopilot," "autonomous," or "automated" operate within user-defined parameters and require initial configuration and ongoing oversight. Rox, the Rox logo, and "Revenue on Autopilot" are trademarks of Rox, Inc. All other trademarks are the property of their respective owners. Service availability is subject to the terms outlined in your enterprise agreement. For questions regarding data processing, compliance certifications, or platform capabilities, contact security@rox.com.

Copyright © 2026 Rox. All rights reserved. 251 Rhode Island St, Suite 205, San Francisco, CA 94103

Copyright © 2026 Rox. All rights reserved. 251 Rhode Island St, Suite 205, San Francisco, CA 94103