Building a Sales OS: A Conversation with Chris Larson, AnyTeam’s Head of AI

Oct 21, 2025

We sat down with Chris Larson, our Head of AI, to discuss why sales is the perfect domain for AI agents, the architectural case for desktop-first, and where he believes AI applications are headed. Chris is an applied engineer and scientist with a Ph.D. from Cornell University. He also is an adjunct professor at Georgetown University teaching courses on artificial intelligence and machine learning. 

The Domain

Why is sales a good use case for AI?

I would say that the GTM function in general is filled with great use-cases for AI in the sense that it's heavily oriented with communications and knowledge work. From prospecting, to outreach, to account research, to relationship building and management, to customer success, all of these activities can and already are being augmented by language models.

Now, amongst all of the great use cases, the account executive stands above the rest in terms of leverage at a B2B company, because AEs are the lynchpin for top line revenue. On top of this, they spend the majority of their time on two activities that are ripe for augmentation with AI: researching companies, products, and people, and meeting prospective and active buyers on video calls. And what’s more is that they definitely operate at an efficiency frontier. This is the use case we're focused on at AnyTeam.

The opportunity here isn't to replace humans, it's to build an assistant who understands your pipe and makes those research hours essentially disappear. With AnyTeam, you can focus on the human aspects of your job, and leave the repetitive time consuming analysis to us.

What does it mean to be a "Sales OS?”

Ok, so "OS" is a deliberate metaphor. Operating systems abstract away physical resources and orchestrate processes on a computer. AnyTeam is trying to do something analogous for GTM activities. We want to be the connective tissue that links everything an AE does, meeting assistance, research, email drafting, CRM updates. An OS means that we truly inject ourselves into the user’s workflow as a set of agents that live on a user’s device. 

Integrations into CRM data are table stakes at this point. What we're building cuts across the entire GTM stack and provides something that can answer questions, provides assistance during meetings, and be able to execute actions, such as updating CRM records, and drafting and sending emails. The goal is that software stops being this collection of disconnected apps that you have to context switch between, and becomes something more personalized that actually understands your workflow.

How do you build reliable AI for a domain with no ground truth?

Yeah, that's a big challenge. Coding agents are actually an instructive comparison. The key difference is that code generation is a verifiable domain, you have compilers and tests that tell you definitively when you're wrong. Sales doesn't have that luxury. There's no signal that says "this deal didn't compile."

So the feedback loop is looser in sales, but the underlying methodology is similar. The real test for any agent, whether it's coding or sales or anything else, is whether it can ask itself the right questions and then go find the answers using the tools it has access to. What we're building is a domain-specific reasoning stack that doesn't rely on deterministic outcomes, but it's highly targeted to the domain where we can impart strong priors onto the agent as a surrogate for that missing ground truth.

The Architecture

Most AI companies are building cloud-first. What's the architectural argument for desktop-first?

For anyone new to the world of language models, the most important thing to understand is that they are hallucinatory creatures. They are literally designed to make up plausible completions of sentences. While it is surprising how much knowledge and intelligence emerges from this process, it comes with a foot gun. In order to automate tasks on a computer with a language model, it needs to be reliable and accurate, but this is not possible without a lot of poking and prodding, so to speak.

Tons of amazing work has been done at the foundational model level related to alignment and reasoning, and on top of this foundation, we now have a framework to mimic the internal deliberation humans use when making decisions and solving problems, as well as mature tooling for models to access external data and systems. The progress has been incredible, but the application layer has not yet fully integrated these capabilities. 

The biggest gap today I think has more to do with UX and software packaging than with model capabilities. The fundamental problem is that agents need the same types of information that a human assistant would need to perform the same task. Our pre-LLM software systems were not designed for this use case.

So the essence of the argument is that the browser and operating system are where a language model is best equipped to augment human capabilities, because the agent can compose generic tasks on top of these systems. And perhaps most importantly, this includes the ability to fetch context, allowing the agent to ground itself. 

There is also a more immediate problem we solve by being on the user’s desktop, which is that the agent is invisible to the external systems it is accessing. Getting around auth walls, while maintaining user privacy and data security, is a huge advantage.

So from this perspective, moving to the edge is an obvious choice. And you’ll note that the horizontal AI platforms including OpenAI and Anthropic are positioning ChatGPT and Claude as desktop-first apps. I suspect their strategy is to unlock the power of their language models to their users. The way to do that is to make context acquisition seamless for the user, and to eventually take actions against other systems. We’re at the beginning of this migration back to edge.

You've said you can't just "bolt AI onto your app." What do you mean by that?

As it turns out, you can't just bolt AI onto the side of your legacy web or SaaS application and expect transformational results. No, it needs to be infused into the application in order to work. And fundamentally it boils down to data access, and how can we simultaneously provide LLM systems with access to the right pieces of information at the right time while maintaining privacy, security, and good UX.

The more immediate consideration for us at AnyTeam is to build a magical experience in which the agent just gets it. Users are given control over the agent, similar to custom GPTs, but behind the scenes, the agent proactively grounds itself based on the user's context. We can dive into the nuts and bolts of how that works, but the overarching goal is an agent that behaves like a human co-pilot working along with you on your computer would. What you see, it sees. What you hear, it hears. That's the idea.

Economics & Engineering

You mentioned local inference is "essentially free." How does that change what you can build?

Local processing gives us competitive advantages on the basis of performance and cost. There is always a tension between these two things. The compute that sits dormant on our laptops is an untapped resource. Modern MacBooks can easily run tens of applications and hundreds of threads at a time without breaking a sweat. These things are supercomputers by yesterday's standards. For AnyTeam, Apple Silicon and Metal in particular have allowed us to run speech transcription, embedding generation, and even smaller text generation models directly on-device. And we do that without starting your laptop on fire. And for PC users out there, we’re actively porting our application to Windows right now.

So ultimately this means that we’re able to take dollars that we’d otherwise be spending on a 3rd party speech to text service or a cloud provider, and reallocate them elsewhere within our App. A way to visualize this is a two dimensional plot on which the quality of our AI features are plotted on the x-axis, and cost on the y-axis. Running local inference models allows us to shift the cost-performance pareto front out and to the right.

What's the hardest part of orchestrating a hybrid edge-cloud system?

Our hardest problems are the same ones every other AI application company is facing. Dealing with the unreliable latency and availability of 3rd party AI services is a constant battle.

Beyond that, we're also trying to orchestrate stuff happening on the desktop with stuff happening in the cloud. In our system, the desktop agent is the oracle, in the sense that it has access to all of the available context. But the cloud agent is responsible for executing the majority of our account research, chat, and meeting guidance responses. So how do we give our cloud agent good priors, basically steer it in the right direction, without leaking data from the desktop? We came up with a message passing arrangement in which the desktop-based agent can send update instructions that allow the cloud-based agent to make the appropriate updates without leaking data. 

An agent that "sees what you see" requires enormous trust. How do you earn that?

Yeah, the first thing we focused on was making sure that our application would be compatible with user’s expectations around privacy and security. We also made a deliberate decision to not build core functionality that relies on data that we might not have access to in the majority of cases. We're also very explicit about permissions. When we ask for access to your calendar, email, or microphone, you know exactly what you're granting and why. And importantly, you can revoke those permissions at any time. The architecture is designed so the agent degrades gracefully, if you don't give it access to a certain context, it just provides less personalized assistance. It doesn't break.

Looking Forward

You've argued hybrid edge-cloud is inevitable. What forces that?

The economics are going to force it. You can't run everything through massive reasoning models forever. It's not affordable, it's not fast enough, and in a lot of cases, it's not private enough.

Right now, so much AI value is being subsidized by venture dollars. Companies are burning capital on inference costs because they're still figuring out product-market fit. But eventually we're going to know the real cost of a token, and companies that aren't well-positioned on LLM COGS are going to struggle. Efficiency is going to become the defining constraint.

Once we get there, the natural course will be to use specialized models to handle smaller, well-defined tasks, things like summarization, retrieval, classification, maybe even some reasoning for certain domains. And then you have larger models in the cloud that handle orchestration and more complex reasoning.

What new capabilities are you building into AnyTeam?

I would say that memory is probably the most fascinating systems problem I'm working on right now. Neural networks were originally inspired by biology, but biological memory manifests in these higher-level brain structures that we just don't have good artificial analogues for yet.

I think about it in computing terms. Our agent has an L1 cache which maps to the active transcript, current conversation, screen context. L2 is recent history such as meeting summaries, past few days of conversations, documents. These easily fit in the context window.

But L3 and beyond is where it’s not as trivial. You need a structured, persistent layer that can do several things simultaneously. It needs to summarize and compress interactions over time without losing important details when they become relevant again. It needs to understand that Jane at Capital One reports to Bob, who you met at that conference in May, and they're evaluating you against Competitor X. And it needs temporal reasoning, like knowing that last week's conversation is likely more relevant than one from three months ago, but sometimes that old email is exactly what you need.

Humans do this effortlessly. A great executive assistant doesn't recall everything verbatim. They remember what matters and compress their countless interactions into intuition. "Oh, you're meeting with Capital One today? They always ghost for two weeks after initial meetings, they care way more about security than speed, and Jane mentioned budget approval needs to happen before Q4."

That's what we're building toward. Not just retrieval from a vector database, but actual synthesis. So when a prospect you talked to three months ago suddenly replies, the agent doesn't need to resynthesize this context from scratch. It knows the full relationship history, the deal context, what mattered in those conversations, the communication patterns.