Local AI: is it viable? - CTOMultiplier

For years we have been migrating our tools and our data to the cloud. In the early days the question was raised of whether moving to the cloud was the right path for every company, but over time it established itself as the default option for the vast majority. Today, with the arrival of AI, I wonder whether it is possible that we will retrace part of that path in order to use AI locally.

With the rise of the cloud we have seen more and more functionality move from running on our PCs to running in the cloud. Among the advantages of the cloud was being able to access software as a service (or SaaS), removing the need to buy and install an application and keep it updated, paying by subscription instead of by purchase. On top of that, we gained transparent backups of our data, along with the ability to share and collaborate easily.

But we also accepted some tolls: we depended on a permanent Internet connection, our data was exposed to being inspected or used by the provider, and we accepted a significant vendor lock-in, because that data lived on a third party’s infrastructure.

With AI we started straight in the cloud, mainly because the cost of the hardware needed to run AI models was out of reach for most people. But, as with any technology, the cost comes down over time. That is exactly what is happening with open source, which already lets you run on a high-end personal computer models that little more than a year ago were the state of the art.

It is true that the largest models still require expensive hardware, but there are use cases that are perfectly feasible locally, such as semantic search over documents, question answering, translation and certain programming use cases. Although programming is where cloud models still hold a clear advantage.

What is pushing us towards local AI

Right now we are seeing certain factors that are making it easier to adopt local AI:

The price of AI in the cloud has risen sharply, largely because it is no longer being subsidised. This is already a problem for companies that adopted it wholesale: practically overnight, they are watching the price they pay for AI multiply. In many cases this is leading companies to cut back their usage, with the resulting loss of the productivity they had already gained.
Keeping control over data privacy is getting harder and the risks are growing. To get the most out of our data with AI we have to give it access. For the first time we have a technology that simplifies integration between different systems, removing the need to build costly ad-hoc integrations. However, the idea of granting access to all of our systems to an AI owned by a third party is a risk that should not be taken lightly. I feel that some companies are making this decision as if it were just one more cloud use case. Yet we are talking about a new technology capable of extracting value from our data that was not possible before.
Geopolitics has put technological sovereignty centre stage. Right now the major AI labs are American or Chinese, with everything that implies. In Europe there are many companies that, due to regulation, cannot take their data outside the European Union, and they are running into problems adopting AI for this reason. Local AI can be a solution to many of these problems.
Hardware manufacturers are betting more and more on machines capable of running AI locally. First it was Apple with Apple silicon and unified memory, and now it is Nvidia together with Microsoft who have unveiled machines to run AI easily on Windows. Meanwhile, Apple has just announced Siri AI, which will use a hybrid execution model where simple queries run on the device and the more complex ones travel to Apple’s cloud. Although the model is Google’s, they guarantee that no data will leave Apple’s servers, and that Google will not have access.
Today’s infrastructure cannot keep up with all the demand, and it is not unusual for models to have availability failures that interrupt normal use. Personally, I can say that in recent months I have experienced at least several errors every week due to AI model APIs being overloaded, having to interrupt programming tasks until the service recovered.

What is holding local AI back

That said, local AI has some drawbacks which, although they can be mitigated and improved over time, I am not sure will disappear completely:

Hardware cost: As of today they require mid-high and high-end machines to run, with Apple silicon Macs being the easiest machines on which to run AI. While taking advantage of the largest models requires a fair amount of RAM, with 16GB we can already run models that let us do question answering, summarise emails, or translate. It is an open question whether the day will come when we can run state-of-the-art models that handle large context windows entirely locally, or whether we will have to keep using the cloud for these cases.
Technical knowledge: Much of the current software for running AI locally requires a certain amount of technical knowledge that makes it complex for non-technical users. Even choosing the right model for our hardware and use case is not a trivial task. Although models like qwen3.5 4b or gemma4 e4b are quite good on machines with 16GB of RAM for many use cases.
Accuracy: The quality and accuracy of the responses tends to be lower than that of the big labs’ models, and they are also more sensitive to prompt quality. If we are used to using ChatGPT or Claude, we know that often we can give them a vague prompt and get good answers. With small models we have to work the prompt harder and even so the quality of the answer may not be the same.
Context size: Since personal machines have less memory, we have a limit on the context size we can use in each query. The context size we can use locally is around one or two orders of magnitude below what we can use in the cloud.
Speed: Cloud models tend to respond faster, while locally it is common to have to wait longer, although this varies a lot depending on the hardware, the model and the use case. The user experience of remote models is noticeably better than that of local ones.
Access to data: Because of the cloud usage we have built up, part of our data is hosted on platforms that local AI cannot always access. This is a form of vendor lock-in that makes using local AI with our data more complex.

I believe that, as of today, the strongest argument for companies regarding local AI is keeping full control over data privacy. On this point its advantage over cloud AI is indisputable. Removing the worry about data control also unlocks more use cases that companies are currently limiting. Having an AI with full access to your documents, your emails, your code, and so on, without having to worry about what happens to your data once it leaves your perimeter, lets you get the most value out of AI’s current potential.

–

PS: We are putting these ideas into practice, which is why we have built an email client that runs AI 100% locally. If you are interested, you can visit: → https://getemailops.com