What is Your Enterprise AI Strategy?
Artificial general intelligence, the “platform shift”, and what your business should do about it.
When ChatGPT was released in November 2022 everyone was forced to come up with an AI strategy for their business. Now, it is 2 years later, and people are starting to look back at what they have accomplished with AI — and how they should adapt their strategy.
This article talks about:
- The anticipated future advancements in AI (AGI roadmap)
- What is the “platform shift” that AI might enable
- What your business should do about it
The AGI roadmap
OpenAI, Anthropic, Google Deep Mind, etc. are all hoping to reach some form of Artificial General Intelligence (AGI).
What is AGI?
I’m glad you asked. No one really knows. Most people agree it means human level intelligence across a variety of areas. But that is a very vague benchmark because humans are both very bad and very good at everything humans do. And humans are embodied. How can you get human-level intelligence outside the context of a bipedal human-shaped being?
It is easier to define what AGI is not.
- It is not narrowly targeted to a limited number of topics. It should be able to generalize intelligence about one topic to other similar topics.
- It is not Artificial Super Intelligence. Many people think AI systems will soon largely surpass human-level intelligence. AGI is not the singularity.
The plan to get to AGI
OpenAI sent a 5-step long term strategy to get to AGI to its employees on July 9th, 2024, which was leaked to the public.
1. Chatbots: AI systems with conversational language abilities, capable of engaging in natural, human-like dialogue across various topics.
2. Reasoners: AI systems that can solve complex problems with proficiency comparable to human experts, such as individuals with doctorate-level education.
3. Agents: AI systems capable of operating autonomously for extended periods, acting on a user’s behalf to perform tasks, make decisions, and adapt to changing circumstances without constant human oversight.
4. Innovators: AI systems that can develop groundbreaking ideas and solutions across various fields, pushing the boundaries of human knowledge and driving innovation independently.
5. Organizations: AI systems functioning as entire entities, possessing strategic thinking, operational efficiency, and adaptability to manage complex operations and achieve organizational goals without human intervention.
For all the problems defining what AGI is exactly, this roadmap makes sense. I can see how one step leads to the next, even if I am not 100% convinced that “AI organizations” = AGI.
Where are we now?
- We are clearly past Chatbots. There are already too many chatbots. ✅
- As I am writing this, OpenAI currently has “reasoning models” generally available to the public. By most accounts, they are pretty good. ✅
- Next is agents. The buzzword of 2025 will be Agentic AI.
AI Agents use the computer for you. This is the platform shift enabled by AI. Users can write or say what they want the computer to do, the software understands that input, then hopefully takes the correct action.
AI as a Platform Shift
One reason all the tech giants are racing to be the leader in AI is because they see AI as a platform shift away from the current status quo, and they do not want to be left behind.
Previous platform shifts
Each time there is a platform shift in computing, the industry re-organizes itself into new winners and losers. Well, certain companies always seem to end up as winners, but that is a different article… Some of the previous platform shifts were:
- Mainframe to Personal Computing (PC)
- Command Line Interfaces (CLI) to Graphical User Interfaces (GUI)
- Desktop Apps to Web Apps
- PC to Mobile Computing
- On-premises computing to cloud computing
GUI to Natural Language Interface (NLI)
Large Language Models (LLMs) are the machine learning models that have been powering AI tools like ChatGPT, Claude, and Gemini. As the name suggests, these are models of human language. Essentially, they work by trying to predict the next most likely word (token) given the previous sequence of text. In the past, we have never had a model of language that is this good.
Now everyone is trying to build the best product leveraging this new technology. So far, we have mostly gotten chat bots and automated search summaries. The wide adoption of Retrieval Augmented Generation (RAG) based systems has allowed people to build chat bots that let users talk to a computer which has “knowledge” (a.k.a. context) about any topic.
But chatbots are not the answer, no matter how much “knowledge” or context they have. Now that we have an excellent model of language, computers should be able to understand intentions and desired actions. The killer-use-case for LLM’s is a computer that understands the spoken word and takes the appropriate actions.
Side note about AI hardware
Dedicated AI enabled devices
Some companies like Humane AI and Rabbit are attempting to make and sell separate AI enabled hardware that you can talk to, and it will take actions for you. The Meta Orion smart glasses prototype has some AI features, but you need to carry around a puck that contains the chips to do all the computing.
AI on the device you are already addicted to
But the most likely scenario is that only piece of AI hardware you need is your phone or laptop. These already have (or will soon) all the compute you need for your everyday AI tasks. Few people want to carry around both a phone and another device that is just a better Siri or Google Assistant. The benefit of smart phones was now I do not have to carry along my iPod, digital camera, and cell phone like it is 2007.
How computers use computers
There are a few different ways Agents could be implemented:
- AI agents have access to use specific API’s for other tasks
- Your AI agent clicks around a website for you
- Your AI agent talks to other AI agents
Example
You tell your agentic AI assistant “order me a Lyft home”. For this example, assume that the AI completely understands this request and does not hallucinate anything.
Option 1: Your AI device (likely your phone) has direct access to submit ride requests to Lyft without opening the Lyft app. This is like Apple App Intents.
Option 2: Your AI opens up a web browser and clicks around the Lyft website to submit the ride request. This is what the Rabbit R1 device is doing.
Option 3: Lyft also has an AI chatbot that allows users to request a ride via natural language interface (NLI). Your AI talks to the Lyft AI to submit the ride request. As of now, there are no examples of this approach.
Endgame
Clearly the ideal end result for consumers is that an AI agent is built right into the OS of your current devices and has direct access to the API’s of other applications. This prevents people from carrying around another device and would be the most accurate for accomplishing tasks. You get 1 NLI to the entire device instead of 1000’s of separate chatbots for every separate application.
There is an argument for a device that allows people use their phones less. That sounds great. But what if that device was your phone.
This opens up a new problem, Nilay Patel at The Verge termed it “The Doordash problem.” Apple/Google need a company like Doordash to exist to allow you to talk to your phone and have food be delivered. But if the OS of your smart phone puts an NLI between you and the Doordash app, that will kill the traffic to Doordash and significantly reduce their revenue, then Doordash ceases to exist. There is just as much business model innovation that needs to happen as technological innovation.
What to do about all this?
For a moment, assume this picture for the future of AI is correct — what should your business do about it?
If you sell software
If you sell software, start thinking about how a Natural Language Interface (NLI) should work for that software. How are users going to talk to your application, which tasks will they want to accomplish with NLI vs. GUI, and what questions might they ask about it?
- Start with RAG to answer questions about how to do things within your software
- Determine which actions should be the first to integrate with an NLI and create a proof-of-concept for those features.
- Determine which APIs would need to be exposed to other AI agents like Apple App Intents. Evaluate how that type of usage effects revenue — if you sell ads, how many fewer ads will people see if users are just talking to your app.
- Skip fine-tuning your own LLM. This will only be needed for very niche sectors with specific jargon like legal, tax, and healthcare. The general foundational models are great for most use cases.
If you do not sell software
1. You do not need to develop your own agents.
Hopefully each piece of software that you use is building some sort of NLI for their product. This removes the need for you to build agents to use other people’s software.
What about AI agents that can work across multiple software systems?
Use Robotic Process Automation (RPA). This is more reliable and does not hallucinate. Someday soon the RPA industry will develop agents to do these tasks but let the large RPA companies take the risk to develop that technology.
2. You do not need to be an expert in Retrieval Augmented Generation (RAG).
RAG will quickly be commoditized. All the cloud platforms have a quick and easy way to set it up. You can already chat with documents just by uploading them to OneDrive.
3. Make your people more productive.
We haven’t discussed in this article the utility of Chatbots. Chatbots are great for productivity. Give employees access to tools like Copilot, Gemini, ChatGPT, and Claude. Not everyone knows how to be more effective with it today, but people will learn quickly once they see coworkers using it.
4. Focus on traditional ML and optimization use cases.
I am biased, because this is my specialty. But there is still a ton of value to gain in investing in more traditional machine learning and optimization projects to improve your business, rather than focusing on generative-AI.