Join our daily and weekly newsletters for the latest updates and exclusive content on industry leading AI coverage. Learn more
It is expected that 2025 will be the year that AI becomes a reality, bringing specific, tangible business benefits.
But, according to a new State of AI Development Report from the AI development platform Vellumwe’re not there yet: only 25% of businesses have deployed AI in production, and only a quarter of those have yet to see a measurable impact.
This seems to indicate that many businesses have not yet known how to survive use cases for AIkeeping them (at least for now) in a pre-build holding pattern.
“It reinforces that it’s still early days, despite all the hype and discussion going on,” Akash Sharma, Vellum CEO, told VentureBeat. “There’s a lot of noise in the industry, new models and model providers coming out, new RAG techniques; we just want to get a lay of the land on how companies are deploying AI into production.”
Businesses must identify specific use cases to see success
Vellum interviewed more than 1,250 AI developers and builders to get a real sense of what’s happening in the AI trenches.
According to the report, most of the companies still in production are in various stages of their AI journeys — building and evaluating strategies and proofs of concept (PoC) (53%) beta testing (14%) and, at the lowest level, communicating with users and gathering requirements (7.9%).
Currently, businesses are focused on building document parsing and analysis tools and customer service chatbots, according to Vellum. But they are also interested in applications that include analytics with natural speech, content generation, recommendation systems, code generation and automation and research automation.
So far, developers report competitive advantage (31.6%), cost and time savings (27.1%) and higher user adoption rate (12.6%) as the biggest impact they’ve seen so far. Interestingly, however, 24.2% have yet to see any meaningful impact from their investments.
Sharma emphasized the importance of prioritizing use cases from the beginning. “We’ve anecdotally heard from people that they just want to use AI for the sake of using AI,” he said. “There is an experimental budget associated with that.”
While this makes Wall Street and investors happy, it doesn’t mean AI is actually contributing anything, he points out. “One thing that everyone should be thinking about, is, ‘How do we find the right use cases? Usually, once companies are able to identify use cases, take them to production and see a clear ROI, they gain more momentum, they go through the hype. That results in more internal expertise, more investment. ”
OpenAI is still on top, but a mixture of models will be the future
When it comes to the models used, OpenAI maintains the lead (no surprise there), especially its GPT 4o and GPT 4o-mini. But Sharma points out that 2024 offers more options, directly from model creators or through platform solutions like Azure or AWS Bedrock. And, providers hosting open-source models such as Llama 3.2 70B are also gaining traction – such as Groq, Fireworks AI and Together AI.
“Open Source models are getting better,” Sharma said. “OpenAI’s closed source competitors are catching up in terms of quality.”
Ultimately, however, businesses won’t stick to just one model and that’s it — they’ll increasingly lean toward multi-model systems, he predicts.
“People choose the best model for each task at hand,” says Sharma. “While building an agent, you can have many prompts, and for each individual prompt the developer wants to get the best quality, lowest cost and lowest latency, and that can be or not from OpenAI.”
Similarly, the future of AI Undoubtedly multimodal, with Vellum that has seen an increase in the adoption of tools that can manage different tasks. Text is the undisputed top use case, followed by file creation (PDFs or Word) images, audio and video.
Also, retrieval-augmented generation (RAG) is a go-to when it comes to retrieving information, and more than half of developers use vector databases to simplify searching. Leading open-source and proprietary models include Pinecone, MongoDB, Quadrant, Elastic Search, PG vector, Weaviate and Chroma.
Everyone is involved (not just engineering)
Interestingly, AI is moving beyond IT and democratizing businesses (like the old ‘it takes a village’). Vellum found that while engineers are most involved in AI projects (82.3%), they are joined by leadership and executives (60.8%), subject matter experts (57.5%), product teams (55.4%) and design department (38.2%). .
This is due to the ease of using AI (as well as the general excitement around it), says Sharma.
“This is the first time we’ve seen software developed in a more cross functional way, especially because the prompts can be written in natural language,” he said. “Traditional software is often more deterministic. It’s not deterministic, which brings more people into the development fold.
However, businesses continue to face major challenges – particularly around AI hallucinations and prompts; model speed and performance; data access and security; and getting buy-in from key stakeholders.
At the same time, while many non-technical users are involved, there is a lack of pure technical expertise in content, Sharma points out. “The way to connect all the different moving parts is still a skill that many developers don’t have today,” he said. “That’s a common challenge.”
However, many existing challenges can be overcome through tooling, or platforms and services that help developers evaluate complex AI systems, Sharma pointed out. Developers can create tooling within or with third-party platforms or frameworks; however, Vellum found that nearly 18% of developers specify prompts and orchestration logic without any tools.
Sharma points out that “the lack of technical skills becomes easier if you have the right tools to guide you on the development journey.” In addition to Vellum, frameworks and platforms used by survey participants include Langchain, Llama Index, Langfuse, CrewAI and Voiceflow.
Evaluations and continuous monitoring are essential
Another way to overcome common issues (including perceptions) is to create evaluations, or use specific criteria to test the correctness of a given answer. “But even then, (developers) don’t do the evals as consistently as they should,” Sharma said.
Especially when it comes to advanced agent systems, businesses need a robust evaluation process, he said. AI agents have a high degree of non-determinism, Sharma points out, as they call on external systems and perform autonomous actions.
“People are trying to build relatively advanced systems, agent systems, and need a lot of test cases and some kind of automated test framework to make sure it works reliably in production,” Sharma said. .
While other developers are taking advantage of automated evaluation tools, A/B testing and open-source evaluation frameworks, Vellum found that more than three quarters are still doing manual testing and reviews.
“Manual testing takes time, doesn’t it? And the sample size of manual testing is usually lower than that of automated testing,” said Sharma. “There may be a challenge in just knowing the techniques, how to do automated, at-scale assessments.”
Finally, he emphasized the importance of embracing a mix of systems that work symbiotically – from the cloud to application programming interfaces (API). “Consider treating AI as just one tool in the toolkit and not the magical solution to everything,” he said.
Source link