Tackling the not-so-hidden risks of ChatGPT and generative AI
The hype around ChatGPT and generative AI is real — and the wide-spread excitement stirred up about the technology makes it clear that it will be key to transforming digital experiences. Suddenly enterprise executives recognize that it’s a new paradigm for search and for the digital experience — and what users and customers now expect: a conversational experience that provides not just results, but advice, in a way Google never could.
“Customers can use the technology to engage in more complex interactions, making pure self-service completely viable and profitable,” says Louis Têtu, chairman and CEO of Coveo. “AI is not about automation and efficiency; it is about augmentation and proficiency. The more I can help you become more proficient on your own, the more satisfied you will be, and the less cost I will engage to satisfy you.”
But while business leaders are putting pressure on their troops to add generative AI and do it quickly, the technology, as it stands, has several risks — what Têtu calls “the 9 headaches of CIOs with GenAI.”
Why generative AI must be handled with care
Behind the hype lie risks that must be addressed before the technology can be unleashed in a sensitive enterprise environment. GenAI requires infrastructure to support ingesting and feeding the right context into those models for them to generate high-quality answers while remaining cost-effective at a large scale — and in a way that respects privacy, security, permissions and proprietary content. Here’s a look at why it’s so hard to lock down a standard Large Language Model [LLM] solution, or generate high-quality answers, and CIOs are rubbing their temples.
Unsecure environment, lack of confidentiality and privacy
Right off the bat, one of the greatest challenges of building and implementing an enterprise AI search and answering solution is the difficulty of securing the environment throughout the generative AI process, as well as maintaining permissions and privacy.
“Does a pharmaceutical company want to upload its IP and customer information into OpenAI? Of course not,” Têtu says “Do I want my customer service people to start loading it up as prompts into GPT, and let GPT store it and use it for someone else? You must isolate the corpus of data that is secure before you even start generating an answer, and control how the generative platform will use the data.”
Hallucinations, outdated answers, non-compliance and validity
LLMs have no concept of “truth,” hard rules or factual accuracy — just language understanding, and are trained on finite data, which is frozen in time, generally in the past. Therefore, there’s a genuine danger of hallucinations — connecting a set of facts together to create a novel yet false answer — while enterprise brands cannot hallucinate with customers and other stakeholders. The need for veracity and verifiability, maintaining the connection to the source of truth behind the generative process, and current not outdated answers, supports the enterprise imperatives of compliance and the need to maintain credibility.
“The enterprise world has constraints, obligations, compliance requirements,” Têtu explains. “If you’re Boeing and a Singapore Airlines engineer with a grounded Dreamliner in Malaysia is trying to solve an engine problem, you cannot hallucinate. On top of that, compliance demands that whatever answer you create, you need to keep the linkage to the source of truth, securely.”
Part of the challenge, and opportunity, is that enterprises also have a huge amount of content, spread across multiple sources, from documents to wikis, intranets, engineering files and customer and service information. The value of generative AI increases exponentially if you can tap into these multiple sources of content to generate answers but exacerbates the security risk unless you can control it well.
Siloed search and conversational channels
The world of search, personalization, answers and conversations are all converging into a new, more modern, digital experience expectation. It’s an experience which unifies search relevance and discovery with personalized answers and conversations. The traditional search box has just become bigger, where ‘intent’ is expressed through either traditional queries or long-form questions. Enterprises struggle to unify conversational and search channels, which makes the user experience inconsistent and adds unnecessary friction.
Treating answering separately is the mistake a majority of enterprises are making right now Têtu says. Pressure from executives means they developed a separate generative AI, forgetting that there’s an existing infrastructure that will probably still serve as the vast majority of interactions.
“Search is not going away,” he explains. “Not everyone will ask a long-form question. You need to make sure that search and chat will generate the same answer, both grounded in the same corpus of results. The way to do that is to make sure they consume the same secure index for all search, extraction, embedding and the vectordb, and the same relevance logic for search and prompt engineering. In other terms, the LLM portion is really only the answering formulation portion.”
The cost of an enterprise solution
A big reason execs should worry about chat and search silos is cost. “The cost of a generative answer right now is approximately 1,000x more expensive than a query event,” Têtu says. “A rich digital experience triggers about 10 queries, so it’s a 100-to-one cost ratio. While this will improve, you would not want to multiply the cost of your search infrastructure by 100.”
Addressing the risks of generative AI
How do you adopt generative AI within the enterprise to ensure you deliver a trusted, relevant, accurate content experience which is coherent across all search and conversational channels and is secure, current, verifiable, and cost effective? That’s the real challenge, but Têtu says, it’s doable.
“You need to be thoughtful about the architecture of how you inject these conversational channels as part of the overall digital experience, and how you feed generative AI. The science here is really in the prompt engineering and grounding data.”
Security, privacy and relevance
First, a platform should guarantee secure access to data with an infrastructure that respects permission and data security rules. On top of that, it requires relevance. While traditional search is content-centric, AI is the technology which lets you understand who’s on the other end, their intent and context, and thus can deliver relevant results and recommendations.
Coveo addressed that issue with Coveo Relevance Cloud. It unifies data across systems to generate a unified enriched index and holistic understanding of content, and uses powerful AI to understand each user who interacts with it, their context and intent. It’s AI ranking algorithms match each query intent to the most relevant content results. This then forms the secure and relevant corpus of content that feeds its new Coveo Relevance Generative Answering technology, built on top of Coveo Relevance Cloud.
Verifiable and credible answers
As the solution leverages real-time data sources from its unified index, it can feed generative answers that are grounded and consistent with enterprise content and maintain links with sources of truth, allowing users to verify and validate the accuracy and credibility of generated answers while discovering more supportive content.
Corralling every source of content
Instead of putting gen AI on a small FAQ repository or knowledge base, the technology can be geometrically unleashed by indexing all sources of content across the enterprise, from engineering and customer files to supply chain information, customer service data and all the other areas content is generated, and use all these pieces of data to generate answers.
In fact, Coveo is currently the only company able to deploy generative AI from multiple sources of enterprise data securely, from Salesforce, Adobe, ServiceNow, SAP, Oracle and Microsoft to the swath of databases and apps in an enterprise. The solution is even able to access external content such as YouTube — consider all the companies for whom YouTube offers relevant content for customers.
Bringing down costs
By combining advanced indexing, search and relevance algorithms to LLM technology, enterprises get results for a fraction of the cost typically associated with generative AI — reducing the huge cost of generative chat.
“Coveo is very uniquely positioned because we already have the corpus of data, it’s secure and it’s relevant to the issues that the user has,” Têtu says. “We can bring it all together, feed it in an LLM, and generate an answer at a much lower cost, much faster, that’s accurate to the context, secure, current, respects permissions and so on, and we get an answer that’s truly coherent, relevant and trustworthy.”