In this blog, we will address the primary challenges of generative AI applications, define retrieval augmented generation (RAG) and explore its potential as a solution, and then explain how Precanto's data centralization facilitates the application of RAG to FP&A. Although Precanto’s RAG-based solution is currently in beta, we believe it is crucial to engage with those interested in AI and ML in the FP&A sphere.
What is Retrieval Augmented Generation?
In simple terms, Retrieval-Augmented Generation (RAG) is a process where an AI application enhances its generated outputs with information retrieved from external sources. Instead of relying solely on the data it has been trained on, the AI retrieves context-specific information relevant to the use case to provide more accurate and informed responses.
IBM Research defines RAG as “RAG is an AI framework for retrieving facts from an external knowledge base to ground large language models (LLMs) on the most accurate, up-to-date information and to give users insight into LLMs' generative process.” - IBM Research
Figure 1. Conceptual Flow of Using RAG With LLMs: This diagram shows the architectural flow of the generation of a RAG-based response.
Image source: What is RAG? - Retrieval-Augmented Generation Explained - AWS
The Benefits of RAG
Enhanced precision and fact-based foundation
A contextual-based response is especially important for any finance application because the threshold for numerical errors in finance is zero. In finance, any forecasting errors, especially for a publicly traded company, could result in an earnings miss, which CFOs and FP&A leaders look to avoid. By querying and citing context-specific data, the AI output can explicitly provide accurate numerical responses.
Advanced grasp of contextual nuances
The strength of large language models (LLMs) lies in their capacity to generate tailored responses to specific questions or needs, in contrast to relying on more generalized information retrieval methods (e.g. a standardized downloadable data report). Their training on vast quantities of natural language data empowers them to comprehend context and efficiently guide users to the desired results. This is particularly valuable in financial planning and analysis, where multiple questions can lead to the same intended answer.
For example, let’s take a typical question from a VP of FP&A: “How many hires are we going to make in the United States this year? How much is that going to cost?”
An out-of-the-box, generalized platform solution may try to answer this question broadly, like this response from ChatGPT:
“To estimate the cost of hiring, you'd typically consider expenses like recruitment advertising, employee referral bonuses, hiring software or services, background checks, onboarding and training, salaries or wages, benefits, and any other associated overhead costs.”
The VP of FP&A isn’t asking for help in figuring out how to find the answer – they want to know the exact number for their specific company. They want someone, or something, else to do the work, save them time, and get them to the correct end point.
A general solution can’t do that.
One might suggest augmenting the out-of-the-box model with fine-tuning, “the process of adjusting the parameters of a pre-trained large language model to a specific task or domain” (Fine-Tuning LLMs : Overview, Methods, and Best Practices).
Using the example above, let’s say we've fine-tuned a foundational model to understand some of the business and technological nuances, and it comes up with the response to the same input query, “How many hires are we going to make in the United States this year? How much is that going to cost?”:
“To find this information, you will need to navigate to your database and create a query to take the sum of the values in the headcount_plan and applicant_tracking_system tables and filter for locations in the United States. The headcount values can be found on column headcount and the cost can be found on column fully_loaded_cost.”
This can be helpful if you’re savvy with SQL and have a working knowledge of your database, but it’s still not giving a director of finance the answer that makes their life easier.
This is where RAG comes in.
Say the generative AI application has an integrated knowledge base to dynamically retrieve information and generate outputs that align with the context of the query and the pertinent data.
A hypothetical response from a RAG-based application may look something like this, with the same input query as before, “How many hires are we going to make in the United States this year? How much is that going to cost?”:
“In your question, you asked about the number and cost of hires in the United States this year. Here’s what was found: ‘The total number of hires planned and hires made in the location of the United States within the accounting periods in 2024 is 484. The sum of the fully loaded cost of the hired employees and the planned hires is $65,340,000.”
With this response, the user gains valuable insights into their data, and with only one query. This is particularly crucial for FP&A, where dedicating time to gathering data reports, building additional spreadsheet workflows, and locating specific data points detracts from the ability to focus on strategic thinking and executing key decisions.
In order to be able to retrieve this information, it first needs to be present. With Precanto, fully loaded costs of employees based on comprehensive formulae and predictive analytics can be readily calculated and provided as context to a RAG stream.
Customizable outputs
From a technology standpoint, RAG offers the advantage of producing outputs tailored to specific business cases such as headcount spend, sales compensation, and payroll taxes, with the flexibility to be dynamically customized for each user's requirements. Whether you prioritize detailed reports with high verbosity or concise numeric summaries, prompt engineering within the RAG framework enables this level of customization, greatly enhancing its value compared to standalone prompt engineering. This adaptability can significantly accelerate the time-to-value for any analytical use case.
Output transparency
Retrieval-Augmented Generation (RAG) enhances output transparency by retrieving information from specific sources during the generation process, enabling the model to provide citations and references for its responses. By offering explanations or justifications and ensuring the use of external data for accuracy, RAG aligns the model's output with the latest available information.
This approach increases accountability, allowing users and developers to hold the model responsible for its responses, and facilitates error analysis by pointing to the exact sources used. Additionally, RAG's contextual retrieval improves the quality and clarity of the model's output, providing more nuanced and tailored responses that help users better understand the sources and reasons behind the model's answers.
How RAG Aligns with Precanto’s Mission
Opportunity for cost optimization and enhanced productivity
Incorporating real-time data feeds into your AI framework yields immediate advantages for FP&A practitioners. While a large language model can facilitate human-like responses and provide directional guidance, a RAG approach delivers unprecedented support. By accelerating data access, practitioners can promptly make informed decisions and efficiently return to core responsibilities.
This is the type of service and usability we’re striving to build for our customers here at Precanto.
Data Centrality
A key advantage of utilizing Precanto is the consolidation of all financial planning data into a unified platform that provides actionable insights. As your organization expands and data volumes increase, the ability to leverage and manage this data becomes increasingly critical.
Closing Thoughts
The increasing usage of LLMs and RAG-based applications is exciting and presents promising opportunities for the finance sector. As the field expands rapidly, we remain committed to advancing research and development in the intersection of FP&A and AI.
For more information on how Precanto is looking to stay at the forefront of FP&A and AI/ML, follow us on LinkedIn: https://www.linkedin.com/company/precanto/.
Also feel free to reach out to me, Rodrigo, with any and all of your questions about Precanto, the cross-space between FP&A and AI/ML, or anything related to data science.
Rodrigo Cochran, Machine Learning and Data Science @ Precanto
Email: rodrigo@precanto.com
LinkedIn: https://www.linkedin.com/in/rodrigo-cochran/
Sources
What is RAG?
https://aws.amazon.com/what-is/retrieval-augmented-generation/
Large Language Models Explained
https://www.nvidia.com/en-us/glossary/large-language-models/
What is retrieval-augmented generation?
https://research.ibm.com/blog/retrieval-augmented-generation-RAG
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
https://arxiv.org/abs/2201.11903
Prompt Engineering
https://platform.openai.com/docs/guides/prompt-engineering
Getting started with LLM fine-tuning
Retrieval-Augmented Generation for Large Language Models: A Survey
https://arxiv.org/abs/2312.10997