Introduction
Large-document summarization has progressively become one of the most important use cases in several industries. For instance in the pharmaceutical sector, consider a scenario where experiments are conducted on Drugs A, B, and C across ten different controlled groups. To gain meaningful insights from these studies, an efficient summarization method is needed to extract key findings and compare outcomes effectively, which generate answers that are query specific. Similarly in financial analysis, consider a scenario where an analyst reviews quarterly reports, earnings transcripts, and market data for multiple companies across different sectors. To make informed investment decisions, an effective summarization method is needed to extract key financial metrics, trends, and risk factors from these extensive documents
Most reading endeavors performed by humans across a range of domains rely on the ability to reason from large collections of pages , reaching conclusions and inferences that go beyond what is stated in the source. With LLMs attempts being made to perform sensemaking and intelligent analysis, it is observed that it struggles to perform well by itself.
Currently summarising or expecting query specific answers with LLMs and Naïve Retrieval Augmented Generation (RAG) has significant limitations in the majority of the scenarios where global context is concerned .
For example, if we had to summarise Harry Potter's series and ask some interesting questions.
Which sounds more literary correct? The first one or the second one. The first appears vague but the second answer appears well structured and relevant. This could also be applied to summarising your sales story for last quarter or your product distribution summary. Would you prefer a simple answer or a structured and precise answer ?
The limitations of Naïve RAG
- Local Context Limitation: Naïve RAG is designed for situations where answers are contained locally within specific regions of text. It struggles with questions that require understanding of the entire context of the file (for example a book or series of documents that have a lineage).
- Context Window Constraints: LLMs have limited context windows, which can be insufficient for summarizing large volumes of text, which translates it can only evaluate a small piece of information at a time, and as the saying goes a little knowledge is a dangerous thing. So we can expect LLMs to throw vague or hallucinated answers. Even expanding these windows may not solve the problem, as information can be "lost in the middle" of longer contexts.
- Scalability Issues: Prior query-focused summarization (QFS) methods fail to scale to the quantities of text indexed by typical RAG systems
These limitations necessitate an alternative system that has global understanding, context relevant and scalable.That's where the concept of Query Focussed Summarisation emerges to extract very specific answers relevant to user query not just in sense of source but also in abstract sense. What seems promising for achieving QFS is the GraphRAG strategy, that combines knowledge graphs, LLMs and RAG to deliver one of a kind innovation.What is Query Focused Summarisation (QFS)? Query-Focused Summarization (QFS) is an advanced natural language processing task that aims to generate concise summaries of large text corpora in response to specific user queries. It differs from traditional summarization by tailoring the output to address particular questions or information needs.Key aspects of QFS include:
- Global scope: QFS can handle questions directed at an entire text corpus, such as "What are the main themes in the dataset?"demonstrating transversality by synthesizing information across diverse sections and hierarchical levels of the corpus
- Scalability: It is designed to work with large quantities of text, often exceeding the context limits of typical language models
- Abstractive approach: QFS generates natural language summaries rather than just concatenating excerpts from the source text
- Query-specific: The summarization process is guided by the user's query, ensuring relevance and focus in the generated summary
QFS represents an evolution beyond simple retrieval-augmented generation (RAG) systems, as it can synthesize information from across an entire corpus to answer broad, thematic questions. The GraphRAG approach Graph RAG is an innovative approach to achieve query-focused summarization that uses an LLM to build a graph-based text index, creating a knowledge graph from source documents and generating community summaries. By leveraging graph modularity and community detection algorithms, it enables comprehensive global summarization of large text corpora, overcoming traditional RAG limitations of local context retrieval.In the following steps , the process is highlighted
1. Source Documents → Text chunks:
Input texts are split into manageable chunks, balancing efficiency and quality. The chunk size affects the number of LLM calls required and the quality of entity extraction. Experiments show that smaller chunk sizes (e.g., 600 tokens) can extract almost twice as many entity references compared to larger sizes (e.g., 2400 tokens).
2. Text Chunks → Element Instance:
An LLM prompt identifies and extracts instances of graph nodes (entities) and edges (relationships) from each text chunk. This process uses a multipart prompt that first identifies entities (name, type, description) and then relationships between entities. The prompt can be customized for specific domains using few-shot examples.
3. Element Instance → Element Summaries:
The extracted instances are summarized into single blocks of descriptive text for each graph element (entity node, relationship edge, and claim covariate). This step involves further LLM summarization over matching groups of instances.
4. Element Summaries → Graph Communities:
The resulting index is modeled as a homogeneous undirected weighted graph. Then the graph is partitioned into hierarchical communities of closely-related nodes. This approach leverages the inherent modularity of graphs to create a scalable structure for summarization.
5. Graph Communities → Community Summaries
Report-like summaries are created for each community in the hierarchy. For leaf-level communities, element summaries are prioritized and added to the LLM context window until the token limit is reached. Higher-level communities are summarized based on their sub-communities.
6. Community Summaries → Community Answers → Global Answers:
Given a question, each community summary is used to generate a partial response. These partial responses are then combined into a final global answer to the user's query using a map-reduce approach.This approach enables query-focused summarization of the entire corpora, addressing the limitations of traditional RAG systems in handling global questions. The use of graph-based indexing and hierarchical community detection allows for efficient processing of large datasets, while the LLM-generated summaries provide comprehensive coverage of the underlying information
Table -1 Comparison between Naïve RAG and GraphRAG Feature
Example :
Give a podcast datasets, here is an example comparing behavior of GraphRAG and Naive RAG.
Query: "What are the main themes in the podcast dataset?"
Naïve RAG Answer:“ The podcast dataset includes discussions on technology, entertainment, and current events. Some popular topics include:
- The latest iPhone release and its features
- An interview with a Hollywood actor about their new movie
- A debate on the impact of social media on mental health
- A discussion about a recent political election”
While this answer provides some specific examples, it lacks the broader thematic organization and comprehensive coverage of the Graph RAG approach.
Graph RAG Answer:
The podcast dataset covers several main themes:
- Technology and Innovation
- Discussions on AI advancements and their societal impact
- Emerging trends in consumer electronics and gadgets
- Debates on privacy and data security in the digital age
- Entertainment and Pop Culture
- Analysis of popular TV shows and streaming content
- Interviews with actors, directors, and content creators
- Exploration of social media trends and influencer culture
- Health and Wellness
- Conversations about mental health awareness and self-care
- Discussions on nutrition, fitness, and holistic well-being
- Expert insights on medical breakthroughs and public health issues
- Current Affairs and Politics
- Analysis of global political events and their implications
- Debates on social justice issues and activism
- Discussions on economic policies and their impact on society
This answer demonstrates a comprehensive overview of the dataset's themes, showing the ability of Graph RAG to synthesize information from across the entire corpus.
Table -2 Comparison between answers of Naïve RAG and Graph RAG Aspect
Conclusion
As we stand on the cusp of 2025, Graph RAG represents a significant leap forward in our ability to make sense of vast amounts of information. QFS is not a luxury but a necessity to achieve the purpose of AI transformation where the use of information is maximised with minimum rambling and hallucinations.