🧠

Research Mind

LLM-Powered Research Copilot for Literature Review and Evidence-Grounded Question Answering

Status

Active Development

Timeline

6 months

Role

Full Stack Engineer

Problem Statement

Literature review is a time-consuming process that requires researchers to:

Search across multiple papers manually, spending hours finding relevant information
Extract and synthesize information from unstructured PDFs without tool support
Compare findings across papers to identify gaps and contradictions
Generate insights while maintaining evidence-grounding and avoiding hallucinations

Target Users: Graduate students, researchers, and academics conducting systematic literature reviews or rapid research synthesis.

Research Motivation

Why RAG?

Retrieval-Augmented Generation (RAG) combines dense retrieval with language models to ground answers in source documents, reducing hallucinations and enabling evidence-backed responses.

Why Hybrid Search?

Hybrid retrieval (BM25 + semantic embeddings) captures both lexical and semantic relevance, improving retrieval accuracy compared to single-mode search.

System Architecture

Backend Pipeline

Document Ingestion: Upload PDFs → Extract text with PyPDF

Chunking: Split text into overlapping chunks (512 tokens, 50 token overlap)

Embedding: Generate embeddings using HuggingFace sentence-transformers

Indexing: Store embeddings in FAISS with BM25 fallback index

Retrieval: Hybrid search combining semantic similarity + keyword matching

Generation: LLM processes retrieved documents + user query → grounded response

Frontend Interface

PDF Upload: Drag-and-drop interface for document management

Query Interface: Real-time question answering with retrieval transparency

Citation Display: Show source documents and retrieved chunks with highlights

Paper Comparison: Multi-document view for synthesizing findings

Technical Implementation

Backend Stack

• Framework: FastAPI
• Embeddings: HuggingFace sentence-transformers
• Vector Store: FAISS
• Retrieval: BM25 + semantic hybrid search
• LLM: GPT-4 / Claude API
• PDF Processing: PyPDF, LangChain

Frontend Stack

• Framework: React with TypeScript
• State Management: React Context
• UI Components: TailwindCSS
• API Integration: Axios
• PDF Viewer: react-pdf
• Deployment: Vercel

Technical Challenges & Solutions

Challenge 1: Handling Large PDFs

Problem: Large research papers (50+ pages) exceed token limits when chunked naively.

Solution: Implemented intelligent chunking with sliding window overlap, separating tables and figures to preserve structure while respecting LLM token limits.

Challenge 2: Hallucination Reduction

Problem: LLM generates plausible but unsupported answers when retrieval fails.

Solution: Implemented retrieval validation — LLM only generates answers if confidence score exceeds threshold; otherwise suggests retrieving more documents.

Challenge 3: Cross-Document Reasoning

Problem: Queries requiring synthesis across multiple papers often retrieve irrelevant sections.

Solution: Developed multi-query expansion strategy where LLM rephrases questions to retrieve diverse perspectives, improving recall for comparative analysis.

Methodology

Evaluation Framework

Evaluated system performance across three dimensions:

Retrieval Quality: NDCG@5, MRR — measuring relevance of retrieved documents
Generation Quality: ROUGE, BERTScore — comparing generated answers to gold references
Factuality: Manual annotation of hallucination rates and citation accuracy

Dataset & Benchmarking

Tested on 50 research papers from arXiv (NLP domain) with 200+ curated questions. Compared against keyword search baseline and single-embedding retrieval.

Results & Impact

Retrieval Accuracy

+45%

Hybrid search vs. semantic-only

Hallucination Reduction

-68%

With confidence-based filtering

User Satisfaction

8.2/10

From researcher feedback (n=15)

Key Findings

• Hybrid retrieval outperforms semantic-only and keyword-only search across all metrics
• Multi-query expansion improves cross-document reasoning by 31% for comparative questions
• Confidence-based filtering reduces hallucinations while maintaining answer quality
• Users prefer cited answers with source transparency over unsourced summaries

Future Work

→

Multi-agent orchestration for automated literature synthesis and research report generation

→

Structured extraction of claims, methodologies, and results for meta-analysis workflows

→

Interactive visualization of citation networks and research gaps across document collections

→

Integration with reference management systems (Zotero, Mendeley) for seamless workflow

→

Fine-tuned retrievers for domain-specific papers (biomedical, physics, computer science)

→

Hallucination evaluation framework with automated factuality scoring

Links & Resources

GitHub Repository

View source code and documentation

Back to Portfolio

View other projects and case studies