Research Mind
LLM-Powered Research Copilot for Literature Review and Evidence-Grounded Question Answering
Status
Active Development
Timeline
6 months
Role
Full Stack Engineer
Problem Statement
Literature review is a time-consuming process that requires researchers to:
- Search across multiple papers manually, spending hours finding relevant information
- Extract and synthesize information from unstructured PDFs without tool support
- Compare findings across papers to identify gaps and contradictions
- Generate insights while maintaining evidence-grounding and avoiding hallucinations
Target Users: Graduate students, researchers, and academics conducting systematic literature reviews or rapid research synthesis.
Research Motivation
Why RAG?
Retrieval-Augmented Generation (RAG) combines dense retrieval with language models to ground answers in source documents, reducing hallucinations and enabling evidence-backed responses.
Why Hybrid Search?
Hybrid retrieval (BM25 + semantic embeddings) captures both lexical and semantic relevance, improving retrieval accuracy compared to single-mode search.
System Architecture
Backend Pipeline
Document Ingestion: Upload PDFs → Extract text with PyPDF
Chunking: Split text into overlapping chunks (512 tokens, 50 token overlap)
Embedding: Generate embeddings using HuggingFace sentence-transformers
Indexing: Store embeddings in FAISS with BM25 fallback index
Retrieval: Hybrid search combining semantic similarity + keyword matching
Generation: LLM processes retrieved documents + user query → grounded response
Frontend Interface
PDF Upload: Drag-and-drop interface for document management
Query Interface: Real-time question answering with retrieval transparency
Citation Display: Show source documents and retrieved chunks with highlights
Paper Comparison: Multi-document view for synthesizing findings
Technical Implementation
Backend Stack
- • Framework: FastAPI
- • Embeddings: HuggingFace sentence-transformers
- • Vector Store: FAISS
- • Retrieval: BM25 + semantic hybrid search
- • LLM: GPT-4 / Claude API
- • PDF Processing: PyPDF, LangChain
Frontend Stack
- • Framework: React with TypeScript
- • State Management: React Context
- • UI Components: TailwindCSS
- • API Integration: Axios
- • PDF Viewer: react-pdf
- • Deployment: Vercel
Technical Challenges & Solutions
Challenge 1: Handling Large PDFs
Problem: Large research papers (50+ pages) exceed token limits when chunked naively.
Solution: Implemented intelligent chunking with sliding window overlap, separating tables and figures to preserve structure while respecting LLM token limits.
Challenge 2: Hallucination Reduction
Problem: LLM generates plausible but unsupported answers when retrieval fails.
Solution: Implemented retrieval validation — LLM only generates answers if confidence score exceeds threshold; otherwise suggests retrieving more documents.
Challenge 3: Cross-Document Reasoning
Problem: Queries requiring synthesis across multiple papers often retrieve irrelevant sections.
Solution: Developed multi-query expansion strategy where LLM rephrases questions to retrieve diverse perspectives, improving recall for comparative analysis.
Methodology
Evaluation Framework
Evaluated system performance across three dimensions:
- Retrieval Quality: NDCG@5, MRR — measuring relevance of retrieved documents
- Generation Quality: ROUGE, BERTScore — comparing generated answers to gold references
- Factuality: Manual annotation of hallucination rates and citation accuracy
Dataset & Benchmarking
Tested on 50 research papers from arXiv (NLP domain) with 200+ curated questions. Compared against keyword search baseline and single-embedding retrieval.
Results & Impact
Retrieval Accuracy
+45%
Hybrid search vs. semantic-only
Hallucination Reduction
-68%
With confidence-based filtering
User Satisfaction
8.2/10
From researcher feedback (n=15)
Key Findings
- • Hybrid retrieval outperforms semantic-only and keyword-only search across all metrics
- • Multi-query expansion improves cross-document reasoning by 31% for comparative questions
- • Confidence-based filtering reduces hallucinations while maintaining answer quality
- • Users prefer cited answers with source transparency over unsourced summaries
Future Work
Multi-agent orchestration for automated literature synthesis and research report generation
Structured extraction of claims, methodologies, and results for meta-analysis workflows
Interactive visualization of citation networks and research gaps across document collections
Integration with reference management systems (Zotero, Mendeley) for seamless workflow
Fine-tuned retrievers for domain-specific papers (biomedical, physics, computer science)
Hallucination evaluation framework with automated factuality scoring