System for Enhancing Accuracy of Noisy Text using Deep Network Language Models.
Developed a system utilizing BART & MarianMT to enhance OCR-recognized text accuracy, achieving a 35% reduction in WER.
AI according to me is more than just algorithms and models, it’s a force for augmenting human intelligence, automating the mundane, and unlocking new creative possibilities. Whether it’s fine-tuning LLMs, optimizing MLOps pipelines, or researching model generalizability, I’m constantly driven by the challenge of pushing AI past its perceived limits.
My journey spans LLMs, NLP, MLOps, and AI-driven automation, where I’ve built tools that enhance workflows, optimize decision-making, and unlock new possibilities. Whether it’s researching model generalizability, architecting MLOps pipelines, or competing in AI hackathons, I’m constantly exploring how technology can be more efficient, adaptable, and impactful.
But beyond AI’s ability to optimize and automate, I believe its true power lies in personalization. My dream? To harness AI’s potential to transform education through personalized narratives, making learning dynamic, adaptive, and as unique as the student engaging with it. The future of AI isn’t just about efficiency; it’s about making technology work for people in ways we haven’t yet imagined. I plan to build powerful AI powered educational tools that empower students to rewrite the narrative of their own success.
Oct 2023-Present
Developed collaborative learning models to classify student utterances into three Community Agreements. Optimized 18 Mistral model variants (Mistral+SVM, Zero-shot, few-shot, LoRA fine-tuned) on human-coded and ASR data, with Mistral+SVM models outperforming RoBERTa by 14%. Deployed an AI moderation agent in schools using AWS’s LLaMA 70B, improving collaborative discourse by 50% with integrated security measures.
Nov 2022-Aug 2023
Designed and implemented the SimuBridge backend in C# to emulate PLCs, optimizing data flow into DeviceBridge. Integrated SimuBridge with the OPC server and DeviceBridge, accelerating DeviceBridge stress testing by 50%.
Dec 2022-Apr 2023
Built a Python-based financial data reconciliation system POC to align unordered company and government GST datasets, using character-level similarity matching to detect invoice discrepancies. Designed an automated pipeline for GST matching, data reordering, and field reconciliation, generating structured Excel reports for accurate financial validation and reducing processing time by 90%.
Dec 2021-Jan 2022
Conducted hands-on research on noise reduction in textual signals by designing filters. Developed expertise in converting text into sequence signals using statistical measures and FFT to minimize textual noise. Explored filtering techniques in both time and frequency domains.
Here are some of the projects I have worked on!
Developed a system utilizing BART & MarianMT to enhance OCR-recognized text accuracy, achieving a 35% reduction in WER.
Analyzed BART and MarianMT performance on synthetic vs. natural errors, identifying a break-even point at 26% synthetic error introduction.
Compared BART and MarianMT for grammatical vs. spelling error correction, with BART outperforming on spelling errors by 24.6%.
Investigated methods to improve collaborative discourse classification across diverse educational datasets.
Explored how AI-driven feedback can foster productive uncertainty in small group learning activities.
Here are some of my minor achievements as a content creator
Rohit.Raju@colorado.edu
1600 Amphitheatre Parkway
Mountain View, CA
94043 US