Processed 30 GB of StackOverflow posts using PySpark NLP and Spark ML. Achieved 95%+ accuracy across 50 logistic regression models to uncover user trends and engagement patterns.
Performed NLP on 200K+ AI-related articles from Goldman Sachs data to uncover AI impact across industries, tracking trends, risks, and adoption success stories.
Applied supervised and unsupervised ML (e.g., Isolation Forest) to detect malicious vs. benign network traffic and identify anomalous behavior in cybersecurity data.
Developed a job-resume matching engine using factorization-based recommendation models. Matches top candidates to jobs and vice versa with high accuracy.
Web app that lets users perform machine learning tasks like preprocessing, classification, regression, and clustering without writing code.
Users upload images of medications (pills/strips/syrups) and get back detailed medical info, powered by OCR and search APIs.
AI-driven legal chatbot with multilingual support. Uses RAG and ChromaDB to answer Indian law queries in simplified language.
IEEE-published research paper on ML models detecting cyberbullying on Twitter and how tweet engagement metrics amplify it.
Web tool to upload PDFs/CSVs and get instant answers using document QA pipeline. Built with Streamlit and NLP libraries.