About Me

Greetings! I am a Master’s student in Data Science at UC San Diego passionate about building interpretable, reliable, and impactful AI systems. My work spans machine learning, symbolic regression, and large-scale data engineering, applied across both scientific research and industry settings.

I currently hold multiple roles that reflect this balance:

  • 🌎 Graduate Researcher, Climate Analytics Lab (UCSD) β€” working with Prof. Duncan Watson-Parris on interpretable machine learning for climate science. I design symbolic regression pipelines (PySR, SINDy, KAN) to uncover physically meaningful relationships from satellite climate data, benchmarking against deep learning approaches and hybrid neural-symbolic architectures.
  • πŸ€– Specialist, Human Frontier Collective at Scale AI β€” collaborating with the Scale AI research division to stress-test and refine frontier generative AI systems, contributing to model diagnostics, optimization, and responsible AI evaluation.
  • πŸŽ“ Teaching Assistant, UCSD (DSC200 – Data Science Programming) β€” supporting instruction and mentoring graduate students in programming best practices, data structures, and applied problem-solving.

Research & Publications

  • Accepted paper (2025): BirdCLEF Acoustic Biodiversity Challenge β€” scalable one-vs-rest detection pipeline for avian bioacoustics (to appear in CLEF 2025 proceedings).
  • Other publications:
    • Leveraging Data Analytics in Azure for Effective Churn Management (1st Author)
    • A Comprehensive Examination of Toxic Tweet Classification on Twitter (1st Author)
    • Analysis of Traditional & Deep Learning Architectures in NLP
    • Evolution of DDoS Detection: Traditional vs. Modern ML Models

Education

University of California San Diego (Sept 2024 – Mar 2026)
M.S. Data Science, GPA: 3.9/4.0

Indian Institute of Technology Madras (May 2023 – Apr 2024)
Diploma in Data Science, GPA: 4.0/4.0

VIT University, Chennai (Sept 2020 – Jul 2024)
B.Tech. Computer Science & Engineering, GPA: 3.96/4.0


Selected Projects

  • BirdCLEF+ 2025 β€” Developed per-species XGBoost classifiers using spectrogram statistics and metadata; achieved >0.90 AUC across species.
  • Symbolic Regression for Cloud-Aerosol Interactions β€” Benchmarked interpretable models vs. deep neural nets, balancing RΒ² performance with explainability.
  • NeuroFraudGAN β€” Synthesized financial transaction data with GANs, reducing class imbalance by 40% and achieving 96% fraud detection accuracy.
  • Data Ambiguity Quantification with Optimal Transport β€” Built a framework to rigorously bound generalization loss, improving sample prioritization and model robustness.

Technical Strengths

  • Programming: Python, R, C/C++, SQL
  • Machine Learning & AI: PyTorch, TensorFlow, PySR, KAN, Scikit-learn, Hugging Face
  • Data Engineering & Cloud: Azure, AWS, Databricks, Spark, Kafka, MongoDB
  • Visualization & Analysis: Tableau, Power BI, Matplotlib, Statistical Inference

I’m motivated by challenges at the intersection of AI, science, and data engineering. Whether advancing climate science with interpretable ML or shaping the future of frontier generative AI, I aim to bridge research and application with rigor and creativity.