Mia S.
Dagati

Mia Dagati

Data Scientist & Research Assistant
Michigan State University  ·  Expected Aug 2026

2
Degrees in progress
Education →
7
Presentations
Presentations →
4
Years of research
Experience →
3
Publications
Publications →
Award

First Place — MSU Undergraduate Research & Arts Forum, Animal Science & Agriculture (2024)


I'm a dual-degree student at Michigan State University completing a B.S. in Statistics and a B.S. in Data Science (CMSE), expected August 2026. I build reproducible, policy-ready data systems at the intersection of applied statistics, data engineering, and environmental sustainability.

Over four years of research I've contributed to a $1.2M IoT-based irrigation optimization study in Michigan apple orchards, built a national-scale ETL and classification pipeline mapping embodied material stocks across utility-scale solar arrays, and published a peer-reviewed article in AgriEngineering. I've presented at ASABE, ISSST 2026, and multiple MSU forums, with an IEEE PVSC 2026 submission pending, and have served as an undergraduate learning assistant in MSU's statistics department.

I work primarily in R and Python, with experience in geospatial data integration, ETL pipelines, time-series analysis, and machine learning. I hold a CyberAmbassador certification from the NSF and am active in ASABE, the MSU AI Club, Women in Computing, and the MSU Powerlifting Team.


Personal Life & Hobbies

I grew up in Ontario, Canada until I was 11, then my family moved to Macomb County, Michigan, about 30 minutes north of Detroit, where I completed middle and high school. I'm the first in my family to go away to university. My mom commuted to a local college, and my dad never went to college at all, so leaving home was a big deal for all of us. I chose Michigan State because I wanted to stay close enough to have the opportunity to come home on weekends, and because MSU's research opportunities were genuinely hard to pass up. Both turned out to be the right call. My dad was born in Italy and both sets of grandparents were born there too, so getting to visit there has meant a lot to me personally. Most of what I love outside of research traces back to the same thing: being outside and caring about the world I'm a part of.

Hiking & Backpacking
Sharp Top Trail, Blue Ridge Parkway
Sharp Top Trail, Blue Ridge Parkway, VA
Zion National Park
Zion National Park, UT
Watkins Glen State Park
Watkins Glen State Park, NY
Letchworth State Park
Letchworth State Park, NY
Cliffs of Moher, Ireland
Cliffs of Moher, Ireland
Acadia National Park
Acadia National Park, ME

Getting outside is the thing I look forward to most. There is something about the beauty and vastness of nature that never gets old for me. Acadia is one that has stayed with me, the fall foliage, the ocean, the quiet of it. The Grand Canyon and Zion are hard to top for sheer scale, but some of my favorite days have been on trails closer to home. I did a 20-mile backpacking trip on the North Country Trail near Manistee, Michigan that I think about often. Ireland surprised me completely. Standing on the Cliffs of Moher felt like being at the edge of the world.

Grand Canyon Zion National Park Acadia National Park Cliffs of Moher, Ireland Sharp Top Trail, Blue Ridge Parkway Horseshoe Bend, AZ Letchworth State Park, NY Watkins Glen State Park, NY Eternal Flame Falls, NY NCT Backpack, Manistee MI (20 mi) Sleeping Bear Dunes, MI Hocking Hills, OH
Travel

Italy twice, France, Spain, Germany, Czech Republic, Ireland, and a handful of Caribbean islands. My dad was born in Italy and both sets of grandparents were born there, so those trips have been personal in a way that is hard to put into words. I collect old books wherever I go. My oldest is a French set from 1763, over 262 years old and still intact.

Powerlifting

I have been lifting for six years and got into powerlifting specifically my sophomore year of college. Competing with the MSU Powerlifting Team. It helped develop the discipline I carry into every other part of my life today.

Antique book collecting
Pickleball
Sustainability advocate
Family oriented
First-generation university student

Work Experience

Dec 2024
– Present
Research Assistant IV
MSU Civil & Environmental Engineering
  • Designed and implemented a multi-source data integration pipeline in R, joining geospatial array metadata, manufacturer datasheets, and spatiotemporal records into a unified, reproducible analytical dataset using tidyverse, sf, and tigris workflows.
  • Developed and applied a rule-based classification framework to infer First Solar CdTe module series from installation year and geometric metadata, incorporating confidence scoring through cross-validation against Landsat-derived estimates.
  • Engineered a tiered module count estimation approach combining panel-level geometry, array-level geometry, and manufacturer specification sheet fallbacks to produce array-level material inventory estimates at national scale.
  • Prepared and delivered research findings to faculty, collaborators, and external stakeholders through written reports, presentations, and conference submissions including IEEE PVSC 2026 and ISSST 2026.
  • Compiled a first-of-its-kind per-module material intensity reference table for First Solar CdTe Series 2–7, aggregating CdTe mass, front/back glass thickness, aluminum frame, steel back rail, EVA encapsulant, and copper leadwire values from manufacturer spec sheets, life cycle inventory (LCI) reports, and First Solar user manuals. LCI per-m² values were converted to per-module using series-specific module areas and cross-validated against known datasheet specifications.
Outcomes
National solar material inventory Reproducible ETL pipeline ISSST 2026 conference paper Geospatial data integration Python / ETL Stakeholder communication
Aug 2025
– Dec 2025
Undergraduate Learning Assistant
MSU Dept. of Statistics & Probability
  • Assisted in leading formal classroom instruction twice weekly for a 60-student course, delivering student-centered lessons on R programming, reproducible coding workflows, and data analysis techniques in a high-expectation learning environment.
  • Held weekly office hours providing individual and small-group tutoring, offering personalized academic feedback and hands-on R coding support to help students work through assignments and build technical confidence.
  • Graded student homework submissions with a focus on code quality, reproducibility, and workflow structure, providing detailed written feedback to support iterative improvement and long-term learning.
  • Guided students through foundational machine learning implementation and statistical analysis using R and R Markdown, emphasizing clean, documented, and reproducible analytical pipelines.
Outcomes
Taught full semester course section Pedagogy & instruction R / R Markdown Technical communication
Jun 2024
– Aug 2024
Research Intern
MSU College of Engineering
  • Continued METER TEROS 12 IoT sensor data collection and time-series analysis across two MSU agricultural research farms over the summer, maintaining the sensor network and processing incoming soil moisture, temperature, and EC data.
  • Developed the Ubidots farmer-facing dashboard further, surfacing real-time irrigation recommendations based on ET calculations and soil moisture thresholds in a simple, accessible format requiring no technical interpretation from the farmer.
  • Managed day-to-day field operations and personnel scheduling across both research farm locations throughout the summer research period.
  • Presented research progress and findings to engineering faculty and project stakeholders, communicating technical results clearly to both technical and non-technical audiences.
Outcomes
Farmer-facing decision tool prototype ASABE presentation (Aug 2024) Web development IoT systems
Jan 2023
– Jan 2025
Research Assistant I–III
MSU Biosystems & Agricultural Engineering
  • Contributed to a two-year, $1.2M green agriculture research study evaluating irrigation methods and scheduling to minimize water consumption, prevent soil degradation, and optimize crop yield in Michigan apple orchards, with work directly aligned with sustainable land and resource management.
  • Designed and executed deployments of METER TEROS 12 IoT sensors across multiple MSU agricultural research farms, integrating time-series weather data to calculate crop evapotranspiration values and building a farmer-facing Ubidots dashboard delivering real-time soil moisture status and plain-language irrigation recommendations.
  • Conducted a controlled laboratory study estimating nitrate dynamics in sandy soil using METER TEROS 12 electrical conductivity sensors, establishing a statistically significant positive correlation between nitrate concentration and EC and demonstrating measurable downward nitrate transport through soil profiles using a custom lysimeter setup; findings were presented at the 2024 ASABE Annual International Meeting in Anaheim, CA.
  • Managed day-to-day operations across two MSU agricultural research farms, creating schedules and assigning tasks to keep project personnel and field operations running efficiently across the full duration of the study.
  • Authored technical reports and grant proposals; supervised and mentored undergraduate student researchers across multiple project phases, providing training in data collection, statistical analysis, and scientific communication.
Outcomes
Contributed to $1.2M study AgriEngineering publication (2025) Real-time farmer decision tool First Place — MSU URAF 2024 IoT sensor deployment Time-series analysis Team leadership R / statistical modeling

Education

Michigan State University
East Lansing, MI  ·  Expected August 2026
B.S. Statistics, Minor in Data Science
B.S. Data Science, Computational Mathematics, Science & Engineering (CMSE)

Relevant Coursework: Probability & Statistics I–II, Bayesian Statistical Methods, Statistics for Biologists, Introduction to Data Science, Computational Modeling & Data Analysis I–II, Fundamentals of Data Science Methods, Matrix Algebra, Differential Equations, Multivariable Calculus

MSU Resident Scholarship (2022–2026) CyberAmbassador Certification (2024) Graduate Research Mentor Training College of Engineering Travel Grant (2026) CANR Undergraduate Travel Grant (2024) CANR Undergraduate Research Grant (2024)
Dakota High School
Macomb, MI  ·  Diploma, June 2022

AP Coursework: Calculus BC, Statistics, Biology, Environmental Science, Microeconomics, U.S. History, World History, English Literature

National Honor Society Rho Kappa National Social Studies Honor Society

Publications

Peer-Reviewed Journal Article
IoT-Enabled Soil Moisture and Conductivity Monitoring Under Controlled and Field Fertigation Systems
Dagati, M.S., et al.  ·  AgriEngineering, July 2025  ·  Third Author
Manuscript in Preparation  ·  First Author
Mapping the Spatial Distribution of Embodied Material Stock within Thin-Film CdTe Utility-Scale Solar Arrays in the United States
Dagati, M.S., et al.  ·  Target journal: Resources, Conservation and Recycling (RCR)
View Manuscript Preview →
IEEE PVSC 2026 conference abstract
Conference Proceedings  ·  First Author
Estimating Nitrate Dynamics in Sandy Soil Using Electrical Conductivity Sensors
Dagati, M. and Dong, Y.  ·  2024 ASABE Annual International Meeting  ·  Anaheim, CA  ·  July 2024  ·  Paper No. 2400819
Software / Code Repository  ·  Contributor (In Progress)
GMSEUS — Geospatial Mapping of Solar Energy in the United States
GitHub  ·  stidjaco/GMSEUS  ·  Contributing code workflow for material stock mapping from thin-film CdTe utility-scale solar arrays
Dataset in Preparation  ·  First Author
Per-Module Material Intensity Reference Table for First Solar CdTe Series 2–7
Dagati, M.S., et al.  ·  Target repository: Figshare  ·  To be published upon manuscript submission to Resources, Conservation and Recycling (RCR)

Presentations

Conference Presentation
Mapping the Spatial Distribution of Embodied Material Stock within Thin-Film CdTe Utility-Scale Solar Arrays in the United States
ISSST 2026  ·  Rochester, NY  ·  June 2026
Conference Presentation  ·  Accepted
Mapping the Material Stock Spatial Distribution from Thin-Film CdTe Utility-Scale Photovoltaic Arrays in the United States
IEEE Conference  ·  2026  ·  Accepted; not attending
Conference Presentation
IoT-based Irrigation Optimization Research
ASABE International Annual Meeting  ·  Anaheim, CA  ·  August 2024
Pictured right: Presenting at the 2024 ASABE Annual International Meeting, Anaheim, CA
Presenting at ASABE 2024
Symposium Presentation
Optimal Irrigation in Apple Orchards
Mid-Michigan Symposium for Undergraduate Research Experiences  ·  East Lansing, MI  ·  July 2024
Campus Presentation  ·  First Place — Animal Science & Agriculture
Irrigation Optimization Research
MSU Undergraduate Research & Arts Forum  ·  April 2024
Pictured right: Accepting the First Place Award with MSU President Kevin Guskiewicz, MSU Undergraduate Research & Arts Forum, 2024
Accepting First Place award with MSU President Kevin Guskiewicz
Campus Presentation
Irrigation Optimization Research
MSU Undergraduate Research and Arts Forum  ·  April 2025
Campus Presentation
Green Energy Social Services Building, Costa Rica
MSU College of Engineering Design Day  ·  East Lansing, MI  ·  February 2023

Technical Skills

Programming

  • R (fluent)
  • Python
  • SQL
  • MATLAB
  • CAD

Data & Pipeline Tools

  • pandas, NumPy
  • tidyverse, ggplot2
  • Git / GitHub
  • Jupyter, R Markdown

Modeling & ML

  • OLS / Logistic Regression
  • LASSO, Ridge, GBM
  • XGBoost, Random Forest
  • SVM, GAMs

Data Engineering

  • ETL Pipeline Development
  • Feature Engineering
  • Time-Series Processing
  • Geospatial Data Integration

Soft Skills

Communication

  • Technical writing
  • Stakeholder presentations
  • Cross-functional communication
  • Scientific communication

Leadership & Mentorship

  • Undergraduate mentorship
  • Research team supervision
  • Instructional facilitation
  • Personnel management

Project Management

  • Multi-site field operations
  • Grant & proposal writing
  • Organizational planning
  • Long-horizon study design

Research Practice

  • Reproducible workflows
  • Data collection protocols
  • Peer review & publication
  • Conference presentation

Certifications

CyberAmbassador Certification
CyberAmbassadors Program  ·  NSF Award #1730137  ·  November 2024

A professional skills certification in Communication, Teamwork, and Leadership for STEM professionals, designed to help scientists and engineers work more effectively across disciplines and with non-technical audiences. Hosted at Michigan State University and funded by the National Science Foundation.

Graduate Research Mentor Training
Michigan State University  ·  November 22, 2024

A structured training program equipping graduate students and researchers with evidence-based mentoring practices for supervising undergraduate researchers. Developed by the National Research Mentor Network, CIMER, and the Tau Beta Pi Association.

SQL (Basic)
In Progress
HackerRank

An industry-recognized certification validating foundational SQL skills including querying, filtering, aggregation, and joins — administered through HackerRank's standardized assessment platform.

Research & Projects

Solar Material Inventory Mapping

Research

MSU Civil & Environmental Engineering  ·  Dec 2024 – Present

Developed a data integration and inference framework classifying utility-scale CdTe solar arrays by First Solar module series across the contiguous United States. Built a national-scale ETL pipeline joining GM-SEUS geospatial metadata, manufacturer datasheets, and spatiotemporal records to produce validated, policy-ready material inventory outputs with direct applications to solar recycling infrastructure and circular economy planning.

ETL PipelinesGeospatial DataSolar EnergyRPython
Preliminary Results
Proportional geographic concentration of thin-film solar array materials by type across the contiguous United States
Proportional geographic concentration of thin-film CdTe solar array materials by type across the contiguous U.S.
View Code Repository →

IoT-Enabled Irrigation Optimization

Research

MSU Biosystems & Agricultural Engineering & MSU College of Engineering  ·  Jan 2023 – Jan 2025

Contributed to a two-year, $1.2M study evaluating irrigation methods and scheduling to minimize water consumption, prevent soil degradation, and optimize crop yield in Michigan apple orchards. Designed and executed IoT sensor deployments across MSU research farms using METER TEROS 12 sensors, collecting soil moisture, temperature, and electrical conductivity readings every 5 minutes. Sensors were connected to a custom-designed breadboard powered by solar panels, logged locally on an Arduino Uno, and streamed to Ubidots via local WiFi.

Integrated time-series weather data to calculate crop evapotranspiration (ET) values, which were used alongside soil moisture readings to determine whether irrigation was needed and how long to run it. This fed into a simple farmer-facing Ubidots widget that displayed a direct recommendation — "Yes, water for X minutes" or "No irrigation needed" — removing the need for any technical interpretation on the farmer's end. The dashboard also displayed real-time soil moisture status (on-target, above, or below field capacity) and EC values for fertilizer management.

IoT sensor installation at MSU Northwest Michigan Horticultural Research Station
MSU Northwest Michigan Horticultural Research Station, Traverse City, MI
METER TEROS 12IoT SensorsTime-Series AnalysisEvapotranspirationWeather DataC++ (Arduino)UbidotsDecision SupportWeb Development

First Solar CdTe Per-Module Material Intensity Reference Table

Research

MSU Civil & Environmental Engineering  ·  Supporting work for Solar Material Inventory Mapping

Compiled a first-of-its-kind per-module material intensity reference table for First Solar CdTe Series 2 through 7, a dataset that had never been unified in this format before. Industry sources typically report material values per m², making direct per-module comparison across series difficult. This table aggregates module size, total weight, glass thickness (front and back), CdTe mass, aluminum frame, steel back rail, EVA encapsulant, copper leadwire, and other materials across all five series using manufacturer spec sheets, life cycle inventory (LCI) reports scraped from DOE databases, and First Solar user manuals.

LCI per-m² values were converted to per-module using series-specific module areas from official datasheets, with derived glass thickness values cross-validated against known S4 specs (LCI-derived 3.24–3.38 mm vs. datasheet 3.2 mm). CdTe mass was calculated from film thickness, CdTe density (5,850 kg/m³), and module area using S2/S3 at 3.0 µm and S4/S6/S7 at 2.5 µm per First Solar internal communication cited in OSTI:2308831. Copper leadwire was derived from conductor cross-section, wire length, wire count, and copper density (8,960 kg/m³) from datasheet wiring specifications.

Data CompilationLCI AnalysisMaterial ScienceSolar PVExcelCircular Economy
Dataset not yet open source — will be published to Figshare upon manuscript submission.

Nitrate Dynamics Estimation in Sandy Soil Using EC Sensors

Research

MSU Biosystems & Agricultural Engineering  ·  2023–2024  ·  ASABE Annual International Meeting, Anaheim, CA

Designed and executed a controlled laboratory study to estimate nitrate movement and concentration in sandy soil using TEROS 12 electrical conductivity sensors. Built a custom lysimeter system using modified 5-gallon buckets with dual-depth sensor placements to track nitrate transport through soil profiles under controlled flush conditions. Established a statistically significant positive correlation between nitrate concentration and EC (0.0032 mS/cm per 1 mg/L-NO₃), and demonstrated that observed EC fluctuations were driven by nitrate movement rather than changes in soil moisture, providing a low-cost, farmer-accessible method for monitoring soil nitrate leaching in agricultural settings.

IoT SensorsExcelTime-SeriesSoil ScienceStatistical ModelingAgricultural Engineering

IoT Household Appliance Energy Pipeline

Academic

Regression, ETL & Predictive Modeling

Built a reproducible ETL pipeline in R ingesting IoT time-series sensor data, engineering lag and temporal features, and outputting clean datasets ready for downstream analytics. Applied rolling-origin cross-validation to prevent data leakage across time windows and benchmarked KNN, OLS, LASSO, Ridge, GBM, SVM, and Random Forest models with HAC-robust inference and VIF-based feature selection.

RETLTime-SeriesRandom ForestGBMHAC Inference
View Project →

Titanic Survival Prediction

Academic

Binary Classification Modeling

Built a binary classification pipeline in R Markdown implementing logistic regression, KNN, decision trees, Random Forest, GBM, LASSO/Ridge, XGBoost, and GAMs. Performed stratified median imputation, one-hot encoding, and principled feature exclusion based on bias-variance trade-off reasoning; evaluated models via ROC/AUC and cross-validation.

R MarkdownXGBoostGAMsROC/AUCClassificationFeature Engineering
View Project →

Neural Decoding of Face Identity from Macaque Brain Activity

Academic

Multiclass Classification & Neural Decoding  ·  November 2025  ·  with Lindsey Myers

Investigated whether face identity could be decoded from neural spike activity recorded in the anterior medial (AM) face patch of macaque monkeys using the Freiwald-Tsao dataset. Preprocessed 2,685 trials × 400 ms of spike rasters into population-level spike count feature matrices, then trained and compared multinomial logistic regression and random forest classifiers via stratified 5-fold cross-validation. Logistic regression achieved 81% CV accuracy and 77.5% test accuracy, demonstrating that face identity is highly linearly decodable from AM population activity. Confusion matrices revealed which identities were most frequently misclassified, providing insight into the representational geometry of face space in the AM region.

Pythonscikit-learnNeural DataLogistic RegressionRandom ForestStratified CVNeuroscience
View Project →

Urban Delivery Route Optimization — Lansing, MI

Academic

Graph Algorithms & Computational Modeling  ·  Group Project

Modeled the city of Lansing, MI as a real street graph using osmnx and networkx to solve an urban last-mile delivery problem. Implemented a greedy nearest-neighbor algorithm traversing delivery nodes starting from a post office origin, incorporating real road travel times and edge speeds for realistic routing. Results demonstrated meaningful reductions in total travel time versus random traversal order, validating the greedy approach for real-time logistics applications where optimal solutions are computationally infeasible.

PythonosmnxnetworkxGraph AlgorithmsGreedy AlgorithmRoute Optimization
View Project →

Contact

Let's connect.

Open to research collaborations, data science opportunities, and academic discussions.