-- ABOUT ME --

I’m Shruthi Arunkumar, currently pursuing my Master’s in Data Science at the University of Pennsylvania (2024–2026). I’ve always been drawn to the challenge of finding meaning in complex data — whether it’s spotting hidden patterns, optimizing queries, or designing efficient databases.

My journey has taken me from working as a Data Analyst Intern, where I translated raw data into insights, to serving as a Graduate Teaching Assistant and Blockchain Bootcamp Instructor, where I discovered the joy of simplifying technical concepts and sharing them with others.

I’m especially fascinated by the intersection of data science, blockchain, and database systems, and how these fields can shape secure, intelligent, and scalable systems. At the heart of my work is a curiosity to explore, a drive to solve problems, and a belief that well-analyzed data can spark real-world impact.

-- EXPERIENCE --

Blockchain Bootcamp Instructor — My Ivy Global Academy (Mar–Apr 2025)
  • Led an introductory blockchain bootcamp for high school students, covering the fundamentals of blockchain technology, cryptocurrency, and basic smart contract concepts.
  • Developed beginner-friendly materials to simplify complex topics, ensuring accessibility for students with no prior experience in blockchain or coding.
  • Delivered live lessons and demonstrations on blockchain fundamentals, including how decentralized networks work and an overview of popular cryptocurrencies.
  • Conducted interactive Q&A sessions to address student questions, clarify concepts, and ensure understanding of core principles.
  • Created engaging hands-on activities to help students grasp basic coding skills and blockchain applications in a supportive, low-pressure environment.
Graduate Teaching Assistant — Wharton School, UPenn (Aug–Dec 2024)
  • Collaborated with Professor Donna Redel in designing and curating course content for LGST 6440, a graduate-level course on blockchain and cryptocurrencies.
  • Managed and organized course materials on Canvas, ensuring seamless access to lecture notes, readings, assignments, and other resources for students.
  • Coordinated guest lectures, liaising with industry professionals and academics to bring diverse perspectives to the course. Managed logistics, scheduling, and ensured technical support for virtual and in-person sessions.
  • Assisted in grading assignments, providing detailed feedback to students on blockchain-related projects, papers, and exams.
  • Facilitated interactive classroom activities, such as group discussions and case study analyses, to deepen students' understanding of blockchain use cases and emerging trends.
Data Analyst Intern — Sakaye Infotech Inc (Jan–Nov 2023)
  • Gathered and processed data from various sources, including spreadsheets, databases, and external datasets, ensuring accuracy, consistency, and completeness through thorough data cleaning.
  • Applied statistical techniques such as descriptive statistics, regression analysis, and hypothesis testing to analyze datasets, uncover trends, and make data-driven predictions using tools like Python (Pandas, NumPy), Power BI, and Excel.
  • Created visually compelling reports and dashboards, effectively summarizing complex data findings and presenting them to team members and stakeholders for informed decision-making.
  • Conducted data quality assessments, identifying anomalies and outliers, and utilizing statistical methods like variance analysis and outlier detection to improve data reliability and business processes.
  • Designed and implemented data-driven solutions to optimize key performance indicators (KPIs), improve operational efficiency, and enhance the company's analytical capabilities.

-- SKILLS --

Python R SQL Baan IV SAP ABAP JavaScript Pandas NumPy Matplotlib Seaborn Power BI Scikit-learn TensorFlow XGBoost Random Forest Git Jupyter MS Excel Google Workspace Microsoft Suite

-- RESUME --

Download my full resume here.

-- PROJECTS --

Predicting Student Depression Using ML
  • Performed EDA and feature engineering on 27k+ survey records to identify key factors (academic, lifestyle, demographic) linked to depression.
  • Cleaned and preprocessed data (missing values, categorical encoding, normalization), and applied SMOTE to balance the dataset.
  • Built and optimized models (Logistic Regression, XGBoost, Random Forest, Neural Networks), achieving 92% AUC.
  • Conducted correlation analysis to uncover relationships between academic pressure, work hours, and study satisfaction.
  • Applied PCA to reduce dimensionality and identify the most impactful features.
  • Evaluated model performance with ROC curves, confusion matrices, and F1-score to ensure balanced results.
Flight Price Prediction Using ML
  • Analyzed 1M+ flight records using EDA, PCA, visualizations, and hypothesis testing to uncover key price drivers (CO₂, aircraft type, duration, etc.).
  • Built and optimized predictive models including Linear Regression, CatBoost, LightGBM, Random Forest, and Neural Networks, achieving high accuracy in price prediction.
  • Performed data cleaning, feature engineering (temporal features, categorical encoding), and applied PCA for dimensionality reduction
  • Applied bootstrapping and RandomizedSearchCV for robust hyperparameter tuning.
  • Evaluated models with RMSE, R², and feature importance; delivered insights to guide airline pricing decisions.
Digital Wallet Transaction Analytics
  • Processed and cleaned raw digital wallet transaction data using Python (pandas, NumPy) to ensure high data quality.
  • Analyzed customer spending patterns, transaction frequency, and preferred payment methods to uncover behavioral trends.
  • Evaluated merchant performance by defining KPIs, ranking top merchants, and identifying revenue-driving partners.
  • Built clear and insightful visualizations with Matplotlib to highlight key insights such as top categories and payment methods.
  • Delivered actionable business insights that support data-driven decisions in customer engagement and merchant strategy.
MedSync: Blockchain-Driven EHR on Cloud | Flask, Python, Microsoft Azure
  • Deployed a private blockchain on the cloud to securely manage patient records, ensuring data integrity and preventing unauthorized access using blockchain technology.
  • Developed a Flask-based web application to provide a user-friendly interface for patients and healthcare providers to access medical records in real-time.
  • Implemented smart contracts to automate permission management and track access logs for compliance with healthcare regulations.
  • Integrated cloud storage solutions on Microsoft Azure to enable scalable, reliable, and secure data storage.
Calorie Burnt Prediction | Python
  • Developed a calorie-burn estimator by cleaning, merging, and analyzing exercise datasets using Python libraries such as pandas and NumPy.
  • Created detailed visualizations with Matplotlib and Seaborn to highlight patterns in workout activity and calorie expenditure.
  • Presented findings in a clear and reproducible Jupyter Notebook, emphasizing transparency and effective storytelling with data.

-- PUBLICATIONS --

MedSync: Blockchain-Driven Electronic Health Record on Cloud
Published in Elsevier's book on Smart and Sustainable Health Technologies.

Developed a blockchain-based EHR system hosted on the cloud to ensure secure, private, and interoperable data exchange between healthcare providers and patients. Highlights include patient-controlled access, HIPAA compliance, and scalable cloud infrastructure.

SOURCE
Blockchain as the Backbone of a Connected Ecosystem of Smart Hospitals
Published as a book chapter with Wiley on applications of blockchain in healthcare.

Explored how blockchain enables secure data sharing, smart contracts, and patient-driven access control within smart hospitals. Demonstrates how decentralization and transparency improve efficiency, interoperability, and trust in healthcare networks.

SOURCE
AI Applications in Production
Co-authored chapter in CRC Press book on real-world industrial AI systems.

Presented how AI technologies like digital twins, computer vision, robotics, and NLP are transforming modern manufacturing. Discusses real-time monitoring, predictive maintenance, and the rise of human-AI collaboration in smart factories.

SOURCE

-- CERTIFICATIONS --

Google Data Analytics Python SQL PostgreSQL SAP ABAP

-- CONTACT ME --

shru235@seas.upenn.edu

linkedin.com/in/shruthiarunkumar

github.com/Shruthi-Arun

Google Scholar

© 2025 Shruthi Arunkumar, All Rights Reserved | Designed by Shruthi Arunkumar