Sanaa Igbokwe

Logo

Sanaa Igbokwe is a data analyst who illuminates insights and drives decision-making by unraveling complex datasets with precision and creativity. They know that data analysis is not just about numbers; it's about understanding human behavior and environmental justice.

View My LinkedIn Profile

Portfolio

Course Projects

Covid Patient ICU and Immunosuppressed Risk Predictions - Spring 2023

In collaboration with two peers, I engaged in a multifaceted project encompassing data wrangling, exploration, analysis, and machine learning in Python Spark. Our objective was to create predictive models for patient ICU admission and immunosuppression risk using a substantial COVID-19 dataset provided by the Mexican government. My specific role involved applying and fine-tuning regression analysis techniques in PySpark, encompassing model development, training, and rigorous testing.

The outcome was remarkable, with my logistic regression model achieving a 91.3% accuracy for ICU admission predictions and an outstanding 95.8% accuracy for identifying immunosuppressed individuals. Our presentation included comprehensive prediction analysis, actionable recommendations, and explanations of key metrics like precision, recall, and area under the ROC curve. We encountered challenges, such as shifting our focus from intubation prediction due to low correlation with immunosuppressive disorders, demonstrating our adaptability and commitment to robust results.

See Code on Google Colab

UK Gender Pay Gap Analysis - Fall 2022

In a group project, two students and I performed data cleaning in python on three datasets (approximately about 27,000 values) collected from a United Kingdom open-source database with a small team to conduct extensive exploratory analysis. Utilizing matplotlib and seaborn to observe and create visualizations, we identified patterns significant to gender pay disparities across a wide range of UK companies and industries from 2018 to 2021.

After research and analysis, we produced a comprehensive report that meticulously documented the code, analysis methodologies, and actionable conclusions, providing a solid foundation for further research and policy-making initiatives.

See Final Report with Code and Visualizations. 462Poster

2021 UK Gender Pay Gap - Fall 2022

Independently performed simple data exploration and data visualization to create a visually compelling story about the gender pay disparity in the United Kingdom for 2021 using R and data visualization techniques, such as ggplot2, dplyr, and a geocode API.

Data Visualization

KPOP Search Engine - Spring 2022

For the end-of-the-year python project, students were tasked to produce a novel program in Jupyter Notebook utilizing programming skills demonstrated since the beginning of the semester. Requirements for this project included novelty, extensive lines of code (around 300 lines), coding techniques such as APIs, functions, for loops, widgets, and more to create a program that had purpose and functionality. To emphasize the skills I learned with Python and Jupyter Notebook I decided to create a program on a subject I was interested in: Music. To add more novelty I narrowed my focus to Korean Pop and Korean Pop fans as the niche and target of this project.

See Full README & Code on Github