Michelle Liu

1122 University Ave, Berkeley, CA 94704 · (858) 900-4846 · michelleliu1027@gmail.com

• Experienced on data ETL, including Data Extraction, Data Cleaning, Data Transforming, and Data Visualization;
• Proficient in Python and SQL;
• Former Data-related Internships at Mango TV and ByteDance;
• Previous Data Science Tutor @ UC San Diego with clear logics and debugging ability;
• Bachelor degree of Data Science @ UC San Diego;
• Currently M.eng degree in Industrial Engineering and Operations Research @ UC Berkeley



Education

University of California, Berkeley

Bachelor of Engineering
Major: Industrial Engineering & Operations Research
Honor: I'm still working on that....

GPA: I'm still working on that....

08/2021 - Present

University of California, San Diego

Bachelor of Science
Major: Data Science
Honor: Cum Laude

GPA: 3.864/4.000

09/2017 - 06/2021

Experience

Data Science Tutor

Halıcıoğlu Data Science Institute, UC San Diego

• Held weekly office hours to help students debug, and better understand class concepts;
• Created Python assignments using Jupyter Notebook and Git;
• Tutored Python basics (NumPy, Pandas, Regex...), Machine Learning concepts, Data Visualization, HTML, etc.;
• Insisted professors in creating questions for exams, proctoring and grading exams;

09/2019 - 03/2021

Live Stream Strategy and Operation Intern

ByteDance Ltd.

• Insisted mentors in evaluating strategic feasibility by exporting historical streamers-related data using both SQL and ByteDance Data Platform, and doing EDA analysis using Python.
• Monitored dashboard for data abnormalities;
• Collaborated with R&D department in filtering out streamers whose revenues declined throughout the past week, and surveyed those streamers in understanding their needs; Did word-frequency analysis by using Python Jieba Package, and obtained the most frequent words from surveys;
• Followed up on-going A/B testing of different strategies, and communicated with the R&D Department in updating and deploying strategic experiment.

07/2020 - 08/2020

Data Analyst Intern

Mango TV

• Insisted Operator Network Center in analyzing behavior-related data of users who unsubscribed membership in June; Communicated frequently with different teams in defining metrics that would contribute to our analysis;
• Exported data using SQL, cleaned messy data and explored abnormality through EDA using Python, and trained an unsupervised clustering machine learning model (KMeans) with scikit-learn package for discovering latent features.
• Presented weekly and monthly reports about the subscription information by visualizing through Tableau and PowerPoint; Created reports templates for future works;
• Our Data Analysis provided data supports for operations department in coming up strategies on maintaining their members;

07/2019 - 09/2019

Projects

Battery Lifetime Prediction

• This is a capstone project collaborated with Intel for my undergraduate program; This project aims on investigating the features that affect the estimated remaining time of personal computer’s battery, and finally building a Gradient Boosting Regressor for predicting battery's remaining minutes.
• Collected Data by using Microsoft API in C scripts; Preprocessed raw data and did EDA by using Pandas;
• Insisted partners on prediction model training and hypothesis testing part;
• Created visual presentation page by using Javascripts, CSS, and HTML;

09/2020 - 03/2021

Reconstructing Superheroes based on Frida Kahlo’s art

• This project created Superheroes with Frida Kahlo’s art styles, and this work is based on DCGAN model;
• Scraped Frida Kahlo’s art from Wikiart by using BeautifulSoup and Request based on Python script.

04/2020 - 05/2020

2015 U.S. Flight Delay Forecast Project

• Collected flight information of U.S. flights in 2015, and found that flight delays were related to location, time and airlines through feature engineering; Built an optimal model based on the decision tree model for predicting flight delays according to the above characteristics, and achieved an accuracy rate of 62%.
• Was selected as the outstanding project representative of the semester.

11/2019 - 12/2019

Interactive Data Visualization

• Designed a dashboard that enabled users to interact with data by real-time mouse moves.
• Realized data visualization with Highcharts Library, cleaned the original data and designed the data structure with JavaScript, typeset the web page with HTML, and beautified the page with CSS in designing.

10/2019 - 11/2019

Facial Recognition and Emotional Detection System

• Developed data uploader by using Python to transform pictures with names into a table with columns (name and extracted features) for later training part in this project.
• Extracted facial mark features with Dlib library on sampled dataset with emotion-labeled pictures, and applied a supervised learning algorithm on those facial marks to train an emotional detector model, achieving a highest accuracy of 89.43% with logistic regression.

03/2019 - 05/2019

Interests

Apart from being a nerdy programmer, I am also an enthusiast for CrossFit. My goal for this year is to get CrossFit Level 1 Certificate!