top of page
Search

How Can a Wellness Company Play it Smart?

  • Writer: sherry salek
    sherry salek
  • Jul 21, 2022
  • 5 min read
Google Data Analytics Capstone: Case Study 2 Shadi Salek 2022-04-16

The Scenario

Bellabeat is a high-tech manufacturer of health-focused products for women. Urška Sršen and Sando Mur founded Bellabeat in 2013. Urška Sršen believes that analyzing smart device fitness data could help unlock new growth opportunities for the company. I have been asked to focus on one of Bellabeat’s products and analyze smart device data to gain insight into how consumers are using their smart devices. The insights I discover will then help guide marketing strategy for the company. I will present my analysis to the Bellabeat executive team along with my high-level recommendations for Bellabeat’s marketing strategy.

Step 1. Ask

Define the problem

Key stakeholders

  • Urška Sršen: Bellabeat’s co-founder and Chief Creative Officer

  • Sando Mur: Mathematician and Bellabeat’s co-founder; key member of the Bellabeat executive team

What are my stakeholders saying their problem are?

Bellabeat is a small company, but it has the potential to become a larger player in the global smart device market. Urška believes that analyzing smart device fitness data could help unlock new growth opportunities for the company.

How can I help the stakeholders to resolve their question? (Business Task)

Analyze smart device usage data in order to gain insight into how people are already using their smart devices. Then, using this information to suggest high-level recommendations for how these trends can inform Bellabeat marketing strategy.

Business Objectives
  • What are some trends in smart device usage?

  • How could these trends apply to Bellabeat customers?

  • How could these trends help influence Bellabeat marketing strategy?

Step 2. Prepare

Data source
  • Fit-bit Fitness Tracker Data (CC0: Public Domain, dataset meaning the creator has waive his right to the work under the copyright law and made available through Mobius): This dataset generated by respondents to a distributed survey via Amazon Mechanical Turk between 03.12.2016-05.12.2016.

  • Thirty eligible Fit-bit users consented to the submission of personal tracker data, including minute-level output for physical activity, heart rate, and sleep monitoring.

Data format
  • Since the data gathered by other people and it lives outside of the company, the data is secondary and external data.

  • The dataset has in total 18 files in .csv format organized in long format.

  • The data is also quantitative, because it is specific and objective measures of numerical facts.

  • Data is organized in a certain format with rows and column, so it is a structured data.

  • The data is a long data in which each row is one time point per subject, so each subject will have data in multiple rows.

ROCCC Data
  • Reliable: The data source is not reliable (Amazon Mechanical Turk)

  • Original: The data is not original (Not Primary data)

  • Comprehensive: The data has some critical information

  • Current: The data is collected between 03.12.2016-05.12.2016 and it is not current.

  • Cited: The data is not primary so it may not be credible.

Data Ethics

The data has kept subjects’ information and activity private. Thirty three eligible Fit-bit users consented to the submission of personal tracker data.

Data limitations
  • There are million of people who use fit-bit to track their health and choosing thirty three fit-bit users is not representative of the population and it leads to sample bias.

  • The time frame of 31 days is considered to be short and not enough data has been collected.

  • The data is not current and it belongs to 2016.

Step 3. Process

The data I’ll be working with will be as follow:

  • dailyActivity_merged.csv

  • dailyCalories_merged.csv

  • sleepDay_merged.csv

  • weightLogInfo_merged.csv

  • dailyIntensities_merged.csv

Setting up my R environment Setting up my R environment by loading the following files and using naming convention:

Installing and loading tidyverse, skimr, janitor, lubridate, dplyr, and sqldf package:

Data Review

Daily_activity file has most of the tracker recorded data, such as calories, intensities, and steps information.

Above information shows that we have 940 observations and 15 variables and 33 different people logged their daily activities, calories expenditure and steps in 31 days.

daily_calories

daily_calories file has the same information included in daily_activity file with the same observation.

daily_intensities

daily_intensities file has the same information included in daily_activity file with the same observation.

daily_intensities
weight_log

weight_log file has less observation (67) and there is a Boolean field (Fat).

Observation
  • All the files have Id as a common field.

  • The daily_activity table have the same observation and values with calories and intensities, so we should confirm that the values actually match for each ‘Id’ number that can be our primary key. First we make a temporary table from daily_activity with the same columns as daily_calories. Let’s check it out:

Now we are sure the daily_calories table is the same as the daily_activity. We do the same with the daily_intensities.

Now we are sure the daily_activity table is the same as the daily_intensities. We can now work on daily_activity, sleep_day and weight_log.

Now we want to check the Id section in all three tables:

Then we check the number of observation in each table:

Looking at the daily_activity_csv file and using filter, I noticed there are lots of missing data. There are many nulls and missing pieces of information altogether.

After filtering I found out that some days there is no record of steps and activity and even no calories, however the sedentary time has been recorded 1440 minutes. After searching Google, I found fit-bit community center and understood that the sedentary time only is calculated on days in which the tracker is worn. Sedentary/Active time is calculated by movement, and you need to be inactive for 10 consecutive minutes before the period is considered stationary. There is a setting on some fit-bit trackers that will log a day where the device is not worn as 100% or 1440 minutes of sedentary time. Since this is a case study and I do not have contact with stakeholders, I decided to move on with the data.

The ActivityDate column in daily_activity structure is in Character and we change it to the date format.

We do the same with the SleepDay column in sleep_day and Date column in weight_log:

Renaming the data columns:

Organizing and grouping the data:

Checking for duplicates:

Step 4. Analyze

Identifying trends and relationships:

For daily sleep:

For weight log:

Step 5. Data Visualization and Share

Negative relationship between total steps and sedentary minutes:

Calories generally trend positively with total steps:

We can say that physical activity, such as walking, is important for burning calories.


We can see a linear trend between the amount of time slept and the total time someone spends in bed:

We could definitely conclude that by tracking the time we’re inactive, the devices can record when you fall asleep at night and when you stir in the morning.


Now we merge sleep day and daily activity together:

Now we plot sedentary time and time in bed together:

It looks like sedentary time and total time in bed are not related much at all.

There’s a strong positive correlation between very active minutes and calories burned.

Step 6. Act and Recommendation

Final Recommendation to Bellabeat:

1. In order to have a high quality analysis, the data needs to be more accurate and complete with higher sample size and longer period of time frame.

2. Membership as motivation is very critical in Bellabeat to make sure the users participate in activities and records data.

3. Including functions that can alert users who tend to have a high number to sedentary minutes would be a good idea in the device so that they will be notified and start moving.

4. Sleeping pattern is another feature that Bellabeat can use in the devices to provide individual sleep need, which is necessary to help poeple determine if they had enough sleep.

5. Syncing problem is also another issue that Bellabeat needs to consider since the data showed lots of null and zero data which can also related to not appropriately syncing.

6. A motivational marketing strategy of body positivity in the media and Bellabeat website might empower the customers to enter their weight into the app.
























 
 
 

Comments


  • alt.text.label.LinkedIn

©2022 by Sherry Salek | Data Analyst. Proudly created with Wix.com

bottom of page