聯系方式

您當前位置:首頁 >> Python編程Python編程

日期:2019-07-28 11:03

Lyft Data Science Assignment

Thank you for taking the time to complete Lyft’s Data Science

Assignment!

Assignment

Lyft ridesharing is a two-sided marketplace with drivers and passengers. Every day new drivers

join the platform and existing drivers either drive or they do not. Suppose you are working as a

Data Scientist on the Driver Retention team whose primary goal is to reduce the rate of churn of

activated drivers (a driver becomes ‘activated’ once they complete their first ride).

The team would like to understand churn better. Explore the data to provide the team with a

deeper understanding of churn at Lyft. Your summary should include:

● The definition (with justification) for a driver to be considered churned.

● An assessment on the current business impact of churn to Lyft.

● Insights on factors affecting churn.

● Insights on segments of drivers more likely to churn.

Next, the team would like to size the opportunity of reducing churn in order to prioritize their

roadmap. The team is considering the following two hypotheses:

i. Doubling the number of rides in an activated driver’s first week.

ii. Another hypothesis you recommend.

Using the data, help the team prioritize these two hypotheses. You should cover:

● How big the opportunities are.

● What might be the longer-term consequences on the marketplace of each hypothesis.

● Which segments of drivers are most likely affected by each hypothesis.

● Which hypothesis you have more confidence in.

Finally, suppose the team wants to test the following hypothesis: “eliminating the Prime Time

feature will decrease driver churn”. Design an experiment to do so. Your design should include:

● How you will divide observational units into control and treatment, and a description of

the treatment and control conditions.

● What are some potential second-order effects on the experience of drivers and

passengers during this experiment.

● What are the primary and secondary metrics you will track.

● How long you will run the experiment and how you will choose the winning variant.

Submission Instructions

1. Please do not write your name on any submission documents.

2. Using the data provided, aim to spend roughly 5-8 hours answering the questions.

3. Prepare a 20 minute presentation for a panel of Data Scientists. At Lyft, we believe

Data Scientists are most effective when they're telling a story with data. Typically

slides are most effective but you are welcome to use other formats (e.g.

iPython-markdown, R-markdown, Word doc but you will need to .pdf them) if you prefer.

4. Include all of your working materials (including all code) in a separate PDF.

5. Keep in mind that we will be grading the assignment based on its technical

soundness and depth, business applications and insights, structure and

organization, completeness and polish.

Data Provided

data/driver_ids.csv

driver_id Unique identifier for a driver

driver_onboard_date Date on which driver was onboarded

data/ride_ids.csv

driver_id Unique identifier for a driver

ride_id Unique identifier for a ride that was completed by the driver

ride_distance Ride distance in meters

ride_duration Ride durations in seconds

ride_prime_time PrimeTime applied on the ride

data/ride_timestamps.csv

ride_id Unique identifier for a ride that was completed by the driver

ride_picked_up_at Timestamp for when driver picked up the passenger


版權所有:編程輔導網 2018 All Rights Reserved 聯系方式:QQ:99515681 電子信箱:[email protected]
免責聲明:本站部分內容從網絡整理而來,只供參考!如有版權問題可聯系本站刪除。

黑龙江体彩22选5