共享单车项目分析

作者: Johnz0 | 来源:发表于2019-11-15 21:46 被阅读0次

共享单车项目分析
共享单车项目分析
Kaggle-共享单车项目分析
共享单车还能火多久?
共享XX
设计模式之享元模式
押金把共享单车带入疯狂，如果消灭押金未来会怎样？
ofo出事摩拜补刀共享单车之伤别将共享单车提供给孩童
共享大乱炖，几多热闹几多喧嚣 | 借把伞
共享单车

简介：随着共享单车的星期，这次探索三大美国城市的自行车共享系统相关的数据：芝加哥、纽约和华盛顿特区，帮助共享单车公司得到一些关键性的数据信息，例如哪个起始车站最热门，哪一趟行程最热门等等，来对共享单车的投放给予一定帮助。

一、分析步骤

编写代码导入数据，并通过计算描述性统计数据回答有趣的问题。
编写一个脚本，该脚本会接受原始输入并在终端中创建交互式体验，以展现这些统计信息。
提出问题
终端应用脚本

二、提出问题

起始时间（Start Time 列）中哪个月份最常见？
起始时间中，一周的哪一天（比如 Monday, Tuesday）最常见？
起始时间中，一天当中哪个小时最常见？
总骑行时长（Trip Duration）是多久，平均骑行时长是多久？
哪个起始车站（Start Station）最热门，哪个结束车站（End Station）最热门？
哪一趟行程最热门（即，哪一个起始站点与结束站点的组合最热门）？
每种用户类型有多少人？
每种性别有多少人？
出生年份最早的是哪一年、最晚的是哪一年，最常见的是哪一年？

三、代码实现

工具：Python
文本编辑器：Pycharm

import time
import pandas as pd
import numpy as np


CITY_DATA = { 'chicago': 'chicago.csv',
              'new york city': 'new_york_city.csv',
              'washington': 'washington.csv' }

def get_filters():
    """
    Asks user to specify a city, month, and day to analyze.

    Returns:
        (str) city - name of the city to analyze
        (str) month - name of the month to filter by, or "all" to apply no month filter
        (str) day - name of the day of week to filter by, or "all" to apply no day filter
    """
    print('Hello! Let\'s explore some US bikeshare data!')
    # get user input for city (chicago, new york city, washington). HINT: Use a while loop to handle invalid inputs
    city = input("Which city do you want to analyze? input ：chicago, new york city, washington\n").lower()
    while True:
        if city not in CITY_DATA.keys():
            city = input('Invalid input======\nwould you like to see data for chicago, '
                         'new youk city, or washington?')
        else:
            break

    # get user input for month (all, january, february, ... , june)
    months = ['all', 'january', 'february', 'march', 'april', 'may', 'june']
    month = input("Which month data do you want to analyze？input ：all，january, february, "
                  "march, april, may, june\n").lower()
    while True:
        if month not in months:
            month = input('Invalid input======\nWhich month data do you want to analyze？input ：all，january, february,'
                  'march, april, may, june\n').lower()
        else:
            break

    # get user input for day of week (all, monday, tuesday, ... sunday)
    days = ['all', 'monday','tuesday','wednesday','thursday','friday','saturday','sunday']
    day = input("Which day of week do you want to analyze? input："
                "all，monday, tuesday, wednesday, thursday, friday, saturday, sunday").lower()
    while True:
        if day not in days:
            day = input("Invalid input======\nWhich day of week do you want to analyze? input："
                "all，monday, tuesday, wednesday, thursday, friday, saturday, sunday").lower()
        else:
            break

    print('-'*40)
    return city, month, day


def load_data(city, month, day):
    """
    Loads data for the specified city and filters by month and day if applicable.

    Args:
        (str) city - name of the city to analyze
        (str) month - name of the month to filter by, or "all" to apply no month filter
        (str) day - name of the day of week to filter by, or "all" to apply no day filter
    Returns:
        df - Pandas DataFrame containing city data filtered by month and day
    """
    # load data file into a dataframe
    df = pd.read_csv(CITY_DATA[city])

    # convert the Start Time column to datetime
    df['Start Time'] = pd.to_datetime(df['Start Time'])

    # extract month and day of week from Start Time to create new columns
    df['month'] = df['Start Time'].dt.month
    df['day_of_week'] = df['Start Time'].dt.weekday_name

    # filter by month if applicable
    if month != 'all':
        # use the index of the months list to get the corresponding int
        months = ['january', 'february', 'march', 'april', 'may', 'june']
        month = months.index(month) + 1

        # filter by month to create the new dataframe
        df = df[df['month'] == month]

    # filter by day of week if applicable
    if day != 'all':
        # filter by day of week to create the new dataframe
        df = df[df['day_of_week'] == day.title()]
    return df


def time_stats(df):
    """Displays statistics on the most frequent times of travel."""

    print('\nCalculating The Most Frequent Times of Travel...\n')
    start_time = time.time()

    # display the most common month
    common_month = df['month'].mode()[0]
    print('The most common month: ', common_month)

    # display the most common day of week
    common_day_of_week = df['day_of_week'].mode()[0]
    print('The most common day of week: ', common_day_of_week)

    # display the most common start hour
    df['start_hour'] = df['Start Time'].dt.hour
    common_start_hour = df['start_hour'].mode()[0]
    print('The most common start hour: ', common_start_hour)


    print("\nThis took %s seconds." % (time.time() - start_time))
    print('-'*40)


def station_stats(df):
    """Displays statistics on the most popular stations and trip."""

    print('\nCalculating The Most Popular Stations and Trip...\n')
    start_time = time.time()

    # display most commonly used start station
    common_start_station = df['Start Station'].mode()[0]
    print('The most commonly used start station: ', common_start_station)

    # display most commonly used end station
    common_end_station = df['End Station'].mode()[0]
    print('The most commonly used end station: ', common_end_station)

    # display most frequent combination of start station and end station trip
    df['Station'] = df['Start Station'] + df['End Station']
    frequent_station = df['Station'].mode()[0]
    print('The most frequent station: ', frequent_station)

    print("\nThis took %s seconds." % (time.time() - start_time))
    print('-'*40)


def trip_duration_stats(df):
    """Displays statistics on the total and average trip duration."""

    print('\nCalculating Trip Duration...\n')
    start_time = time.time()

    # display total travel time
    total_travel_time = df['Trip Duration'].sum()
    print('The total trabel time: ', total_travel_time)

    # display mean travel time
    mean_trabel_time = df['Trip Duration'].mean()
    print('The mean travel time: ', mean_trabel_time)

    print("\nThis took %s seconds." % (time.time() - start_time))
    print('-'*40)


def user_stats(df):
    """Displays statistics on bikeshare users."""

    print('\nCalculating User Stats...\n')
    start_time = time.time()

    # Display counts of user types
    count_user_types = df['User Type'].value_counts()
    print('Counts of user types: ', count_user_types)

    # Display counts of gender
    try:
        count_gender = df['Gender'].value_counts()
        print('Counts of gender: ', count_gender)
    except KeyError:
        print('Counts of gender:oh sorry, this city have no this data.')

    # Display earliest, most recent, and most common year of birth
    try:
        earliest_birth = df['Birth Year'].min()
        most_recent_birth = df['Birth Year'].max()
        most_common_birth = df['Birth Year'].mode()[0]
        print('Earliest year of birth:',earliest_birth)
        print('Most recent year of birth',most_recent_birth)
        print('Most common year of birth',most_common_birth)
    except KeyError:
        print('oh sorry, this city have no Birth Year data.')

    print("\nThis took %s seconds." % (time.time() - start_time))
    print('-'*40)


def main():
    while True:
        city, month, day = get_filters()
        df = load_data(city, month, day)

        time_stats(df)
        station_stats(df)
        trip_duration_stats(df)
        user_stats(df)

        restart = input('\nWould you like to restart? Enter yes or no.\n')
        if restart.lower() != 'yes':
            break


if __name__ == "__main__":
    main()

四、互动式体验

该文件是一个脚本，它接受原始输入在终端中创建交互式体验，来回答有关数据集的问题。
输入想要查看的问题：

输入.png
得出答案：

答案.png
Ps：脚本还可以持续地优化，这次只是做了一个简易的版本，另外还可以在脚本加入可视化的工具，输入需要的数据，自动生成需要的图表，这就不要太方便了啊啊啊啊啊！！！！！！

共享单车项目分析
项目来源：Bike Sharing Demand | Kaggle 一、提出问题在本项目中，参与者被要求将历史使...
共享单车项目分析
简介：随着共享单车的星期，这次探索三大美国城市的自行车共享系统相关的数据：芝加哥、纽约和华盛顿特区，帮助共享单车公...
Kaggle-共享单车项目分析
项目链接：Bike Sharing Demand | Kaggle 思路：1.认识数据 2.特征工程 3.建模...
共享单车还能火多久?
** 什么是共享单车 **现在提到共享单车，大家应该都不陌生，距离共享单车进入市场已有段时间。共享单车是共享经济的...
共享XX
共享电单车芒果电单车七号电单车共享汽车 gofun 共享单车摩拜 ofo bluegogo
设计模式之享元模式
享元模式，刚好现在共享单车火，拿来开刀抽象共享单车天朝小黄车 50斤的摩拜单车共享单车类型单车托管所客户...
押金把共享单车带入疯狂，如果消灭押金未来会怎样？
共享单车有多火？摩拜单车、ofo共享单车、酷骑单车、1步单车、由你单车、7号电单车、黑鸟单车、熊猫单车、云单车、...
ofo出事摩拜补刀共享单车之伤别将共享单车提供给孩童
共享单车可谓方便了生活，但因为共享单车所造成的隐患可不少，除了违章停车、共享单车被毁以外，最严重的要属共享单车用车...
共享大乱炖，几多热闹几多喧嚣 | 借把伞
一、主流共享产品：共享单车、共享汽车、共享雨伞等 1、共享单车共享单车市场很稳定没有特别的新闻，反正前排名前两位...
共享单车
说起共享单车，那可谓是无人不知，无人不晓。作为共享经济的代表之一，共享单车最先火了起来。在共享单车的红火时代，共享...