一级标题
二级标题
三级标题
四级标题
五级标题
六级标题
列表
- 列表1
- 列表2
- 列表3
有序文本
- 有序文本1
- 有序文本2
- 有序文本3
![图片](http://ww4.sinaimg.cn/bmiddle/aa397b7fjw1dzplsgpdw5j.jpg)
Project for Getting and Cleaning Data
Author: Benjamin Chan (https://github.com/benjamin-chan/GettingAndCleaningData)
Parameters for the project
The purpose of this project is to demonstrate your ability to collect, work with, and clean a data set. The goal is to prepare tidy data that can be used for later analysis. You will be graded by your peers on a series of yes/no questions related to the project. You will be required to submit: 1) a tidy data set as described below, 2) a link to a Github repository with your script for performing the analysis, and 3) a code book that describes the variables, the data, and any transformations or work that you performed to clean up the data called CodeBook.md. You should also include a README.md in the repo with your scripts. This repo explains how all of the scripts work and how they are connected.
http://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+SmartphonesHere are the data for the project:
https://d396qusza40orc.cloudfront.net/getdata%2Fprojectfiles%2FUCI%20HAR%20Dataset.zip
You should create one R script called run_analysis.R that does the following.
- Merges the training and the test sets to create one data set.
- Extracts only the measurements on the mean and standard deviation for each measurement.
- Uses descriptive activity names to name the activities in the data set.
- Appropriately labels the data set with descriptive activity names.
- Creates a second, independent tidy data set with the average of each variable for each activity and each subject.
Good luck!
Steps to reproduce this project
- Open the R script
run_analysis.r
using a text editor. - Change the parameter of the
setwd
function call to the working directory/folder (i.e., the folder where these the R script file is saved). - Run the R script
run_analysis.r
. It calls the R Markdown file,run_analysis.Rmd
, which contains the bulk of the code.
Outputs produced
- Tidy dataset file
DatasetHumanActivityRecognitionUsingSmartphones.txt
(tab-delimited text) - Codebook file
codebook.md
(Markdown)
Test script
a <- c(1,1,2,3,4,5,6,7,8,9)
asd
Preliminaries
Load packages.
packages <- c("data.table", "reshape2")
sapply(packages, require, character.only = TRUE, quietly = TRUE)
## data.table reshape2
## TRUE TRUE
Set path.
path <- getwd()
path
## [1] "C:/Users/chanb/Documents/Repositories/Coursera/GettingAndCleaningData/Project"
Get the data
Download the file. Put it in the Data
folder. This was already done on 2014-04-11; save time and don't evaluate again.
url <- "https://d396qusza40orc.cloudfront.net/getdata%2Fprojectfiles%2FUCI%20HAR%20Dataset.zip"
f <- "Dataset.zip"
if (!file.exists(path)) {
dir.create(path)
}
download.file(url, file.path(path, f))
Unzip the file. This was already done on 2014-04-11; save time and don't evaluate again.
executable <- file.path("C:", "Program Files (x86)", "7-Zip", "7z.exe")
parameters <- "x"
cmd <- paste(paste0("\"", executable, "\""), parameters, paste0("\"", file.path(path,
f), "\""))
system(cmd)
The archive put the files in a folder named UCI HAR Dataset
. Set this folder as the input path. List the files here.
网友评论