This file describes the variables, the data, and any transformations or work that I have performed to clean up the data.
-
Source of the data https://d396qusza40orc.cloudfront.net/getdata%2Fprojectfiles%2FUCI%20HAR%20Dataset.zip
-
Full description of the data http://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones
-
Script run_analysis.R performs the following steps to clean the data:
- Read X_train.txt and X_test.txt and concatenate it to data_X (10299x561 data frame)
- Read y_train.txt and y_test.txt and concatenate it to data_Y (10299x1 data frame)
- Read subject_train.txt and subject_test.txt and concatenate it to data_Subject (10299x1 data frame)
- Read features.txt to data_Features (561x2 data frame)
- Extract only "-mean()" and "-std()" to relevant_Features (List of 66 int values)
- Reduce data_X with the relevant_Features to data_X (10299x66 data frame)
- Read activity_labels.txt to data_Activity (6x2 data frame)
- Replace it in data_Y
- Proper names for data_X (data_Features), data_Y (Activity) and data_Subject (Subject) columns
- Combine data_Subject, data_X and data_Y to data_All (10299x68 data frame)
- The "Subject" column contains integers that range from 1 to 30 inclusive
- The "Activity" column contains 6 kinds of activity names (walking, walkingupstairs, walkingdownstairs, sitting, standing, laying)
- The last 66 columns contain measurements that range from -1 to 1 exclusive
- Write data_All to "mergedExtractData.txt" in the current working directory
- Create new tidy data set with the average of each measurement for each activity and each subject to data_All_Average (180x68 data frame)
- 30 unique subjects
- 6 uniques activities
- Write data_All_Average to "mergedExtractAverageData.txt" in the current working directory