Stat 517 Project #1

Your name

Running Time of R

Description

proc.time determines how much real and CPU time (in seconds) the currently running R process has already taken.

Examples

ptm <- proc.time()
for (i in 1:50) mad(stats::runif(500))
# finding the elapsed seconds for the R execution
proc.time() - ptm

## user system elapsed
## 0.17 0.00 0.17

Please use Sys.time() for each Data set; and report the running time for any R codes/chunks within each Data set iff it takes a longer running time to run (say more than 1000 seconds).

On the First Data Set

#Keep track your beginning time for the entire Project #1
Sys.time()

## [1] "2017-09-19 13:28:05 PDT"

#adult=read.table("http://www.webpages.uidaho.edu/~stevel/517/Project1/dataset1.txt",sep=',')

adult=read.csv("http://www.webpages.uidaho.edu/~stevel/Datasets/adult.csv",sep=',')

set.seed(1234)
n=32561
idx=train_test_2_split=sample(1:2,n,repl=T)

train=adult[idx==1,]
test=adult[idx==2,]

summary(train)

## age workclass fnlwgt
## Min. :17.00 Private :11315 Min. : 12285
## 1st Qu.:28.00 Self-emp-not-inc: 1226 1st Qu.: 117569
## Median :37.00 Local-gov : 1009 Median : 178505
## Mean :38.47 ? : 937 Mean : 189469
## 3rd Qu.:47.00 State-gov : 650 3rd Qu.: 236910
## Max. :90.00 Self-emp-inc : 581 Max. :1484705
## (Other) : 461
## education education_num marital_status
## HS-grad :5186 Min. : 1.00 Divorced :2242
## Some-college:3581 1st Qu.: 9.00 Married-AF-spouse : 11
## Bachelors :2721 Median :10.00 Married-civ-spouse :7393
## Masters : 860 Mean :10.11 Married-spouse-absent: 209
## Assoc-voc : 708 3rd Qu.:13.00 Never-married :5316
## 11th : 588 Max. :16.00 Separated : 511
## (Other) :2535 Widowed : 497
## occupation relationship
## Exec-managerial:2088 Husband :6476
## Prof-specialty :2051 Not-in-family :4190
## Craft-repair :2003 Other-relative: 481
## Adm-clerical :1887 Own-child :2544
## Sales :1785 Unmarried :1687
## Other-service :1641 Wife : 801
## (Other) :4724
## race sex capital_gain
## Amer-Indian-Eskimo: 161 Female: 5416 Min. : 0
## Asian-Pac-Islander: 517 Male :10763 1st Qu.: 0
## Black : 1547 Median : 0
## Other : 131 Mean : 1064
## White :13823 3rd Qu.: 0
## Max. :99999
##
## capital_loss hours_per_week native_country salary
## Min. : 0.00 Min. : 1.0 United-States:14527 <=50K:12277
## 1st Qu.: 0.00 1st Qu.:40.0 Mexico : 288 >50K : 3902
## Median : 0.00 Median :40.0 ? : 278
## Mean : 90.46 Mean :40.4 Philippines : 96
## 3rd Qu.: 0.00 3rd Qu.:45.0 Puerto-Rico : 68
## Max. :4356.00 Max. :99.0 Germany : 66
## (Other) : 856

colnames(adult)=c('age','workclass','fnlwgt','education','education_num','marital_status','occupation','relationship','race','sex','capital_gain','capital_loss','hours_per_week','native_country','salary')

write.csv(adult, file="adult.csv")

Sys.time()

## [1] "2017-09-19 13:28:23 PDT"

On the Second Data Set

Sys.time()

## [1] "2017-09-19 13:28:23 PDT"

#####################

# Salary=read.csv("http://www.webpages.uidaho.edu/~stevel/517/Project1/dataset2.CSV",header=TRUE,sep=',')

salary=read.csv("http://www.webpages.uidaho.edu/~stevel/Datasets/salary_uk.csv",header=TRUE,sep=',')

# Keep track your ending time for the entire Project #1
Sys.time()

## [1] "2017-09-19 13:29:18 PDT"