r - Aggregate columns in data.table for descriptive statistics -
i looking @ student data set @ individual student level.
what want do descriptive analysis @ faculty degree level.
that students doing 2 degrees (double degrees eg bachelor of , bachelor of science) students generate 2 degrees.
my data looks below. faculty assignments (whether fac1 or fac2) arbitrary.
studid fac1 fac2 success sex ave_mark 1 arts 0 male 65 2 science 1 male 35 3 law 0 male 98 4 science 0 female 55 5 commerce 0 female 20 6 commerce 1 male 80
this generated
students<-data.table(studid=c(1:6) ,fac1 = c("it","science", "law","it","commerce","commerce"), fac2 = c("arts","","","science","it","it"), success = c(0,1,0,0,0,1), sex=c("male","male","male","female","female","male"), ave_mark=c(65,35,98,55,20,80))
how go producing (made figures) create faculty column incorporates both fac1 , fac2 columns? have been trying use lapply function across fac1 , fac2 keep hitting dead ends (ie students[, lapply(.sd, mean), by=agg.by, .sdcols=c('fac1', 'fac2')]
faculty mean_success ave_mark 0.65 65 science 1 50 law 0.76 50 arts 0.55 50 commerce 0.40 10
any assistance appreciated.
this seems looking for.
library(reshape2) dt <- melt(students,measure.vars=c("fac1","fac2"),value.name="faculty")[nchar(faculty)>0] dt[,list(mean_success=mean(success),ave_mark=mean(ave_mark)),by=faculty] # faculty mean_success ave_mark # 1: 0.25 55 # 2: science 0.50 45 # 3: law 0.00 98 # 4: commerce 0.50 50 # 5: arts 0.00 65
so uses melt(...)
function in package reshape2
collapse 2 faculty columns, replicating other columns. unfortunately, results in columns blank faculty, have rid of using [nchar(faculty)>0]
. it's simple aggregate based on (new) faculty
column.
Comments
Post a Comment