R! posIXCT in sqldf -
first time question, if missed apologize:
i imported excel file r! using xlconnect, str() function follow:
data.frame': 931 obs. of 5 variables: $ media : chr "eem" "eem" "eem" "eem" ... $ month : posixct, format: "2014-08-01" "2014-08-01" "2014-08-01" "2014-08-01" ... $ request_row : num 8 25 26 37 38 44 53 62 69 83 ... $ total_click : num 12 9 9 8 8 8 7 7 7 7 ... $ match_type : chr "s" "s" "s" "s" ...
when use following sqldf no rows selected, anyway wrong:
sqldf(" select media, sum(total_click) , avg(request_row), min(request_row) , max(request_row), count(distinct(media)) all_data request_row < 100 , month='2014-09-01' group 1,2 order 2,6 desc ") <0 rows> (or 0-length row.names)
thanks help
vj
its not clear intended code shown has these problems:
month
used in datamonth
used in sql statementsqlite has no date or time types , if send posixct value sqlite interpreted number of seconds since unix epoch (in gmt time zone). comparison of month character string won't work. can convert number of seconds yy-mm-dd using sqlite
strftime
ordate
functions. alternately use database has datetime types. sqldf supports h2 database , supports date , time types.the statement trying group
media
,sum(total_click)
. grouping aggregated value not legal although perhaps done nesting selects depending on intended.since statement grouping
media
expressoincount(distinct(media)) from
all_data1 since there can 1
media
in such group.
you need clarify intent if drop or fix various points can this:
sqldf("select media, sum(total_click) sum_total_click, avg(request_row) avg_request_row, min(request_row) min_request_row, max(request_row) max_request_row all_data request_row < 100 , date(month, 'unixepoch', 'localtime') = '2014-08-01' group 1 order 2 desc")
which gives:
media sum_total_click avg_request_row min_request_row max_request_row 1 eem 38 24 8 37
rh2 use rh2 package , h2 database instead sure have java , rh2 installed (rh2 includes h2 database not need separately installed) , then:
library(rh2) library(sqldf) sqldf("...")
where ... replaced same sql statement except date comparison simplifies line:
and month = '2014-08-01'
data: when posting r tag please show data using dput
. in case used:
all_data <- structure(list(media = c("eem", "eem", "eem", "eem"), month = structure(c(1406865600, 1406865600, 1406865600, 1406865600), class = c("posixct", "posixt" ), tzone = ""), request_row = c(8, 25, 26, 37), total_click = c(12, 9, 9, 8), match_type = c("s", "s", "s", "s")), .names = c("media", "month", "request_row", "total_click", "match_type"), row.names = c(na, -4l), class = "data.frame")
update: misc revisions.
Comments
Post a Comment