Improve code efficiency

Ask Time：2015-10-30T23:39:54 Author：user2943039

I've been working on a code that reads in all the sheets of an Excel workbook, where the first two columns in each sheet are "Date" and "Time", and the next two columns are either "Level" and "Temperature, or "LEVEL" and "TEMPERATURE". The code works, but I am working on improving my coding clarity and efficiency, so any advice in those regards would be greatly appreciated.

My function 1) reads in the data to a list of dataframes, 2) gets rid of any NA columns that were accidentally read in, 3) combines "Date" and "Time" to "DateTime" for each dataframe, 4) rounds "DateTime" to the nearest 5 minutes for each dataframe, 5) replaces "Date" and "Time" in each dataframe with "DateTime". I started getting more comfortable with lapply, but am wondering if I can improve the code efficiency at all instead of have so many lines with lapply.

library(readxl)
library(plyr)

  read_excel_allsheets <- function(filename) {
  sheets <- readxl::excel_sheets(filename)
  data <- lapply(sheets, function(X) readxl::read_excel(filename, sheet = X))
  names(data) <- sheets
  clean <- lapply(data, function(y) y[, colSums(is.na(y)) == 0])
  date <- lapply(clean, "[[", 1)
  time <- lapply(clean, "[[", 2)
  time <- lapply(time, function(z) format(z, format = "%H:%M"))
  datetime <- Map(paste, date, time)
  datetime <- lapply(datetime, function(a) as.POSIXct(a, format = "%Y-%m-%d %H:%M"))
  rounded <- lapply(datetime, function(b) as.POSIXlt(round(as.numeric(b)/(5*60))*(5*60),origin='1970-01-01'))
  addDateTime <- mapply(cbind, clean, "DateTime" = rounded, SIMPLIFY = F)
  final <- lapply(addDateTime, function(z) z[!(names(z) %in% c("Date", "Time"))])
  return(final)
}

Next, I would like to plot all of my data. So, I 1) run my code for a file, 2) combine the list of dataframes into one dataframe while maintaining an "ID" for each dataframe as a column, 3) combine the lowercase and uppercase versions of the variable columns, 4) add two new columns that split the "ID". Each ID is something like B1CC or B2CO, where I want to split the "ID" like so: "B1" and "CC". Now I can use ggplot very easily.

mysheets <- read_excel_allsheets(filename)
df = ldply(mysheets)
df$Temp <- rowSums(df[, c("Temperature", "TEMPERATURE")], na.rm = T)
df$Lev <- rowSums(df[, c("Level", "LEVEL")], na.rm = T)
df <- df[!names(df) %in% c("Level", "LEVEL", "Temperature", "TEMPERATURE")]

df$exp <- gsub("^[[:alnum:]]{2}", "\\1",df$.id)
df$plot <- gsub("[[:alnum:]]{2}$", "\\1", df$.id)

Here are the data for the first two dataframes, but there are over 50 of them, and each is relatively big, and there are many files to read. Therefore, I'm looking to improve efficiency (in terms of time to run) where I can. Any help or advice is greatly appreciated!

dput(head(x[[1]]))
structure(list(Date = structure(c(1305504000, 1305504000, 1305504000, 
1305504000, 1305504000, 1305504000), class = c("POSIXct", "POSIXt"
), tzone = "UTC"), Time = structure(c(-2209121912, -2209121612, 
-2209121312, -2209121012, -2209120712, -2209120412), class = c("POSIXct", 
"POSIXt"), tzone = "UTC"), Level = c(106.9038, 106.9059, 106.89, 
106.9121, 106.8522, 106.8813), Temperature = c(6.176, 6.173, 
6.172, 6.168, 6.166, 6.165)), .Names = c("Date", "Time", "Level", 
"Temperature"), row.names = c(NA, 6L), class = c("tbl_df", "tbl", 
"data.frame"))

dput(head(x[[2]]))
structure(list(Date = structure(c(1305504000, 1305504000, 1305504000, 
1305504000, 1305504000, 1305504000), class = c("POSIXct", "POSIXt"
), tzone = "UTC"), Time = structure(c(-2209121988, -2209121688, 
-2209121388, -2209121088, -2209120788, -2209120488), class = c("POSIXct", 
"POSIXt"), tzone = "UTC"), LEVEL = c(117.5149, 117.511, 117.5031, 
117.5272, 117.4523, 117.4524), TEMPERATURE = c(5.661, 5.651, 
5.645, 5.644, 5.644, 5.645), `NA` = c(NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_), `NA` = c(NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_), `NA` = c(NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_), `NA` = c(NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_), `NA` = c(NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_)), .Names = c("Date", "Time", "LEVEL", 
"TEMPERATURE", NA, NA, NA, NA, NA), row.names = c(NA, 6L), class =    
c("tbl_df", "tbl", "data.frame"))

Author:user2943039，eproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article：https://stackoverflow.com/questions/33439746/improve-code-efficiency

Improve code efficiency

热门文章

jpg图片怎么转换成pdf，详细教程分享！

iphone怎么把图片转成电子版？试试这2个方法！

图片如何转换pdf文件？看看这三个方法！

怎么把图片转换成pdf格式，干货教程不要错过

png图片怎么转换成pdf，实用方法不要错过

图片怎么转pdf格式？三种转换方法分享给你，一分钟轻松解决

图片转pdf格式怎么弄免费？get这五个简单的方法，轻松搞定！

如何将图片转pdf格式？4种转换方法分享给你，一分钟轻松解决

如何图片转pdf免费？快学习这三种免费转换方法

怎么将图片转pdf？分享个图片转pdf在线免费

相关搜索

jpg图片怎么转换成pdf，详细教程分享

电脑图片转pdf工具怎么用

单张pdf图片转照片格式

如何将图片转成pdf文档，经验分享

这么好用的图片转pdf软件，我一定要分享

干货分享，不懂图片转pdf的朋友快快收藏起来

分享一个让你惊叹不已的图片转pdf方法

图片转pdf工具

分享一个大家都不知道的图片转pdf格式方法

好用的图片转pdf软件要和好朋友分享

Improve code efficiency

More about “Improve code efficiency” related questions

热门文章

jpg图片怎么转换成pdf，详细教程分享！

iphone怎么把图片转成电子版？试试这2个方法！

图片如何转换pdf文件？看看这三个方法！

怎么把图片转换成pdf格式，干货教程不要错过

png图片怎么转换成pdf，实用方法不要错过

图片怎么转pdf格式？三种转换方法分享给你，一分钟轻松解决

图片转pdf格式怎么弄免费？get这五个简单的方法，轻松搞定！

如何将图片转pdf格式？4种转换方法分享给你，一分钟轻松解决

如何图片转pdf免费？快学习这三种免费转换方法

怎么将图片转pdf？分享个图片转pdf在线免费

相关搜索

jpg图片怎么转换成pdf，详细教程分享

电脑图片转pdf工具怎么用

单张pdf图片转照片格式

如何将图片转成pdf文档，经验分享

这么好用的图片转pdf软件，我一定要分享

干货分享，不懂图片转pdf的朋友快快收藏起来

分享一个让你惊叹不已的图片转pdf方法

图片转pdf工具

分享一个大家都不知道的图片转pdf格式方法

好用的图片转pdf软件要和好朋友分享