美文网首页
DataCamp课程 <高效率代码> Chapter3. 查看C

DataCamp课程 <高效率代码> Chapter3. 查看C

作者: Jason数据分析生信教室 | 来源:发表于2021-07-22 08:03 被阅读0次

高效率代码课程目录

Chapter1. Benchmarking
Chapter2. R语言高效化基础
Chapter3. 查看Code内部
Chapter4. 多线程计算

使用profvis

通过使用profvis包可以记录和可视化每一步计算需要的资源。具体用法如下,

# Load the data set
data(movies, package = "ggplot2movies") 

# Load the profvis package
library(profvis)

# Profile the following code with the profvis function
profvis({
  # Load and select data
  comedies <- movies[movies$Comedy == 1, ]

  # Plot data of interest
  plot(comedies$year, comedies$rating)

  # Loess regression line
  model <- loess(rating ~ year, data = comedies)
  j <- order(comedies$year)
  
  # Add fitted line to the plot
  lines(comedies$year[j], model$fitted[j], col = "red")
     })     ## Remember the closing brackets!

有点像复习上一章节的内容,创建一样格式的dataframematrix,比较一下各自需要花费的计算时间。

# Load the microbenchmark package
library(microbenchmark)

# The previous data frame solution is defined
# d() Simulates 6 dices rolls
d <- function() {
  data.frame(
    d1 = sample(1:6, 3, replace = TRUE),
    d2 = sample(1:6, 3, replace = TRUE)
  )
}

# Complete the matrix solution
m <- function() {
  matrix(sample(1:6, 6, replace = TRUE), ncol=2)
}

# Use microbenchmark to time m() and d()
microbenchmark(
 data.frame_solution = d(),
 matrix_solution     = m()
)
Unit: microseconds
                expr     min       lq      mean   median       uq      max
 data.frame_solution 102.613 122.4190 179.33370 140.9915 171.0330 2420.308
     matrix_solution   4.764   5.7765  32.10798   7.5525  10.4335 2346.358
 neval
   100
   100

rowSumsapply更加快捷。

# Define the new solution
r_sum <- function(x) {
    rowSums(x)
}
# Compare the methods
microbenchmark(
    app_sol = app(rolls),
    r_sum_sol = r_sum(rolls)
)
Unit: microseconds
      expr    min      lq     mean  median     uq      max neval
   app_sol 20.394 22.6620 44.38427 23.5745 25.788 1953.120   100
 r_sum_sol  5.078  5.9265 20.99668  6.4335  7.164 1381.345   100

&&&的差别

A&B: 不管A是不是FALSE,B都会被计算,结果返回FLASE
A&&B: 如果A是FALSE,B就会被跳过,结果返回FALSE
这样就节省了很多计算资源。

# Example data
is_double
[1] FALSE  TRUE  TRUE
# Define the previous solution
move <- function(is_double) {
    if (is_double[1] & is_double[2] & is_double[3]) {
        current <- 11 # Go To Jail
    }
}
# Define the improved solution
improved_move <- function(is_double) {
    if (is_double[1] && is_double[2] && is_double[3]) {
        current <- 11 # Go To Jail
    }
}
# microbenchmark both solutions
# Very occassionally the improved solution is actually a little slower
# This is just random chance
microbenchmark(move, improved_move, times = 1e5)
Unit: nanoseconds
          expr min lq     mean median uq   max neval
          move  20 24 29.39710     25 27 67893 1e+05
 improved_move  20 24 27.30683     25 26 18635 1e+05

相关文章

网友评论

      本文标题:DataCamp课程 <高效率代码> Chapter3. 查看C

      本文链接:https://www.haomeiwen.com/subject/xkyjmltx.html