介绍
数据归一化和标准化都是scaling,常用Normalization或Standardization表示。记录下R实现不同scaling方法。更多知识分享请到 https://zouhua.top/。
标准化R实现
-
Median scale normalization
-
Robust scale normalization
-
Unit scale normalization
-
z-scale normalization
-
Min-Max normalization
# method1: Median scale normalization
MDA_fun <- function(features){
# x for features X = (x1, x2, ..., xn)
value <- as.numeric(features)
d_mad <- mad(value)
x_scale <- (value - median(value))/d_mad
return(x_scale)
}
dat_s1_MDA <- apply(dat, 1, MDA_fun)
rownames(dat_s1_MDA) <- colnames(dat)
# method2: Robust scale normalization
Robust_fun <- function(features){
# x for features X = (x1, x2, ..., xn)
value <- as.numeric(features)
q_value <- as.numeric(quantile(value))
remain_value <- value[value > q_value[2] & value < q_value[4]]
mean_value <- mean(remain_value)
sd_value <- sd(remain_value)
x_scale <- (value - mean_value)/sd_value
return(x_scale)
}
# method3: Unit scale normalization
Unit_fun <- function(samples){
# v for samples v = (v1, v2, ..., vn)
value <- as.numeric(samples)
x_scale <- value / sqrt(sum(value^2))
return(x_scale)
}
# method4: z-scale normalization
Zscore_fun <- function(features){
# x for features X = (x1, x2, ..., xn)
value <- as.numeric(features)
mean_value <- mean(value)
sd_value <- sd(value)
x_scale <- (value - mean_value)/sd_value
return(x_scale)
}
# method5: Min-Max normalization
Min_Max_fun <- function(features){
# x for features X = (x1, x2, ..., xn)
value <- as.numeric(features)
min_value <- min(value)
max_value <- max(value)
x_scale <- (value - min_value)/(max_value - min_value)
return(x_scale)
}
method1 2 4 5 的scaling的计算方式为减一个统计量再除以一个统计量,method3除以向量自身的长度,前者适合行向量,后者适合列向量,当然也不一定。
参考
参考文章如引起任何侵权问题,可以与我联系,谢谢。
网友评论