高通量筛选 Z'-Factor 计算 Python脚本
计算公式:
Z factor = 1 - ( 3 * (σp + σn) / |(μp- μn)| )
z-factor
代码如下:
#!/usr/bin/env python
# -*- encoding: utf-8 -*-
@File : z_factor.py
@Time : 2022/09/02 17:02:20
@Author : aqy
@Contact : aqy0716@163.com
@Department : GZlab
@Desc : None
1. 导入CSV, 首行 名称 首列为数据。生成数据集
2. 导入统计分析包,计算平均值 和 标准差
3. 计算Z-factor
# here put the import lib
# import csv
from csv import reader
from statistics import mean
import numpy as np
filename = input("请输入文件名: ")
with open(filename,'rt',encoding='UTF-8')as raw_data:
readers=reader(raw_data,delimiter=',')
x=list(readers)
data=np.array(x)
print(data)
print(data.shape)
pos = data[1:,0]
neg = data[1:,1]
print(pos,neg)
pos =[ int(x) for x in pos ]
neg =list(map(int, neg))
up = np.mean(pos)
un = np.mean(neg)
print(up,un)
# python求列表均值,⽅差,标准差
import numpy as np
qp = np.std(pos)
qn = np.std(neg)
print(qp,qn)
# a =[1,2,3,4,5,6]
# #求均值
# a_mean = np.mean(a)
# #求⽅差
# a_var = np.var(a)
# #求标准差
# a_std = np.std(a,ddof=1)
# print("平均值为:%f"% a_mean)
# print("⽅差为:%f"% a_var)
# print("标准差为:%f"% a_std)
# # 其中,可以添加参数axis 如下:
# #参数0代表对每⼀列求值,
# #参数1代表对每⼀⾏求值,
# #⽆参数则求所有元素的值
# x_mean = np.mean(x,axis =0)
# x_var = np.var(x,axis =0)
denominator = abs(up-un)
numerator = 3 * (qp+qn)
z = 1-(numerator/denominator)
print("z-factor is : ", z)
s示例数据如下:
positive,negtive
2000,50
2001,59
1985,23
1869,95
1794,64
2103,75
结果如下:
Microsoft Windows [版本 10.0.22000.856]
(c) Microsoft Corporation。保留所有权利。
D:\Coding\python_gzlab_docu>D:/ruanjian/anaconda3/Scripts/activate.bat
(base) D:\Coding\python_gzlab_docu>D:/ruanjian/anaconda3/python.exe "d:/Coding/python_gzlab_docu/gzlab_python_do/Statistical Analysis/z_factor/z_factor.py"
请输入文件名: D:\Coding\python_gzlab_docu\gzlab_python_do\Statistical Analysis\z_factor\drug.csv
[['positive' 'negtive']
['2000' '50']
['2001' '59']
['1985' '23']
['1869' '95']
['1794' '64']
['2103' '75']]
(7, 2)
['2000' '2001' '1985' '1869' '1794' '2103'] ['50' '59' '23' '95' '64' '75']
1958.6666666666667 61.0
100.15099711047425 22.098265391956296
z-factor is : 0.8067375087788732
Z'-Factor
z>0 系统可用
其中:0 < z < 0.5 筛选体系可接受
z > 0.5 筛选体系比较理想
网友评论