美文网首页
structured array in numpy ----

structured array in numpy ----

作者: 昵称违法 | 来源:发表于2019-12-27 09:48 被阅读0次

一、jit加速对比


# 用pandas来计算,特别是不能用矢量计算,必须要用循环来处理的时候,用 numpy的结构化数组,就显得比较有优势。
# 本例,用于观测 structured array for循环时,用jit加速和不加速的时间对比。
# 本例structured array有26400行。
# structured array 再用jit加速前,需要把dtype为object的改为numpy支持的类型[比如 string 被 默认为object]


import numba as nb

@nb.jit
def update(struct_array):
    for row in struct_array:
        row['open'] = 200
        row['high'] = 250
        #print(row['day'])

def update1(struct_array):
    for row in struct_array:
        row['open'] = 200
        row['high'] = 250
        #print(row['day'])
        
%timeit update(struct_array3)

%timeit update1(struct_array3)

结果

68.6 µs ± 1.93 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
31.3 ms ± 957 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

二、结构化数组数据查询

1、尽量使用view,不要用copy,view是引用,不涉及新的内存分配,故,速度块。

#判断是不是共享内存
year = struct_array['年']
year.base is struct_array

2、查询1d数组,返回真实结果值

res = numpy.isin(struct_array['年'],[2016,2017])
struct_array['年'][res]

==========================
res:
array([False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
       False, False, False, False, False, False, False, False, False])

struct_array['年'][res] :
array([2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2016, 2016,
       2016, 2016, 2016, 2016, 2016, 2016, 2016], dtype=int64)

3、用where和isin查询结构化数据

先用where返回index,再用index取切片数据

year = struct_array['年']
bool_arr = numpy.where(numpy.isin(year,[2016,2017]))
display(bool_arr)
display(year[bool_arr])
final_result = struct_array[numpy.where(numpy.isin(struct_array['年'],[2016,2017]))]
display(final_result)


==========================
(array([18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
        35], dtype=int64),)
array([2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2016, 2016,
       2016, 2016, 2016, 2016, 2016, 2016, 2016], dtype=int64)
array([(18, 2017, 0.1, 104.51972096, -10.82555183, '憨斑鸠'),
       (19, 2017, 0.2, 104.3501145 , -10.96938717, '憨斑鸠'),
       (20, 2017, 0.3, 103.54367631, -11.35928169, '憨斑鸠'),
       (21, 2017, 0.4, 107.41392689,  -9.6743072 , '憨斑鸠'),
       (22, 2017, 0.5, 108.28510005,  -9.85590002, '憨斑鸠'),
       (23, 2017, 0.6, 104.48715011,  -9.62250469, '憨斑鸠'),
       (24, 2017, 0.7, 100.66455001,  -9.81848412, '憨斑鸠'),
       (25, 2017, 0.8,  99.66175183,  -9.55695774, '憨斑鸠'),
       (26, 2017, 0.9, 100.40963599,  -7.01453746, '憨斑鸠'),
       (27, 2016, 0.1, 104.70750137, -22.43171061, '憨斑鸠'),
       (28, 2016, 0.2, 103.04499966, -22.55541852, '憨斑鸠'),
       (29, 2016, 0.3,  99.48432722, -23.29792662, '憨斑鸠'),
       (30, 2016, 0.4,  98.85926603, -23.8461711 , '憨斑鸠'),
       (31, 2016, 0.5,  99.34908936, -22.30951175, '憨斑鸠'),
       (32, 2016, 0.6,  97.82385895, -21.96773831, '憨斑鸠'),
       (33, 2016, 0.7,  97.66852514, -22.1247624 , '憨斑鸠'),
       (34, 2016, 0.8,  97.0840451 , -19.36211832, '憨斑鸠'),
       (35, 2016, 0.9,  96.74356454, -19.47856185, '憨斑鸠')],
      dtype=[('Unnamed: 0', '<i8'), ('年', '<i8'), ('百分比', '<f8'), ('个股最终收益', '<f8'), ('个股最大回撤', '<f8'), ('名字', 'O')])

三、一个问题

jit编译的函数fn1里面引用了numpy,然后,把fn1放在joblib里面,会报numpy not defined
待查

相关文章

网友评论

      本文标题:structured array in numpy ----

      本文链接:https://www.haomeiwen.com/subject/mivroctx.html