美文网首页
structured array in numpy ----

structured array in numpy ----

作者: 昵称违法 | 来源:发表于2019-12-27 09:48 被阅读0次

    一、jit加速对比

    
    # 用pandas来计算,特别是不能用矢量计算,必须要用循环来处理的时候,用 numpy的结构化数组,就显得比较有优势。
    # 本例,用于观测 structured array for循环时,用jit加速和不加速的时间对比。
    # 本例structured array有26400行。
    # structured array 再用jit加速前,需要把dtype为object的改为numpy支持的类型[比如 string 被 默认为object]
    
    
    import numba as nb
    
    @nb.jit
    def update(struct_array):
        for row in struct_array:
            row['open'] = 200
            row['high'] = 250
            #print(row['day'])
    
    def update1(struct_array):
        for row in struct_array:
            row['open'] = 200
            row['high'] = 250
            #print(row['day'])
            
    %timeit update(struct_array3)
    
    %timeit update1(struct_array3)
    

    结果

    68.6 µs ± 1.93 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
    31.3 ms ± 957 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
    

    二、结构化数组数据查询

    1、尽量使用view,不要用copy,view是引用,不涉及新的内存分配,故,速度块。

    #判断是不是共享内存
    year = struct_array['年']
    year.base is struct_array
    

    2、查询1d数组,返回真实结果值

    res = numpy.isin(struct_array['年'],[2016,2017])
    struct_array['年'][res]
    
    ==========================
    res:
    array([False, False, False, False, False, False, False, False, False,
           False, False, False, False, False, False, False, False, False,
            True,  True,  True,  True,  True,  True,  True,  True,  True,
            True,  True,  True,  True,  True,  True,  True,  True,  True,
           False, False, False, False, False, False, False, False, False])
    
    struct_array['年'][res] :
    array([2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2016, 2016,
           2016, 2016, 2016, 2016, 2016, 2016, 2016], dtype=int64)
    

    3、用where和isin查询结构化数据

    先用where返回index,再用index取切片数据
    
    year = struct_array['年']
    bool_arr = numpy.where(numpy.isin(year,[2016,2017]))
    display(bool_arr)
    display(year[bool_arr])
    final_result = struct_array[numpy.where(numpy.isin(struct_array['年'],[2016,2017]))]
    display(final_result)
    
    
    ==========================
    (array([18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
            35], dtype=int64),)
    array([2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2016, 2016,
           2016, 2016, 2016, 2016, 2016, 2016, 2016], dtype=int64)
    array([(18, 2017, 0.1, 104.51972096, -10.82555183, '憨斑鸠'),
           (19, 2017, 0.2, 104.3501145 , -10.96938717, '憨斑鸠'),
           (20, 2017, 0.3, 103.54367631, -11.35928169, '憨斑鸠'),
           (21, 2017, 0.4, 107.41392689,  -9.6743072 , '憨斑鸠'),
           (22, 2017, 0.5, 108.28510005,  -9.85590002, '憨斑鸠'),
           (23, 2017, 0.6, 104.48715011,  -9.62250469, '憨斑鸠'),
           (24, 2017, 0.7, 100.66455001,  -9.81848412, '憨斑鸠'),
           (25, 2017, 0.8,  99.66175183,  -9.55695774, '憨斑鸠'),
           (26, 2017, 0.9, 100.40963599,  -7.01453746, '憨斑鸠'),
           (27, 2016, 0.1, 104.70750137, -22.43171061, '憨斑鸠'),
           (28, 2016, 0.2, 103.04499966, -22.55541852, '憨斑鸠'),
           (29, 2016, 0.3,  99.48432722, -23.29792662, '憨斑鸠'),
           (30, 2016, 0.4,  98.85926603, -23.8461711 , '憨斑鸠'),
           (31, 2016, 0.5,  99.34908936, -22.30951175, '憨斑鸠'),
           (32, 2016, 0.6,  97.82385895, -21.96773831, '憨斑鸠'),
           (33, 2016, 0.7,  97.66852514, -22.1247624 , '憨斑鸠'),
           (34, 2016, 0.8,  97.0840451 , -19.36211832, '憨斑鸠'),
           (35, 2016, 0.9,  96.74356454, -19.47856185, '憨斑鸠')],
          dtype=[('Unnamed: 0', '<i8'), ('年', '<i8'), ('百分比', '<f8'), ('个股最终收益', '<f8'), ('个股最大回撤', '<f8'), ('名字', 'O')])
    

    三、一个问题

    jit编译的函数fn1里面引用了numpy,然后,把fn1放在joblib里面,会报numpy not defined
    待查

    相关文章

      网友评论

          本文标题:structured array in numpy ----

          本文链接:https://www.haomeiwen.com/subject/mivroctx.html