NumPy使用教程

作者: JeremyL | 来源:发表于2019-01-11 21:12 被阅读3次

    首先需要了解Numpy是什么,那我们看下官网的解释;

    Numpy是python 中一个用于科学计算的基础包;除此之外,Numpy能够很的处理各维数组,具有强大的函数功能,能够整合其他语言代码(C/C++ 和 Fortran),也适用于线性代数、傅立叶变换或随机数处理等等。

    简单说下Numpy的安装,下面的命令就行了:

    python -m pip install --user numpy
    
    #Ubuntu & Debian也可以这样操作
    sudo apt-get install python-numpy
    

    下面开始浏览一下Numpy 使用说明-Numpy Tutorial

    基本概念

    NumPy中主要的操作对象就是数组(ndarray),数组中以表格的形式存放着同一类型的元素,我们可以根据索引对数组中元素进行操作。

    ndarray的属性:

    • ndarray.ndim:数组的维度个数
    • ndarray.shape:数组维度
    • ndarray.size:数组元素个数
    • ndarray.dtype: 元素总个数
    • ndarray.itemsize: 元素字节大小
    • ndarray.data:包含数组实际元素的缓冲区

    一个示例:

    >>> import numpy as np
    >>> a = np.arange(15).reshape(3, 5)
    >>> a
    array([[ 0,  1,  2,  3,  4],
           [ 5,  6,  7,  8,  9],
           [10, 11, 12, 13, 14]])
    >>> a.shape
    (3, 5)
    >>> a.ndim
    2
    >>> a.dtype.name
    'int64'
    >>> a.itemsize
    8
    >>> a.size
    15
    >>> type(a)
    <type 'numpy.ndarray'>
    >>> b = np.array([6, 7, 8])
    >>> b
    array([6, 7, 8])
    >>> type(b)
    <type 'numpy.ndarray'>
    

    数组的创建:

    使用array:

    >>> import numpy as np
    >>> a = np.array([2,3,4])
    >>> a
    array([2, 3, 4])
    >>> a.dtype
    dtype('int64')
    >>> b = np.array([1.2, 3.5, 5.1])
    >>> b.dtype
    dtype('float64')
    
    >>> a = np.array(1,2,3,4)    # WRONG,提供的是多个参数
    >>> a = np.array([1,2,3,4])  # RIGHT,提供的是一个列表
    

    使用占位符创建数组

    zeros,ones,empty

    >>> np.zeros( (3,4) )
    array([[ 0.,  0.,  0.,  0.],
           [ 0.,  0.,  0.,  0.],
           [ 0.,  0.,  0.,  0.]])
    >>> np.ones( (2,3,4), dtype=np.int16 ) # dtype can also be specified
    array([[[ 1, 1, 1, 1],
            [ 1, 1, 1, 1],
            [ 1, 1, 1, 1]],
           [[ 1, 1, 1, 1],
            [ 1, 1, 1, 1],
            [ 1, 1, 1, 1]]], dtype=int16)
    >>> np.empty( (2,3) ) # uninitialized, output may vary
    array([[  3.73603959e-262,   6.02658058e-154,   6.55490914e-260],
           [  5.30498948e-313,   3.14673309e-307,   1.00000000e+000]])
    

    其它函数;

    array, zeros, zeros_like, ones, ones_like, empty, empty_like, arange,linspace, numpy.random.rand, numpy.random.randn, fromfunction,fromfile

    打印数组

    一维数组打印出来就是一行

    二维数组打印出来就是矩阵

    三维数组打印出来就是列表矩阵

    >>> a = np.arange(6)   # 1d array
    >>> print(a)
    [0 1 2 3 4 5]
    >>>
    >>> b = np.arange(12).reshape(4,3)   # 2d array
    >>> print(b)
    [[ 0  1  2]
     [ 3  4  5]
     [ 6  7  8]
     [ 9 10 11]]
    >>>
    >>> c = np.arange(24).reshape(2,3,4)   # 3d array
    >>> print(c)
    [[[ 0  1  2  3]
      [ 4  5  6  7]
      [ 8  9 10 11]]
     [[12 13 14 15]
      [16 17 18 19]
      [20 21 22 23]]]
    

    如果一个数组太大,NumPy在打印时会自动隐藏一部分:

    >>> print(np.arange(10000))
    [   0    1    2 ..., 9997 9998 9999]
    >>>
    >>> print(np.arange(10000).reshape(100,100))
    [[   0    1    2 ...,   97   98   99]
     [ 100  101  102 ...,  197  198  199]
     [ 200  201  202 ...,  297  298  299]
     ...,
     [9700 9701 9702 ..., 9797 9798 9799]
     [9800 9801 9802 ..., 9897 9898 9899]
     [9900 9901 9902 ..., 9997 9998 9999]]
    

    如果想打印整个数组,使用set_printoptions()

    >>> np.set_printoptions(threshold=np.nan)
    

    基本操作

    算术运算

    >>> a = np.array( [20,30,40,50] )
    >>> b = np.arange( 4 )
    >>> b
    array([0, 1, 2, 3])
    >>> c = a-b
    >>> c
    array([20, 29, 38, 47])
    >>> b**2
    array([0, 1, 4, 9])
    >>> 10*np.sin(a)
    array([ 9.12945251, -9.88031624,  7.4511316 , -2.62374854])
    >>> a<35
    array([ True, True, False, False])
    

    数组的乘法运算:*,@,dot

    >>> A = np.array( [[1,1],
    ...             [0,1]] )
    >>> B = np.array( [[2,0],
    ...             [3,4]] )
    >>> A * B # elementwise product,对应位置数字直接相乘
    array([[2, 0],
           [0, 4]])
    >>> A @ B # matrix product,矩阵乘法
    array([[5, 4],
           [3, 4]])
    >>> A.dot(B) # another matrix product,矩阵乘法
    array([[5, 4],
           [3, 4]])
    

    +=,*=

    >>> a = np.ones((2,3), dtype=int)
    >>> b = np.random.random((2,3))
    >>> a *= 3
    >>> a
    array([[3, 3, 3],
           [3, 3, 3]])
    >>> b += a
    >>> b
    array([[ 3.417022  ,  3.72032449,  3.00011437],
           [ 3.30233257,  3.14675589,  3.09233859]])
    >>> a += b                  # b is not automatically converted to integer type
    Traceback (most recent call last):
      ...
    TypeError: Cannot cast ufunc add output from dtype('float64') to dtype('int64') with casting rule 'same_kind'
    

    对不同数组进行操作时,结果数组对应着更精确的一种;

    >>> a = np.ones(3, dtype=np.int32)
    >>> b = np.linspace(0,pi,3)
    >>> b.dtype.name
    'float64'
    >>> c = a+b
    >>> c
    array([ 1.        ,  2.57079633,  4.14159265])
    >>> c.dtype.name
    'float64'
    >>> d = np.exp(c*1j)
    >>> d
    array([ 0.54030231+0.84147098j, -0.84147098+0.54030231j,
           -0.54030231-0.84147098j])
    >>> d.dtype.name
    'complex128'
    

    一元操作符:

    >>> a = np.random.random((2,3))
    >>> a
    array([[ 0.18626021,  0.34556073,  0.39676747],
           [ 0.53881673,  0.41919451,  0.6852195 ]])
    >>> a.sum()
    2.5718191614547998
    >>> a.min()
    0.1862602113776709
    >>> a.max()
    0.6852195003967595
    

    对数组的行列进行操作,指定axis参数:

    >>> b = np.arange(12).reshape(3,4)
    >>> b
    array([[ 0,  1,  2,  3],
           [ 4,  5,  6,  7],
           [ 8,  9, 10, 11]])
    >>>
    >>> b.sum(axis=0)                            # sum of each column
    array([12, 15, 18, 21])
    >>>
    >>> b.min(axis=1)                            # min of each row
    array([0, 4, 8])
    >>>
    >>> b.cumsum(axis=1)                         # cumulative sum along each row
    array([[ 0,  1,  3,  6],
           [ 4,  9, 15, 22],
           [ 8, 17, 27, 38]])
    

    对数组的常规计算

    >>> B = np.arange(3)
    >>> B
    array([0, 1, 2])
    >>> np.exp(B)
    array([ 1.        ,  2.71828183,  7.3890561 ])
    >>> np.sqrt(B)
    array([ 0.        ,  1.        ,  1.41421356])
    >>> C = np.array([2., -1., 4.])
    >>> np.add(B, C)
    array([ 2.,  0.,  6.])
    

    其他函数:

    all, any, apply_along_axis, argmax, argmin, argsort, average, bincount, ceil,clip, conj, corrcoef, cov, cross, cumprod, cumsum, diff, dot, floor, inner, inv,lexsort, max, maximum, mean, median, min, minimum, nonzero, outer, prod,re, round, sort, std, sum, trace, transpose, var, vdot, vectorize, where

    索引、切片、迭代

    一维数组可以像列表一样进行索引,切片,迭代

    >>> a = np.arange(10)**3
    >>> a
    array([  0,   1,   8,  27,  64, 125, 216, 343, 512, 729])
    >>> a[2]
    8
    >>> a[2:5]
    array([ 8, 27, 64])
    >>> a[:6:2] = -1000    # equivalent to a[0:6:2] = -1000; from start to position 6, exclusive, set every 2nd element to -1000
    >>> a
    array([-1000,     1, -1000,    27, -1000,   125,   216,   343,   512,   729])
    >>> a[ : :-1]                                 # reversed a
    array([  729,   512,   343,   216,   125, -1000,    27, -1000,     1, -1000])
    >>> for i in a:
    ...     print(i**(1/3.))
    ...
    nan
    1.0
    nan
    3.0
    nan
    5.0
    6.0
    7.0
    8.0
    9.0
    

    多位数组

    >>> def f(x,y):
    ...     return 10*x+y
    ...
    >>> b = np.fromfunction(f,(5,4),dtype=int)
    >>> b
    array([[ 0,  1,  2,  3],
           [10, 11, 12, 13],
           [20, 21, 22, 23],
           [30, 31, 32, 33],
           [40, 41, 42, 43]])
    >>> b[2,3]
    23
    >>> b[0:5, 1]                       # each row in the second column of b
    array([ 1, 11, 21, 31, 41])
    >>> b[ : ,1]                        # equivalent to the previous example
    array([ 1, 11, 21, 31, 41])
    >>> b[1:3, : ]                      # each column in the second and third row of b
    array([[10, 11, 12, 13],
           [20, 21, 22, 23]])
    
    >>> b[-1]                                  # the last row. Equivalent to b[-1,:]
    array([40, 41, 42, 43])
    

    多维数组索引的一些写法

    • x[1,2,...] is equivalent to x[1,2,:,:,:],
    • x[...,3] to x[:,:,:,:,3] and
    • x[4,...,5,:] to x[4,:,:,5,:].
    >>> c = np.array( [[[  0,  1,  2],               # a 3D array (two stacked 2D arrays)
    ...                 [ 10, 12, 13]],
    ...                [[100,101,102],
    ...                 [110,112,113]]])
    >>> c.shape
    (2, 2, 3)
    >>> c[1,...]                                   # same as c[1,:,:] or c[1]
    array([[100, 101, 102],
           [110, 112, 113]])
    >>> c[...,2]                                   # same as c[:,:,2]
    array([[  2,  13],
           [102, 113]])
    

    迭代:对于多位数组,根据数组第一个维度进行迭代

    >>> for row in b:
    ...     print(row)
    ...
    [0 1 2 3]
    [10 11 12 13]
    [20 21 22 23]
    [30 31 32 33]
    [40 41 42 43]
    

    如果想对数组的每个元素都进行迭代,可以使用flat

    >>> for element in b.flat:
    ...     print(element)
    ...
    0
    1
    2
    3
    10
    11
    12
    13
    20
    21
    22
    23
    30
    31
    32
    33
    40
    41
    42
    43
    

    其它函数:

    Indexing, Indexing (reference), newaxis, ndenumerate, indices

    数组维度的操作

    改变数组的维度

    >>> a = np.floor(10*np.random.random((3,4)))
    >>> a
    array([[ 2.,  8.,  0.,  6.],
           [ 4.,  5.,  1.,  1.],
           [ 8.,  9.,  3.,  6.]])
    >>> a.shape
    (3, 4)
    
    >>> a.ravel()  # returns the array, flattened
    array([ 2.,  8.,  0.,  6.,  4.,  5.,  1.,  1.,  8.,  9.,  3.,  6.])
    >>> a.reshape(6,2)  # returns the array with a modified shape
    array([[ 2.,  8.],
           [ 0.,  6.],
           [ 4.,  5.],
           [ 1.,  1.],
           [ 8.,  9.],
           [ 3.,  6.]])
    >>> a.T  # returns the array, transposed
    array([[ 2.,  4.,  8.],
           [ 8.,  5.,  9.],
           [ 0.,  1.,  3.],
           [ 6.,  1.,  6.]])
    >>> a.T.shape
    (4, 3)
    >>> a.shape
    (3, 4)
    
    >>> a
    array([[ 2.,  8.,  0.,  6.],
           [ 4.,  5.,  1.,  1.],
           [ 8.,  9.,  3.,  6.]])
    >>> a.resize((2,6))  #修改数组本身
    >>> a
    array([[ 2.,  8.,  0.,  6.,  4.,  5.],
           [ 1.,  1.,  8.,  9.,  3.,  6.]])
    

    当进行维度调整时,如果给了参数-1,那剩下的维度自行排列;

    >>> a.reshape(3,-1)
    array([[ 2.,  8.,  0.,  6.],
           [ 4.,  5.,  1.,  1.],
           [ 8.,  9.,  3.,  6.]])
    

    函数参考:

    ndarray.shape, reshape, resize, ravel

    不同数组的组合:

    >>> a = np.floor(10*np.random.random((2,2)))
    >>> a
    array([[ 8.,  8.],
           [ 0.,  0.]])
    >>> b = np.floor(10*np.random.random((2,2)))
    >>> b
    array([[ 1.,  8.],
           [ 0.,  4.]])
    >>> np.vstack((a,b)) #按列,只适用于二维数组
    array([[ 8.,  8.],
           [ 0.,  0.],
           [ 1.,  8.],
           [ 0.,  4.]])
    >>> np.hstack((a,b))  #按行,只适用于二维数组
    array([[ 8.,  8.,  1.,  8.],
           [ 0.,  0.,  0.,  4.]])
    

    column_stack,row_stack可以将1维数组组合成二维

    >>> from numpy import newaxis
    >>> np.column_stack((a,b))     # with 2D arrays
    array([[ 8.,  8.,  1.,  8.],
           [ 0.,  0.,  0.,  4.]])
    >>> a = np.array([4.,2.])
    >>> b = np.array([3.,8.])
    >>> np.column_stack((a,b))     # returns a 2D array
    array([[ 4., 3.],
           [ 2., 8.]])
    >>> np.hstack((a,b))           # the result is different
    array([ 4., 2., 3., 8.])
    >>> a[:,newaxis]               # this allows to have a 2D columns vector
    array([[ 4.],
           [ 2.]])
    >>> np.column_stack((a[:,newaxis],b[:,newaxis]))
    array([[ 4.,  3.],
           [ 2.,  8.]])
    >>> np.hstack((a[:,newaxis],b[:,newaxis]))   # the result is the same
    array([[ 4.,  3.],
           [ 2.,  8.]])
    

    r_ 和 c_可以创建1维数组

    >>> np.r_[1:4,0,4]
    array([1, 2, 3, 0, 4])
    

    函数参考:

    hstack, vstack, column_stack, concatenate, c_, r_

    数组的拆分

    hsplit,vsplit,array_split

    >>> a = np.floor(10*np.random.random((2,12)))
    >>> a
    array([[ 9.,  5.,  6.,  3.,  6.,  8.,  0.,  7.,  9.,  7.,  2.,  7.],
           [ 1.,  4.,  9.,  2.,  2.,  1.,  0.,  6.,  2.,  2.,  4.,  0.]])
    >>> np.hsplit(a,3)   # Split a into 3
    [array([[ 9.,  5.,  6.,  3.],
           [ 1.,  4.,  9.,  2.]]), array([[ 6.,  8.,  0.,  7.],
           [ 2.,  1.,  0.,  6.]]), array([[ 9.,  7.,  2.,  7.],
           [ 2.,  2.,  4.,  0.]])]
    >>> np.hsplit(a,(3,4))   # Split a after the third and the fourth column
    [array([[ 9.,  5.,  6.],
           [ 1.,  4.,  9.]]), array([[ 3.],
           [ 2.]]), array([[ 6.,  8.,  0.,  7.,  9.,  7.,  2.,  7.],
           [ 2.,  1.,  0.,  6.,  2.,  2.,  4.,  0.]])]
    

    拷贝和查看

    对数组进行操作,有时候是创建了一个新数组,有时还是指向原来的地址;需要分辨清楚;

    函数和方法总结

    Array Creation

    arange, array, copy, empty, empty_like, eye, fromfile, fromfunction, identity,linspace, logspace, mgrid, ogrid, ones, ones_like, r, zeros, zeros_like

    Conversions

    ndarray.astype, atleast_1d, atleast_2d, atleast_3d, mat

    Manipulations

    array_split, column_stack, concatenate, diagonal, dsplit, dstack, hsplit,hstack, ndarray.item, newaxis, ravel, repeat, reshape, resize, squeeze,swapaxes, take, transpose, vsplit, vstack

    Questions

    all, any, nonzero, where

    Ordering

    argmax, argmin, argsort, max, min, ptp, searchsorted, sort

    Operations

    choose, compress, cumprod, cumsum, inner, ndarray.fill, imag, prod, put,putmask, real, sum

    Basic Statistics

    cov, mean, std, var

    Basic Linear Algebra

    cross, dot, outer, linalg.svd, vdot

    数组索引

    前面使用的索引都是单个整数;下面展示将索引整理成数组来索引数组内容。

    >>> a = np.arange(12)**2                       
    >>> i = np.array( [ 1,1,3,8,5 ] )   #索引组成的一维数组          
    >>> a[i]                                     
    array([ 1,  1,  9, 64, 25])
    >>>
    >>> j = np.array( [ [ 3, 4], [ 9, 7 ] ] )    
    >>> a[j]                                       
    array([[ 9, 16],
           [81, 49]])
    

    当然,被索引的数组也可以是多维数组;此时的索引在多维数组的第一个维度上进行索引

    >>> palette = np.array( [ [0,0,0],                # black
    ...                       [255,0,0],              # red
    ...                       [0,255,0],              # green
    ...                       [0,0,255],              # blue
    ...                       [255,255,255] ] )       # white
    >>> image = np.array( [ [ 0, 1, 2, 0 ],           # each value corresponds to a color in the palette
    ...                     [ 0, 3, 4, 0 ]  ] )
    >>> palette[image]                           
    array([[[  0,   0,   0],
            [255,   0,   0],
            [  0, 255,   0],
            [  0,   0,   0]],
           [[  0,   0,   0],
            [  0,   0, 255],
            [255, 255, 255],
            [  0,   0,   0]]])
    
    >>> a = np.arange(12).reshape(3,4)
    >>> a
    array([[ 0,  1,  2,  3],
           [ 4,  5,  6,  7],
           [ 8,  9, 10, 11]])
    >>> i = np.array( [ [0,1],              # indices for the first dim of a
    ...                 [1,2] ] )
    >>> j = np.array( [ [2,1],              # indices for the second dim
    ...                 [3,3] ] )
    >>>
    >>> a[i,j]                        # i and j must have equal shape
    array([[ 2,  5],
           [ 7, 11]])
    >>>
    >>> a[i,2]
    array([[ 2,  6],
           [ 6, 10]])
    >>>
    >>> a[:,j]         #相当于对a的每行都进行迭代inex         
    array([[[ 2,  1],
            [ 3,  3]],
           [[ 6,  5],
            [ 7,  7]],
           [[10,  9],
            [11, 11]]])
    

    使用布尔值数组进行数组索引

    >>> a = np.arange(12).reshape(3,4)
    >>> b = a > 4
    >>> b                                          # b is a boolean with a's shape
    array([[False, False, False, False],
           [False,  True,  True,  True],
           [ True,  True,  True,  True]])
    >>> a[b]                                       # 1d array with the selected elements
    array([ 5,  6,  7,  8,  9, 10, 11])
    

    使用布尔值组成的数组

    >>> a = np.arange(12).reshape(3,4)
    >>> a
    array([[ 0,  1,  2,  3],
           [ 4,  5,  6,  7],
           [ 8,  9, 10, 11]])
    >>> b1 = np.array([False,True,True])             # first dim selection
    >>> b2 = np.array([True,False,True,False])       # second dim selection
    >>>
    >>> a[b1,:]                                   # selecting rows
    array([[ 4,  5,  6,  7],
           [ 8,  9, 10, 11]])
    >>>
    >>> a[b1]                                     # same thing
    array([[ 4,  5,  6,  7],
           [ 8,  9, 10, 11]])
    >>>
    >>> a[:,b2]                                   # selecting columns
    array([[ 0,  2],
           [ 4,  6],
           [ 8, 10]])
    >>>
    >>> a[b1,b2]                                  # a weird thing to do
    array([ 4, 10])
    

    线性代数

    >>> import numpy as np
    >>> a = np.array([[1.0, 2.0], [3.0, 4.0]])
    >>> print(a)
    [[ 1.  2.]
     [ 3.  4.]]
    
    >>> a.transpose()
    array([[ 1.,  3.],
           [ 2.,  4.]])
    
    >>> np.linalg.inv(a)
    array([[-2. ,  1. ],
           [ 1.5, -0.5]])
    
    >>> u = np.eye(2) # unit 2x2 matrix; "eye" represents "I"
    >>> u
    array([[ 1.,  0.],
           [ 0.,  1.]])
    >>> j = np.array([[0.0, -1.0], [1.0, 0.0]])
    
    >>> j @ j        # matrix product
    array([[-1.,  0.],
           [ 0., -1.]])
    
    >>> np.trace(u)  # trace
    2.0
    
    >>> y = np.array([[5.], [7.]])
    >>> np.linalg.solve(a, y)
    array([[-3.],
           [ 4.]])
    
    >>> np.linalg.eig(j)
    (array([ 0.+1.j,  0.-1.j]), array([[ 0.70710678+0.j        ,  0.70710678-0.j        ],
           [ 0.00000000-0.70710678j,  0.00000000+0.70710678j]]))
    

    向量堆积

    x = np.arange(0,10,2)                     # x=([0,2,4,6,8])
    y = np.arange(5)                          # y=([0,1,2,3,4])
    m = np.vstack([x,y])                      # m=([[0,2,4,6,8],
                                              #     [0,1,2,3,4]])
    xy = np.hstack([x,y])                     # xy =([0,2,4,6,8,0,1,2,3,4])
    

    直方图

    >>> import numpy as np
    >>> import matplotlib.pyplot as plt
    >>> # Build a vector of 10000 normal deviates with variance 0.5^2 and mean 2
    >>> mu, sigma = 2, 0.5
    >>> v = np.random.normal(mu,sigma,10000)
    >>> # Plot a normalized histogram with 50 bins
    >>> plt.hist(v, bins=50, density=1)       # matplotlib version (plot)
    >>> plt.show()
    
    ../_images/quickstart-2_00_00.png
    >>> # Compute the histogram with numpy and then plot it
    >>> (n, bins) = np.histogram(v, bins=50, density=True)  # NumPy version (no plot)
    >>> plt.plot(.5*(bins[1:]+bins[:-1]), n)
    >>> plt.show()
    
    image

    参考学习资料:

    相关文章

      网友评论

        本文标题:NumPy使用教程

        本文链接:https://www.haomeiwen.com/subject/vmyfdqtx.html