7 激活函数 -庖丁解牛之pytorch

作者: readilen | 来源:发表于2018-11-04 10:53 被阅读2次

pytorch中实现了大部分激活函数，你也可以自定义激活函数，激活函数的实现在torch.nn.functional中，每个激活函数都对应激活模块类，但最终还是调用torch.nn.functional，看了定义，你也能自定义激活函数,我们从最早的激活函数来看

sigmoid

def sigmoid(input):
    r"""sigmoid(input) -> Tensor

    Applies the element-wise function :math:`\text{Sigmoid}(x) = \frac{1}{1 + \exp(-x)}`

    See :class:`~torch.nn.Sigmoid` for more details.
    """
    warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")
    return input.sigmoid()

Sigmoid

源码显示这个激活函数直接调用tensor.sigmoid函数，值域在[0,1]之间，也就是把数据的所有值都压缩在[0,1]之间，映射概率不错，如果作为激活函数有如下缺点

神经元容易饱和，其值不在[-5, 5]之间，梯度基本为0，导致权重更新非常缓慢
值域中心不是0，相当于舍弃负值部分
计算有点小贵，毕竟每次都算两个exp，一定要做内存和计算的葛朗台

tanh

def tanh(input):
    r"""tanh(input) -> Tensor

    Applies element-wise,
    :math:`\text{Tanh}(x) = \tanh(x) = \frac{\exp(x) - \exp(-x)}{\exp(x) + \exp(-x)}`

    See :class:`~torch.nn.Tanh` for more details.
    """
    warnings.warn("nn.functional.tanh is deprecated. Use torch.tanh instead.")
    return input.tanh()

tanh

这个函数的值域正常了，避免了sigmoid的问题，是[-1, 1]，以0为中心，但是依然存在一些问题梯度消失的神经元饱和问题，而且计算更贵！

relu

def relu(input, inplace=False):
    if inplace:
        return torch.relu_(input)
    return torch.relu(input)

ReLu

relu的函数定义就是max(0, x)，解决了梯度消失的饱和问题，计算高效，线性值，一般来说比Sigmoid/tanh快6倍左右。而且有资料显示，和生物神经激活机制非常相近。但是引入了新的问题，就是负值容易引起神经死亡，也就是说每次这个激活函数会撸掉负值的部分。

Leaky Relu

def leaky_relu(input, negative_slope=0.01, inplace=False):
    r"""
    leaky_relu(input, negative_slope=0.01, inplace=False) -> Tensor

    Applies element-wise,
    :math:`\text{LeakyReLU}(x) = \max(0, x) + \text{negative\_slope} * \min(0, x)`

    See :class:`~torch.nn.LeakyReLU` for more details.
    """
    if inplace:
        return torch._C._nn.leaky_relu_(input, negative_slope)
    return torch._C._nn.leaky_relu(input, negative_slope)

LReLu

为了处理负值的情况，Relu有了变种，其函数是max(0.01*x, x),这个函数解决了神经饱和问题，计算高效，而且神经不死了。

PRelu

def prelu(input, weight):
    r"""prelu(input, weight) -> Tensor

    Applies element-wise the function
    :math:`\text{PReLU}(x) = \max(0,x) + \text{weight} * \min(0,x)` where weight is a
    learnable parameter.

    See :class:`~torch.nn.PReLU` for more details.
    """
    return torch.prelu(input, weight)

PRelu

这个函数的定义是max(ax, x)，其中参数a可以随时调整。

Elu Exponential Line Unit

def elu(input, alpha=1., inplace=False):
    r"""Applies element-wise,
    :math:`\text{ELU}(x) = \max(0,x) + \min(0, \alpha * (\exp(x) - 1))`.

    See :class:`~torch.nn.ELU` for more details.
    """
    if inplace:
        return torch._C._nn.elu_(input, alpha)
    return torch._C._nn.elu(input, alpha)

Elu

这个函数的定义是max(x, a*(exp(x)-1))，继承了Relu的所有优点，but贵一点，均值为0的输出、而且处处一阶可导，眼看着就顺滑啊，哈哈，负值很好的处理了，鲁棒性很好， nice！学完批标准化后，我们展示一个小示例，它居然在那个例子中干掉了批标准化。
于是其他变种应运而生

SELU

def selu(input, inplace=False):
    r"""selu(input, inplace=False) -> Tensor

    Applies element-wise,
    :math:`\text{SELU}(x) = scale * (\max(0,x) + \min(0, \alpha * (\exp(x) - 1)))`,
    with :math:`\alpha=1.6732632423543772848170429916717` and
    :math:`scale=1.0507009873554804934193349852946`.

    See :class:`~torch.nn.SELU` for more details.
    """
    if inplace:
        return torch.selu_(input)
    return torch.selu(input)

SELU

还有其他变种relu6、celu等等

这些激活函数我们来个经验参考：

首先使用Relu，然后慢慢调整学习率
可以尝试Lecky Relu/Elu
试一下tanh，不要期望太多
不要尝试sigmoid

网友评论

本文标题：7 激活函数 -庖丁解牛之pytorch

本文链接：https://www.haomeiwen.com/subject/jetttqtx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

7 激活函数 -庖丁解牛之pytorch

sigmoid

tanh

relu

Leaky Relu

PRelu

Elu Exponential Line Unit

SELU

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

大数据，机器学习，人工智能

自然语言处理—学习

pytorch

深度学习-推荐系统-CV-NLP