美文网首页我爱编程
第1天:Python安装使用及机器学习概览

第1天:Python安装使用及机器学习概览

作者: 離枝 | 来源:发表于2017-02-21 17:58 被阅读0次

    任务

    • 阅读
      • Machine Learning in Action(对应第一章)
    • 实践
      • 安装python环境
      • 导入NumPy、BeautifulSoup等数据挖掘与机器学习所需要的模块

    实践

    # 官方demo
    from mpl_toolkits.mplot3d import axes3d
    import matplotlib.pyplot as plt
    from matplotlib import cm
    
    fig = plt.figure()
    ax = fig.gca(projection='3d')
    X, Y, Z = axes3d.get_test_data(0.05)
    ax.plot_surface(X, Y, Z, rstride=8, cstride=8, alpha=0.3)
    cset = ax.contourf(X, Y, Z, zdir='z', offset=-100, cmap=cm.coolwarm)
    cset = ax.contourf(X, Y, Z, zdir='x', offset=-40, cmap=cm.coolwarm)
    cset = ax.contourf(X, Y, Z, zdir='y', offset=40, cmap=cm.coolwarm)
    
    ax.set_xlabel('X')
    ax.set_xlim(-40, 40)
    ax.set_ylabel('Y')
    ax.set_ylim(-40, 40)
    ax.set_zlabel('Z')
    ax.set_zlim(-100, 100)
    plt.show()
    

    阅读

    Machine Learning in Action第一章:机器学习基础

    • 总结本章中比较重要的观点
      • 机器学习让我们从“生”的数据集中提炼出有意义的信息,以便我们从中获取洞见

      With machine learning we can gain insight from a dataset,not cyborg rote memorization, and not the creation of sentient beings
      Machine learning is turning data into information

    • 机器学习是一门运用统计学的学科,之所以需要统计学,因为现实世界中并没有那么多确定性

    Machine learning uses statistics.There are many problems where the solution isn’t deterministic. That is, we don’t know enough about the problem or don’t have enough computing power to properly model the problem. For these problems we need statistics.

    • 机器学习问题分类


    • 机器学习一般步骤
      • 收集数据源
      • 准备输入数据
      • 分析输入数据
      • 训练算法
      • 测试算法
      • 使用
    • 名词解释
    • expert systems(专家系统): 可以像某个领域的专家那样处理专业问题的系统

    By creating a computer program to recognize birds, we’ve replaced an ornithologist with a computer. The ornithologist is a bird expert, so we’ve created an expert system.

    • features/attributes(特征): 类似于标签,是对事物属性的描述

    features可以有以下几种取值:

    • numeric
    • binary
    • enumeration
    • classifiction(归类)

    For the moment, assume we have all that information. How do we then decide if a bird at our feeder is an Ivory-billed Woodpecker or somethingelse? This task is called classification.

    • regression(回归): 对于数值变化的预测,揭示出数值变化的规律

    Regression is the prediction of a numeric value.

    • training set(训练集):用以训练算法的数据源

    A training set is the set of training examples we’ll use to train our machine learning algorithms.

    • target variable(目标变量):机器学习算法预测的目标值

    The target variable is what we’ll be trying to predict with our machine learning algorithms. In classification the target variable takes on a nominal value, and in the task of regression its value could be continuous.

    • test set(测试集):是从training set中分割出来的数据集,用以测试算法的准确性

    To test machine learning algorithms what’s usually done is to have a training set of data and a separate dataset, called a test set.

    • supervised learning(监督学习): 人为干预(为数据贴标签等)情况下的机器学习

    This set of problems is known as supervised because we’re
    telling the algorithm what to predict.

    • unsupervised learning(非监督学习):人为不干预,如聚类问题

    In unsupervised learning, there’s no label or target value given for the data. A task where we group similar items together is known as clustering.

    资源汇总

    python官网
    python官网windows版本下载
    python环境准备网络博客
    codecademy上的Python学习
    windows下面安装Python和pip终极教程
    Python中的Numpy、SciPy、MatPlotLib安装与配置

    相关文章

      网友评论

        本文标题:第1天:Python安装使用及机器学习概览

        本文链接:https://www.haomeiwen.com/subject/ncsiittx.html