第1天：Python安装使用及机器学习概览

作者: 離枝 | 来源:发表于2017-02-21 17:58 被阅读0次

任务

阅读
- Machine Learning in Action（对应第一章）
实践
- 安装python环境
- 导入NumPy、BeautifulSoup等数据挖掘与机器学习所需要的模块

实践

Python的安装
python官网下载对应操作系统的python版本，如windows
windows系统，配置环境变量
安装IDE，如PyCharm(可选)
Python第三方模块的导入的方法
下载pip
解压到一个文件夹，进入解压目录后，cmd输入python setup.py install
添加环境变量安装盘:\Python版本\Scripts
使用cmd安装beautifulsoup4模块 pip install beautifulsoup4
导入numpy
- 下载64位对应python3版本的numpy的wheel文件
- pip安装wheel pip install wheel
- 到对应的numpy的wheel所在的文件夹内，cmd输入pip install numpy-1.12.0+mkl-cp36-cp36m-win_amd64.whl
- 测试numpy是否安装成功，在python shell中输入random.rand(4,4)
导入scikit-learn
- 下载scikit-learn对应的wheel文件
- 找到wheel文件存放的文件夹，pip install scikit_learn-0.18.1-cp36-cp36m-win_amd64.whl
- scikit-learn需要先下载安装scipy
导入matplotlib
- 下载[matplotlib]的wheel文件并安装(http://www.lfd.uci.edu/~gohlke/pythonlibs/#matplotlib)
- 测试matplotlib的安装

# 官方demo
from mpl_toolkits.mplot3d import axes3d
import matplotlib.pyplot as plt
from matplotlib import cm

fig = plt.figure()
ax = fig.gca(projection='3d')
X, Y, Z = axes3d.get_test_data(0.05)
ax.plot_surface(X, Y, Z, rstride=8, cstride=8, alpha=0.3)
cset = ax.contourf(X, Y, Z, zdir='z', offset=-100, cmap=cm.coolwarm)
cset = ax.contourf(X, Y, Z, zdir='x', offset=-40, cmap=cm.coolwarm)
cset = ax.contourf(X, Y, Z, zdir='y', offset=40, cmap=cm.coolwarm)

ax.set_xlabel('X')
ax.set_xlim(-40, 40)
ax.set_ylabel('Y')
ax.set_ylim(-40, 40)
ax.set_zlabel('Z')
ax.set_zlim(-100, 100)
plt.show()

阅读

Machine Learning in Action第一章：机器学习基础

总结本章中比较重要的观点
- 机器学习让我们从“生”的数据集中提炼出有意义的信息，以便我们从中获取洞见
With machine learning we can gain insight from a dataset,not cyborg rote memorization, and not the creation of sentient beings
Machine learning is turning data into information
机器学习是一门运用统计学的学科，之所以需要统计学，因为现实世界中并没有那么多确定性

Machine learning uses statistics.There are many problems where the solution isn’t deterministic. That is, we don’t know enough about the problem or don’t have enough computing power to properly model the problem. For these problems we need statistics.

机器学习问题分类
机器学习一般步骤
- 收集数据源
- 准备输入数据
- 分析输入数据
- 训练算法
- 测试算法
- 使用
名词解释
expert systems（专家系统）: 可以像某个领域的专家那样处理专业问题的系统

By creating a computer program to recognize birds, we’ve replaced an ornithologist with a computer. The ornithologist is a bird expert, so we’ve created an expert system.

features/attributes（特征）: 类似于标签，是对事物属性的描述

features可以有以下几种取值：

numeric
binary
enumeration
classifiction（归类）

For the moment, assume we have all that information. How do we then decide if a bird at our feeder is an Ivory-billed Woodpecker or somethingelse? This task is called classification.

regression（回归）: 对于数值变化的预测，揭示出数值变化的规律

Regression is the prediction of a numeric value.

training set（训练集）：用以训练算法的数据源

A training set is the set of training examples we’ll use to train our machine learning algorithms.

target variable（目标变量）：机器学习算法预测的目标值

The target variable is what we’ll be trying to predict with our machine learning algorithms. In classification the target variable takes on a nominal value, and in the task of regression its value could be continuous.

test set（测试集）：是从training set中分割出来的数据集，用以测试算法的准确性

To test machine learning algorithms what’s usually done is to have a training set of data and a separate dataset, called a test set.

supervised learning（监督学习）: 人为干预（为数据贴标签等）情况下的机器学习

This set of problems is known as supervised because we’re
telling the algorithm what to predict.

unsupervised learning（非监督学习）：人为不干预，如聚类问题

In unsupervised learning, there’s no label or target value given for the data. A task where we group similar items together is known as clustering.

资源汇总

python官网
 python官网windows版本下载
 python环境准备网络博客
 codecademy上的Python学习
 windows下面安装Python和pip终极教程
 Python中的Numpy、SciPy、MatPlotLib安装与配置

网友评论

我爱编程

本文标题：第1天：Python安装使用及机器学习概览

本文链接：https://www.haomeiwen.com/subject/ncsiittx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

第1天：Python安装使用及机器学习概览

任务

实践

阅读

Machine Learning in Action第一章：机器学习基础

资源汇总

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

我爱编程