Machine Learning笔记第16周

作者: 我的名字叫清阳 | 来源:发表于2016-05-01 09:24 被阅读663次

sklearn笔记1
Machine learning booooks
00 Machine Learning Introduction
【ML】Machine learning model
机器学习开篇
Machine Learning @ Python
机器学习概要 - supervised learning
The Fundamentals of Machine Lear
Coursera.MachineLearning.Week10
周志华推荐阅读材料

Machine learning 从第10周之后我就没再更新。一个原因是自己根本没时间学习，另一个原因是剩下的部分中所有内容，都是在我去年上过的另外一门课 Reinforcement Learning 中讲到了。需要那些笔记的话，直接去往下面的链接。 Reinforcement Learning 第一周课程笔记 : MDP；Reinforcement Learning 第十二周课程笔记： Game Theory I；Reinforcement Learning 第十三周课程笔记： Game Theory II & III。

下面的这部分内容是RL课中没有，而本课独有的内容。

Reinforcement Learning

API

MDP: Model -> planner -> policy
transitions(sars) -> leaner -> policy

Reinforcement learning history

RL = reward maximization.

More RL "APIs"

What do you call these?

Three ways of solving RL problems

From Direct use/ indirect learning -> indirect learning/direct use of the policy.

Policy search: find the optimal policy which can get the right action in given state (s).
Value-function based: find U which can return the best value (v) for state (s). apply argmax on this will generate policy for policy search.
Model-based approach: based on transition function and reward function, we can get the next state (s') and reward for the current state (s) and action (a) pair. These can be used to solve the Bellman equations and eventually solve value function and given policy.

Q-learning

With Q, we can find out U or PI without knowing transition or action. This is why Q learning works.

what Q-learning can do

Estimating Q From Transitions

Paste_Image.png

Estimated Q,(Q hat) is the utility of next state plus the current learning rate discounted by the learning rate alpha.
V and be estimated by changing alpha

what V converges to?

V will converge to the estimated value of X when alpha satisfies: all alphas sum to infinity, but all alpha square sum to a certain number. (e.t alpha = 1/t).

Q learning proof

The first step is a bit ambiguous because Q-hat changes over time. But it works in practice.

Q learning steps

Q-learning only works if s,a visited infinitely often. and alpha_t satisfy the conditions that it sums to infinity but the square of it sums to something less than infinity.

Paste_Image.png

always choose a₀
choose randomly
use Q-hat
"greedy" (local min) choose a₀ if a₀ is awesome
annealing

Greedy exploration

exploration & exploitation.

Wrap up

wrap up

今天就要考试了，我根本就没复习好。祝我好运吧。

2016-04-30 初稿

sklearn笔记1
layout: posttitle: sklearn笔记1categories: Machine_Learning...
Machine learning booooks
Machine learning Pattern Recognition and Machine Learning...
00 Machine Learning Introduction
Machine Learning Introduction What's the Machine Learning...
【ML】Machine learning model
What are machine learning models? A machine learning mode...
机器学习开篇
英文笔记： Machine Learning -Grew out of work in AI -New capab...
Machine Learning @ Python
Machine Learning（机器学习） Machine learning typically impleme...
机器学习概要 - supervised learning
转摘自：Andrew Ng's Machine Learning课程学习笔记1 侵删。
The Fundamentals of Machine Lear
How would you define Machine Learning? Machine Learning i...
Coursera.MachineLearning.Week10
Machine Learning Week10 : Large Scale Machine Learning La...
周志华推荐阅读材料
1 机器学习入门教材 Machine Learning machine-learning-the-art-and-...

网友评论

a7289fcbd328:你想把这篇文章投稿至《机器学习》专题，建议你把文中的图片重新画规范一些，再把中英文混搭的部分全部改成中文，这样更通俗易懂一些，其它读者也更容易看懂，谢谢支持~

a7289fcbd328:@我的名字叫清阳因为简书毕竟主要是针对中文的读者，谢谢支持~

我的名字叫清阳:@练绪宝这是课程笔记，课程是佐治亚理工的在线硕士项目的课程之一。课程的讲授和考试都是英文，需要这个笔记的人不会希望它是中文的。图片是课程的截图。但是笔记的内容我不打算改变了，也不会翻译为中文。

不过，还是谢谢你的建议。我可能会写一份中文的导读，把课程介绍一下，并且把这个课程中我所有的笔记链接整理进去，再重新投稿。

青木729:看你这笔记风格，是自己一直这么做笔记吗？我最近看Scott Young写的《如何高效学习》这本书，书中提到的一种叫做笔记流的记笔记方法，跟你这笔记风格非常相似～我最近按照他那本书中教的方法在学习，但是笔记流这里卡住了，我不知道怎么下手，也不知道记什么关键字，突然发现自己不会记笔记了～你是如何掌握了这门技巧的？有什么窍门嘛？不吝赐教

我的名字叫清阳:@青木729 我没有什么记笔记的技巧^
我是这样做的，我先看／听一遍视频，了解关键点，确认视频的关键帧。然后停下来，把关键点总结，记成笔记，把关键帧截图，配合笔记。
这种方法好像只能针对对着电脑视频进行学习的情况，因为我可以随时停止视频，让自己记录下自己的所得。
如果在教室里，这样就不太行了。

a7289fcbd328:你想把这篇文章投稿至《机器学习》专题，建议你把文中的图片重新画规范一些，再把中英文混搭的部分全部改成中文，这样更通俗易懂一些，其它读者也更容易看懂，谢谢支持~
a7289fcbd328:@我的名字叫清阳因为简书毕竟主要是针对中文的读者，谢谢支持~
我的名字叫清阳:@练绪宝这是课程笔记，课程是佐治亚理工的在线硕士项目的课程之一。课程的讲授和考试都是英文，需要这个笔记的人不会希望它是中文的。图片是课程的截图。但是笔记的内容我不打算改变了，也不会翻译为中文。

不过，还是谢谢你的建议。我可能会写一份中文的导读，把课程介绍一下，并且把这个课程中我所有的笔记链接整理进去，再重新投稿。
青木729:看你这笔记风格，是自己一直这么做笔记吗？我最近看Scott Young写的《如何高效学习》这本书，书中提到的一种叫做笔记流的记笔记方法，跟你这笔记风格非常相似～我最近按照他那本书中教的方法在学习，但是笔记流这里卡住了，我不知道怎么下手，也不知道记什么关键字，突然发现自己不会记笔记了～你是如何掌握了这门技巧的？有什么窍门嘛？不吝赐教
我的名字叫清阳:@青木729 我没有什么记笔记的技巧^
我是这样做的，我先看／听一遍视频，了解关键点，确认视频的关键帧。然后停下来，把关键点总结，记成笔记，把关键帧截图，配合笔记。
这种方法好像只能针对对着电脑视频进行学习的情况，因为我可以随时停止视频，让自己记录下自己的所得。
如果在教室里，这样就不太行了。

Machine Learning笔记第16周

Estimating Q From Transitions

相关文章

sklearn笔记1

Machine learning booooks

00 Machine Learning Introduction

【ML】Machine learning model

机器学习开篇

Machine Learning @ Python

机器学习概要 - supervised learning

The Fundamentals of Machine Lear

Coursera.MachineLearning.Week10

周志华推荐阅读材料

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

机器学习与模式识别

每周500字

理科生的果壳

才艺爱好

Machine Learning笔记 第16周

Estimating Q From Transitions

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

Machine Learning笔记第16周