美文网首页
cs231n 学习笔记i

cs231n 学习笔记i

作者: 专注挖坑的汪 | 来源:发表于2020-02-18 21:39 被阅读0次

第一课:Introduction

学习这门课先从这门课的历史开始 官方笔记 cs231n.github.io

the majority bits flying around the internet are actually visual data.这些visual data很难去理解,有时我们称之为网络中的暗物质

网络中visual data产生的速度快,需要有技术来理解这些data

the history of vision:Evolution's Big Bang,543 million years,B.C.那个时候 the earth was mostly water,there were a few species of animals floating

around in the ocean. Animals didn't move around much there they don't have eyes or anything when food swims by they grab them if the food didn't swim by they just float around.但是有件非常重要的事发生在540 million years ago. From fossil studies zoologists found out within a very short

period of time - ten million years - the number of animal species just exploded. 有人提出一个非常令人信服的原因,动物有了眼睛!捕食从此就开始proactive,some predators went after prey and prey have to escape from predators so the evolution or onset of vision started a evolutionary arms race and animals had to evolve quickly in order to survive as a species --- 这是biology perspective的quick history

第二课:Image Classification Pipeline

这个课主要用的是python和numpy 关于这两个的精简教程在http://cs231n.github.io/python-numpy-tutorial/

numpy的详细一点教程 https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html

这里的rank 1 是指 单个数组,如[1,2,3,4] rank 2 是指 以数组为元素的数组 [[1,2,3,4]]

所以对rank 1的数组 进行transpose 没有任何事发生

对于不同shape的数组进行算术操作,我感觉不同的shape就是指rank 1 和rank 2的数组

非broadcasting 用了broadcasting

关于broadcasting的General Broadcasting Rules https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html

有1就能broadcasting

Image Classification:A core task in computer vision 对于机器来说是一个非常难的问题

输入一个图,然后给你一些标签,输出图对应的标签(assign it one of these fixed category labels)

the computer really is representing the image as gigantic grid of numbers 比如800 * 600 * 3,假如输入是一张猫的图片,机器很难从这些数字里面distill the cat-ness. We refer to this problem as semantic gap

为什么这个一个Hard Problem

Because you can change the picture in very small,subtle ways that will cause this pixel grid to change entirely,our algorithms need to be robust to this

not only viewpoint is one problem,another is illumination.There can be different lighting conditions going on in the scene.Whether the cat is appearing in this very dark,moody scene,or very bright sunlit scene,it's still a cat and our algorithms need to be robust to that.还有猫会有不同的姿势transformation

transformation

Occlusion:where you might only see part of a cat,like,just the face,or in extreme example,just a tail peeking out from under the couch cushion.

Occlusion

Background Clutter:where maybe the foreground object of the cat could actually look quite similar in appearance to the background

Background Clutter

Intraclass variation:one notion of cat-ness,actually spans a lot of different visual appearances.And cats can come in different shapes and sizes and colors and ages.

Intraclass Variation

no obvious way to hard-code the algorithm for recognizing a cat,or other classes Unlike e.g. sorting a list of numbers,我们要用的是data-driven approach 我们用到两个函数 一个是train 输入是图片和标签 输出是模型 另一个是predict 输入是模型 输出是预测

我感觉笔记就用cs231n.github.io/classfication 就行了

相关文章

网友评论

      本文标题:cs231n 学习笔记i

      本文链接:https://www.haomeiwen.com/subject/yllbxhtx.html