美文网首页
Lecture 1: Introduction and Word

Lecture 1: Introduction and Word

作者: 魏鹏飞 | 来源:发表于2020-05-17 12:58 被阅读0次

    Lecture Plan

    Lecture 1: Introduction and Word Vectors

    1. The course (10 mins)
    2. Human language and word meaning (15 mins)
    3. Word2vec introduction (15 mins)
    4. Word2vec objective function gradients (25 mins)
    5. Optimization basics (5 mins)
    6. Looking at word vectors (10 mins or less)

    Course logistics in brief

    Instructor: Christopher Manning

    • Head TA and co-instructor: Abigail See
    • TAs: Many wonderful people! See website
    • Time: TuTh 4:30–5:50, Nvidia Aud (à video)
    • Other information: see the class webpage:

    What do we hope to teach?

    1. An understanding of the effective modern methods for deep learning
      • Basics first, then key methods used in NLP: Recurrent networks, attention, etc.
    2. A big picture understanding of human languages and the difficulties in understanding and producing them
    3. An understanding of and ability to build systems (in PyTorch) for some of the major problems in NLP:
      • Word meaning, dependency parsing, machine translation, question answering

    What’s different this year?

    • Lectures (including guest lectures) covering new material: character models, transformers, safety/fairness, multitask learn.
    • 5x one-week assignments instead of 3x two-week assignments
    • Assignments covering new material (NMT with attention,
      ConvNets, subword modeling)
    • Using PyTorch rather than TensorFlow
    • Assignments due before class (4:30pm) not at midnight!
    • Gentler but earlier ramp-up
    • First assignment is easy, but due one week from today!
      • No midterm

    High-Level Plan for Problem Sets

    • HW1 is hopefully an easy on ramp – an IPython Notebook
    • HW2 is pure Python (numpy) but expects you to do
      (multivariate) calculus so you really understand the basics
    • HW3 introduces PyTorch
    • HW4 and HW5 use PyTorch on a GPU (Microsoft Azure)
      • Libraries like PyTorch, Tensorflow (and Chainer, MXNet, CNTK, Keras, etc.) are becoming the standard tools of DL
    • For FP, you either
      • Do the default project, which is SQuAD question answering
      • Open-ended but an easier start;agoodchoiceformost
        • Propose a custom final project, which we approve
      • Youwillreceivefeedbackfromamentor(TA/prof/postdoc/PhD) • Can work in teams of 1–3; can use any language

    Lecture Plan

    1. The course (10 mins)
    2. Human language and word meaning (15 mins)
    3. Word2vec introduction (15 mins)
    4. Word2vec objective function gradients (25 mins)
    5. Optimization basics (5 mins)
    6. Looking at word vectors (10 mins or less)

    1. How do we represent the meaning of a word?

    How do we have usable meaning in a computer?

    Problems with resources like WordNet

    Representing words as discrete symbols

    Problem with words as discrete symbols

    Representing words by their context

    Word vectors

    Word meaning as a neural word vector – visualization

    3. Word2vec: Overview

    Word2Vec Overview

    Word2Vec Overview

    Word2vec: objective function

    Word2vec: objective function

    Word2Vec Overview with Vectors

    Word2vec: prediction function

    Training a model by optimizing parameters

    To train the model: Compute all vector gradients!

    4. Word2vec derivations of gradient

    Chain Rule

    Interactive Whiteboard Session!

    Calculating all gradients!

    Word2vec: More details

    5. Optimization: Gradient Descent

    Gradient Descent

    Stochastic Gradient Descent

    Lecture Plan

    参考链接:
    https://www.youtube.com/watch?v=8rXD5-xhemo&list=PLoROMvodv4rOhcuXMZkNm7j3fVwBBY42z&index=1

    相关文章

      网友评论

          本文标题:Lecture 1: Introduction and Word

          本文链接:https://www.haomeiwen.com/subject/ckbjohtx.html