Cousera——Machine Learning习题笔记

Cousera——Machine Learning习题笔记

作者: W_Honor | 来源:发表于2017-07-21 17:37 被阅读965次



    Consider the following training set of m=4 training examples:

    x y
    1 0.5
    2 1
    4 2
    0 0

    Consider the linear regression model hθ(x)=θ0+θ1x. What are the values of θ0 and θ1 that you would expect to obtain upon running gradient descent on this model? (Linear regression will be able to fit this data perfectly.)

    • A. θ0=0.5,θ1=0

    • B. θ0=0.5,θ1=0.5

    • C. θ0=1,θ1=1

    • D. θ0=1,θ1=0.5

    • F. θ0=0,θ1=0.5



    Let f be some function so that

    f(θ0,θ1) outputs a number. For this problem,

    f is some arbitrary/unknown smooth function (not necessarily the

    cost function of linear regression, so f may have local optima).

    Suppose we use gradient descent to try to minimize f(θ0,θ1)

    as a function of θ0 and θ1. Which of the

    following statements are true? (Check all that apply.)

    • A. If θ0 and θ1 are initialized at the global minimum, then one iteration will not change their values.

    • B. Setting the learning rate α to be very small is not harmful, and can only speed up the convergence of gradient descent.

    • C. If the first few iterations of gradient descent cause f(θ0,θ1) to increase rather than decrease, then the most likely cause is that we have set the learning rate α to too large a value.

    • D. No matter how θ0 and θ1 are initialized, so long as α is sufficiently small, we can safely expect gradient descent to convergen to the same solution.



    For this question, assume that we are

    using the training set from Q1. Recall our definition of the

    cost function was J(θ0,θ1)=12m∑i=1m(hθ(x(i))−y(i))2.

    What is J(0,1)? In the box below,

    please enter your answer (Simplify fractions to decimals when entering answer, and '.' as the decimal delimiter e.g., 1.5).



    Suppose m=4 students have taken some class, and the class had a midterm exam and a final exam. You have collected a dataset of their scores on the two exams, which is as follows:

    midterm exam (midterm exam)^2 final exam
    89 7921 96
    72 5184 74
    94 8836 87
    69 4761 78

    You'd like to use polynomial regression to predict a student's final exam score from their midterm exam score. Concretely, suppose you want to fit a model of the form hθ(x)=θ0+θ1x1+θ2x2, where x1 is the midterm score and x2 is (midterm score)2. Further, you plan to use both feature scaling (dividing by the "max-min", or range, of a feature) and mean normalization.

    What is the normalized feature x1(3)? (Hint: midterm = 94, final = 87 is training example 3.) Please round off your answer to two decimal places and enter in the text box below.

    公式:正规方程特征 = (目标值 - 平均值)/(Max-Min)

      分析解答:平均值为 (7921+5184+8836+4761)/4=6675.5



          本文标题:Cousera——Machine Learning习题笔记
