美文网首页大数据
数据挖掘 python 1

数据挖掘 python 1

作者: 不连续小姐 | 来源:发表于2018-11-10 06:23 被阅读1次

Part 1

As we discussed in class, a common analytical optimization technique is the first derivative test. In this test, you take the first derivative of a function and solve for its zeroes. Because of the way slope works, we know then that these points will be either a minimum or a maximum.

1. Are these points global or local minima/maxima

First Derivative provides us LOCAL minimum or maximum values

2. Apply the First Derivative test to the function. How many minima/maxima does this function have? Can you identify which zeroes are a minimum and which are a maximum without graphing the function?

f(x) =3 x+ 10*cos(x)

Answer:
f'(x)= 3- 10*sin(x)

Let's set f'(x)=0, then we have sin(x)=0.3. Therefore, there are infinite many minima/maxima for this function. Since Sin(x) is an alternating function, therefore x=17.5+2nPi are local Max, and x=17.5+ (2n+1)Pi is local Min, where n is in Z.

3. Apply the fsolve() function, as discussed in class, to write a simple program to find minima and maxima for the above function.

from scipy.optimize import fsolve
import math
import numpy as np

def system(coeff):
    b0= coeff
    f0=3-10*math.sin(math.radians(b0))
    return(f0)

b_guess=(0)
b0=fsolve(system,b_guess)
print("Beta0= {}".format(b0))

Beta0= [ 17.45760312]

Part 2

1. Least-Squares Regression

Using the least-squares regression module discussed in class, perform a regression analysis upon the data provided in the assignment_6.csv file on Moodle. Split the data provided into training and testing data. Use whatever means you like to determine the appropriate form of the model, f(x), to use.

from scipy.optimize import leastsq
import pandas as pd
import matplotlib.pyplot as plt

#Read in the data
data=pd.read_csv("C:/Users/fatan/Downloads/assignment6.csv",names=["X","Y"]) 

#Show Data structure
print(data[0:10])

#Split Training set and testing set 
#Traning set should be around 70% to 80%
train=data[0:78]
test=data[79:100]

def residual(b,x,y):
    return b[1]*x + b[0]-y

b_guess=[0,0]
line= 12.465*train.X -1.53

#calculate the optimized parameters for training set
b,_ =leastsq(residual,b_guess,args=(test.X, test.Y))
print(b,_)

#data visulization
plt.scatter(test.X, test.Y)
plt.plot(train.X, line,"r", label= "Training Set Linear Reg Line")
plt.xlabel("Test Set X value")
plt.ylabel("Test Set Y value")
plt.title("Linear Model Example")
plt.legend()
plt.show()
      X          Y

0 0.916092 10.973234
1 4.610461 63.649082
2 0.164516 8.143623
3 1.089609 13.759627
4 1.589659 15.190665
5 2.264226 23.217127
6 2.656766 27.918476
7 2.665267 28.458073
8 4.358936 56.519672
9 2.882788 26.703205
[ -0.83044318 12.66276417]

image.png

Part 3

Naive Bayes Classifier

In Python, implement a Naïve Bayes Classifier, as discussed in class.

from sklearn.naive_bayes import GaussianNB

x= np.array([[-3,7],[1,8], [1,1], [-1,0], [2,3], [-4,4], [-2,3], [1,2], [-1,4], [3,6], [-3,0], [-2,5]])
Y = np.array(["Y", "N", "Y", "Y", "Y", "N", "N", "Y", "Y", "Y", "N", "N"])
model = GaussianNB()
model.fit(x, Y)
predicted= model.predict([[1,1]])
print (predicted)

['Y']

相关文章

网友评论

    本文标题:数据挖掘 python 1

    本文链接:https://www.haomeiwen.com/subject/vvnaxqtx.html