美文网首页
讲解:spreadsheet、R、R、PTSPython|Dat

讲解:spreadsheet、R、R、PTSPython|Dat

作者: liaoyie | 来源:发表于2020-01-12 21:20 被阅读0次

    Create a new Excel WorksheetSorry guys and gals, but it’s a New Year and we have new computers. So we’re going to all make a new spreadsheet. Remember that the data is at www.basketball-reference.com. If you don’t remember exactly how to load the data into the spread sheet, I will help you. The players we are studying are: Luka Doncic, Shai Gilgeous-Alexander, Deandre Ayton, Marvin Bagley, Trae Young, Josh Okogie, Jaren Jackson, Kevin Knox, Wendell Carter, Mohamed BambaFormat the Excel WorksheetDelete columns “G”, “DATE”, “AGE”, “TM”, “OPP”, the column next to “OPP”, “GS”, “MP”, “FG%”, “3P%”, “FT%”, “GMSC”, “+/-”To do this you can either go to www.basketball-reference.com and recreate the .xls file or you can simply delete columns in Excel or WPS by clicking on the header of the column and pressing delete. If you do the second option you need to delete the columns for every sheet.Loading the file into R1)We need to prepare R to read the file. Go to Packages on the title bar and click install package. R will ask you to choose a location. Select China (Shanghai); after R loads select gdata from the list of packages. It will take a few minutes for R to load the package.2)Once the package is installed you have to load it into R. Type library(gdata). If the library doesn’t load ask Mr. Wilson for help. 3)We are now ready to load the file into R. We are going to load each player into their own variable. Start with Luka Doncic; define a variable for Luka (ex: Doncic =) and then type in the command:a.read.xls(“Basketball.xls”,sheet = 1)b.Continue for all 10 players. Make sure to change the number of the sheet according to the player that is being loaded in. c.Now that you’ve done that, do it again! Re-define the variable and use the command: read.xls(“Basketball.xls”,sheet = “Doncic”). It still works!Figuring the average points per game for our playersOf course for any of our players we can type in mean(Doncic$PTS)And that will return the average points per game (PPG) for that one player. Try that for all 10 players now. If the program returns:[1]NAIt means that we need to fix the data that we imported into R:1.If you did the above steps you saw that the Doncic variable is broken. Type:fix(Doncic)A screen will appear that looks like a spreadsheet. Look for words in the spreadsheet. That’s what we want to delete.2.Redefine Doncic but add the argument “na.strings.” na.strings tells R to change the word we don’t want (if you did number 1 you will see that the word is “Inac”) to “NA.” So type:Doncic = read.xls(“Basketball.xls”,sheet = “Doncic, na.strings “Inac”Now all of the instances of the word “Inac” have been changed to “NA.” Type:fix(Doncic) What is the result? 3.If it still didn’t work that’s because R cannot give a mean of data with words (in programming words are called strings). We need to delete all of the “NA.” Type:Doncic = na.omit(Doncic)Now the Doncic variable should be fixed. Repeat this process for the other names if you cannot get a value for the mean of their PTS.We should now be able to calculate the PPG for any player. Of course we can look at the numbers and see who scored highest, but we want a visual representation. We’ll use the barplot() function to create a bar graph. We can’t typebarplot(mean(Doncic$PTS),mean(Ayton$PTS).........This won’t work. We need to create a vector of values that we can use in barplot(). We are going to create a variable (PPG) and we are going to put the average points per game for every player in it. Type:PPG = c(mean(D代做spreadsheet留学生作业、R程序作业代写、代做R语言作业、代做PTS课程作业 代写Python程序|代做Daoncic$PTS).........The command c( ) creates vectors (or lists).Try to make a bar graph now by typing barplot(PPG). It should work. If it doesn’t see Mr. Wilson. You should have this on your screen.This is a decent looking bar graph but there are several problems:1)There are no names2)Some of the data is taller than the graph3)The colors are a little plainYou need to fix these things. Fortunately everything you need to know can be found in R! Type:barplotRead the information file and find the functions for names, limit of the x and y axes, and colors. Experiment and change all of these things and take a picture of your work. Triple Bar GraphYou need to create a graph like the one above. Use total rebounds, offensive rebounds and defensive rebounds. We want to create another graph for points (PPG, FGA, FGM) and percentages (FG%, 3P%, FT%) but we don’t want to work as hard as we did to make the first triple bar graph. So we’re going to create a “function” for “R.”FunctionsLet’s define a variable calledAverageLet Average = function (stat = “stat”)In this line of code “Average” is a variable that we are defining. “function” is a command. It tells the program that we are going to create a rule for the program to execute. (stat = “stat”) defines the input as a string (a string means the input will be words instead of numbers). When we finish writing the function the input will look like this:Average (PTS)The above command will tell the program to take the average PTS of all players. Before we can use that command we have to finish writing the function.So we have Average = function (stat = “stat”) We are going to add a bracket to the end of our command line so that we can start writing our function.Average = function (stat = “stat”){Take note of what kind of bracket we have added to the end of the command line. Make sure you type the correct bracket into R.Okay, now that we have added the bracket on the command line we can press [enter] and R will drop us down a line without closing the function. It will look like thisOn the next line we are going to define a new variable. I will call it AVG (you can call it whatever you want).Now here is the difficult part. We want to create a vector of averages. Remember that we want our function “Average” to return a list of averages of each player.If we type in Average (PTS)We want 10 values for average PTS for each player AND we want their names to be next to their average PTS. So we need to tell our function to do that. I don’t want to tell you exactly how to do this part of the project, but I will tell you this. You need to create a vector of values (averages) and you need to create an array of names. Both of these things need to be created inside of the Average function.When you have finished the Average function show it to me so that I can verify that it works.After you have finished this function there are three more assignments to complete that should be much faster no that you’ve mastered this skill. They are on the next page.�1.Create a triple bar graph for PTS, FGA, FGM2.Create a triple bar graph for FG% (=FGM/FGA), FT% (=FTM/FTA), 3P%(=3PM/3PA). You’ll have to calculate those percentages yourself.3.Create a function that allows you to load in a spreadsheet by typing in a command. For example:�This function allows me to load a players spreadsheet into R by typing in their name. MAKE SURE YOU SAVE ALL YOUR WORK SO THAT YOU CAN SHOW IT TO ME.转自:http://www.7daixie.com/2019042612619895.html

    相关文章

      网友评论

          本文标题:讲解:spreadsheet、R、R、PTSPython|Dat

          本文链接:https://www.haomeiwen.com/subject/wnuwactx.html