Performing Basic Statistics
Lets try doing some basic stats with the data from the previous modules.
- If the previous data got lost, type data<-read.csv("http://joeystanley.com/downloads/menu.csv")
- In order to find the mean, type mean() into the script with what you want to find the mean of.
- In the dataset from above, lets find the mean calories. Type mean(data$Calories) into the script. Then, press Run.
- The console should look like below. The mean number of calories in everything in the menu is 368.2692 calories.
- In order to produce tests of significance, use the formula: cor()
- To do this with our data, try looking at the significance between Calories and Fat using the formula: cor(data$Calories,data$Fat)
- In the console, the following should be there. The correlation is 0.90.
- To test a single correlation coefficient, use the formula cor.test()
- Using our dataset to test a single correlation coefficient, use the variables Calories and Fat. Enter the script: cor.test(data$Calories,data$Fat). Then, press, Run.
- Pearson's product-moment correlation should come up in the console, like the following. Here you can see, the t test, degrees of value, p-value, 95 percent confidence interval and the correlation between the variables.
- Try making another correlation between two other variables in the dataset.
- An independent 2-group t test is where one variable is the numbers and the other is a binary (tall versus short) factor. You can make it binary by selecting rows 1 through 57 that category becomes binary because there are only 2 options.
- Then, use the formula t.test() in order to run the t-test. Type: t.test(x~y)
- The following should come up in the console.