Location: Anthony Tanbakuchi / Courses / MAT167 / MAT167 Introduction to Statistics

MAT167 Introduction to Statistics

image

An introduction to statistics. Includes sampling, data display, measures of central tendency, variability, and position; random variables, probability, probability distributions; sampling distributions, assessing normality, confidence intervals, hypothesis testing, ANOVA, and regression. Use of the statistics software R is taught throughout the course.

Announcements

Final grades have been posted. The solutions to the final exam are now on the website. Have a good summer and thanks for all your hard work this semester. –Anthony

Course Info

Instructor Info

Exam Dates

Resources

Quick Reference Sheet:

R Statistics Software

Using R on Campus:
R is on a few computers in the Academic Commons Computer Lab (2nd floor Santa Catalina building). Just go into the computer lab and ask Jody or Dennis (one of the lab managers) where the computers are. Printers are also available in this lab.
To install R:
Visit the R Resources page.
Basic R usage and examples:
R Basics page
R Data Sets:
  • Triola Book Data Page
  • Class Survey Data Page
  • See quick reference sheet or R intro lecture for info on how to use the data sets.
  • Technical Notes On R:
    When you start a new problem, it’s best to delete all the variables to ensure you don’t accidentally use old data, just type rm(list=ls()). Note that you will need to reload the book data if you need it.
    When you close R, you do not need to save the workspace if it asks. Saving the workspace just saves the variables you have defined.

    Solutions / Exams

    Lectures and Homework

    All homework is due at the beginning of class on Tuesday. Thus, homework assigned on Tuesday and Thursday is due at the beginning of class on the following Tuesday.

    1. Tue, Jan 20

      FOUNDATIONS

      Introductory Material. (Sections 1.1-1.4)

    2. Thur, Jan 22

      Introduction to R.

    3. Tue, Jan 27

      DESCRIPTIVE STATISTICS

      Summarizing & graphing data. (Sections 2.1-2.4)

      • Lecture:Handout

      • Homework:

        • From this point forward if the book has the TI symbol next to a problem, use R to do it.
        • As always, make sure to include your plots made with R in the HW.
        • If you get stuck using R take a look at these R examples.
        1. Sec 2.2: 1-17 odds (do 17 by hand)

        2. Sec 2.3: 1, 3.

        3. Additional Problem for 2.3: Use R to make two histograms: one of the male heights and of the female heights in the Appendix B Data Set 1 (Mhealth and Fhealth tables in R). Include both of the histograms in your HW. Then write a paragraph discussing the differences between the male and female heights that you can see from the histograms (ie. center, variation, shape, outliers, min, max). Do either of the histograms have a distribution that is approximately normal?

          HINTS: If you are having trouble getting the book data, see the R intro lecture or look at the back of the quick reference sheet. See the top part of this page to download the data sets.

        4. Sec 2.4: 1-4, 9 (use R), 13 (just sketch by hand), 17 (use R), 19 (use R)

          Hint for 19: to make a plot with Both lines and points rather than a scatter plot use the optional argument type="b" for the plot function. ex. plot(t, y, type="b"). The time vector t goes from 1990 to 2000. A quick way to make t is to use this shortcut: t=1990:2000.

    4. Thur, Jan 29

      Summation Notation.

    5. Tue, Feb 3

      Measures of center. (Sections 3.1-3.2)

      • Lecture: Handout

      • Homework:

        1. Sec 3.2: 1-9 odds, 13, 15, 21, 23, 25, 29 (Use R if the problem has TI by it from now on.)

          Hint for 21: to get the first set of differences use:

          x=WEATHER$HIGH-WEATHER$PREDICTE

          You can find the second set of differences in the same way once you figure out the correct column name. (I admit the author’s column names are not that good).

          Hint for 23: to get the pennies for before 1983:

          x=Coins$WEIGHT[Coins$TYPE=="Pre-1983 Pennies"]

          To find the post 1983 pennies use the same method but take a look at the Coins table to see what they are called and then modify the above statement.

          Hint on 29 b: mean(x, trim=0.10)

    6. Thur, Feb 5

      Measures of variation. (Sections 3.3)

      • Lecture: Handout

      • Homework:

        1. Sec 3.3: 1-9 odds, 15, 21
    7. Tue, Feb 10

      Relative standing and exploratory data analysis. (Sections 3.4-3.5)

      • Lecture: Handout

      • Homework

        1. Sec 3.4: 1, 5, 7, 9, 11, 13-27 odds

        2. Sec 3.5: 1, 3, 5, 9

        3. Additional problem: Use the following code to make two boxplots for comparing gender against bear weight and length. Then use the boxplots to discuss and compare the distribution of lengths and weights of bears in terms of their gender. (Make sure the book data is loaded into R first)

          boxplot(Bears$LENGTH ~ Bears$SEX, main="Comparison of bear length")
          boxplot(Bears$WEIGHT ~ Bears$SEX, main="Comparison of bear weight")
    8. Thur, Feb 12

      Descriptive Statistics: Case Study.

      PROBABILITY

      Probability I: Addition rule. (Sections 4.1-4.3)

      • Lecture: Handout

      • Homework:

        1. Sec 4.2: 1-25 odds, 29
        2. Sec 4.3: 1-23 odds
    9. Tue, Feb 17

      Probability II: Multiplication rule. (Sections 4.4-4.5)

      • Lecture: Handout

      • Homework:

        1. Sec 4.4: 1-21 odds
        2. Sec 4.5: 1-25 odds
    10. Thur, Feb 19

      Random variables (Sections 5.1-5.2)

      • Lecture: Handout (Printout next lecture on counting, we may cover part of that if we have time.)

      • Homework:

        1. Sec 5.2: 1-19 odds
    11. Tue, Feb 24

      MIDTERM I (Chapters 1-4)

    12. Thur, Feb 26

      Rodeo Holiday (No Classes)

    13. Tue, Mar 3

      Counting & Binomial distribution. (Sections 4.7, 5.3-5.4)

      • Lecture: Handout A, Handout B

      • Homework:

        1. Sec 4.7: 1, 5, 7, 9, 13
        2. Sec 5.3: 1, 3, every other odd 5-33, 35 (If the book says to use a table in the appendix, use dbinom in R instead.)
        3. Sec 5.4: 1, 3, every other odd 5-17, 19
    14. Thur, Mar 5

      Intro to the normal distribution. (Sections 6.1-6.2)

      • Lecture: Handout

      • Homework

        1. Sec 6.2: 1-4, 5-39 odds (most of these are easy if you use pnorm and qnorm R function.). You must make sketches to show the area.

          NOTE: From this point onward, if the book says to use a lookup table in Appendix A, use R instead. (You won’t be given tables on the tests.)

          HINT: If you use the technique I used in class, you don’t need to find z scores OR use the table in the back of the book. If the question refers to data that has a standard normal distribution, then it has a normal distribution with a mean=0 and a standard deviation=1.

          For example, to do 6.2 #10, it says to find the probability a thermometer has a reading less than -2.50 if the readings have a standard normal distribution. Thus we want to find P(x<-2.50) where x has a standard normal distribution. In R you would type:

          > pnorm(-2.50, mean=0, sd=1)
          0.006209665

          So the probability is only 0.00621!

    15. Tue, Mar 10

      Normal distribution cont. (Section 6.3)

      • Lecture: Continuation of previous lecture.

      • Homework

        1. Sec 6.3: 1, 2, 4, 5-23 odds Make sketches to show the area.
    16. Thur, Mar 12

      INFERENTIAL STATISTICS

      Sampling distributions, estimators, and the Central limit theorem (CLT). (Section 6.4-6.5)

      • Lecture: Handout

      • Homework:

        1. Sec 6.4: 1-7 odds, 11
        2. Sec 6.5: 1-17 odds Make sketches
    17. Tue, Mar 17 & Thur, Mar 19

      Spring Break (No class)

    18. Tue, Mar 24

      Normal as approx. to the binomial and assessing normality. (Sections 6.6-6.7)

      • Lecture: Handout A, Handout B

      • Homework:

        1. Sec 6.6: 1-23 odds (Use R not the appendix tables!) Make sketches
        2. Sec 6.7: 1, 3, 9 & 13, 11 & 15
    19. Thur, Mar 26

      Estimating a population proportion (Sections 7.1-7.2)

      • Lecture: Handout

      • Homework: (Yes, there are many problems for this HW, but these problems require practice.)

        1. Sec. 7.2: 1-35 odds
    20. Tue, Mar 31

      Estimating a population mean. (Sections 7.3-7.4)

      • Lecture: Handout

      • Homework: (Yes, there are many problems for this HW, but these problems require practice.)

        1. Sec. 7.3: 1-23 odds, 27, 29, 33
        2. Sec. 7.4: 1-13 odds, 19, 21, 23
    21. Thur, April 2

      HYPOTHESIS TESTING

      Intro to hypothesis testing (Sections 8.1-8.2)

      • Lecture: Handout

      • Homework:

        1. Sec. 8.2: 1-43 odds (skip 17-23). You don’t need to find critical values. However, if the book asks you to find the test statistic, find that using the equation.

          Hint for 29-36. If you have the test statistic and it’s a z-score, then use the cumulative probability distribution for the standard normal pnorm and find the tail area. See the section on p-value in the notes, it also discusses what to do.

    22. Tue, April 7

      Testing a claim about a proportion (Section 8.3)

      • Lecture: Continuation of last lecture handout.

      • Homework:

        1. Sec. 8.3: 1-3 odds, 5(c,d,e), 9, 15, 19, 23

          Note 1: You do not need to find the test statistic or critical values. We are using the p-values.

          Note 2: R uses the continuity correction for more accurate p-values. Your p-values and test statistics will differ from the book’s answers by a few percent. The following are a few of the p-values you will get with R to help you verify your work: Q5: p-value = 0.9114, Q9: p-value < 2.2e-16, Q15: p-value = 0.5395.

    23. Thur, April 9

      Testing a claim about a mean (Section 8.4-8.5)

      • Lecture: Handout

      • Homework:

        1. Sec. 8.4: 1-7 odds, 13, 15
        2. Sec. 8.5: 3-13 odds, 21, 25, 27, 31
    24. Tue, April 14

      Understanding tests and estimates

      • Lecture: Handout
      • Homework: Study for the exam. I won’t accept any email questions after 5 pm the night before the exam. Don’t start studying the night before the test.
    25. Thur, April 16

      MIDTERM II (Chapters 5-8 and 4.7)

    26. Tue, April 21

      Inferences about two proportions (Sections 9.1-9.2)

      • Lecture: Handout

      • Homework:

        1. Sec. 9.2: 1-7 odds, 15, 17, 19, 21, 25

          Note that R uses the continuity correction so the p-values will differ by a few percent from the book’s.

    27. Thur, April 23

      Inferences about two means & matched pairs (Section 9.3-9.4)

      • Lecture: Handout

      • Homework:

        1. Sec. 9.3: 1-7 odds, 23, 25, 27, 28

          Hint for 27: Use the Coins table, to get the quarters for before 1964:

          pre=Coins$WEIGHT[Coins$TYPE=="Pre-1964 Quarters"]

          To find the post 1964 quarters use the same method but take a look at the Coins table to see what they are called and then modify the above statement. The command summary(Coins) is helpful to find the categories.

          Hint for 28: Use the Cola table. Just figure out which two columns you need.

        2. Sec. 9.4: 1, 3, 5 (manually find the test statistic & p-value only), 13, 15, 17 (b-c), 19

    28. Tue, April 28

      MODELING AND TESTING RELATIONSHIPS

      Correlation (Section 10.1-10.2)

      • Lecture: Handout

      • Homework: Include scatter plots for each set of data that you find r

        1. Sec. 10.2: 1-11 odds, 21, 23, 27, 29, 31, 33, 35
    29. Thur, April 30

      Regression (Section 10.3)

      • Lecture: Handout

      • Homework: Make sure to determine if r is significant first (via hypothesis test) SHOW WORK. Include scatter plots with regression line and residual plots.

        1. Sec. 10.3: 1-11 odds, 21, 23, 27, 29, 31, 33

          Hint for 5 and 7: you will need to determine if the linear correlation coefficient is significant. Since r and n are already given just use the test statistic equation to manually find the p-value.

      Variation and prediction intervals, multiple regression (Section 10.4-10.5)

      • Lecture: Continuation of regression lecture.
      • Homework: No HW. These sections are optional course material. However, I highly recommend you read them.
    30. Tue, May 5

      Contingency tables (Section 11.3)

      • Lecture: Handout

      • Homework: Note that R uses the Yate’s continuity correction, so your P-values may differ slightly from the book’s.

        1. Sec. 11.3: 1-5, 7, 11, 13, 17, 21
    31. Thur, May 7

      ANOVA I (Section 12.1)

      • Lecture: Handout

      • Homework:

        1. Sec. 12.2: 1-4, 5 (skip d), 9

      ANOVA II (Section 12.2)

      • Lecture: Continuation of last lecture handout.

      • Homework: Don’t type in the data for 11-14 manually, download Chapter 12 Data File and load it into R (just like the book data), it has data for each problem. The table name is listed next to each problem. Also, don’t forget to include the boxplots.

        1. Sec. 12.2: 11 car.crash (p-val=0.421), 12 car.crash (p-val=0.296), 13 stress (p=val=0.091), 14 skulls (p-val=0.0305), 16 (p-val=0.0369)
    32. Tue, May 12

      Review / Questions

      • Lecture: Handout
      • Homework: Study for the final exam.
    33. Thur, May 14

      Review / Questions

    34. Tue, May 19

      FINAL EXAM Chapters 1-12 (2 hours - early class start time)

      8:10 am to 10:10 am

      If you would like your final exam back, turn in a self addressed stamped envelope with 2 first class stamps affixed with your final exam. Once the exams are graded I will mail back those I have envelopes for. If you don’t provide an envelope your exam will be shredded for your privacy.

    Updated on:
    Thu May 21 2009 at 12 PM

    Sub Pages

    Page Contents