## Statistics

20 Feb
1. What is the difference between a statistic and a parameter? (5 points)

1. Describe Type I and Type II error (5 points)
1. The personnel director of a corporation will study the overtime work during the previous year for the 2,575 clerical workers. A sample of 100 of these workers will be chosen at random from the files. The average and the standard deviation of the overtime hours will be calculated and the number of these employees who worked more than 75 hours of overtime will be recorded. For each of the following, state whether it is a statistic or a parameter.  (6 pts)
1. a) The average number of overtime hours for the 2,575 employees.
2. b) The standard deviation of overtime hours for the 100 workers.
3. c) The standard error of the average.
4. d) The proportion of clerical workers in the company who worked more than 75 hours overtime last year.

1. John Rengel is the Quality Assurance Supervisor for Vino Supremo Vinyards.  He knows that 10 percent of each box of corks is undersized. (6 pts)
1. a) If he were to randomly select 120 corks from the next box, then how many of these corks would John expect to be undersized?
2. b) If he were to randomly select 120 corks from each box, then what would John calculate as the standard error of the number of undersized corks?
3. c) What is the probability that John will find 15 or more corks defective in a randomly selected box?
1. Sixto Sanchez is the owner of Suburban Stylists.  He is evaluating the service level provided to walk-in customers. Because he is enrolled in an MBA program at Eastern University, Sixto decides to sample walk-in customers for the next two weeks.  He collects information from 82 walk-in customers and calculates that their average waiting times that is 21 minutes with a standard deviation of 4 minutes.  (12 pts)
1. a) Determine the degrees of freedom to be used in further analysis.
2. b) Calculate the two sided 95% confidence interval for the population mean of waiting times.
3. c) Calculate the two sided 90% confidence interval for the population mean of waiting times.
4. d) Calculate the two sided 99% confidence interval for the population mean of waiting times.
5. e) What t table value should be used in calculating a two sided 95% confidence interval for the population mean of waiting times if the sample selected is 25 instead of 82?

1. Bay Area Community College (BACC) has collected data comparing the starting salaries of their graduating students with last names beginning with the letters A through M with those whose last names begin with N through Z.  The first category, A through M, provided 47 random responses .  The second category, N through Z, had 52 random responses . (12 pts)

 Students_A_M Students_N_Z 32,890 21098 23000 19880 29015 25098 31898 18993 29466 20891 27890 25876 30654 23091 31973 24562 21456 18076 22,999 27889 32776 22101 30434 20993 34954 21788 36987 23569 20876 20965 27654 23458 29432 24567 31876 23863 30987 20962 29670 22553 22579 21098 35985 24898 32470 21964 28763 23009 29415 21890 35071 24874 25983 19823 21880 24538 35593 24729 22,569 23789 29841 25698 32650 20436 29427 24990 29996 21343 31874 23451 23195 21099 34990 25765 29845 22567 29996 27899 33889 21005 25182 22890 36201 24321 22994 21054 35789 22378 23198 27654 29468 25645 29379 19765 23109 18654 22891 19804 25993

1. a) If you assume that last names should not have an impact on starting salary of graduates of BACC, then what is the appropriate null hypothesis?
2. b) State the research hypothesis in words and in notation.
3. c) Calculate the appropriate test statistic.
4. d) Calculate the appropriate p-value for the test statistic.
5. e) Is the statistic statistically significant?
6. f) What type of error if any has been committed?

1. Plymouth Rock Securities is interested in finding out if there is a relationship between the number of new clients brought into the firm by a broker and the sales performance of the broker. A random sample of 11 brokers’ records are reviewed to determine the number of new clients enrolled last year and total sales in millions of dollars:  (12pts)

 Broker 1 2 3 4 5 6 7 8 9 10 11 Clients 27 11 42 33 15 15 25 36 28 30 17 Sales, \$ 52 37 64 55 29 34 58 59 44 48 31

1. a) How closely related is the new client base to sales performance?  Draw the scatterplot and compute the correlation and describe the relationship
2. b) Find the least-squares equation to predict sales from number of clients. Can the least squares equation be used to predict sales?
3. c) What does the slope represent?
4. d) What would a new broker who brings in 30 clients sell, on average?
5. e) How much of the variability in sales is not explained by the number of new clients?

1. Jean Siskel is an entertainment analyst for West Coast Securities. He is trying to develop a model to estimate gross earning generated by a new movie release.  He has collected the following data on 20 movies: Gross Earnings, Production Costs, Promotion Costs, and if the movie is based on a bestseller novel: (12 pts)

 Gross Earnings Production Cost Promotion Cost Movie Millions \$ Millions \$ Millions \$ Novel 1 28 4.2 1 0 2 35 6 3 1 3 50 5.5 6 1 4 20 3.3 1 0 5 75 12.5 11 1 6 60 9.6 8 1 7 15 2.5 0.5 0 8 72 10 12 1 9 45 6.4 8 1 10 37 7.5 5 0 11 30 5.0 1 1 12 63 10.1 10 0 13 58 7.8 9 1 14 50 6.9 10 0 15 24 3.5 4 0 16 82 11.0 15 1 17 48 10.7 1 1 18 34 6.6 2 0 19 50 8.4 3 1 20 45 10.8 5 0

1. What type of variable is novel?
2. What is the estimated multiple linear regression equation derived from this data?
3. What are the regression coefficients for each X variable?  Interpret the regression coefficient.
4. Will Jean be pleased with the results?
5. Interpret the intercept value.

1. The following data represent revenues in thousands of dollars for a manufacturer of small electric appliances. (15 pts)

 Year Quarter Revenues 1996 1 514 1996 2 822 1996 3 648 1996 4 976 1997 1 616 1997 2 884 1997 3 678 1997 4 996 1998 1 658 1998 2 850 1998 3 714 1998 4 1052

1. a) Calculate the moving averages for this time series.
2. b) Find the seasonal index for each quarter.
3. c) From the fourth quarter of 1997 to the first quarter of 1998, revenues declined.  What happened on a seasonally adjusted basis?
4. d) Compute the forecast for the second quarter of 2002.

extra credit (+2)     Find the regression equation to predict the long term trend in the seasonally adjusted revenues.

1. Bay Area University enrolls MBA students in three cohort programs: Weeknight, Saturday, and Distance. Dean Ed Epstein wants to know if there is a difference in the average of the students in the three programs.  He has his assistant take a random sample of 5 students from each program and record their ages. (15 pts)

 Weeknight Saturday Distance 29 32 25 27 33 24 30 31 24 27 34 25 28 30 26

1. a) State the Null hypothesis and the Research hypothesis to be tested
2. b) Calculate the F statistic.
3. c) Should the Null hypothesis be rejected at the 5% level of significance?
4. d) Draw box plots for the different programs.
5. e) Using the least-significant difference test, identify the significant differences between the programs.

EXTRA CREDIT:  Please describe the different parts of a statistical report and what is included in each section.  (5pts)