# HSH746 BIOSTATISTICS 1

# ASSIGNMENT 2 (30% OF TOTAL MARK)

## Due date: 19 May 2017

### Instructions

This assignment covers the unit content up to and including week 8.

Please note: this assessment task must be all your own work. Please do not discuss questions and answers in detail with your fellow students.

Assignments must be submitted on-line via the assignment folder in the unit site in Deakin sync **before 5 pm on 19 May 2017**. Assignments must be submitted in a Microsoft Word document or an editable pdf.

Some of the questions require calculations using Stata. Where you have used Stata for calculations, you should copy the Stata commands and output from the Stata results screen and paste them into your assignment so that the assessor can see how you have derived your answer. Note: this Stata output is required __in addition__ to your answer to the question. Simply pasting in the Stata output will not be considered an adequate answer on its own.

This assignment is worth 30% of the final mark for HSH746 and the marks allocated for each question are shown.

Students should ensure that they keep a spare copy of their work

# Questions

**Read the following data description and answer the following questions**

The data set *assignment 2 sysbp.csv* contains the results of a clinical trial to test the effectiveness of a drug in lowering blood pressure. The data represent measurements of systolic blood pressure for a group of 29 women who are taking the drug and a comparison group of 29 women of similar age who are not taking the drug. The two samples have been selected to be statistically independent of each other and individuals were selected so that the data in each sample are independent.

A person is considered to have a systolic blood pressure which puts their health at risk if their systolic blood pressure is over 140 mmHg. So this data set also contains a variable indicating this risk. This variable takes the value 0 if the systolic blood pressure measurement is at or below 140 and 1 if the systolic blood pressure measurement is over 140.

So the variables recorded on the file are:

Variable name |
Variable definition |
Valid values |

personid | Number identifying the person in each group | 0–29 |

sysbp | Measured systolic blood pressure (mmHg) | 98.9–216.5 |

drug | Whether or not the person was taking the drug | 0 = not taking the drug |

1 = taking the drug | ||

atrisk | Whether or not the measured blood pressure was above 140 mmHg | 0 = sysbp ≤ 140 |

1 = sysbp > 140 |

All the assignment questions use this data set.

These are synthetic data, but you may reference them in your answers as coming from

*assignment 2 data: test of blood pressure drug *

Note that all tables and graphs in this assignment should be presented with appropriate headings and footnotes.

- Calculate a 95% confidence interval for the estimated difference in mean systolic blood pressure between the group taking the drug and the group not taking the drug. Present this estimate and confidence interval to 1 decimal place.

As part of your answer state what probability distribution you will use to calculate this confidence interval and why you will use this probability distribution and why it is appropriate for this data set. (8 marks)

- What does this confidence interval mean in words? (1 mark)

- What conclusion can you make from this confidence interval about whether or not the population mean systolic blood pressure between the two groups is different? Give a reason for your answer. (1 mark)

We now wish to formally test the hypothesis that using the drug is associated with a change in the mean systolic blood pressure.

- What statistical test will you use to test whether or not using the drug is associated with a change in the mean systolic blood pressure? State why it is appropriate for this data set. (1 mark)

Note: You have already discussed the probability distribution in question 1, so you don’t need to repeat that here.

- Formally test the statistical hypothesis that mean systolic blood pressure is different between those on the drug and those not on the drug. Be sure to fully report the hypothesis test. Report all numbers to 2 decimal places except for degrees of freedom which are reported as whole numbers. (5 marks)

We have examined whether or not the mean systolic blood pressure differs between the two groups. However, a more important question is whether or not the people who take the drug have a lower proportion at risk because of their blood pressure.

- Use the chi-square test to test whether the proportion of people at risk due to their blood pressure is different for the people who take the drug compared to the people who do not take the drug. Be sure to fully report the hypothesis test and show why this test can be used with this data set. Report all numbers to 2 decimal places except for degrees of freedom which are reported as whole numbers. Have we proved (choose one)
- The null hypothesis is true
- The alternative hypothesis is true
- Neither a. nor b. (7 marks)

- Calculate the estimated proportion of women who take the drug who are at risk because of their blood pressure along with a 95% confidence interval. State the results as a percentage to 1 decimal place.

(4 marks)

- We have a target that the proportion of women in the population who are at risk because of their blood pressure should be no more than 0.4. Assuming the sample of women who take the drug is representative of the population of women, does the confidence interval from question 7 provide evidence for or against our having met the target for people who use the drug? State your reason for your answer. (3 marks)

**End of assignment questions**