Regression Project

The overall purpose of this project is to use simple linear regression to evaluate data.
This is an open-resource project. Feel free to use tutors, websites, books, videos, and anything else that might help you. Keep track of all the resources you use,

because you’ll need to include them on your references page. Please note that plagiarism is not tolerated — at all. Putting everything in your own words may take a

little longer, but it will save you a lot of hassle in the long run.
There are three parts in the body of the report (between the Overall Introduction and the Overall Conclusion)
PART 1:  Using a data set from the book
Finding the data
You will choose from four data sets.  (#1; Health Exam Data, #6; Bears, #16; Cars, or #22; Garbage) When you have selected the data set you want to work with, locate

and copy it into a notebook or file.  Note that these data sets are available in MSL as Statcrunch files as well as in Appendix B in the textbook.
The next step is to explore the data.  Perform linear regressions on various pairs of data from your data set.  Decide which will be the x value and which will be the

y value.  You will need to perform a correlation, graph, and present at least two pairs of data in your final submission, but you can quickly test many more than that.

Try to find the best correlations that you can.  If you think there are too many possible combinations to test them all easily, use logic to pick the variables that

are most likely to correlate well.
Once you have explored the data, select the two correlations that show the highest correlation coefficients that you have found.  Graph them, showing the regression

line.  Explain what you have found.  Answer the following questions:
What percentage of the change in the y value is explained by the change in the x value?
Does the correlation make sense to you?
Do you think there is reason to believe that there really is a relationship between the two variables?  Does the correlation indicate causation?
Drawing conclusions
For each pair of variables, summarize your discussion into a concise conclusion statement.  Concise means short, focused, clear, and meaningful.
PART 2:  Using the data set you collected in Week 3
Graphing the Data (completed previously)
Create a scatter plot of the data including the linear regression line. Label both axes clearly and use appropriate scaling.
Perform a simple linear regression (completed previously)
Analyze your data, performing a simple linear regression.  Include only the slope, intercept, r value and r-squared value.  No other data should be included.
These first two steps were completed as part of the discussion forums.  However, you will need to clean up your graphs, making them look good.  Items that may need

attention include:  scale appropriately, provide a title, label the axes, get rid of unnecessary legends, eliminate extra data.
Evaluate the data
Discuss your data.  Possible questions to answer include:
What were your original thoughts about this data?
What did you expect to find?
What did you learn through the survey process?
Do you think your sample is a good representation of the whole population?  Why or why not?
Is there a correlation?
Is it strong or weak?
Positive or negative?
Do you think it proves a relationship?
Do you think the data supports causation?
These will all be opinions.  The important thing is to be clear, concise, and technically correct.  Don’t make claims you can’t support.  Use statistics to support

your words.
Drawing Conclusions
Summarize your discussion into a concise, focused, clear statement.
PART 3:  Analyzing a classmate’s data
Repeat Part 2 using a graph that was posted by one of your classmates, showing their data and the linear regression results.  You do not need to clean up their graph.

Include it as part of your submission.  Present an evaluation of the data and a conclusion.  Be very sure that your evaluation and conclusion do not look at all

similar to your classmate’s.  Write this completely independently.  Plagiarism will not be tolerated.
To summarize, the body of your report will have four graphs showing best fit lines and regression data.  There will be an evaluation and conclusion for each of the

four graphs.
Use the overall introduction and conclusion sections to tie the whole report together.  Don’t repeat the specific conclusions for each graph in your conclusion.  Make

a concise statement summarizing the project.

Get a 10 % discount on an order above $ 100
Use the following coupon code :