-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathproject.qmd
279 lines (174 loc) · 14.8 KB
/
project.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
---
title: "Research Project"
---
## Overview
Throughout the entire semester you will work with a co-author to analyze a data set and present the results of your research. Regular homework assignments will serve as a first draft of exploration into your research questions, and helps you build your story. However expect to visualize and analyze your data outside the those assignments.
You will build up your project in stages, revising multiple times. Here is the general outline, with each stage is explained in detail further below.
* [Stage 1: Choose a data set and identify a few research topics](https://math615.netlify.app/project#stage-1-choose-a-data-set-and-identify-a-few-research-topics)
* [Stage 2: Introduce your project and variables of interest](https://math615.netlify.app/project#stage-2-introduce-your-research-question-and-variables-of-interest)
* [Stage 3: Explore your data and relationships](https://math615.netlify.app/project#stage-3-exploratory-data-analysis)
* [Stage 4: Analyze bivariate relationships](https://math615.netlify.app/project#stage-4-bivariate-inference)
* [Stage 5: Multivariable modeling & summarizing findings](https://math615.netlify.app/project#stage-5-multivariable-modeling-summarizing-findings)
* [Stage 6: Dissemination (Poster Presentation and brief "publication")](https://math615.netlify.app/project#stage-6-dissemination)
Your research team will produce two separate deliverable: a research poster and a short journal article called a research brief.
#### Deliverable: Research Poster
You will be organizing your findings using a Google Slides template that will lead you through an organized approach to reporting research. This short format will also help you concisely explain your research findings in a way will be easier to translate (fit) onto a poster.
* These slides stay in Google (are not submitted to Canvas) and are subject to the _mastery based assessment_ discussed at the bottom of this page.
* The contents of each slide are specified in the slide template and explained below.
- You are welcome to have "staging" slides where you can dump content, thoughts, analyses that you _may_ end up using. **These extra slides should be at the END of the required slides.**
* There are also [example slides](https://drive.google.com/drive/folders/1JoRVyt_F4IqleTZFCnsPjOaFYDVNJPzP) from prior students as a references.
### Deliverable: Research Brief
You will also disseminate your findings by submitting a short journal article format called a research brief to the internal **Journal of Math 615**. These reports will be compiled into an internal journal with selected images decorating the cover and published on the course website for future cohorts.
See the [Guide for Authors](https://math615.netlify.app/author_guide.html) for more details.
### Peer Review
After stage 3 and 5 you will submit PDF draft copies of your research brief to the journal editor (your instructor) via Canvas for feedback, peer review and scoring. It is expected that you revise your work after each set of reviews.
Peer reviews will be done in Canvas using a scoring rubric. Reviewers will be randomly assigned and blinded to the authors.
## Learning Resources
* Our Library [subject research guides](https://libguides.csuchico.edu/). There is a guide for Math 315 (the undergraduate version of this class) that can be helpful
* UCI Library research guide: [How to write a research paper](https://guides.lib.uci.edu/scientificwriting)
* [How to peer review](https://authorservices.wiley.com/Reviewers/journal-reviewers/how-to-perform-a-peer-review/index.html) by Wiley Author Services
* [Peer review guide](https://www.bates.edu/biology/files/2010/06/Peer_review_form_v_2-5-18-GA-revised-1.pdf) by Bates college
## Details for each stage
### Stage 1: Choose a data set and identify a few research topics
Open source data repositories to peruse:
* [Data.gov](https://data.gov/)
* [IPUMS](https://www.ipums.org/)
* [Teaching of Statistics in Health Sciences](https://causeweb.org/tshs/category/dataset/) datasets
* [City of Minneapolis data](https://opendata.minneapolismn.gov/) (including police use of force)
* [Pew Research Center](https://www.pewresearch.org/tools-and-resources/) is "a nonpartisan fact tank that informs the public about the issues, attitudes and trends shaping the world."
* [Bureau of Labor Statistics](https://www.bls.gov/)
* [Dr.D's curated data sets](https://drive.google.com/drive/u/0/folders/1jULudBjRbHdW-uLIvmMbxRBEJJkq9crY)
* Other?
Sources not allowed:
* :x: Kaggle
* :x: UCI (or any other) Machine learning repository
:::{.callout-important}
#### Criteria for choosing a data set
* You either know something about the topic or it is something you want to learn about
* File type must be a .txt, .csv, .xlsx or .xls file
* File size is less than 1 Gig
* A codebook or data dictionary that fully explains what each variable means is available.
* There are at least 200 rows (observations), but ideally between 500-10,000.
* There are 10 or more unique and interesting variables
* At least 4 quantitative variables
* Variables are not functions of each other (e.g. weight in lbs and weight in kg)
:::
Your data set must be approved before you are allowed to work with it. Use this [submission link](https://forms.gle/TmhHaJ6PX5X9qx1C7) to upload the data (a sample or a codebook will also be sufficient) and tell me a little about it. Questions you should be prepared to answer:
1. What is the data set you are interested in analyzing?
2. How many observations (rows) and variables (columns) does it contain?
3. Where did you get it from?
4. Why are you interested in this topic?
5. What do you think you want to research with this data?
The sooner your data is approved the sooner you can work with it!
### Stage 2: Introduce your research question and variables of interest
1. Make a copy of this [Template](https://docs.google.com/presentation/d/1M98PL_S1TzHjXs00Ir86imfr1mO801tyWawk30RH0r4) and save it in the `Project -00 Poster prep` folder in our Google Drive.
2. Name your file using the last names of your research team. E.g. `Donatello, Raymond`
3. Fill out the required information on slides 1 -7. These slides for this stage are pale yellow in the template.
* **Slide 01**. Title
* **Slide 02**. Introduction
* **Slide 03**. Background & lit review.
* **Slide 04**. Explain the research problem / topic area
* **Slide 05**. State your Research Question as a testable hypothesis.
* **Slide 06**. Introduce the data. Where does it come from, how was it collected. How many records?
* **Slide 07**. Description of measures (variables) being used.
:::{.callout-warning}
### Pay attention to slide Numbers!
* Please ensure that all content is on the slide as expected.
* Slides will not be filled in in direct numerical order.
:::
### Stage 3: Exploratory data analysis
#### Poster Preparation slides
Fill out the required information on slides 9, 10, 11 and 13. These slides for this stage are pale blue in the template.
* **Slide 09**. Fully describe your primary response variable
* **Slide 10**. Fully describe your primary explanatory variable
* **Slide 11**. Fully describe the relationship between your primary explanatory and primary response variables
* **Slide 13**. Fully describe the relationship of a third variable to either your explanatory or response variables
All using appropriate summary statistics, graphic and an explanatory sentence.
#### Research Brief
* Write the first two sections of the manuscript: Introduction & Background, Study Design and Data collection methods.
* Submit a PDF to Canvas by the due date.
### Stage 4: Bivariate Inference
#### Poster Preparation slides only
Fill out the required information on slides 8, 12 and 14. Note that these directly correspond to your variables used on slide 11 and 13. If you decided to change variables since the last time, you will need to update the corresponding univariate descriptions and graphs. These slides are pale purple in the template.
* **Slide 08**. Describe in a few sentences what analysis tools you will use to answer your research question
* **Slide 12**. Analyze the relationship discussed in slides 11
* **Slide 14**. Analyze the relationship discussed in slides 13
Conclusions should be written in clear English, in the context of the problem, includes summary statistics, confidence interval and a p-value. Refer to the homework for proper writing style.
### Stage 5: Multivariable modeling & summarizing findings
#### Poster Preparation slides
Fill out the required information on slides 15-21. These slides are pale green in the template. You are trying to understand the relationship between your explanatory and response variable, in presence of information contained in other variables.
**Slide 15: Model Building**
* Build a multivariable model by adding additional predictors to the model.
* Explain your model building process in a few bullet points.
- What variables did you test as other explanatory variables?
- Which ones did you examine as confounders, or as effect moderators?
- How did you determine your final model?
* See the lecture notes ([ASCN Ch 10](https://norcalbiostat.github.io/AppliedStatistics_notes/model-building.html)) on model building as guidance.
* Include any variables that were found to be significantly associated with the outcome
* If you found a moderator, your model should include an interaction term with your moderating variable.
* If you have a confounding variable, you still need to keep your primary explanatory variable in the model.
**Slide 16: Multivariable Model - Summary of results**
* A table or plot of the regression coefficients (or Odds Ratios) must be presented.
* At least one coefficient, the primary explanatory variable, must be interpreted in context of the problem.
**Slide 17. Model Assessment**
* If using a linear or log-linear model;
- present and interpret the model diagnostic plots
- report and interpret $R^{2}$
* If using a logistic regression model;
- describe the distribution of predicted probabilities and note if there are any concerns
- report and interpret the model accuracy, and the cutpoint used. ([Ref: ASCN Ch 12.5](https://norcalbiostat.github.io/AppliedStatistics_notes/model-performance.html))
**Slide 18. Discussion**
* Here you will explain what your graphical and inferential results tell you about your topic.
* Discuss if your research hypothesis was supported, if it was not, why you think that might be
* Explain the overall story/trend/what you learned when you consider your univariate, bivariate & multivariate results about your topic.
* Compare your results to previous research results. Do they agree or disagree?
**Slide 19. Implications**
* What are the practical implications of your results?
* What could others do with your findings?
* What future research needs to be conducted?
- This needs to be more specific than "other variables could be explored". Which variables and why? What other research articles indicate that those other variables are relevant?
**Slide 20. Limitations**
* Who are the results of this study generalizable to? (i.e. a subset of individuals?)
* Were there any model assumptions that were not upheld?
* If this is an observational study, you should make a statement about the findings are associations and not causal in nature
* Are there other factors that could explain your response variable that you did not include in your model?
**Slide 21. References**
* You can use smaller font to get all references on one slide.
* Use references from research plan, and any additional references gathered along the way.
* Make sure these are correctly done in APA format.
* Proper citations for
- *R*: Type `citation()`
- [R Studio](https://support.rstudio.com/hc/en-us/articles/206212048-Citing-RStudio)
- [How to cite software in Text](http://blog.apastyle.org/apastyle/2015/01/how-to-cite-software-in-apa-style.html)
#### Research Brief
* Write the remaining sections of the manuscript: Data Preparation and Statistical Analysis Methods, Results, Discussion/Conclusion.
* Submit a PDF to Canvas and print a copy for your instructor by the due date.
### Stage 6: Dissemination
#### Research Poster
* You will transfer all findings into a research poster, print the poster, and then present your research to your classmates during our class final period in a poster symposium format.
* Full guidelines including examples and evaluation criteria [are written in this blog post](https://www.norcalbiostat.com/post/2022-11-26-poster-guidelines/).
* Submit the poster file as printed to Canvas by the due date.
**Draft version**
This draft is graded based on how complete the poster is. You should consider this a draft that you would circulate to your colleagues for final review and comments. There is a rubric in Canvas with details.
Save your poster as a PDF and upload to the `Poster-Draft` folder in Google Drive.
**Final Version**
Upload your final poster as it is printed in PDF format Canvas.
**Presentation at the Poster Symposium**
When not presenting, you will walk around and learn about others research. Ask the presenters questions and fill out an evaluation form as you go. Poster scoring follows the above evaluation criteria and will be done via Google Forms. The link to this semesters form is in Canvas. Printed copies will be available upon request.
#### Research Brief
Following the [Guide for Authors](https://math615.netlify.app/author_guide.html) you will submit both the _self contained_ HTML file and the PDF file to Canvas by the due date.
# Project Grading Method
### Poster Preparation Slides
This work will be done through a series of revisions gaining feedback from the instructor at each stage.
* Each slide will be marked one of the following categories
- `Not Aeesssible`: No content presented.
- `NR`: Needs revision. This is C level work.
- `ME`: Meets expectation. This is B level work.
- `E` : Exemplary. This is A level work.
#### Rubric / Assessment form
Each team has their own 'project assessment' Google spreadsheet (same location as your poster) that shows you what achievement stage you are at for each item. It is locked to only view mode and only you and I have access to it. The Assessment item descriptions are a work in progress. I may add, combine, or reword items as I do my revisions to better fit what I am actually looking for. At the bottom of this file you will find a personalized grading rubric containing an _estimate_ of your final score.
I will update your status column for the slides that are being assessed at that stage.
You can request a reassessment of prior work at (nearly) any time. (You can ask for it, I just may not get to it until the next time I'm reviewing slides.)
You are responsible for keeping track of your status and asking for reassessment.
#### Research Brief
The research brief will be graded in Canvas using a rubric.