-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathSimpsons_Paradox_usingshiny_Presentation.Rpres
81 lines (60 loc) · 3.15 KB
/
Simpsons_Paradox_usingshiny_Presentation.Rpres
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
Simpsons Paradox using Shiny
========================================================
author: Sandeep Anand
date: 6/20/2017
autosize: true
First Slide
========================================================
For more details on the Simpsons Paradox, please check the link below
<https://sananand007.shinyapps.io/simpsonsparadox/> .
Every Simpson's paradox involves at least three variables:
+ the explained
+ the observed explanatory
+ the lurking explanatory
If the effect of the observed explanatory variable on the explained variable changes directions when you account for the lurking explanatory variable, you've got a Simpson's Paradox.
+ The Simpson's Paradox is an idea that can explain and find faults in a number of data analysis by proper pooling of the data which can challenge reproducibility
Second Slide
========================================================
For more details on the shiny app, please check the link below
<https://sananand007.shinyapps.io/simpsonsparadox/> .
- It shows the widgets being used to develop the Shiny app that is later shown as a Picture
- It also shows reactivity of graphs through the use of variable bar Plots
- It also shows how we can implement a rough version of Simpson's Paradox basic idea using a simple way
Code to Show the Data being Used
========================================================
```{r Code to Show the Data Used, echo=FALSE, warnings=FALSE}
data(UCBAdmissions)
dimnames(UCBAdmissions)
admindata <- data.frame(UCBAdmissions)
aggdata <- setNames(aggregate(admindata$Freq, list(admindata$Admit,admindata$Dept), sum), c("Decision", "Group","Count"))
grade <- c()
# New data frame to categorize Easy or Hard
dept.id<-c(unique(as.character(aggdata$Group)))
for (i in seq(1,length(aggdata$Decision),2)){
if(as.character(aggdata$Group[i]) == as.character(dept.id[as.integer(i/2+1)])) {
if(as.integer(aggdata$Count[i]) > as.integer(aggdata$Count[i+1])) {
grade[as.integer(i/2+1)] <- "Easy"
} else if (as.integer(aggdata$Count[i] <= as.character(aggdata$Count[i+1]))) {
grade[as.integer(
i/2+1)] <- "Hard"
}
}
}
newgrade.df <- cbind.data.frame(dept.id,grade)
newgrade.df$grade <- as.factor(newgrade.df$grade)
# Making Final Data frame with only easy and hard departments
lev <- with(admindata, levels(admindata$Dept))
lev[lev == as.character(newgrade.df$dept.id)] <- as.character(newgrade.df$grade)
admindata.mod <- within(admindata, levels(admindata$Dept)<-lev)
drop<-c("Admit", "Gender", "Dept", "Freq")
admindata.mod.clean <- admindata.mod[,!(names(admindata.mod) %in% drop)]
# Collapse them
aggdata2 <- setNames(aggregate(admindata.mod.clean$Freq, list(admindata.mod.clean$Admit, admindata.mod.clean$Gender, admindata.mod.clean$Dept), sum),
c("Decision", "Gender", "Category", "Count"))
aggdata2
```
Slide With Plot
========================================================
```{r, echo=FALSE}
knitr::include_graphics('./Capture-Output.png')
```