Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project Submission #24

Open
wants to merge 103 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
103 commits
Select commit Hold shift + click to select a range
9541bb4
Added author email
bpearson11 Nov 13, 2018
15b5a99
Update mkdocs.yml
bpearson11 Nov 13, 2018
c49afa8
My findings
bpearson11 Nov 13, 2018
ad12e68
Title change
bpearson11 Nov 24, 2018
c4996fb
Update index.md
bpearson11 Nov 24, 2018
f588ef4
Took out picture
bpearson11 Nov 25, 2018
b246395
Description
bpearson11 Dec 3, 2018
fffd354
Update workflow.md
bpearson11 Dec 3, 2018
a77b7dd
Added references
bpearson11 Dec 3, 2018
4c4a4e4
Update workflow.md
bpearson11 Dec 3, 2018
7ae3789
Sep. para
bpearson11 Dec 3, 2018
f236207
Updated the source info.
bpearson11 Dec 3, 2018
07795f9
Update source_analysis.md
bpearson11 Dec 3, 2018
c854ffe
Removed unnecessary info
bpearson11 Dec 3, 2018
10c169d
Updated Authorship
bpearson11 Dec 3, 2018
df234f6
Updated Analysis
bpearson11 Dec 3, 2018
0cd055b
Updated Data Analsysis
bpearson11 Dec 3, 2018
63f82c4
Updated intro
bpearson11 Dec 3, 2018
f066300
Background
bpearson11 Dec 3, 2018
a2b0d87
Updated Historiography
bpearson11 Dec 3, 2018
690a642
Updated historical background.
bpearson11 Dec 4, 2018
d55fdd8
Updated historiography
bpearson11 Dec 4, 2018
aa5d3a6
Put in Intro/Summary
bpearson11 Dec 4, 2018
b0caaa4
"Distances"
bpearson11 Dec 4, 2018
c926009
Update index.md
bpearson11 Dec 4, 2018
5e1eab4
Deleted image
bpearson11 Dec 4, 2018
9493853
Intro
bpearson11 Dec 4, 2018
7c02d0b
Background v.2
bpearson11 Dec 4, 2018
4cbe7a5
Background v0.3
bpearson11 Dec 4, 2018
2185d54
Digitial and Historical Background
bpearson11 Dec 4, 2018
bf136ca
Remove Image
bpearson11 Dec 4, 2018
afd8a9b
img
bpearson11 Dec 4, 2018
c382273
Voyant
bpearson11 Dec 4, 2018
8088568
img
bpearson11 Dec 4, 2018
39ab28c
uploaded img
bpearson11 Dec 4, 2018
c7f27c8
Edits
bpearson11 Dec 4, 2018
3cdd676
fixed tab
bpearson11 Dec 4, 2018
af5719e
Update workflow.md
bpearson11 Dec 4, 2018
2490b4b
Removed tabs 2nd time
bpearson11 Dec 4, 2018
20087d6
Updated Data Analysis
bpearson11 Dec 4, 2018
ecf4c2c
Updated finding
bpearson11 Dec 6, 2018
615bcf4
Findings.
bpearson11 Dec 6, 2018
bd73b13
Almost done w/ Finding
bpearson11 Dec 6, 2018
5473e93
Findings
bpearson11 Dec 6, 2018
74cc905
Updated 2nd paragraph. Edited for grammar
bpearson11 Dec 7, 2018
8408407
Edited grammar of 1st paragraph.
bpearson11 Dec 7, 2018
f7f4d47
Put in prob
bpearson11 Dec 7, 2018
a04f3c0
Edited 2nd para
bpearson11 Dec 7, 2018
c1bc4c6
Updated 2nd paraggraph
bpearson11 Dec 7, 2018
e044afc
Updated 3 para
bpearson11 Dec 8, 2018
533e650
Added pros
bpearson11 Dec 8, 2018
09ec8c0
Finished 3rd para.
bpearson11 Dec 8, 2018
25e0600
Edited 2nd para
bpearson11 Dec 8, 2018
0ee7be3
Finished 3rd para.
bpearson11 Dec 8, 2018
1fbb6a8
Edited 3rd para
bpearson11 Dec 8, 2018
82e552d
Put in 1st source
bpearson11 Dec 8, 2018
3f9268a
Updated source
bpearson11 Dec 8, 2018
05a7c75
Updated Bib
bpearson11 Dec 8, 2018
9b2d22a
Added 5th source
bpearson11 Dec 8, 2018
1566a7f
Added 6th source
bpearson11 Dec 8, 2018
e83928f
Added source
bpearson11 Dec 8, 2018
20c77a1
Alphabtized
bpearson11 Dec 8, 2018
9450568
Deleted instructions.
bpearson11 Dec 8, 2018
d67439c
Edited conclusion
bpearson11 Dec 8, 2018
128e76b
Edited 8th source
bpearson11 Dec 8, 2018
a9ee052
Put in name for project credit
bpearson11 Dec 8, 2018
20902fe
Added citation for Orange
bpearson11 Dec 8, 2018
ec038ef
Cited orange3
bpearson11 Dec 8, 2018
355792a
Corrected grammar in para 2 and 3
bpearson11 Dec 8, 2018
ad4435b
Edited grammar.
bpearson11 Dec 8, 2018
b65c503
Updated the wording
bpearson11 Dec 8, 2018
04d0481
Updated grammar on section
bpearson11 Dec 8, 2018
54f05bf
Updated grammar
bpearson11 Dec 8, 2018
f337cf7
Fixed the grammar
bpearson11 Dec 9, 2018
f24404e
Updated grammar on third paragraph.
bpearson11 Dec 9, 2018
6449247
Attempted to add interpretation of results.
bpearson11 Dec 9, 2018
8aa3115
Update initial_findings.md
bpearson11 Dec 9, 2018
609e62b
Updated project background
bpearson11 Dec 9, 2018
2244bc7
Finished background
bpearson11 Dec 9, 2018
68e5469
Updated
bpearson11 Dec 9, 2018
7ed6879
added I to "but was
bpearson11 Dec 10, 2018
bf2ba27
Changed my 2nd para
bpearson11 Dec 10, 2018
724b1e8
Clarified the second sentence in 2nd para
bpearson11 Dec 10, 2018
6c2d837
Added articles to 2nd para.
bpearson11 Dec 10, 2018
014de15
added not to "consigned"
bpearson11 Dec 10, 2018
6592c72
Clarified final sentence in 4th para. deleted instructions
bpearson11 Dec 10, 2018
f8c501e
Update initial_findings.md
bpearson11 Dec 10, 2018
a502a80
Updated grammar of 1st par
bpearson11 Dec 10, 2018
6c89e56
Edited last para and deleted instructions
bpearson11 Dec 10, 2018
57be939
Deleted instructions
bpearson11 Dec 10, 2018
bb3d062
Edited wording and citations
bpearson11 Dec 10, 2018
731e612
Deleted instructions
bpearson11 Dec 10, 2018
0d56165
Added citation for Prof. Thomas
bpearson11 Dec 10, 2018
f34a3a5
Added reference for Prof. Thomas
bpearson11 Dec 10, 2018
5de86d3
Finished citations for background
bpearson11 Dec 10, 2018
4e92f96
Edited grammar for 4th para.
bpearson11 Dec 10, 2018
c208dde
Deleted instructions
bpearson11 Dec 10, 2018
a4e6f1a
Deleted Finding for 11/13
bpearson11 Dec 10, 2018
2bef671
Edited citations.
bpearson11 Dec 10, 2018
fb60329
Removed instructions. Deleted past examples
bpearson11 Dec 10, 2018
8e5fa87
Removed bullet point
bpearson11 Dec 10, 2018
fef109b
Updated 4th citation
bpearson11 Dec 10, 2018
d39b13e
Added citation
bpearson11 Dec 10, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 2 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,38 +3,16 @@ Digital Workbook, [University of South Florida](http://www.usf.edu/), Fall 2016

---

Instructor: [David J. Thomas](mailto::davidjthomas@usf.edu), [thePortus.com](http://thePortus.com/)
By: [Brandon Pearson](mailto:bpearson1@mail.usf.edu)

---

<figure>

![Replace Me, Sample Image](docs/imgs/caesarian_code.png)

<figcaption>*Put a caption to your image here, if you want*</figcaption>

</figure>

---

## Project Template

This is a starter template for final projects. When you have completed your final project, you should replace this message (README.md) with a short 1-2 paragraph description of your project.

See the [Course Workbook Project Page](https://hacking-history.readthedocs.io/project) for more information on the final project.

**REQUIREMENTS BEFORE STARTING**
+ [GitHub account](https://github.com) created
+ [GitHub Desktop](https://desktop.github.com) client installed
+ [Atom](https://atom.io) editor installed
+ [Python 3.x](https://www.python.org/) installed

---

## Past Project Examples

* [Confederate Memorials](http://confederate-memorials-project.readthedocs.io/)
* [The Slave Ledges](http://slave-ledger.readthedocs.io/en/latest/)
* [An Teanga Sean: The Celtic Languages](http://an-teanga-sean-the-celtic-languages.readthedocs.io/)
This project consists of my analysis of Hillary Clinton's emails using Orange3's hierarchical clustering widget to group similar emails. To allow the hierarchical clustering to work, I first had to use Orange's "Bag of Words" widget to give the emails numerical value - in this case, how often specific words appeared - and put it into the "Distances" widget to see how similar certain emails were. After running it through those widgets, I chose to analyze cluster C1 - a set of emails about Benghazi - to see what linked them together, finding that the only reason Orange grouped them together was the fact they all contained references to the attack in Benghazi or Libya. Some were about Libyan concerns regarding how it would affect international investment while others were about the 2012 Presidential election; specifically, Republicans' desire to use Benghazi to paint Obama as weak on foreign policy.

---
18 changes: 6 additions & 12 deletions docs/background.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,19 @@
# Project Background

---

Put a general discussion of your topic here. Don't get into the historiography or scholarship here. Just introduce to your reader what your topic is. What is the 'problem', the 'question', and/or the 'argument'.

YOU SHOULD BE LIBERAL AND INCLUDE ANY AND ALL LINKS WHEN RELEVANT
I chose to analyze WikiLeaks' archive of Hillary Clinton's emails to try and learn about Benghazi, and chose to analyze her emails because it is a political topic, and I am interested in politics. What I learned is Libya's president at least initially believed militants had infiltrated protests against an anti-Islam video to attack Benghazi; I also learned of his concern foreign businesses would see the country as unstable if he was unable to eliminate the militants [(WikiLeaks, 2016)](https://wikileaks.org/clinton-emails/). I also learned of the GOP's eagerness to use Benghazi for their political gain, and the annoyance of Clinton's allies people refused to move on [(WikiLeaks, 2016)](https://wikileaks.org/clinton-emails/).

---

## Historical Background
## Historical and Digital Background

Put a discussion of any relevant historigraphy you think relates to the topic.You can discuss the historiography of specific times and places, but you can also discuss any theoretical literature you think is relevant.
Part of the historiography of text analysis is the "Gender, Race, and Nationality in Black Drama, 1950-2006: Mining Differences in Language Use in Authors and their Characters" [(Argamon et al., 2009)](http://digitalhumanities.org:8081/dhq/vol/3/2/000043/000043.html). The authors discuss their use of text analysis to test whether techniques can tell the difference between authors of different gender, race, and nationality (Argamon et al., 2009). Though the scholars were successful in their efforts, it was only after normalizing the data that they were able to use the techniques to determine author and character gender, which they fear puts distance between the text the and reader and limit the ability of readers interpret text with text analysis (Argamon et al., 2009). Argamon and his team point out focusing on only simple categories may prevent analysis of complex themes using text analysis or reinforce the biases of authors by "confirming" what they set out to prove (Argamon et al., 2009).

You have freedom to roam from the narrow topic of your project to explore how different authors/schools of thought have impacted scholarly approaches over time. However, make sure that in the end you clearly relate how this discussion relates to the subject of your project and/or your choices in methods or interpreative models.
Another part of the historiography of text analysis and a rebutal to "Gender" is Dr. Meehan's "Text Minding" [(Meehan, 2009)](http://digitalhumanities.org:8081/dhq/vol/3/2/000045/000045.html). Dr. Meehan agrees with Argamon and his coauthor' argument technology may force sources to fit models that are too simple; however, unlike Argamon, Meehan states technology could aid research (Meehan, 2009). If scholars familiarize themselves with technology and combine it with traditional tools, it could enable them to find its shortcomings and learn how to circumvent epistemological impediments posed by technology, just as Fredrick Douglas was able to overcome the binarity of slavery by familiarizing himself with and exploring how slavery advocates contrasted slaves with the free (Meehan, 2009).

As you discuss different authors, you may site them using (author, page) style parenthetical notation. Make sure that a full citation in [Chicago](http://chicagomanualofstyle.org) is added to the 'docs/credits.md' page.
The relevance of these articles for using text analysis on WikiLeaks' archive of Clintons' emails is that though scholars should be cautious about using techological tools - because they are as flawed as the people who made them - the way for scholars to prevent themselves from being constrained is to educate themselves on how technology works and combine it with other analytical aids (Aramon et al., 2009 and Meehan, 2009). As I and others familiarize ourselves with Orange3, we will begin to understand the shorcomings of text analysis and digital techniques in general (Meehan, 2009). A better understanding of technology, in turn, will allow scholars to find novel ways to analyze text, and possibly increase the breadth and depth of humanistic study (Meehan, 2009). Therefore, projects using text analysis to study Clinton's emails are useful as they give scholars the ability to conduct more sophisticated research on a subject _and_ the opportunity to familiarize themselves with technology.

---

## Digital Background
---

You should also make sure to do research on any relevant digital work, whether scholarly articles or digital projects. Make sure to check Digital Humanities Quarterly, or [DHQ](http://www.digitalhumanities.org/dhq/), [Debates in the Digital Humanities](http://dhdebates.gc.cuny.edu/), [JStor](https://jstor.org), blogs and more to find relevant work.

You don't actually have to have a sepearte 'Digital Background' section. If it feels more natural to you, you may combine them into a single discussion. This is especially a good idea if you feel that with your topic you cannot talk about historiography without also talking about digital scholarship, and vice versa.
27 changes: 21 additions & 6 deletions docs/biblio_and_credits.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,32 @@

## Project Bibliography

* List your bibliography items here, whether historical and/or digital in nature, here.
* It should be in [Chicago Manual of Style](chicagomanualofstyle.org)
* Remember to *italicize* and **bold** as appropriate.
Argamon, Shlomo, Charles Cooney, Russell Horton, Mark Olsen, Sterling Stein, and Robert Voyer. “Gender, Race, and Nationality in Black Drama, 1950-2006: Mining Differences in Language Use in Authors and their Characters.” _Digital Humanities Quarterly_ 3, no. 2 (2009). Accessed December 1, 2018. http://digitalhumanities.org:8081/dhq/vol/3/2/000043/000043.html.

Brennan, Margaret and Paula Reid. "State Dept. to comply with court order on Hillary Clinton's emails." _CBS News_, May 19, 2015. https://www.cbsnews.com/news/state-dept-to-comply-with-court-order-on-hillary-clintons-emails/.

Currier, Cora and Micah Lee. “In Leaked Chats, WikiLeaks Discusses Preference for GOP Over Clinton, Russia, Trolling, and Feminists They Don’t Like.” _The Intercept_, February 14, 2018. https://theintercept.com/2018/02/14/julian-assange-wikileaks-election-clinton-trump/.

Curk, Tomaz, Janez Demsar, Ales Erjavec, Crt Gorup, Tomaz Hocevar, Mitar Milutinovic, Martin Mozina, et al. "Orange: Data Mining Toolbox in Python." _Journal of Machine Learning Research_ 14(Aug), (2013): 2349-2353.

Meehan, Sean R. “Text Minding: “A Response to Gender, Race, and Nationality in Black Drama, 1850-2000: Mining Differences in Language Use in Authors and their Characters” .” _Digital Humanities Quarterly_ 3, no. 2 (2009). Accessed December 1, 2018. http://digitalhumanities.org:8081/dhq/vol/3/2/000045/000045.html.

Orange. "Hierarchical (hierarchical)." Orange Data Mining Library. Accessed December 3, 2018. https://docs.orange.biolab.si/3/data-mining-library/reference/clustering.hierarchical.html.

Thomas, David. “The Portus.” Github. Accessed December 9, 2018. https://github.com/thePortus.

WikiLeaks. “About.” WikiLeaks. May 7, 2011. https://wikileaks.org/About.html.

WikiLeaks. “Hillary Clinton Email Archive.” WikiLeaks. February 2, 2018. https://wikileaks.org/clinton-emails/.

WikiLeaks. “What is WikiLeaks.” WikiLeaks. November 3, 2015. https://wikileaks.org/What-is-WikiLeaks.html.

---

## Project Credits

* Put your group member's credits here
* Link to any emails or github accounts (if you want)
* Leave the credits at the bottom
* This project created by Brandon Pearson.


---

Expand Down
Binary file added docs/imgs/workflow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
23 changes: 2 additions & 21 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,8 @@
# Replace This Title!
# An Analysis of Clinton's Emails!

---

<figure>

![Replace Me, Sample Image](imgs/caesarian_code.png)

<figcaption>

*Put a caption to your image here, if you want*

</figcaption>

</figure>

1. Put an intro image above (if you want)
2. Change the 1st line of this file to the name of your project
3. Replace this list with the names of your group members, linking to email or github accounts (if you want)
4. Remember to also add your credits, introductions/summarys to the mkdocs.yml and README.md and docs/credits.md files.

---

Replace this, putting the introduction/summary of your project here. Leave the credits in the bottom section, however.
This project consists of my analysis of Hillary Clinton's emails using Orange3's hierarchical clustering widget to group similar emails. To allow the hierarchical clustering to work, I first had to use Orange's "Bag of Words" widget to give the emails numerical value - in this case, how often specific words appeared - and put it into the "Distances" widget to see how similar certain emails were. After running it through those widgets, I chose to analyze cluster C1 - a set of emails about Benghazi - to see what linked them together, finding that the only reason Orange grouped them together was the fact they all contained references to the attack in Benghazi or Libya. Some were about Libyan concerns regarding how it would affect international investment while others were about the 2012 Presidential election; specifically, Republicans' desire to use Benghazi to paint Obama as weak on foreign policy.

---

Expand Down
22 changes: 2 additions & 20 deletions docs/initial_findings.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,7 @@
# Initial Findings

This is where you where you discuss any (initial) conclusions you have come to. This is also a natural place to put any and all visualizations that you come up with. In fact, this is where you tell the reader what it is you discovered. You can use the first person if that helps you discuss the process. This is where you both describe to the reader what they are looking at (if they are looking at a visualization) and, most importantly, what your interpretation of the results is.
My initial finding after analyzing cluster C1 is the only thing tying the emails in the cluster together was the fact each referenced Benghazi, but in the subclusters, there were several common themes, including foreign policy, national security, and the U.S. 2012 presidential election. My analysis of one subcluster revealed to me Libyan President Mohammed Ussef el Magariaf's concern foreign investors would see the country as unstable if he was unable to find the militants who murdered Ambassador Chris Stephens in Benghazi, and his belief several of them were being funded by al Qaedaa [(WikiLeaks, 2016)](https://wikileaks.org/clinton-emails/). My examination of a second subcluster revealed Hillary's inner circle was concerned about the effect Benghazi would have on U.S. policy in Libya, especially projects America had already begun implementing in the country[(WikiLeaks, 2016)](https://wikileaks.org/clinton-emails/). My analysis of a third revealed to me how eager the GOP was to make it an issue (one Romney advisor called it a gift to the campaign) and the frustration of Clinton's supporters the issue did not disappear, one of whom asked why it had not be consigned to the "sad and sealed file of Americans killed abroad in dangerous line of duty" [(WikiLeaks, 2016)](https://wikileaks.org/clinton-emails/). Given I learned all this information from a cluster of roughly 20 emails out of 30,000 total, I believe there is much more that could be learned if scholars chose to analyze the data further; additional analysis of related clusters might reveal some of Benghazi’s effects on Libya's economy and relationship with America.

If you want to add an image with a caption you can do it like this....
However, conducting more thorough analysis demands a larger team - this is not a project one person can tackle - with technical skill, and if possible, more powerful technology. Since I conducted the research on my own and due to time constraints, I was only able to look at a single cluster; a larger team with more resources, in contrast, could do a more in-depth analysis of Clinton's emails. Scholars will also need technical expertise as my limited knowledge and skills hampered my ability to analyze the data by preventing me from using hierarchical clustering programs that would have aided my project. Finally, scholars will need better technology - I had to limit the size of the file I ran on Orange so my computer could process it; scholars with more powerful technology than what I have access to would be able to do higher quality research by analyzing a larger sample.

<figure>

![Replace Me, Sample Image](imgs/caesarian_code.png)

<figcaption>

*Put a caption to your image here, if you want*

</figcaption>

</figure>

Whether you turn your visualizations into static pictures and put them here or embed them, you MUST discuss your visualizations adequately. That means that whoever is the visualization expert must explain what they think the visualization means. You should explain anything that is not self-apparent from the picture alone. Moreover, you should at least comment on whether you think you can draw broader conclusions from any of the visualizations, either when considered individually or all together.

If your project was less visualization-centric, this is where you at least explain in plain words what you learned through the process of non-visual analysis. For example, if you used text mining not for visualization purposes, but to help you manually find interesting threads of conversation in a body of documents.

You should also make sure to be a cautious scholar, and to think about the limitations of what the visualizations can actually tell you. As indicated by the section name, you are **not** expected to make hard conclusions that upend serious historical debates. Rather, this is a place to explore what might be learned from visual exploration.

You must also comment on where you would go next. If these are initial findings, where do you think the best profit would be for any future attempts. This is also where you can talk about pitfalls that limited your ability to learn more about the topic. How might future projects overcome these difficulties?
Loading