From 87e6f0ba2ae8ecae6b65556abfdd5909cf2ffa93 Mon Sep 17 00:00:00 2001 From: "github-actions[bot]" Date: Fri, 20 Dec 2024 22:17:35 +0000 Subject: [PATCH] Render course --- docs/01-intro.md | 2 +- docs/02-team_guidelines.md | 10 +++++++--- docs/04-informatics_relationships.md | 6 +++--- docs/05-promoting_diversity.md | 4 ++-- ...for-multidisciplinary-informatics-teams.html | 17 ++++++++++------- docs/informatics-relationships.html | 12 ++++++------ docs/introduction.html | 2 +- ...romoting-diversity-equity-and-inclusion.html | 4 ++-- docs/references.html | 6 +++--- docs/search_index.json | 2 +- 10 files changed, 36 insertions(+), 29 deletions(-) diff --git a/docs/01-intro.md b/docs/01-intro.md index c24a189..830e9b9 100644 --- a/docs/01-intro.md +++ b/docs/01-intro.md @@ -40,7 +40,7 @@ Informatics work presents unique challenges due to the fact that it requires mul According to a recent article about this subject: -> "Computational biology hinges on mutual respect between researchers from different disciplines, and a key element of respect is understand a colleague's particular expertise." [carpenter_cultivating_2021] +> "Computational biology hinges on mutual respect between scientists from different disciplines, and key elements of respect are understanding a colleague’s particular expertise and motivation." [way_field_2021] In this course we hope to give you a bit more of an understanding of the variety of perspectives that your colleagues might have. diff --git a/docs/02-team_guidelines.md b/docs/02-team_guidelines.md index 3962322..8d7872e 100644 --- a/docs/02-team_guidelines.md +++ b/docs/02-team_guidelines.md @@ -68,11 +68,15 @@ Finally, you also want to look for individuals who seem to get things done. You Be careful about assuming that your experimental collaborator can do any type of wet bench experiment or that your informatics collaborator can analyze any type of data. -> "Computational biology hinges on mutual respect between researchers from different disciplines, and a key element of respect is understanding a colleague’s particular expertise" [@carpenter_cultivating_2021]. +> "Computational biology hinges on mutual respect between researchers from different disciplines, and a key element of respect is understanding a colleague’s particular expertise" [@way_field_2021]. + +> "Computationalists do not like to be seen as “just” running the numbers any more than biologists appreciate the perception that they are “just” a pair of hands that produced the data" [@way_field_2021]. + +> "Statistics, database structures, clinical informatics, genetics, epigenetics, genomics, proteomics, imaging, single-cell technologies, structure prediction, algorithm development, machine learning, and mechanistic modeling are all distinct fields. Biologists should not be offended if a particular idea does not fit a computational biologist’s research agenda, and computational scientists need to clearly communicate analysis considerations, approaches, and limitations" [@way_field_2021]. + +> "Certain grant mechanisms can provide flexibility for computational biologists to develop new technologies, but the scope often focuses on method development, limiting the ability to collaborate on application-oriented projects. The current academic systems incentivize mechanism and translational discovery for biology but methodological or theoretical advances for computational sciences. This explains a common disconnect when collaborating: **Projects that require routine use of existing methodology typically provide little benefit to the computational person’s academic record no matter how unique a particular dataset.**" [@way_field_2021] -> "Computationalists do not like to be seen as “just” running the numbers any more than biologists appreciate the perception that they are “just” a pair of hands that produced the data" [@carpenter_cultivating_2021]. -> "Statistics, database structures, clinical informatics, genetics, epigenetics, genomics, imaging, single-cell technologies, structure prediction, algorithm development, machine learning, and mechanistic modeling are all distinct fields. Biologists should not be offended if a particular collaboration idea does not fit a computational biologist’s research agenda or expertise...a computational biologist’s research laboratory often must focus on novel methods development in a particular area for career advancement" [@carpenter_cultivating_2021]. ## Communication for informatics teams diff --git a/docs/04-informatics_relationships.md b/docs/04-informatics_relationships.md index ad7d42c..80d59b2 100644 --- a/docs/04-informatics_relationships.md +++ b/docs/04-informatics_relationships.md @@ -107,11 +107,11 @@ Sharing and discussing budget information early and often can help research memb It is also important to recognize that: -> "There is a common misconception that the lack of physical experimentation and laboratory supplies makes analysis work automated, quick, and inexpensive" [@carpenter_cultivating_2021]. +> "There is a common misconception that the lack of physical experimentation and laboratory supplies makes computational work automated, quick, and inexpensive." [@way_field_2021]. However: -> "In reality, even for well-established data types, analysis can often take **as much or more time** and effort to perform as generating the data at the bench. Moreover, it typically also requires pipeline optimization, software development and maintenance, and user interfaces so that methods remain usable beyond the scope of a single publication or project" [@carpenter_cultivating_2021]. +> "In reality, even for well-established data types, analysis can often take **as much or more time** and effort to perform as generating the data at the bench. Moreover, it typically also requires pipeline optimization, software development and maintenance, and user interfaces so that methods remain usable beyond the scope of a single publication or project" [@way_field_2021]. Don't forget to provide some budget for your informatics collaborators, as their time ultimately does cost money and there may be computational costs that you may not be aware of. @@ -234,7 +234,7 @@ Wherever possible also try to advocate for more nuanced academic promotion polic For your students who wish to stay in academia, it may be less important that they become as generally familiar with a wide variety of data science skills and practices as students interested in a career outside of academia, if they are for example focusing on a specific statistical method. Just like other academic fields, informatics experts will become experts in niche subject areas. Encourage these students to go to targeted conferences to build a network in their field of interest (although we still encourage if possible to allow your students to get a well-rounded exposure to different types of conferences). Also especially encourage these students to learn about grantsmanship just as you would with your other academic mentees. However this can also be useful for students interested in working for a nonprofit or for the government. -See @carpenter_cultivating_2021 and @waller_documenting_2018 for a more in-depth discussion and suggestions on how we can work to reform academic promotion practices to be more mindful of disciplinary differences for informatics experts. +See @way_field_2021 and @waller_documenting_2018 for a more in-depth discussion and suggestions on how we can work to reform academic promotion practices to be more mindful of disciplinary differences for informatics experts. #### Authorship considerations diff --git a/docs/05-promoting_diversity.md b/docs/05-promoting_diversity.md index 8a67140..f1009be 100644 --- a/docs/05-promoting_diversity.md +++ b/docs/05-promoting_diversity.md @@ -59,7 +59,7 @@ Many Black scientists are responsible for great human achievements that changed - [Vivien Thomas (1910–1985)](https://en.wikipedia.org/wiki/Vivien_Thomas): Vivien was a pioneer in [heart surgery at Johns Hopkins](https://www.hopkinsmedicine.org/som/giving/vtfund.html) in the 1940s when it was considered taboo. He is credited for assisting in creating a life saving surgical operation that improved the oxygenation of children with congenital heart defects. This operation was pivotal in paving the way for other heart surgical procedures. Sadly, it took more than 25 years for him to be credited for this work. -See [this article for more information about Katherine Johnson and other pioneering and world-changing Black female scientsist](https://www.biography.com/scientists/katherine-johnson-black-female-science-technology-engineering-mathematics). Also check out this [list of inspiring Black scientists today ](http://crosstalk.cell.com/blog/100-inspiring-black-scientists-in-america). +See [this article for more information about Katherine Johnson and other pioneering and world-changing Black female scientists](https://www.biography.com/scientists/katherine-johnson-black-female-science-technology-engineering-mathematics). Also check out this [list of inspiring Black scientists today ](http://crosstalk.cell.com/blog/100-inspiring-black-scientists-in-america). Scientists and mathematicians of Latin American or Hispanic origin have also greatly contributed and continue to contribute to scientific innovation: @@ -198,7 +198,7 @@ In a study of clinical trial participants of marginalized groups, participants s > "In addition, participants noted that the enthusiasm, commitment, or passion of the researcher to help the population of interest and to study the topic of interest influenced how trustworthy the researcher appeared to be" [@griffith_determinants_2020]. -Participants also state that the feel more trusting and more willing to participant in trials with a researcher of a similar background. Thus, improving the diversity of research teams and supporting underrepresented investigators may help to recruit more diverse participants for clinical trials [@ACRP_representation_2020]. In addition, [culture competency training](https://www.cigna.com/health-care-providers/resources/cultural-competency-training) and [Diversity, Equity, and Inclusion (DEI) training](https://www.diversityinclusiontraining.com/) can help! See [here](https://www.samhsa.gov/section-223/cultural-competency/resources) for additional cultural competency training resources. +Participants also state that they feel more trusting and more willing to participant in trials with a researcher of a similar background. Thus, improving the diversity of research teams and supporting underrepresented investigators may help to recruit more diverse participants for clinical trials [@ACRP_representation_2020]. In addition, [culture competency training](https://www.cigna.com/health-care-providers/resources/cultural-competency-training) and [Diversity, Equity, and Inclusion (DEI) training](https://www.diversityinclusiontraining.com/) can help! See [here](https://www.samhsa.gov/section-223/cultural-competency/resources) for additional cultural competency training resources. ### **Barriers of access** diff --git a/docs/guidelines-for-multidisciplinary-informatics-teams.html b/docs/guidelines-for-multidisciplinary-informatics-teams.html index 53198b4..d0e9d6a 100644 --- a/docs/guidelines-for-multidisciplinary-informatics-teams.html +++ b/docs/guidelines-for-multidisciplinary-informatics-teams.html @@ -394,13 +394,16 @@

2.1.2 Who to look for2.1.3 Adjust your expectations

Be careful about assuming that your experimental collaborator can do any type of wet bench experiment or that your informatics collaborator can analyze any type of data.

-

“Computational biology hinges on mutual respect between researchers from different disciplines, and a key element of respect is understanding a colleague’s particular expertise” (Carpenter et al. 2021).

+

“Computational biology hinges on mutual respect between researchers from different disciplines, and a key element of respect is understanding a colleague’s particular expertise” (Way et al. 2021).

-

“Computationalists do not like to be seen as “just” running the numbers any more than biologists appreciate the perception that they are “just” a pair of hands that produced the data” (Carpenter et al. 2021).

+

“Computationalists do not like to be seen as “just” running the numbers any more than biologists appreciate the perception that they are “just” a pair of hands that produced the data” (Way et al. 2021).

-

“Statistics, database structures, clinical informatics, genetics, epigenetics, genomics, imaging, single-cell technologies, structure prediction, algorithm development, machine learning, and mechanistic modeling are all distinct fields. Biologists should not be offended if a particular collaboration idea does not fit a computational biologist’s research agenda or expertise…a computational biologist’s research laboratory often must focus on novel methods development in a particular area for career advancement” (Carpenter et al. 2021).

+

“Statistics, database structures, clinical informatics, genetics, epigenetics, genomics, proteomics, imaging, single-cell technologies, structure prediction, algorithm development, machine learning, and mechanistic modeling are all distinct fields. Biologists should not be offended if a particular idea does not fit a computational biologist’s research agenda, and computational scientists need to clearly communicate analysis considerations, approaches, and limitations” (Way et al. 2021).

+
+
+

“Certain grant mechanisms can provide flexibility for computational biologists to develop new technologies, but the scope often focuses on method development, limiting the ability to collaborate on application-oriented projects. The current academic systems incentivize mechanism and translational discovery for biology but methodological or theoretical advances for computational sciences. This explains a common disconnect when collaborating: Projects that require routine use of existing methodology typically provide little benefit to the computational person’s academic record no matter how unique a particular dataset.(Way et al. 2021)

@@ -550,7 +553,7 @@

2.4.7 Advocate authorship

Regardless of your employees’ or students’ backgrounds, make sure you advocate for authorship for each of them (particularly if they are interested in a career in research).

According to a recent report about computational biology:

-

“Despite the fact that dramatic advancements have been driven by computational biology, too often researchers choosing this path languish in career advancement, publication, and grant review” (Carpenter et al. 2021).

+

“Despite the fact that dramatic advancements have been driven by computational biology, too often researchers choosing this path languish in career advancement, publication, and grant review” (carpenter_cultivating_2021?).

It is often overlooked, but informatics experts will also need first author papers. However, keep in mind that in some fields authors are listed in different ways.

Allow your employees to generate ideas for such publications and discuss this with them. Often the work to help with other projects may not be as interesting for your employee as an idea that they come up with themselves.

@@ -593,9 +596,6 @@

References Broman, Karl. 2019. “Collaborating Reproducibly.” https://www.biostat.wisc.edu/~kbroman/presentations/rrcollab_aaas2019.pdf. -
-Carpenter, Anne E, Casey S Greene, Piero Carninci, Benilton S Cervalho, Michiel de Hoon, Stacey Finley, Jerry SH Lee, et al. 2021. “Cultivating Computational Biology.” arXiv [q-Bio.OT], 12. https://doi.org/arXiv:2104.11364. -
Haden, Jeff. 2014. “5 Ways to Ask the Perfect Question.” Inc.com. https://www.inc.com/jeff-haden/5-ways-to-ask-the-perfect-question.html.
@@ -617,6 +617,9 @@

References ———. 2014. “The Lonely Bioinformatician Revisited: Clinical Labs .” Opiniomics.org. http://www.opiniomics.org/the-lonely-bioinformatician-revisited-clinical-labs/. +
+Way, Gregory P., Casey S. Greene, Piero Carninci, Benilton S. Carvalho, Michiel De Hoon, Stacey D. Finley, Sara J. C. Gosline, et al. 2021. “A Field Guide to Cultivating Computational Biology.” PLOS Biology 19 (10): e3001419. https://doi.org/10.1371/journal.pbio.3001419. +

diff --git a/docs/informatics-relationships.html b/docs/informatics-relationships.html index 53b426b..1f05c00 100644 --- a/docs/informatics-relationships.html +++ b/docs/informatics-relationships.html @@ -438,11 +438,11 @@

4.2.2 Potential challengesSharing and discussing budget information early and often can help research members to understand what expectations are reasonable and how collaboration partners may best assist one another.

It is also important to recognize that:

-

“There is a common misconception that the lack of physical experimentation and laboratory supplies makes analysis work automated, quick, and inexpensive” (Carpenter et al. 2021).

+

“There is a common misconception that the lack of physical experimentation and laboratory supplies makes computational work automated, quick, and inexpensive.” (Way et al. 2021).

However:

-

“In reality, even for well-established data types, analysis can often take as much or more time and effort to perform as generating the data at the bench. Moreover, it typically also requires pipeline optimization, software development and maintenance, and user interfaces so that methods remain usable beyond the scope of a single publication or project” (Carpenter et al. 2021).

+

“In reality, even for well-established data types, analysis can often take as much or more time and effort to perform as generating the data at the bench. Moreover, it typically also requires pipeline optimization, software development and maintenance, and user interfaces so that methods remain usable beyond the scope of a single publication or project” (Way et al. 2021).

Don’t forget to provide some budget for your informatics collaborators, as their time ultimately does cost money and there may be computational costs that you may not be aware of.

 Ways to overcome collaboration challenges 1) To avoid issues with communication differences take care, educate each other, considering co-mentoring trainees or hiring experts that speak your collaborator's language. 2) To avoid issues with different styles or goals, make unified standards and discuss what success means early and often. 3) To avoid issues with different capabilities, Make sure everyone has defined roles and tasks that actually use their expertise.4) To avoid issues with a reduced sense of responsibility, keep everyone motivated with defined due dates (discuss dates to make sure they make sense!). 5) To avoid issues with the fact that research is a dynamic process, expect change and occasional failure. 6) To avoid issues with differences in resources, be mindful of potential budget and resource differences.

@@ -547,7 +547,7 @@

4.4.4.3 Academia for informatics
  • Encourage your student to seek specialized technical skill sets
  • For your students who wish to stay in academia, it may be less important that they become as generally familiar with a wide variety of data science skills and practices as students interested in a career outside of academia, if they are for example focusing on a specific statistical method. Just like other academic fields, informatics experts will become experts in niche subject areas. Encourage these students to go to targeted conferences to build a network in their field of interest (although we still encourage if possible to allow your students to get a well-rounded exposure to different types of conferences). Also especially encourage these students to learn about grantsmanship just as you would with your other academic mentees. However this can also be useful for students interested in working for a nonprofit or for the government.

    -

    See Carpenter et al. (2021) and Waller (2018) for a more in-depth discussion and suggestions on how we can work to reform academic promotion practices to be more mindful of disciplinary differences for informatics experts.

    +

    See Way et al. (2021) and Waller (2018) for a more in-depth discussion and suggestions on how we can work to reform academic promotion practices to be more mindful of disciplinary differences for informatics experts.

    -
    -Carpenter, Anne E, Casey S Greene, Piero Carninci, Benilton S Cervalho, Michiel de Hoon, Stacey Finley, Jerry SH Lee, et al. 2021. “Cultivating Computational Biology.” arXiv [q-Bio.OT], 12. https://doi.org/arXiv:2104.11364. -
    Mäkinen, Elina I., Eliza D. Evans, and Daniel A. McFarland. 2020. “The Patterning of Collaborative Behavior and Knowledge Culminations in Interdisciplinary Research Centers.” Minerva 58 (1): 71–95. https://doi.org/10.1007/s11024-019-09381-6.
    @@ -603,6 +600,9 @@

    References Waller, Lance A. 2018. “Documenting and Evaluating Data Science Contributions in Academic Promotion in Departments of Statistics and Biostatistics.” The American Statistician 72 (1): 11–19. https://doi.org/10.1080/00031305.2017.1375988. +
    +Way, Gregory P., Casey S. Greene, Piero Carninci, Benilton S. Carvalho, Michiel De Hoon, Stacey D. Finley, Sara J. C. Gosline, et al. 2021. “A Field Guide to Cultivating Computational Biology.” PLOS Biology 19 (10): e3001419. https://doi.org/10.1371/journal.pbio.3001419. +

    diff --git a/docs/introduction.html b/docs/introduction.html index 477fb78..26ba27d 100644 --- a/docs/introduction.html +++ b/docs/introduction.html @@ -387,7 +387,7 @@

    1.5 Informatics teams challenges<

    Informatics work presents unique challenges due to the fact that it requires multidisciplinary teams.

    According to a recent article about this subject:

    -

    “Computational biology hinges on mutual respect between researchers from different disciplines, and a key element of respect is understand a colleague’s particular expertise.” [carpenter_cultivating_2021]

    +

    “Computational biology hinges on mutual respect between scientists from different disciplines, and key elements of respect are understanding a colleague’s particular expertise and motivation.” [way_field_2021]

    In this course we hope to give you a bit more of an understanding of the variety of perspectives that your colleagues might have.

    diff --git a/docs/promoting-diversity-equity-and-inclusion.html b/docs/promoting-diversity-equity-and-inclusion.html index 9bb63bd..acf61bd 100644 --- a/docs/promoting-diversity-equity-and-inclusion.html +++ b/docs/promoting-diversity-equity-and-inclusion.html @@ -404,7 +404,7 @@

    5.3 Examples of contributions by
  • Katherine Johnson (1918–2020): Katherine Johnson was a mathematician who helped enable humans to venture into space. Katherine worked for NASA in the 1960s during the Space Race. A depiction of her work and some of the discrimination that she faced while working for NASA is featured in the film Hidden figures.

  • Vivien Thomas (1910–1985): Vivien was a pioneer in heart surgery at Johns Hopkins in the 1940s when it was considered taboo. He is credited for assisting in creating a life saving surgical operation that improved the oxygenation of children with congenital heart defects. This operation was pivotal in paving the way for other heart surgical procedures. Sadly, it took more than 25 years for him to be credited for this work.

  • -

    See this article for more information about Katherine Johnson and other pioneering and world-changing Black female scientsist. Also check out this list of inspiring Black scientists today.

    +

    See this article for more information about Katherine Johnson and other pioneering and world-changing Black female scientists. Also check out this list of inspiring Black scientists today.

    Scientists and mathematicians of Latin American or Hispanic origin have also greatly contributed and continue to contribute to scientific innovation:

    • Mario Molina (1943–2020): Mario was born in Mexico and came to the United States for graduate school. He was a chemist and environmental scientist and was awarded the Nobel Prize in chemistry in 1995 for his work in discovering the environmental impact of chlorofluorocarbon (CFC) gases on the Earth’s ozone layer. This work was critical in changing policies (starting in the 1980s) to protect our environment from these chemicals which were widely used as aerosols, solvents and refrigerants. Through these policies, many countries around the world have banned or greatly reduced the manufacturing of these chemicals. NASA projections suggest that the ozone layer would have been largely depleted by 2060 if these chemicals were manufactured at previous historical rates. See here for more information on what would have likely happened to the world if CFCs were not banned. Also check out this podcast episode for more information.

    • @@ -522,7 +522,7 @@

      5.4.4 Inadequate research

      “In addition, participants noted that the enthusiasm, commitment, or passion of the researcher to help the population of interest and to study the topic of interest influenced how trustworthy the researcher appeared to be” (Griffith et al. 2020).

      -

      Participants also state that the feel more trusting and more willing to participant in trials with a researcher of a similar background. Thus, improving the diversity of research teams and supporting underrepresented investigators may help to recruit more diverse participants for clinical trials (Yates et al. 2020). In addition, culture competency training and Diversity, Equity, and Inclusion (DEI) training can help! See here for additional cultural competency training resources.

      +

      Participants also state that they feel more trusting and more willing to participant in trials with a researcher of a similar background. Thus, improving the diversity of research teams and supporting underrepresented investigators may help to recruit more diverse participants for clinical trials (Yates et al. 2020). In addition, culture competency training and Diversity, Equity, and Inclusion (DEI) training can help! See here for additional cultural competency training resources.

      5.4.5 Barriers of access

      diff --git a/docs/references.html b/docs/references.html index 1524996..0d522f4 100644 --- a/docs/references.html +++ b/docs/references.html @@ -383,9 +383,6 @@

      References“Asian Americans and Pacific Islanders - Facts, Not Fiction: Setting the Record Straight,” 44. https://secure-media.collegeboard.org/digitalServices/pdf/professionals/asian-americans-and-pacific-islanders-facts-not-fiction.pdf.

      -Carpenter, Anne E, Casey S Greene, Piero Carninci, Benilton S Cervalho, Michiel de Hoon, Stacey Finley, Jerry SH Lee, et al. 2021. “Cultivating Computational Biology.” arXiv [q-Bio.OT], 12. https://doi.org/arXiv:2104.11364. -
      -
      Cech, E. A., and T. J. Waidzunas. 2021. “Systemic Inequalities for LGBTQ Professionals in STEM.” Science Advances 7 (3): eabe0933. https://doi.org/10.1126/sciadv.abe0933.
      @@ -530,6 +527,9 @@

      References“The Lonely Bioinformatician Revisited: Clinical Labs .” Opiniomics.org. http://www.opiniomics.org/the-lonely-bioinformatician-revisited-clinical-labs/.

      +Way, Gregory P., Casey S. Greene, Piero Carninci, Benilton S. Carvalho, Michiel De Hoon, Stacey D. Finley, Sara J. C. Gosline, et al. 2021. “A Field Guide to Cultivating Computational Biology.” PLOS Biology 19 (10): e3001419. https://doi.org/10.1371/journal.pbio.3001419. +
      +
      Williams, Deborah H., and Gerhard P. Shipley. 2018. “Cultural Taboos as a Factor in the Participation Rate of Native Americans in STEM.” International Journal of STEM Education 5 (1): 17. https://doi.org/10.1186/s40594-018-0114-7.
      diff --git a/docs/search_index.json b/docs/search_index.json index 633044d..05c15c4 100644 --- a/docs/search_index.json +++ b/docs/search_index.json @@ -1 +1 @@ -[["index.html", "Leadership for Cancer Informatics Research About this course 0.1 Available course formats", " Leadership for Cancer Informatics Research 2024-12-20 About this course This course is part of a series of courses for the Informatics Technology for Cancer Research (ITCR) called the Informatics Technology for Cancer Research Education Resource. This material was created by the ITCR Training Network (ITN) which is a collaborative effort of researchers around the United States to support cancer informatics and data science training through resources, technology, and events. This initiative is funded by the following grant: National Cancer Institute (NCI) UE5 CA254170. Our courses feature tools developed by ITCR Investigators and make it easier for principal investigators, scientists, and analysts to integrate cancer informatics into their workflows. Please see our website at www.itcrtraining.org for more information. Except where otherwise indicated, the contents of this course are available for use under the Creative Commons Attribution 4.0 license. You are free to adapt and share the work, but you must give appropriate credit, provide a link to the license, and indicate if changes were made. Sample attribution: Leadership for Cancer Informatics Research by Johns Hopkins Data Science Lab (CC-BY 4.0). You can download the illustrations by clicking here. 0.1 Available course formats This course is available in multiple formats which allows you to take it in the way that best suites your needs. You can take it for certificate which can be for free or fee. The material for this course can be viewed without login requirement on this Bookdown website. This format might be most appropriate for you if you rely on screen-reader technology. This course can be taken for free certification through Leanpub. This course can be taken on Coursera for certification here (but it is not available for free on Coursera). Our courses are open source, you can find the source material for this course on GitHub. "],["introduction.html", "Chapter 1 Introduction 1.1 Motivation 1.2 Target audience 1.3 Topics covered: 1.4 Curriculum 1.5 Informatics teams challenges 1.6 Meet the team!", " Chapter 1 Introduction 1.1 Motivation Informatics research often requires multidisciplinary teams. This requires more flexibility to communicate with team members with distinct backgrounds. Furthermore, team members often have different research and career goals. This can present unique challenges in making sure that everyone is on the same page and cohesively working together. 1.2 Target audience The course is intended for researchers who lead research teams or collaborate with others to perform multidisciplinary work. We have especially aimed the material for those with moderate to no computational experience who may lead or collaborate with informatics experts. However this material is also applicable to informatics experts working with others who have less computational experience. 1.3 Topics covered: 1.4 Curriculum We will provide you with an awareness for the specific challenges that your informatics collaborators, employees, and mentees might face, as well as ways to mitigate these challenges. By creating a better work environment for your informatics research team, you will ultimately improve the potential impact of your work. We will also discuss the major pitfalls of informatics research and discuss best practices for performing informatics research correctly and well, so that you can get the most out of your informatics projects. 1.5 Informatics teams challenges Informatics work presents unique challenges due to the fact that it requires multidisciplinary teams. According to a recent article about this subject: “Computational biology hinges on mutual respect between researchers from different disciplines, and a key element of respect is understand a colleague’s particular expertise.” [carpenter_cultivating_2021] In this course we hope to give you a bit more of an understanding of the variety of perspectives that your colleagues might have. 1.6 Meet the team! In order to familiarize you with our guidelines for how to make the most out of your informatics projects we are going to introduce you to some characters of the type of people you may encounter on your journey. We are going to show these characters in situations that may be similar to what you might experience. We are doing this to make the lessons concrete and to try to make the experience more entertaining and experiential. First our fearless lab leaders who lead informatics research projects. We have Sally who is experienced with working with team members from many disciplines including informatics experts. She helps guide her lab through successful projects all the time. Next, we have Charlie. He is new to informatics research and could learn a bit about how to work with informatics experts more effectively. Now we have our informaticists. First is Jack, who is often forgotten and misunderstood by his lab leader. His lab leader does not really know what he does, how long his work takes, or how to support Jack and his career goals. This is unfortunately impeding Jack from achieving the career that he could have and from producing the good work that he is capable of. We also have Hilda, an example of a happy informaticist. She feels supported in all the ways that she needs, allowing her to be as productive and helpful as possible. Here is Francis the frustrated collaborator. She often feels misunderstood by her colleagues. They seem to think her work is easy and should be done faster and they often don’t discuss important aspects about the project until she needs to redo work. Thus, she is reasonably frustrated. Finally we have Harry, the helpful collaborator. He clearly communicates with his collaborators and is well organized! He teaches is collaborators about informatics and they teach him about their knowledge. He and his collaborators have very productive projects. We will now describe some guidelines for how to be an effective leader, collaborator, and mentor on informatics projects so that you can be more like Sally with mentees and employees like Hilda and collaborators like Harry. Keep in mind that while our cartoons often exaggerate situations to make them more comical, more subtle situations can still be very detrimental to your research teammates. "],["guidelines-for-multidisciplinary-informatics-teams.html", "Chapter 2 Guidelines for multidisciplinary informatics teams 2.1 Finding and creating informatics teams 2.2 Communication for informatics teams 2.3 Record keeping practices 2.4 Leadership best practices 2.5 Conclusion", " Chapter 2 Guidelines for multidisciplinary informatics teams In this lesson we will discuss general guidelines for how to create and maintain healthy relationships within a multidisciplinary informatics research team. 2.1 Finding and creating informatics teams The first step to performing a good research study is to find the right people for your research team. In this section we will provide a guide for finding good coworkers, whether they are mentees, collaborators, or employees, to work on informatics cancer projects particularly for multidisciplinary teams. This section is based on blog posts Peng (2011) from Roger Peng and Matsui (2013) from Elizabeth C. Matsui of the simply statistics blog which has many other useful discussions and resources. 2.1.1 Start early The key to a successful multidisciplinary project is to start looking for your research teammates early. If you plan to collaborate with an expert, we suggest that you find such collaborators and discuss your plans before you even finish designing your studies. If you are new to informatics and you plan to employ or mentor informatics experts in your lab, we also suggest that you seek guidance from a senior informatics expert before you start an informatics project. 2.1.2 Who to look for Especially for projects with multidisciplinary teams, good communication is vital. Look for people who are easy to talk to. In Roger’s words: “If you don’t feel comfortable asking (stupid) questions, pointing out problems, or making suggestions, then chances are the science will not be as good as it could be.” Ideally you want to be able to ask your collaborators basic questions, so that you can be sure that you understand the fundamentals correctly. If your collaborator doesn’t explain information clearly (without jargon that you don’t understand) or doesn’t make you feel safe to ask questions, then this will likely result in you missing out on critical information. This goes hand in hand with finding someone who is compassionate or polite. You never quite know what will happen with projects in Science, thus it is ideal to work with people who can handle difficult situations well and continue to treat others with respect. An ideal collaborator also has enthusiasm to ask you about your work, including again fundamental questions. Having their newer perspective can be especially helpful for you to think about your work differently and notice anything that you might take for granted. You also want collaborators that respect and appreciate your knowledge. If there is an imbalance regarding respect, this can result in situations where one collaborator feels like a subordinate to the other. This generally makes individuals feel less comfortable to bring up problems. Finally, you also want to look for individuals who seem to get things done. You can get a sense of someones productivity by looking at their CV and asking them about their projects. Motivation in collaboration can dwindle due to responsibility being spread across groups, which is why it is so important to find productive collaborators. 2.1.3 Adjust your expectations Be careful about assuming that your experimental collaborator can do any type of wet bench experiment or that your informatics collaborator can analyze any type of data. “Computational biology hinges on mutual respect between researchers from different disciplines, and a key element of respect is understanding a colleague’s particular expertise” (Carpenter et al. 2021). “Computationalists do not like to be seen as “just” running the numbers any more than biologists appreciate the perception that they are “just” a pair of hands that produced the data” (Carpenter et al. 2021). “Statistics, database structures, clinical informatics, genetics, epigenetics, genomics, imaging, single-cell technologies, structure prediction, algorithm development, machine learning, and mechanistic modeling are all distinct fields. Biologists should not be offended if a particular collaboration idea does not fit a computational biologist’s research agenda or expertise…a computational biologist’s research laboratory often must focus on novel methods development in a particular area for career advancement” (Carpenter et al. 2021). 2.2 Communication for informatics teams Communication is vital for all work relationships, but this is especially true for multidisciplinary teams. Here are a few tips for keeping communication smooth. This section is based on Broman (2019) by Karl Broman and Wang (2019) by Jiangxia Wang. 2.2.1 Talk first OK, so we kinda covered this, but we can’t emphasize it enough. We suggest that you start talking to your collaborators, students, or employees before you even begin a project, so that you can plan the project in the optimal way. This is critical for forming the right informatically feasible and scientifically useful questions and for collecting the right data to address such questions. Collecting the right data can be vital to the success of a project. It may not always be obvious what is possible or impossible for the experimental biologists. Is 30 samples actually feasible? How about 300? Would that be performed in different batches? What is necessary or possible for an informatics project to test certain questions with statistical methods? How long would certain analyses take? These are discussions that should happen long before reagents are purchased, before IRB submissions, and before grant submissions if possible. 2.2.2 Take your time For employees and mentees, allow time to get comfortable talking to one another. As a leader, take the lead in openly expressing areas that are new to you and facilitate an environment where teammates communicate respect for one another’s unique knowledge and perspectives. In all cases (with collaborators, employees, and mentees), build in extra time for projects to allow for teaching time. You can teach them about your domain and they can teach you about theirs. This may feel like it is taking extra time but it will ultimately pay off in the end, as you will be better prepared to work as a team and to ask the most useful and testable hypotheses. 2.2.3 Come up with questions/hypotheses together Not knowing what may be feasible in terms of data collection and analysis can make it nearly impossible to form an appropriately testable hypothesis. Furthermore, it may be difficult to know how your questions fit into the context of a field and what is actually useful to advance treatment and prevention if you are not a domain expert of the cancer or disease that you are studying. By working together in multidisciplinary teams we can determine the best hypotheses to advance science. Domain experts can help to ensure that the question is feasible from a standpoint of data collection, that it leads to other important questions, that it is new, that it is useful, and that the plan to test it will actually lead to interpretations that are useful. Informatics experts can help to ensure that the question is feasible from as standpoint of data collection and data analysis, that a question is testable, and that it leads to the interpretations that the domain experts hope to gain. 2.2.4 Be specific Give and ask for specific feedback. If your collaborator/employee/student says something that you do not quite understand, ask them for more specific clarification. In addition, give feedback that is specific where possible without assuming knowledge that might be necessary and avoid jargon. One way to do this is to reiterate what you think you understood and to describe concepts at high level, and then follow this with detailed descriptions. Include as many details as possible. For example, if a collaborator simply states that the number of samples would be underpowered, this might not be enough information for you to help solve the issue. Ask for clarification about why with response questions, such as “you believe that the sample size is too small to allow for this specific statistical test (specify the test) to be utilized to identify if there are differences between these specific groups (specify the groups)?”. Once you get a better understanding about why they might have said this, you can better understand how to solve the issue. In the above example, perhaps you need to consider a different test before considering getting additional samples. Or perhaps you aren’t even testing the correct question. Clarifying this first will help. Additional communication tips include (based on (Haden 2014) and (“Ten Tips for Asking Good Questions” n.d.): Use neutral language. You want to allow your team members to have their own authentic reaction (especially when it comes to issues). Neutral language might allow them to better realize a solution before they feel threatened by a reaction of stress from you. You also don’t want to ask questions that lead your team members to a specific conclusion even if there isn’t an issue, so as to allow for more discussion. You can do so by initially avoiding including options or assumptions that you have come up with. However after these open questions you want to start including your own understandings to get more specific. Start with general high level concepts and follow this with more specific comments/questions. Try to focus your feedback and questions about one specific aspect at a time. Ask your team members to walk through a process with you step by step. Ask your teammates to think about your process as if all of you were entirely new to the process to consider what you might be missing or taking for granted. Plan meetings ahead of time so that you know exactly what you hope to communicate. Write your questions or feedback out so that you make sure you cover everything that you need to. Assess each question and feedback comment for its overall purpose or goal. 2.2.5 Be compassionate Consider the stage of the project and how your discussions may impact your coworker. For example, pointing out that there is not enough data or samples to do what your collaborator had hoped during later stages of the project can be very disappointing as it is often not possible to collect new data. Being polite and considerate when you make suggestions can make a major difference. Furthermore, suggesting an idea about how the project can still be productive can save your collaborator/coworkers/students stress and heartache. They may not be aware that there is public data available or additional data in your lab that can still save the project. 2.2.6 Keep contact Regular communication continues the momentum of a project and ensures that important details get discussed when necessary. It also relieves anxiety among coworkers by keeping everyone aware of the status of the project and helping to start discussions if someone needs help. 2.2.7 Schedule extra time As a project continues, new challenges will arise that will again require more time for teaching one another about the scientific process specific to your domain. Build in breathing room in the project schedule where possible, to allow for time for setbacks. Keep in mind that you may be unaware of the setbacks that you may encounter for work outside of your expertise. Creating a situation that is less stressful makes it easier for everyone to maintain positive relationships. We will discuss this more in the next chapter. In summary, we suggest that you follow these tips when communicating with your multidisciplinary team: 2.3 Record keeping practices Once you have your project rolling, it is important to keep good records of your work, your collaborators work, and your communication. Keeping good records takes time and discipline but it can save you more time and heartache in the end. Here are some suggestions for how to optimize your record keeping. 2.3.1 Keep organized records of work Record and communicate notes about your data collection and analyses. Be mindful of overwhelming your coworkers, but generally speaking provide extra information where possible. The more people aware of details about what samples were in what batch, the more likely important details are not missed or forgotten. For example if you are sending data to a collaborator send as much information as possible about how it was generated in the email in which you send it to them, even if you have already discussed the data. This can help ensure that no important details fall through the cracks. The best way we think you can do this in general is to use reports - one of our next suggestions. 2.3.2 Keep organized records of communication Besides recording your work, keep a record of your communications. At a minimum organize your emails for projects into a separate folders with easily recognizable titles to save yourself hassle later when something comes into question. However, we highly recommend that in addition for even better record keeping, you copy paste emails and dates to a note-taking system. This could be as simple as a shared Google doc, or you could consider an app like these that are designed for note-taking. With many of these you can also share your notes with research teammates and you can include report documents directly in your notes. Which brings us to our next point about using reports! 2.3.3 Use reports Instead of sending informal short emails (which are useful at some points in a workflow), we suggest intermittently sending lab reports with as much information about what was done and why as possible. For informatics related work in R or Python (or other supported languages) we highly suggest using a method like R markdown or Jupyter notebooks to track what informatics steps you have performed and why. Beginning these reports with a short description of what raw data you used and when you received it can be critical for ensuring that you are using the correct data! We will describe more about how to use such reports in the final chapter of this course. It is also important that the experimental biologists make similar reports defining what reagents they used, when they performed the study, what samples were used, who performed the experiment, and any notes about usual events, such as the electricity went out during the experiment, left the samples overnight but usually leave two hours, mouse #blank unexpectedly died so we lost this sample thus it is not included, or the dye seemed unusually faint in this gel. In summary, we recommend the following record keeping tips: 2.4 Leadership best practices In this section we will describe best practices for lab leaders leading multidisciplinary informatics teams to support their research team members. The section is based on a famous blog post (Watson 2013) called “The lonely bioinformatician” that describes the angst that informatics personnel often feel when they are the only person in their lab with their skill set. The blog post author, Professor Mick Watson at the University of Edinburgh, describes these individuals as “pet bioinformaticians” in his blog called opiniomics. He states: “It is possible they [the pet bioinformaticians] will become isolated and pick up bad practices as they don’t have a senior bioinformatician to guide them. It also concerns me that their career and professional development might suffer.” He also acknowledges the challenges of the opposite case: “Consider the opposite situation – how many bioinformatician PIs manage lab staff? How could we possibly guide a young post doc on how to run gels, PCRs etc nevermind more complicated laboratory SOPs?” He has since then stated for the PIs of experts who do not share the same skill-sets: “Just look after them, and recognise you can’t give them everything that they need. You can give them a lot, just not everything.” “Secondly, there is nothing wrong with being a pet bioinformatician – it can be a really stimulating role, and opens your eyes to lab-based science. I am not criticizing the pets either, I just urge you to look after yourselves.” And ultimately provides a guide for the “pet bioinformaticians” that can be useful for both informatics expert employees/mentees and also for leaders of such individuals as well as for informatics lab leaders who employ lab-based scientists. Extending the major themes from his guide and from his post about clinical labs (Watson 2014) here are guidelines for multidisciplinary research lab leaders: 2.4.1 Recognize that different disciplines require deep expertise Informatics is truly it’s own scientific discipline that requires deep expertise. To truly optimize the multidisciplinary work that you may wish to perform, you need expert-level experience of all disciplines on your team. Although you may be able to learn how to use a particular tool to seemingly test an hypothesis, this may not be correct or optimal in every instance. This is why you need expert-level informatics team members to help you with your research. If you can’t hire such an individual, or even if you hire a more junior informatics mentee or employee, you need to discuss your research with a senior informatics expert. 2.4.2 Avoid employee isolation If possible, employ more than one domain expert or at least collaborate heavily with others - especially those with experience working with human data. Alternatively hire a more senior expert (with expertise studying in the domain you intend) with a higher salary. In Mick Watson’s words: “I am aware of a few lone bioinformaticians working in clinical labs. I want to make this clear – this is a bad idea. In fact, it’s a terrible idea. Through no fault of their own, these guys will make mistakes. Those mistakes may have dire consequences if the data are then used to inform a treatment plan or diagnosis.” In any case, we highly encourage guideline #2 regardless of what option you choose. 2.4.3 Encourage relationships with others in their domain Enable and encourage your employee to cultivate relationships with others who have similar skill-sets at your institution or local community. Ideally, help your employees or mentees find a mentor within their domain. If there is no local group of such individuals, see if your employee would be interested in starting one - such as a seminar group or journal club. Also encourage them to join online forums and attend conferences and workshops. Examples include: R ladies for support for using Bioconductor or R programming. Many of the members are also very familiar with using a variety of genomics and imaging data. You do not need to be a woman to get support from this organization. Many universities have a statistical support group, check at your institution. National Cancer Institute ((NCI) hub groups has a list of cancer specific groups such as large groups like the Informatics Technology for Cancer Research (ITCR). Consider location specific groups/collaborations such as the African Esophageal Cancer Consortium (AfrECC). When in doubt, ask around. Ask at your institution, ask your former colleagues at other institutions, or try on social medial like twitter to find connections. 2.4.4 Encourage growth outside their domain On the other hand, it is important that you also cultivate and encourage your employee’s growth in your domain by again suggesting and enabling their participation in conferences and journal clubs on topics relevant to your lab. 2.4.5 Value their perspective about science in general Encourage feedback and discussion from all of your employees in scientific discussions. Make their input feel welcomed regardless of the topic. A fresh perspective can sometimes lead to really important insights about things that are taken for granted by experts. 2.4.6 Discuss expectations and hypotheses If your employee is helping with work for a grant, provide the proposal to them. Have a discussion with your employee about your expectations and how feasible they are, as well as to make your informatics hypotheses specific. Avoid projects where the informatics goals are vague. Also remember that many informatics tasks may take more time than you anticipate and your employee may have a better sense of how long something will take (or vice versa if you are an informatics expert employing lab scientists). Be clear with your employee in these discussions that you are unclear about how long tasks will take, if that is indeed the case. Continue to have open dialogue about expectations and goals as the work proceeds. 2.4.7 Advocate authorship and idea generation for all Regardless of your employees’ or students’ backgrounds, make sure you advocate for authorship for each of them (particularly if they are interested in a career in research). According to a recent report about computational biology: “Despite the fact that dramatic advancements have been driven by computational biology, too often researchers choosing this path languish in career advancement, publication, and grant review” (Carpenter et al. 2021). It is often overlooked, but informatics experts will also need first author papers. However, keep in mind that in some fields authors are listed in different ways. Allow your employees to generate ideas for such publications and discuss this with them. Often the work to help with other projects may not be as interesting for your employee as an idea that they come up with themselves. Often you can create one technical paper and one biological paper from each project. For technical papers, allow your lab members that largely do informatics to play a prominent/leadership role. For biological papers, ask them to play a supporting role. For experimental lab members do the opposite. Allow these lab members to have a prominent role on biological papers and a supporting role on more technical papers. If nothing else, even if your employee is very busy on work for mid-level authorship, give them time to write a review or a software paper for a simple package, or a comparison of informatics methods. Mick Watson suggests making sure that your employees are authoring ~2 first author publications a year if possible. If necessary you can front-load collaboration work and then give your employee more time later to spend on their own work, but be careful about not protecting some of their time for their own career advancement. Also please see the Career Paths for Informatics Mentees section (coming up soon) and read it with your employees in mind, as well. 2.4.8 Check on them! Most importantly, make sure that your employee is getting help and feedback from other experts in their domain. It can be easy for your employee to get stuck or go in the wrong direction if left in isolation. How can you prevent this from happening? Keep tabs on what they are doing in general, if they are still working on the same issue for an extensive amount of time, suggest that they seek help. Also by encouraging your team members to cultivate relationships with experts you will provide them with the opportunity to ask others for their thoughts. 2.4.9 Get external review of work Particularly in informatics, we can especially track our steps. Make sure that your employees are keeping detailed records about their work and then get them to regularly ask for feedback from others. We all make mistakes, it’s good to get external feedback early and often to ensure that the work is correct. 2.4.10 Support diverse teammate work schedules One other important thing to know is that informatics work is often best performed with long stretches of uninterrupted time to allow your informatics employees to perform “deep work”. Why is this? Some of the challenges that your informatics teammates will be working on will require a great deal of abstract thinking and troubleshooting. Such difficult work profits well from deep concentration. How can you accommodate this? Try to work with your informatics teammates to schedule lab meetings and be mindful of other time commitments they might have, such as classes, seminars, or other meetings. On the other hand, if you are an informatics expert mentoring experimental biologists, keep in mind that their experiments will dictate their schedule. Impromptu meetings may be difficult for them at times. Also be aware that some of their experiments may require that they stay late at night or come in very early. Thus on those days, it might be best to not overburden them with other tasks if you want them done well. 2.5 Conclusion We hope these leadership guidelines will help you to better support your lab team to be as successful as possible! In conclusion, here are some of the take-home messages: Look for collaborators early in the process, particularly those that explain clearly. Take your time and expect delays. Reduce stress by scheduling extra time where possible. Keep organized records of communication and analyses. Recognize that different disciplines require deep expertise. Thus help your mentees find mentors for their respective disciplines if it is different form your own. Advocate authorship for all of your mentees (including first authorship). References "],["informatics-project-guidelines.html", "Chapter 3 Informatics project guidelines 3.1 Identifying good informatics questions 3.2 Informatics project pitfalls 3.3 Informatics project pitfall mitigation methods 3.4 Conclusion", " Chapter 3 Informatics project guidelines 3.1 Identifying good informatics questions Once you have identified your research team, your next step is to start thinking more deeply about the specific informatics questions you would like to evaluate. Be sure to include team members of each discipline in these discussions. There are many important considerations to keep in mind when asking an informatics question: We suggest the following steps to take a great scientific question and make into a great informatically testable question. 3.1.1 Steps for forming questions Start with what you know and determine what is unknown Clarify what is most important to learn about what is unknown. What key information would lead to more understanding? What would be most helpful to know to lead to a new treatment or prevention strategy? What would lead to more questions? Narrow down what is unknown into specific statements based on what you identified as important to know from step 2. Write the unknown statements into specific questions. (Look out for vague phrases!) Make the questions into actionable tests by thinking about what would be measured or observed and ultimately what your variables would be in a statistical test. Make a mock-up of what the data would look like. (Do you have necessary controls?) Evaluate if that actionable test can be assessed with statistical methods and if you have access or can collect the necessary data. Rework as necessary, possibly returning to a different question from step 5. Think about possible biases or confounders. Evaluate if the interpretation of the test would provide the insights that you are interested in. For example, say we were interested in identify new diagnostic biomarkers for colorectal cancer. Note: this is only an illustrative example. These suggestions are based on that of: Wang (2019). 3.1.2 STEP 1 - Identify what is known and unknown First we would begin by identifying what is known and unknown: Several potential blood-based biomarkers for colorectal cancer have been identified, however many are lacking evidence due to the previous studies having small sample sizes. [source] You might ask how useful are these biomarkers for diagnosing colorectal cancer? So now we think about what is unknown: You know the sizes of the previous samples that have assessed these biomarkers and you know the level of sensitivity reported by previous reports. However (assuming just this knowledge for illustration purposes), it is unknown: - How sensitive and specific some of these biomarkers are with sufficient sample sizes. - How collectively these biomarkers help to identify patients with cancer. - Which biomarkers are more important. - Which biomarkers or combinations are particularly useful for determining disease progression or what treatment options might be best. 3.1.3 STEP 2 - Prioritize unknowns Step 2 then involves determining which unknowns are the most important to you. This could be what is more translatable to aiding better diagnostics in a noninvasive way. This could be to better understand cancer progression and what these biomarkers tell us about patient prognosis. Determine what unknowns best fit your interest/expertise. Let’s say that we want to know what is most translatable to aiding diagnostic tests now. 3.1.4 STEP 3 - Write specific statements Step 3 then involves writing out specific statements for what is unknown related to making these biomarkers more useful for tests now. It is unknown how useful many of these biomarkers are individually for the diagnosis of colorectal cancer in larger samples. We do not know if combining these biomarkers together is useful in the diagnosis of colorectal cancer. Perhaps combining these blood-based screens with other screens is useful. You can probably imagine many more statements, but we will keep this example simple. 3.1.5 STEP 4 - Transform into specific questions Step 4 involves transforming these into questions: At what sensitivity rate do each of these biomarkers aid in the diagnosis of colorectal cancer? Does the use of a combination of these biomarkers for the diagnosis of colorectal cancer increase diagnosis rates better than any single biomarker? Does the use of a combination of any of these biomarkers with other non-blood-based screens improve diagnosis rates compared to either diagnostic method alone? Look for terms or phrases that are vague in your questions and make them more specific. For example, “How helpful”, “Is it better”. Think about in what way something might be helpful or in what way something might be better. For simplicity purposes we will stick with only the second question. 3.1.6 STEP 5 - Transform into actionable tests Step 5 is to transform questions into actionable tests. For a question to be testable it must meet several requirements. We need to have variables that can be measured or observed. We need to have a variable we can modify or control, and we need to figure out what we cannot control. Now what are our variables, what can we control or observe? We will be observing diagnostic rates of colorectal cancer and biomarker expression, and we can modify or control how many biomarkers we choose to focus on to compare samples. This leads to many questions: Should we compare one biomarker vs all of the biomarkers? Which single biomarker will we choose to compare to or will we look at all of them? Do we have the sample sizes to allow for the statistical power for so many tests? How will we look at the combination of biomarkers? A total score? Will it be additive or something more complicated? For example, we could prioritize some biomarkers over others. These are good questions to ask an informatics expert about. However we are getting to a more testable question. Now let’s really think about what the data we would need and what it would look like. Which brings us to step 6 where we create a mock-up of the data. 3.1.7 STEP 6 - Create a mock-up of your data Creating a mock-up of the data can make you ask yourself more questions about what you are asking and what you need to ask that question. Would it be that we have blood results for these biomarkers for patients where we know (based on surgical pathology) if they have cancer? What would these blood results look like? Would it be absolute expression levels of mRNA or protein? Do we have a threshold of elevated expression that we can use? Will we assign samples as yes or no in terms of meeting this threshold or will we use an absolute quantity or relative percentage over this threshold? Actually creating a mock-up of what the data might look like can reveal other important aspects that you may not yet have thought about. Thus here is the result of step 6. 3.1.8 STEP 7 - Think about statistical tests Step 7 is then to think about what statistical tests you might perform. Could we use a t-test to compare the scores among the patient groups? Would we want to account for other factors like the patients age or gender? Would another test be better? 3.1.9 STEP 8 - Think about interpretation Step 8 is then to think about what this would mean. What would it mean if our results showed a difference in score between the groups? What can we interpret? Do we want to be able to predict patient status? This may involve moving back a step or two. Remember that working with your research teammates can help you to come up with a better research plan before you start collecting data. By involving experts from different domains you can make the most out of your research efforts. We would also suggest that you work with your informatics experts to come up with a biological research question (or set of questions) and a more technical question (or again set of questions) for each project. This can be a good strategy to ensure that everyone in your team gets authorship and that your team is as productive as possible. For this example, your informatics employees or students might write a paper using simulated data or publicly available data to look at methods for creating biomarkers scores. Their studies could better inform you about how to think about testing the utility of colorectal biomarkers for diagnosis purposes. 3.2 Informatics project pitfalls One common misconception is that informatics research projects work out more often or are faster than wet bench experimental research projects. This is however not necessarily true and informatics projects are just as likely to fail and often take more time than one might expect. However, one advantage of having an informatics team member on a project is that there is ample free data available to add to or shift or reframe a study if necessary. This is important to keep in mind when advising your mentees and guiding the planning of their projects. Common reasons why an informatics project might fail: The goals were too vague (see the previous section about identifying good informatics questions). Sadly this happens quite often and it can easily lead informatics employees and mentees down the wrong path. The data is not of high enough quality or lacks consistency. This may be due to a faulty method, methodological differences between lab personnel, expired reagents, temperature differences on data collection days, or aging of a machine over time etc. Some of these issues can be avoided or reduced, while others are unavoidable. Do not be quick to blame your experimental research team members if the data does not look like you expect. Some variation in data is just a part of life. There is not a strong enough signal in the data to detect the effect of interest with the current data/methods. This is also a very common problem if you are not sure what the strength of the effect you are looking for might be (which is often the case in Biology). In this case you need more data or perhaps methods with greater granularity. The method of data collection becomes obsolete. This may not make the project fail per se, but it can make publication difficult. Staying on top of what methods are currently being used can help to avoid this. The signal does not exist. Sometimes our hypotheses are just wrong. 3.3 Informatics project pitfall mitigation methods This section is in part based on a book (Robinson and Nolis (2020)) by Emily Robinson and Jacqueline Nolis. We can mitigate some of these project weak points. (You may notice how some of these have been discussed previously.) However some of these are a bit unavoidable and it is best to have realistic expectations and flexibility about backup project ideas. Ways to mitigate project failure: Discuss with experts Discuss with trusted experts across all necessary domains about your informatics hypotheses to make sure they are feasible with the data you have or will generate before you get too far down the research path. Ask for their help to make sure that your scientific questions are not too vague. Do this as early as possible. Diversify projects It is a good idea to diversify your mentees’ and employees’ projects to enable them to have exposure to different projects, as well as more opportunities to contribute to a project that will ultimately result in a product such as an academic paper or a new software package. Safe project planning Make sure mentees and employees have at least one very solid project. For example, assign a review article, a simple software package, or a project with very promising pilot data. Co-authorship Allow lab members (especially mentees) to work together on projects. Assign one mentee or employee as the main personnel, but allow other team members to contribute in small ways to allow them to at least get co-authorship, just in case their main projects fail. Plan for ample time Plan for projects to have adequate time to account for setbacks. For example, if possible plan on the possibility that additional data may need to be collected or perhaps more data will need to be added from a data resource. It will take additional time to analyze the new data. Unfortunately, simply plugging in new data to an existing script hardly ever works. Instead the following tasks are required: Check the quality of the new data Reformat/wrangle the new data to match that of the existing data Evaluate how different the new data is from the old data - are they similar enough to be included in a larger analysis or does this require two analyses? Perform the analysis on the new data Adjust and reframe When a project appears to fail because the data turns out to not be adequate for answering your original question, reframe the project to answer a question that the data actually can answer. For example, if the goal of a project was to look for differential gene expression of a single gene and no significant difference is found, consider evaluating the gene expression of a pathway or network of genes that are involved in the same biological process. It is best to be transparent about your scientific process in your publications. Get new data In the worst case that the data does not appear to work for your initial goal and reframing the question does not seem possible, look for new data. Now there are many data resources available online. We have curated a list of cancer research related data with the help of the National Cancer Institute (NCI) Informatics Technology for Cancer Research (ITCR) faculty. These are also good resources for finding cancer related data: - The cBioPortal - This article Keep in mind that using new data takes time. Using an existing script on new data rarely works because data can be formatted differently and have other intrinsic differences. This must first be evaluated to know how to proceed. The following steps are required: Overall we will summarize our suggestions for avoiding project pitfalls. 3.4 Conclusion We hope that having an awareness for how informatics projects can fail and that keeping these mitigation strategies in mind when you are planning your projects will help you to be more successful with your informatics research endeavors! In conclusion, here are some of the take-home messages: Follow the outlined steps for forming good informatics questions. Especially remember to make a mock-up of what your data might look like for a project. Remember that there are several sources for project pitfalls some of which are unavoidable at times, however discussing your plan early with other experts, planning for extra time, and diversifying projects can help. References "],["informatics-relationships.html", "Chapter 4 Informatics relationships 4.1 Cultivating good multidisciplinary lab relationships 4.2 Collaborating with informatics experts 4.3 Employing informatics experts 4.4 Mentoring informatics students 4.5 Conclusion", " Chapter 4 Informatics relationships 4.1 Cultivating good multidisciplinary lab relationships Now that we know a bit more about general practices for maintaining successful multidisciplinary teams and projects, we are going to take a deeper look at how to best support the relationships that might have in our team. We will also discuss the pros and cons of each type of relationship to better guide you about decisions regarding building your team. 4.2 Collaborating with informatics experts Studies investigating biology research labs over history indicate that collaboration has been on the rise since the 1950s (Vermeulen, Parker, and Penders 2013) and that the rate continues to increase (Sonnenwald 2007). Indeed the size of biology research teams appear to have doubled from 1955 to 1990 (Vermeulen, Parker, and Penders 2013). But why? 4.2.1 The benefits of collaboration Shared cost Research often involves expensive technology, thus it is cost effective to share resources. Shared expertise Now that technology affords answering in some cases more complex or broader research questions, it is often more effective to employ multiple contributors with different knowledge, skills, and perspectives. Researchers have noted that their own concept of their field changed as a result of working with investigators from other disciplines. Thus this can lead to innovation (Mäkinen, Evans, and McFarland 2020). Shared burden Doing part of the work for a project using the knowledge and skills that you are most comfortable with and seeking help from others who are more knowledgeable on other research aspects can be a more efficient strategy. Shared reliability Including multiple team members who can each evaluate the research can improve the reliability of a project, as mistakes can be found by other members. Shared credibility Collaborations involving experts of multiple areas can improve the perceived credibility of the work by others. 4.2.2 Potential challenges There are always challenges when collaborating with others, but some of these are particularly enhanced in multi-disciplinary teams. Here are some challenges that you may encounter when a collaboration involves informatics experts. Bad collaboration: Communication Differences Extra care needs to be taken to ensure that communication across groups is effective. Typically researchers will not meet as often with a collaborator as they would with an internal team member. Therefore, poor communication in a collaboration can lead to more costly misdirection and thus wasted time and effort. Furthermore, as investigators often have different backgrounds, differences in jargon and language can make communication more challenging. Having internal team members with some familiarity with informatics can be very beneficial for translating discussions with collaborators who are informatics experts. One solution to this is to have trainees work in both labs. This can be especially beneficial for the trainee who will become accustomed to two research styles and will learn a diverse set of skills. This allows the trainee to potentially have their own multi-disciplinary lab in the future (Mäkinen, Evans, and McFarland 2020). Another important method that can help resolve this issue is to have members provide educational seminars for participating members about the fundamentals of their work. Different research style and goals Beyond differences in language, differences in research style and goals can lead to conflict. “Scholars’ different styles of thought, standards, research traditions, techniques, and languages can be difficult to translate across disciplinary domains” (Mäkinen, Evans, and McFarland 2020). Making clear research standards and goals, as well as outlining clear specific tasks at the beginning of a project can help to avoid this issue. Furthermore, meeting consistently throughout the duration of a project can also help to make sure that standards are maintained. Additionally, these meetings should include discussions about intellectual property, authorship, leadership, and defining what success looks likes to each of the various members. Defining these details early can avoid major conflict later. Furthermore, it is critical to keep in mind the diversity of career goals of research team members, as junior team members may have a challenging time persuading others of their independence and contributions when they work on largely collaborative projects. It is also necessary to ensure that junior members have time to devote to their own research programs. (Sonnenwald 2007) Support should be provided for these junior collaborators by more senior collaborators. Different capabilities Research of multi-disciplinary collaborations has revealed that when collaborating members are unclear of how their expertise and work contributes to the project, they are less motivated and fell less valued. Working with members of different backgrounds to determine how their expertise can contribute to the project, as opposed to simply assigning them a task, will not only help with morale, but it can also better define how a collaborator can further contribute to a project in ways that you may not already expect (Mäkinen, Evans, and McFarland 2020). Reduced sense of responsibility Another concern of collaboration is that team members may feel less responsibility or commitment to a project than for a project within their own lab. Defining tasks and expected due dates can help reduce this issue. Discussions to establish due dates should always include team members with expertise in each area of science, as tasks may not take the amount of time that another researcher would expect. It is a common misconception that informatics tasks take less time than the tasks actually take in reality. Research is dynamic Research always has an element of trial and error. Protocols may change and new scientific questions may emerge. Frequent meetings with all group members to understand the dynamics of the project are critical. Furthermore, flexibility and understanding is required. It should be expected that aspects about the project will change. Different levels of resources Particularly when collaborating with community members, community colleges, and institutions that are “Equity-oriented” and serving populations that have historically been marginalized or “minoritized” (Blake 2017), it is important to keep in mind that large differences in resources may exist between collaborating members. Sharing and discussing budget information early and often can help research members to understand what expectations are reasonable and how collaboration partners may best assist one another. It is also important to recognize that: “There is a common misconception that the lack of physical experimentation and laboratory supplies makes analysis work automated, quick, and inexpensive” (Carpenter et al. 2021). However: “In reality, even for well-established data types, analysis can often take as much or more time and effort to perform as generating the data at the bench. Moreover, it typically also requires pipeline optimization, software development and maintenance, and user interfaces so that methods remain usable beyond the scope of a single publication or project” (Carpenter et al. 2021). Don’t forget to provide some budget for your informatics collaborators, as their time ultimately does cost money and there may be computational costs that you may not be aware of. 4.3 Employing informatics experts In contrast to collaborating with informatics experts, in some case it may be beneficial to directly employ them on your team. There are again pros and and cons for this strategy. By directly employing informatics experts, rather than collaborating with an expert, research leaders will have more access to meet with these experts more often. Research leaders may also have more sway in terms of guiding the direction of the experts’ work. Leaders can also potentially grow the informatics part of their research program more readily, leading to even more flexibility in the research questions that they may be able to assess. However, direct employment of informatics experts requires all of the typical responsibilities and costs of employing another lab member. It also requires the additional resource requirements for the informatics work of the particular expert. In addition, it is useful to become familiar with best practices for ethics, reliability, and reproducibility in computational work. This requires some different tactics than that of experiment based research (often called “wet lab” research). Although it is also useful for informatics experts to keep track of the work that they have performed in general, similar to maintaining notes about experimental research with a lab notebook, a much deeper level of detail can be tracked and maintained for computational work. What we mean by this, is that the actual code and data used in their work can be saved over time. This can be invaluable for research reproducibility. Thus research leaders are advised to become familiar with best practices for data sharing and data management so that they can most effectively manage their informatics employees. This is also discussed in more detail later in the course. One other important thing to remember is that informatics work is often best performed with long stretches of uninterrupted time. This will be true for your informatics employees and mentees. Again we suggest that you work with your informatics teammates when you schedule lab meetings and be mindful of their other time commitments. Try to support them in scheduling several hours of uninterrupted time a day if possible. As a reminder, again unless you employee a senior informatics expert and even then - it is advisable that you encourage these employees to make supportive relationships with other informatics experts, and particularly if they are working in a new domain. 4.4 Mentoring informatics students Mentorship is a particularly unique relational experience. While traditional mentorship has been defined by the hierarchical structure of a single mentor who teaches subordinate mentees, new styles have emerged that are not as constrained or limited as the traditional paradigm. At its optimum, mentors and mentees should learn from each other and together and expand what each can do alone. Importantly the more traditional paradigm that does not value “reciprocal learning” as highly, has been shown to be less effective for a larger diversity of students (Mullen and Klimaitis 2021). For research groups that are newer to informatics, some of these less traditional paradigms may be especially useful, we will focus on a few here. 4.4.1 Co-mentoring/collaborative/team mentoring As we described earlier, co-mentoring or collaborative mentoring of students by multiple mentors with different backgrounds can be particularly beneficial to the student and also to the partnering labs. In the case of collaborative mentoring where a mentee is mentored by two research experts in two different labs, this provides an opportunity not only to strengthen a collaboration, but also for students to gain more diverse knowledge, and to in turn provide more of the expertise that they gain back to both labs. Co-mentoring could also occur within the same lab by a research leader and an informatics expert. This could also work well in a multilevel paradigm, where an informatics expert may guide informatics related aspects of research, while an overarching research adviser may guide the student’s overall research mentorship experience. 4.4.2 Peer mentoring Peer mentoring also provides great opportunities to expand students’ expertise and skills without as much time constraints for the research leaders of a lab, particularly for skills that may be new to lab leadership. Furthermore, such paradigms are helpful for improving students’ teaching skills, collaboration skills, self-reliance, and self-confidence. Teaching a peer is often useful for students to identify gaps in their own knowledge and assisting in their quest to “learn how to learn” (Mullen and Klimaitis 2021). Furthermore, such paradigms appear to be especially beneficial to students of historically marginalized populations (Mullen and Klimaitis 2021). However, there are challenges for research leaders from a management standpoint. Mentors should be mindful of any conflicts that may arise between students. These can often be avoided with clear and distinct goals and projects for students, to avoid making students feel like they are competing with one another. Additionally, we highly recommend establishing a code of conduct for the lab, so that students and staff members are clear about what behavior is expected. 4.4.3 Electronic mentoring With the COVID-19 pandemic, the transition to using electronic means of contact with students and staff for research has expanded on an unprecedented scale. It is unclear currently how much this will continue in the future. However, research prior to the pandemic has shown some surprising benefits of providing mentorship through electronic means. Importantly, it appears that this eases burdens for students who are balancing course work, as it often provides more scheduling flexibility. Additionally, such mentorship is particularly helpful for historically marginalized populations who may face more hostility by going to research institutes with face-to-face interaction with others or may have additional scheduling conflicts. Even as we may return to more on-site research labs, additional availability by mentors with mentees using electronic means of contact are likely to be beneficial. Technology such as slack can be especially useful for allowing lab members to interact with one another. We will cover more about this soon. 4.4.4 Career goals The job landscape for scientists has changed in recent decades with more opportunities outside academia in industry and government. Furthermore career goals for informatics mentees can be very different than that of other research mentees. By having informatics expertise, these trainees have additional career opportunities. Becoming aware of these opportunities yourself, as a research leader, is therefore critical for cultivating your mentees’ awareness of the diversity of opportunities available to them. This will ultimately allow your mentees to choose the career path that suits them best. 4.4.4.1 Career paths for informatics mentees Academia - Your informatics mentees may have career opportunities as principal investigators, scientists, or educators just like other cancer biology mentees. In addition to opportunities as educators for informatics and biology, they will also have opportunities for data science. Government - Your informatics mentees may have career opportunities as scientists or policy makers for research institutes just like other cancer biology mentees. However, additional agencies and institutes may have a need for their data science skills on topics outside of biology. For example your mentee may have the skills to work for a city police department. Industry - Beyond the potential career options in the pharmaceutical industry, biotech, and medicine, your informatics mentees will have data science skills that may qualify them for jobs in a variety of industries. For example your informatics mentees could find jobs at companies such as Stitch Fix or Ancestry which use methods in machine learning and bioinformatics for their products. Additionally, your mentee may also have opportunities to join a software company as a computer programmer or even as a programming educator at a company like RStudio. Nonprofit - Beyond research and management positions at nonprofits performing scientific or clinical research, informatics mentees may have opportunities at other nonprofits with other types of goals. For example, your mentee might find work at a nonprofit that advocates for civil rights and investigates social interactions on social media platforms. 4.4.4.2 Career paths outside of academia If your mentee is more interested in a career path outside of academia we suggest you read up about industry perspectives on useful skills and knowledge, so that you are better prepared to guide your mentees to get exposure and experience to the data science domains or aspects that would be most helpful to them. According to Brandon Rohrer, a data scientists who formerly worked at Facebook and now works at iRobot, there are 4 major categories of knowledge and skills for data science: Data Analysis - domain knowledge, research skills, and interpretation skills Data Modeling - machine learning application skills and algorithm development skills Data Engineering - data management skills, skills to make code production-level ready (ex. automation), and software engineering Data Mechanics - data formatting and cleaning and data handling (filtering, subsetting) Based on these categories, he says that there are also 3 major archetypes (with subdivisions) for data scientists: Beginner data scientists These are individuals just starting out but who are familiar with each of the 4 above areas. This is ideally how your mentee will be after training (at a minimum) if their goal is to pursue a data science career. This will allow them to identify their strengths. Generalist These are individuals who are proficient in all areas. This is helpful for becoming a data science manager or executive. Focus on all 4 aspects of data science is also good for mentees who wish to stay in research! Specialists There are 3 major subtype specialties: Detective - strong skills in data analysis and mechanics and exposure in all 4 areas - may be especially useful for working at nonprofits Oracle - strong skills in modeling and mechanics - this is great for working at companies using machine learning Maker - strong sills in mechanics and engineering - this would make your mentee valuable for working in any of the nonacademic fields as well as academia Check out this video for more details: From another perspective, the major skill sets to focus on according to the “Build a Career in Data Science” book by Emily Robinson and Jacqueline Nolis (Robinson and Nolis 2020) are: Statistics Machine learning Programming (python and R) Projects - hands on experience Promoting your mentee’s exposure to each of these domains can only help them further pursue a career in data science. We also suggest that your mentees checkout this blog post on surviving data science interviews also by Branden Rohrer. We think this could be helpful for mentees pursuing any data science career path, however it is especially useful for those interested in industry. 4.4.4.3 Academia for informatics mentees So how do we best support informatics students that want to stay in Academia? Recognize that academic promotion for informatics experts is currently not very accommodating. Some aspects of traditional academic promotion are simply not set up to accommodate informatic experts. For example, these individuals tend to publish many more secondary author publications and create software which can reduce the time they have available for first author publications. However, first author (or last) author papers are still most often used as the major guideline for academic achievement. How can you and your mentees overcome this? Be sure to especially advocate for your student to get first authorship papers if they intend to stay in academia. Wherever possible also try to advocate for more nuanced academic promotion polices that account for multidisciplinary differences at your institution. Encourage your student to seek specialized technical skill sets For your students who wish to stay in academia, it may be less important that they become as generally familiar with a wide variety of data science skills and practices as students interested in a career outside of academia, if they are for example focusing on a specific statistical method. Just like other academic fields, informatics experts will become experts in niche subject areas. Encourage these students to go to targeted conferences to build a network in their field of interest (although we still encourage if possible to allow your students to get a well-rounded exposure to different types of conferences). Also especially encourage these students to learn about grantsmanship just as you would with your other academic mentees. However this can also be useful for students interested in working for a nonprofit or for the government. See Carpenter et al. (2021) and Waller (2018) for a more in-depth discussion and suggestions on how we can work to reform academic promotion practices to be more mindful of disciplinary differences for informatics experts. 4.4.4.4 Authorship considerations In addition to typical biological papers, there are also other companion types of papers that your informatics mentee can publish, including: Data resource papers - your mentee may publish an article introducing a data resource Software papers - an article where the functionality and development of piece of software is discussed Method comparison papers - your mentee may compare existing methods New method or pipeline papers - your mentee could describe a new method that they created These other types of papers are especially good to keep in mind if your mentee will not be first author on a biological paper from your lab. When guiding your mentee through the publication process, it is a good idea to keep in mind their career goals as you prioritize different paper ideas. For example a mentee that is interested in pursuing a data engineering career may benefit more from a software paper, while a mentee interested in staying in academia would benefit more from a new method paper or a biological paper if possible. Most importantly, remember that informatics mentees need first author publications too! 4.5 Conclusion We hope that these tips help you to mentor and lead your team in a more productive and effective way that benefits both your team members and your lab’s mission! In conclusion, here are some of the take-home messages: There are many challenges associated with collaboration, however early and regular communication can help. It is also helpful to outline expectations early on. Becoming familiar with best practices for ethics and reproducibility is helpful for employing a computational biologist, especially if you are new to computation biology. Possibly consider learning a bit of code to better understand what your computational employees are doing if you yourself are not already familiar. Remember that mentoring is a reciprocal process. There are many alternative strategies for mentoring that you may find helpful. Remember to discuss with your mentees about their career goals. The way you advise them should be driven by their interests. Organize projects so that your team can produce biological and computational manuscripts. References "],["promoting-diversity-equity-and-inclusion.html", "Chapter 5 Promoting diversity, equity, and inclusion 5.1 Diversity is beneficial 5.2 Underrepresentation in cancer informatics 5.3 Examples of contributions by individuals of underrepresented groups 5.4 Underrepresentation in clinical trials 5.5 Strategies to promote more equitable inclusion in clinical trials 5.6 What does it take for clinical research to be ethical? 5.7 Research practices to reduce cancer health disparities 5.8 Ways to better support a more diverse research team 5.9 Conclusion", " Chapter 5 Promoting diversity, equity, and inclusion 5.1 Diversity is beneficial Beyond the critical importance of giving everyone more fair opportunities (which cannot be overstated), there are many additional crucial reasons why diversity is particularly important for science and health research. The inclusion of diverse research team members may promote more inclusive research questions and practices to help the populations that need better health care the most. This is especially important as many of the historically marginalized racial and ethnic populations of the United States are growing. By 2045 it is expected that these groups will make up more than 50% of the population (Clark et al. 2019). “Racial/ethnic minority groups in the United States are at disproportionate risk of being uninsured, lacking access to care, and experiencing worse health outcomes from preventable and treatable conditions” (Jackson and Gracia 2014). “…Compared with the general population, racial/ethnic minority populations have poorer health outcomes from preventable and treatable diseases, such as cardiovascular disease, cancer, asthma, and human immunodeficiency virus/acquired immunodeficiency syndrome than those in the majority” (Jackson and Gracia 2014). “… the social environment in which people live, learn, work, and play contributes to disparities and is among the most important determinants of health throughout the course of life” (Jackson and Gracia 2014). More diverse research teams may be more aware of the cultural differences and social determinants that may influence the health of the people that the research could serve. Such consideration could further increase the impact of the research. The inclusion of diverse populations in scientific teams has also been shown to improve innovation (Hofstra et al. 2020). “Scholars from underrepresented groups have origins, concerns, and experiences that differ from groups traditionally represented, and their inclusion in academe diversifies scholarly perspectives. In fact, historically underrepresented groups often draw relations between ideas and concepts that have been traditionally missed or ignored” (Hofstra et al. 2020). 5.2 Underrepresentation in cancer informatics Despite the benefits of diversity in research teams, analyses of scientific article authorship indicate that women are underrepresented in computational biology (Bonham and Stefan 2017) and biomedical engineering (Aguilar et al. 2019). Furthermore, analyses of university faculty and students demonstrate that both women and historically marginalized populations (such as certain racial and ethnic groups, disabled people, and LGBTQ+ individuals) remain underrepresented in science, technology, engineering, and mathematics (STEM) fields in the US and in Europe (Hofstra et al. 2020; Chaudhary and Berhe 2020; Kricorian et al. 2020; Cech and Waidzunas 2021; Iporac 2020; Williams and Shipley 2018). Although Black people and people of Hispanic or Latin American origin or ancestry made up 27% of the workforce in the US in 2016, together they only made up 16% of the STEM workforce. People of tribes and nations that are often called Native American, Indigenous, or Indian Americans are also underrepresented in STEM (Williams and Shipley 2018). Although they made up 1.7% of the population in 2016 in the United States, they only represented 0.6% of individuals with bachelor’s degrees in science and engineering and only 0.2% of individuals with doctoral degrees in these fields (Williams and Shipley 2018). Furthermore all of these groups are particularly underrepresented in academia at the faculty level(Vargas, Saetermoe, and Chavira 2021). Importantly, although Asian Americans are generally considered well represented in STEM, this is only true for some fields and for those of ancestry or origin from certain Asian countries, such as Japan, China, Korea, and India, while those from other countries such as Cambodia, Laos, and others are often underrepresented. Indeed the term Asian American encompasses a very diverse group of people with diverse experiences and socioeconomic statuses.(CARE, n.d.; nsf.gov n.d. ; G. A. Chen and Buell 2018; Iporac 2020) Furthermore, there are issues related to fair support and treatment for promotion and advancement for many marginalized groups, as well as for Asian Americans that are considered well represented. (Funk and Parker 2018) “This ‘other-ness’ exists intentionally or unintentionally between those of a minority and those of a majority from lacking of common cultural background. Relationships at work appear polite on surface but reluctant tendency in willing to share limited opportunities the same way, which I felt in a previous job where whites and males were overwhelmingly a majority.” – Asian woman, engineer, 56 (Funk and Parker 2018). “There are not many people of my race in [my] industry. It requires me to go the extra mile to fit in or be accepted because many of the employees don’t share my background or life experiences. I can do the job just fine, however, there are other factors of one’s life that are considered whenever they are in a critical and highly competitive environment.” – Black man, systems administrator, 30 (Funk and Parker 2018). The reason for this underrepresentation is that people have historically not been allowed to participate, been discouraged from participating due to discrimination, had a lack of visible role models of their background, and have not had the same access to education. (Funk and Parker 2018) Part of the reason that role models were not as visible as they could be, is that even when individuals of underrepresented groups overcame these adversities and made contributions to science, their contributions were often covered up or not fairly credited (Magazine and Dominus n.d.). 5.3 Examples of contributions by individuals of underrepresented groups Despite very often facing discrimination and less support, many individuals of these underrepresented groups have greatly contributed to scientific achievement, which demonstrates the potential innovation that could be achieved when research teams are more diverse. Many Black scientists are responsible for great human achievements that changed the world including: Katherine Johnson (1918–2020): Katherine Johnson was a mathematician who helped enable humans to venture into space. Katherine worked for NASA in the 1960s during the Space Race. A depiction of her work and some of the discrimination that she faced while working for NASA is featured in the film Hidden figures. Vivien Thomas (1910–1985): Vivien was a pioneer in heart surgery at Johns Hopkins in the 1940s when it was considered taboo. He is credited for assisting in creating a life saving surgical operation that improved the oxygenation of children with congenital heart defects. This operation was pivotal in paving the way for other heart surgical procedures. Sadly, it took more than 25 years for him to be credited for this work. See this article for more information about Katherine Johnson and other pioneering and world-changing Black female scientsist. Also check out this list of inspiring Black scientists today. Scientists and mathematicians of Latin American or Hispanic origin have also greatly contributed and continue to contribute to scientific innovation: Mario Molina (1943–2020): Mario was born in Mexico and came to the United States for graduate school. He was a chemist and environmental scientist and was awarded the Nobel Prize in chemistry in 1995 for his work in discovering the environmental impact of chlorofluorocarbon (CFC) gases on the Earth’s ozone layer. This work was critical in changing policies (starting in the 1980s) to protect our environment from these chemicals which were widely used as aerosols, solvents and refrigerants. Through these policies, many countries around the world have banned or greatly reduced the manufacturing of these chemicals. NASA projections suggest that the ozone layer would have been largely depleted by 2060 if these chemicals were manufactured at previous historical rates. See here for more information on what would have likely happened to the world if CFCs were not banned. Also check out this podcast episode for more information. Ynes Mexia (1870–1938): Ynes began her career as a botanist in her fifties in the 1920s. She traveled widely (often by herself) collecting and characterizing plants. She unfortunately passed away at 68 cutting her botany career short to less than 20 years. However, in that time she collected over 145,000 specimens and made numerous discoveries. Her legacy in botany had such an impact that the Mexianthus genus is named after her. She is also highly regarded for her efforts in environmental conservation, particularly for helping to preserve the redwood forests in the United States. This adventurous woman has been quoted for saying: “I don’t think there’s any place in the world where a woman can’t venture.” See this video for more information about her inspiring life. Also check out this list for additional historic scientists and this list for some current inspiring scientists of Hispanic or Latin American ancestry or origin. Individuals from the various indigenous or native tribes of the United States have also contributed greatly to scientific understanding and innovation. See here for a list of some, in addition to: Fred Begay/Fred Young/Clever Fox(1932–2013): Fred was a nuclear physicist who’s work was instrumental in innovating alternative energy sources at the Los Alamos National Laboratory. He was born on the Ute Mountain Reservation. His parents were from the Navajo or Diné and Ute tribes. His life and work are featured in a documentary called The Long Walk of Fred Young. Fred spent a great deal of time on outreach and education particular for youths of the Navajo Nation. He is also known for making connections between Navajo beliefs and science. He’s been quoted for saying: “I think the key point is that I learned to think abstractly and develop reasoning skills when I was growing up, learning about lasers and radiation in the Navajo language… That’s all embedded in our religion” (“Fred Begay PhysicsCentral” n.d.). “We strongly rely on natural phenomena. We believe we’re children of nature” (“Fred Begay PhysicsCentral” n.d.). “The Navajo has mysterious ideas about science which cannot be interpreted into English” (“Fred Begay PhysicsCentral” n.d.). See here to learn more about Fred. Floy Agnes Lee (called Aggie) (1922-2018): Floy was a biologist who grew up in New Mexico and her father was from the Santa Clara Pueblo or Kha’po Owingeh. Her work at the Los Alamos National Laboratory, Argonne National Laboratory, the University of Chicago, and Jet Propulsion Laboratory helped expand our understanding of the effects of radiation on living cells. You can see an interview of her at this link and see a written translation of this interview here. Edna Paisano (1948—2014): Edna was a demographer and statistician who was born on the Nez Perce Reservation in Idaho. Her family was from the Nez Perce tribe and the Laguna Pueblo. While working at the United States Census Bureau in the 1980s, she discovered that many Native, indigenous, Indian or tribal communities were being undercounted and that this was reducing the amount of government resources and services potentially available to these communities. She was instrumental in improving the counting of individuals from various tribes in the United States Census by initiating a public information campaign. She also published several books throughout her career. See here for more information about Edna. Also, check out this list for information about current native or indigenous people in STEM. There are also numerous LGBTQ+ scientist who changed the world. See here and here for a review of several including: Alan L. Hart (1890—1962): Alan (1890-1962) was a clinician and researcher who’s work largely focused on tuberculosis. Alan was assigned female at birth and later transitioned to be male in adulthood. Despite facing major obstacles due to discrimination, Alan had a very successful career as an expert in radiology for tuberculosis and served as the director of hospitalization and rehabilitation at the Connecticut State Tuberculosis Commission. Alan also wrote several novels. See here to learn more about Alan’s life. In addition, this list includes many LGBTQ+ people currently working in STEM. These lists are neither complete nor fully inclusive, but shed some light on the amazing achievements of some individuals in science who were or are among groups that remain underrepresented in STEM. Their unique life experiences, perspectives, intellect, and talent helped shape their work which has greatly improved our world. Again, this demonstrates how potentially even greater innovation and scientific advancement could be achieved when including more diverse individuals in our research teams. 5.4 Underrepresentation in clinical trials Beyond underrepresentation of certain populations within research teams, there is also a lack of representation of various populations in clinical trials (in particular woman and people of specific racial and ethnic groups) (Clark et al. 2019). This is particularity true for cancer clinical trial studies (Nazha et al. 2019). Furthermore there is also limited data collected about LGBTQ+ individuals (B. Chen et al. 2019). Importantly this underrepresentation is not due to certain populations being hard to reach, but instead about a lack of practices that enable and encourage individuals from more diverse groups to participate. According to an FDA study in 2018, Black people represented 13.4% of the population, yet only 5% of clinical trial participants, while Hispanics and people of Latin American ancestry made up 18.1% of the population at the time, but less than 1% of clinical trial participants (Coakley et al. 2012). Furthermore, when looking at oncology trials, whites (80% of participants) and males (59.8% of participants) continue to be overrepresented, according to a study in 2019 (Nazha et al. 2019). This lack of diversity in clinical trials is a problem because this results in less understanding about how particular patients will fare with a given treatment/vaccine/diagnosis etc. This lack of knowledge can result in furthering inequities in medical care (Clark et al. 2019). Demographic studies of the United States demonstrate that racial and ethnic populations that have been historically underrepresented in clinical trials and in STEM have grown and are predicted to make up 50% of the population by 2045 (Clark et al. 2019). Thus for medical care to best serve the current and future population of the United States, trials especially need to be more representative of the diverse groups that make up our country. Research suggests the following reasons for the lack of diversity in clinical trials: 5.4.1 Historical injustice According to studies, marginalized groups such as Native Americans, people of Hispanic or Latin American ancestry, and Black Americans understandably distrust researchers at higher rates than white individuals due to historical injustices and inadequate medical care. (Griffith et al. 2020; Mastroianni, Faden, and Federman 1999) “African Americans’ suspicions and fears that many sectors of American society are not trustworthy were logical and accurate interpretations of their perceptions and experiences” (Griffith et al. 2020). Examples of historical injustices include: Tuskegee syphilis trial: A study in Tuskegee, Alabama about the outcomes of untreated syphilis in Black males (1932-1972) in which the patients were told they were being treated but were in fact not being treated (McVean 2019). “The Unfortunate experiment”: A study in New Zealand (1966-1987) in which women diagnosed with a precursor to cervical cancer were also similarly studied but patients were not offered standard treatments, they were not informed of their diagnosis, and they were not told that they were a part of a trial, nor given the opportunity to decide if they wanted to be a part of the trial (Evans 2018). Henrietta Lacks and HeLa Cells: In 1951, a patient named Henrietta Lacks was treated for cervical cancer at Johns Hopkins. Her cancerous cells turned out to be uniquely capable of surviving and reproducing and have been used widely in research for decades for many discoveries. Her family did not receive money from the companies that profited from her cells, and for decades her family was often not asked for consent as doctors and scientists revealed her name and medical records publicly (“Henrietta Lacks: Science Must Right a Historical Wrong” 2020). “I want scientists to acknowledge that HeLa cells came from an African American woman who was flesh and blood, who had a family and who had a story,” her granddaughter Jeri Lacks-Whye told Nature (“Henrietta Lacks: Science Must Right a Historical Wrong” 2020). Sexually Transmitted Disease (STD) experiments in prisons and Guatemala: This (1946-1948) study of STDs started by infecting prisoners in a US prison in Indiana but later moved to Guatemala. Overall this study involved infecting many vulnerable populations with STDs, including children, prostitutes, mentally ill patients, prisoners, indigenous people of Guatemala, and soldiers. “…health officials intentionally infected at least 1308 of these people with syphilis, gonorrhea, and chancroid and conducted serology tests on others” (Rodriguez and García 2013). Radioactive iodine thyroid studies of Alaskan natives: In the 1950s, a study was conducted to examine the thyroid gland in Alaskan natives, in which participants were given doses of radioactive iodine that exceeded current recommendations. Beyond the riskiness of the dose of radiation, this study was unethical due to inadequate translation of research methods and consent forms, thus participants were not able to be properly informed about the risks of the study and therefore not able to provide proper consent. Furthermore, this also led to the inadequate exclusion of participants for pregnancy, lactation, and other conditions (Hodge 2012). Importantly, this is only a small subset of examples. Many other individuals among a variety of populations such as those who were imprisoned (Rodriguez and García 2013), economically disadvantaged, those with mental disabilities, children, the elderly, those with mental illness (Millum 2012), and those among other groups have also been abused by unethical research practices Park and Grayson (2008). This pattern of abuse and neglect has in many cases directly impacted individuals and their families or communities and has unsurprisingly made many people wary of participating in clinical research (Griffith et al. 2020). “We, as African Americans were always the pilot test… I know because people have experienced it within my line, my family, my bloodline in [city name]. Yeah.” He continued by saying, “So I just think that we’ve been used and misused a lot of times within the African American community and in the lower parts of the devastation where the devastation lies” (Griffith et al. 2020). “I’m thinking about horror stories like the Tuskegee experiment and things like that. Like stuff that for me, were mentioned back in school and were mentioned by my family members…” (Griffith et al. 2020). 5.4.2 Health care inequity Beyond such examples of extremely unethical studies and practices, marginalized populations such as people of certain racial and ethnic ancestry have also historically received poorer access to health care and poorer quality health care, thus discouraging their engagement with the health care system (Griffith et al. 2020; Mastroianni, Faden, and Federman 1999; FitzGerald and Hurst 2017; Hodge 2012). Many racial and ethnic groups in the US were historically more likely to be uninsured compared to white people and although coverage rates have been better in more recent history, unequal rates remain to today (Damico 2021). [source] Even with similar access to care (which can be reduced by sociodemographic factors), people of marginalized racial and ethnic groups still have a history of receiving inferior health care (Griffith et al. 2020). These two issues among others have led to health disparities, including for cancer: “An analysis of 1 million patients with cancer in the United States showed that blacks or African Americans have a 28% higher cancer-specific mortality compared with whites. This survival gap is independent of sociodemographic factors, disease stage, and access to treatments. Indeed, disparities exist along the continuum of care, from screening, access to care, and referral to subspecialty centers, to enrollment in clinical trials that define new approaches to cancer treatment” (Nazha et al. 2019). 5.4.3 Poor recruitment and inclusion Due to concern about the time and effort required to find a more diverse set of participants and also due to implicit bias, many populations are not recruited at the same rate. Implicit bias is an unconscious negative evaluation of a person based on characteristics that are irrelevant (FitzGerald and Hurst 2017). “Based on the available evidence, physicians and nurses manifest implicit biases to a similar degree as the general population. The following characteristics are at issue: race/ethnicity, gender, socio-economic status (SES), age, mental illness, weight, having AIDS, brain injured patients perceived to have contributed to their injury, intravenous drug users, disability, and social circumstances” (FitzGerald and Hurst 2017). Sadly, this can result in researchers not including certain populations in their trials. Indeed healthcare professionals have been shown to withhold care, such as treatment based on their own bias about if a particular individual would adhere to the protocol. Thus marginalized groups often aren’t even recruited for clinical trials (Yates et al. 2020). Studies of attempts to regulate oncology trial participation to be more inclusive, show that still 48% did not meet target recruitment goals for recruiting underrepresented populations (Yates et al. 2020). 5.4.4 Inadequate researcher cultural competency and diversity In a study of clinical trial participants of marginalized groups, participants stated that they were more likely to trust a researcher if they seemed scientifically knowledgeable but also had an understanding of the history and context of the study population, as well as experience working with that population (Griffith et al. 2020). “In addition, participants noted that the enthusiasm, commitment, or passion of the researcher to help the population of interest and to study the topic of interest influenced how trustworthy the researcher appeared to be” (Griffith et al. 2020). Participants also state that the feel more trusting and more willing to participant in trials with a researcher of a similar background. Thus, improving the diversity of research teams and supporting underrepresented investigators may help to recruit more diverse participants for clinical trials (Yates et al. 2020). In addition, culture competency training and Diversity, Equity, and Inclusion (DEI) training can help! See here for additional cultural competency training resources. 5.4.5 Barriers of access In a 2019 study, access to oncology trials was found to only be available to about 27% of US cancer patients (Nazha et al. 2019). There are many barriers to this access: Language: One major important barrier is translation of recruitment materials to other languages. Scheduling: Another barrier can be the time at which participants are needed to be in person for a clinical trial. Some individuals need more accommodation if for example they can’t come to appointments during work hours due to having a job with an inflexible work schedule or due to care taking responsibilities. Transportation & location: If it is more time consuming to get to appointments due to a need for public transportation or because an individual lives in a different area of town this can inhibit participation and retention of participants. Health literacy: For individuals who don’t have the time, education, or other means necessary to learn about the importance of specific health interventions or preventative practices for example, they may not be recruited for trials as well as individuals that are regularly concerned with their health and spend leisure time investigating information about their health (Yates et al. 2020). 5.4.6 Lack of community engagement funding There has been historically a lack of funding to support equitable engagement of community stakeholders. However, community engagement can build trust, reduce stigma, increase access, recruitment and retention of more diverse participants for clinical trials. Recent projects in which participants are recruited through community engagement, such as Project Brotherhood (a community based project, that provides Black men in Chicago with health education within their communities) have been shown to help increase health screen participation. 5.4.7 Other aspects of funding Most clinical trials are funded by the pharmaceutical industry which often leads to pressure for a more homogeneous population to improve the opportunity to see effects in patients without confounding patient-specific factors. However, trials that are funded by the NIH include more diverse participants that are more representative of the true population of patients with cancer in the US (Yates et al. 2020). 5.5 Strategies to promote more equitable inclusion in clinical trials Provide adequate financial, logistical, and time support Investigators should consider how to support caregivers who have important obligations, such as tending to their family members or others. For example when possible, perhaps home visits would be feasible. Other strategies include providing flexible hours for trial visits and support for transportation, such as payment for a rideshare service or taxi (Yates et al. 2020). Researchers should also consider providing patients with a cell phone to improve communication (Clark et al. 2019). Finally, participants may also need extra assistance to cope with healthcare needs following participation in a study, particularly if this interferes with their typical obligations. More promotion of health care access in general Individuals have been shown to be more likely to participate when advised by a health care provider (Clark et al. 2019). Increasing health care system access in general could improve clinical trial participation, by increasing opportunities for health care providers to interact with individuals who could participate in a trial. More promotion of community engagement Community stakeholders can help to assess if recruitment information is culturally appropriate and can help with creating more equitable recruitment strategies. More inclusive research teams As stated earlier, individuals have been shown to be more trusting when research teams include members that have a similar background to themselves (Griffith et al. 2020). More inclusive practices When obtaining information about gender, researchers should collect more informative information such as the model proposed in B. Chen et al. (2019). This involves collecting more information about non-binary individuals, for example providing response options to questions about gender so that an individual who was assigned female at birth, but is now male can adequately provide this information. Similarly, more detailed information about ancestry (for example providing more specific Latin American ancestry options as opposed to just Latin American) could also lead to more informative findings about specific populations. If possible aim for funding for clinical trials from an institute that is supportive of the recruitment of more representative samples. 5.6 What does it take for clinical research to be ethical? Here you can see a table of requirements for ethical research trials (Emanuel, Wendler, and Grady 2000). Such consideration can help to ensure that participants are treated with respect and integrity and with their well-being as the top priority. 5.7 Research practices to reduce cancer health disparities According to a recent article (Zavala et al. 2021) about cancer research in the US, the following is suggested to reduce cancer health disparities: Further develop and sustain large diverse cohorts that collect multidimensional/multilevel data. Diversify germ-line and tumor genetics/genomics databases and clinical trials. Develop diverse cell lines and patient-derived xenograft models. Implement system changes in healthcare coverage to guarantee equity in access to high-quality screening and access to treatment. Improvement and system-wide implementation of patient navigation programs. Employ culturally tailored community awareness and education programs to increase cancer screening (including genetics) and modify risk behaviors. Implement legislation that supports behavioral interventions (e.g. limit the sales of tobacco products). 5.8 Ways to better support a more diverse research team In order to best support and encourage mentees and employees of underrepresented groups in cancer informatics, we suggest that lab leaders do the following: 5.8.1 Seek additional training about disparities in informatics and STEM careers Especially focus on hindrances to achievement such as attitudes, biases, and stereotypes. Also, become aware of stereotype threat (also called stereotyped inferiority) - “an internal feeling and concern about confirming a negative stereotype associated with a group (e.g., racial, ethnic, gender, and age) with which the individual identifies” (Stelter, Kupersmidt, and Stump 2021) and how they might influence your mentees. Here is a great video of Russell McClain at the University of Maryland that introduces how implicit bias and stereotype threat impact higher education: Note that you may not be aware of all the barriers of achievement that your mentees may face. For example, mentees from low socioeconomic backgrounds, mentees with disabilities, mentees who have immigrated, older mentees, mentees of traditionally underrepresented races and ethnic groups, and mentees with gender identities that are underrepresented face unique and sometimes overlapping challenges. This is not a complete list and it is also important to learn about how intersectionality (the idea that some individuals may represent more than one underrepresented group (ex. female and Black)) results in more nuanced challenges. For example: “When the intersection of race/ethnicity and gender is considered, women of color report even less access to mentorship and support from mentors than other groups” (Davis et al. 2021). Here is a great video of Kimberlé Crenshaw (at UCLA and Columbia) describing the theory of intersectionality, which she developed: Also, become aware of microaggressions - “subtle verbal and nonverbal slights, insults, or invalidating remarks directed at individuals due to their membership in a group (e.g., racial, ethnic, gender, sexual orientation, age, and physical disability), which are rooted in biases about individuals in that group” (Stelter, Kupersmidt, and Stump 2021). See below a list of examples: [Source] Importantly, “mentors for students with disabilities should receive training, as needed, on their mentee’s specific disability and should be made aware of the accommodations that students may need to succeed in activities and courses” (Stelter, Kupersmidt, and Stump 2021). 5.8.2 Acknowledge mentee’s differences Research shows that mentees of underrepresented racial groups would prefer their mentors to directly discuss how to best cultivate their mentee’s career success given their race. An attitude of “color-blindness” about race has shown to hinder the success of mentees (Byars-Winston et al., n.d.; Holoien and Shelton 2012). Talk with your team members individually (be careful not to single out individual team members in front of the rest of the lab) about how they would like to discuss the potential influences of their background/identity on their career growth. “Racial/ethnic differences between mentees and mentors in interracial mentoring relationships can pose cultural barriers to effective mentoring of HU (Historically underrepresented) students and even affect students’ professional and psychosocial success, especially when complex racial/ethnic issues are not effectively handled or addressed…” (Byars-Winston et al., n.d.). “Two ideological perspectives – colorblindness and multiculturalism – have emerged to shed light on this question. Colorblindness downplays the salience and importance of race by focusing on the commonalities people share, such as one’s underlying humanity. In contrast, multiculturalism acknowledges and highlights racial differences” (Holoien and Shelton 2012). “Exposure to colorblind (vs. multicultural) messages predicts negative outcomes among Whites such as greater implicit and explicit racial bias (Richeson & Nussbaum, 2004)” (Holoien and Shelton 2012). “[Underrepresented groups] benefit when others around them endorse multiculturalism (Plaut et al., 2009)” (Holoien and Shelton 2012). 5.8.3 Work to create a safe environment Educate lab mentors about cultural sensitivity and microaggressions. Highlight the importance of collaboration and create a code of conduct for the lab to demonstrate that respect among lab members is expected and required. 5.8.4 Diverse role models Expose all mentees to a diverse range of role models through seminars, journal clubs, and participation in conferences. Computational biology papers with female authors are more likely to have a last author who is also female. It is unclear if this is because women are more likely to hire other women and or if females are more likely to choose a lab with a female adviser (Bonham and Stefan 2017). Indeed, research of females and other underrepresented groups in STEM including students with disabilities and of certain racial and ethnic groups suggests that role models of underrepresented populations are particularly important for recruiting and keeping students interested in fields where they may feel like an outsider (Stelter, Kupersmidt, and Stump 2021) due to current underrepresentation. One strategy to encourage students of underrepresented populations is to provide students with exposure to such role models through regular seminars where scientists who represent these populations are prominent (Katz 2007). 5.8.5 Advocate for all mentees Introduce your mentee to other scientists and trainees particularly those from a diverse range of underrepresented groups. Encouraging the participation of your mentees in support programs and groups such as graduate student groups. Help mentees cultivate self advocacy practices through open discussions and encouragement. 5.8.6 Support a healthy relationship with failure Be a good role model and openly discuss the role of failure in research. For example, you may describe failures in your own career or you may read some of the book Brilliant Blunders by Mario Livio or this article about the book with your mentees. This book describes how scientific advancement actually occurred due to mistakes of some of the most respected scientists. Educating mentees about the Growth Mindset described by Carol Dweck may also be helpful. The major themes of this mindset is an awareness that our abilities are not fixed, that we can change our aptitudes with practice and work. [source] See here for more information. 5.8.7 Celebration and microaffirmations Be sure to celebrate all of your mentees’ small and large successes. This has been shown to promote confidence and resilience (Stelter, Kupersmidt, and Stump 2021). Be generous complementing or pointing out small successes in discussions with your mentees and thank your mentees for performing tasks that assist you or your lab. For larger successes, consider sharing a meal or other social activity with your lab. For virtual or remote lab members this could be playing a game online. Again aim to do this with all your mentees/lab members. Be mindful about not singling out particular mentees. This could further make such lab members feel like they don’t belong. 5.8.8 Give feedback with cultural sensitivity It is important to be aware that your mentee may be struggling with feeling like they don’t belong when you provide feedback (Stelter, Kupersmidt, and Stump 2021; Lee, Dennis, and Campbell 2007). Thus, when given criticism, certain mentees who may especially feel like they don’t belong because of their background differences, may feel very discouraged. Try to still be encouraging when delivering criticism by acknowledging what is going well and what is progressing. Also keep in mind that it is important to provide feedback for all mentees, as this is needed for growth. Just be sure to provide the feedback in a way that shows that you want your mentee to grow as a scientist overall and that you want them to continue. This article has several good tips for delivering criticism that we will summarize here with our own thoughts: Allow for a discussion about what went wrong. You may learn that your mentee struggled for an entirely different reason than you expected. Having a discussion allows you to better determine how your mentee might be able to perform better in the future. Give criticism in a sandwich. Say something positive, deliver the criticism, then say something else positive. Focus on the actions instead of personality traits. Think about how they can improve. For example, instead of saying “You seem to have time management issues”, you could say something like, “Navigating and prioritizing all these projects is a difficult task and I think we can do better as a team.” Be specific and suggest improvements. You want your mentees to know exactly what they should be aiming to improve and why. Vague criticism may reduce their confidence and will not help them improve as much as concrete specific suggestions and discussion. Deliver criticism in private. Especially if your mentee is feeling like they don’t belong, criticism in front of other lab members can really impact their confidence. It can also lead to more unhealthy competitive dynamics between lab members. Check your feedback. Where possible, double check the feedback you plan to give from the prespective of the mentee. Consider the following: How constructive and useful is the feedback? Can the student realistically make these changes? Is the feedback written in a kind and clear manner? How might the feedback be misconstrued? Don’t surprise your mentees with criticism. Build criticism into regular meetings with your mentee. Don’t create a meeting out of the blue to tell them they need to improve as this may cause excess stress. Secondly, criticism can be normalized if it is delivered gently and in the ways we just outlined. 5.8.9 Consider creating a document of mentor and mentee expectations. These documents help clarify what mentees can expect. This is helpful for your mentees to better perform according to expectations, as they are explicitly stated rather than intuited. [source] Masters and Kreeger (2017) has created a nice set of guidelines for such documents. Also see this table of examples of such documents from the Center for Improvement of Mentored Experiences in Research (CIMER). Keep in mind that such forms should be tailored for different career stages of your mentees and for mentees who are pursuing different expertise. Informatics mentees should incorporate guidelines about data management practices. We will discuss a bit more about that in the next chapter. 5.9 Conclusion We hope that these guidelines will help you to create a safe and more comfortable environment for all of your lab members and to support your team to be more mindful about health inequity when conducting research. We believe that a happier and more inclusive lab has the potential to be more productive and innovative. In conclusion, here are some of the take-home messages: Increasing the diversity of our research teams can improve scientific innovation, can add additional perspectives about what research to focus on, can potentially promote more inclusive research practices, and can improve clinical trail recruitment of more diverse participants. Training about diversity, inclusion, and equity, as well as training about cultural competence can help research teams to communicate with diverse clinical trial participants, promote more inclusive lab culture, and promote more inclusive research. Clinical trials can recruit and retain more diverse participants by designing trials and recruitment strategies that accommodate more diverse participant lifestyles. Community engagement can be a particularly useful strategy for recruitment. Seeking training about biases, stereotypes, intersectionality, and microaggressions can be helpful for creating a more inclusive lab culture. Creating an expectation document for the lab can help clarify your expectations of students so that students do not need to intuit how they are expected to perform. References "],["informatics-lab-management-tools.html", "Chapter 6 Informatics lab management tools 6.1 Slack 6.2 Git and GitHub 6.3 Docker 6.4 Figshare 6.5 RStudio and R Markdown 6.6 Jupyter 6.7 Note-taking apps 6.8 Conclusion", " Chapter 6 Informatics lab management tools There are several tools that can be especially useful for assisting with day-to-day management of projects involving informatics regardless of if you are simply collaborating with an informatics expert or you are leading an informatics research team. 6.1 Slack Slack is a communication tool that allows you to communicate with lab members much more efficiently than email. It is a bit like a combination of an instant message system, email, and Dropbox. You can do quite well with the free version of Slack. It may be all that your research group needs indefinitely. The major difference between the free version and the paid versions is that the free version does not save all of your message history. Currently, with the free version you can search through the history of the last 10,000 messages. From our experience using this with a department with about 250 users, it takes about a year to reach this threshold. If you choose to go with the free option and share any really important messages or files, make sure to save them just in case. Now we will guide you through a bit about how to use Slack. 6.1.1 Workspaces The main landing page for Slack is called a workspace, which looks like this: In the above image, this person has five workspaces which are indicated by the squares on the far left. Each workspace allows for multiple channels for communicating. These channels can include all members of the workspace or specific subsets of members. Team members can also have separate direct messages to have one-on-one discussions. It’s a good idea to check if your department or institute is already using Slack. If so, they may have a workspace that you can join. Otherwise you may want to think about recruiting your department or institute to start using Slack. In this case you could start a workspace where people outside your research group can communicate. This would still allow you to have group messages with your lab or specific groups within the lab. Otherwise, you can start a workspace just for your research group. 6.1.2 Channels Channels are the main way in which you can converse with your team on Slack. We recommend making a Slack channel for your entire research group. Everyone in your group will be able to discuss something by sending messages in real time. If someone is not available at that time, they will see the message when they next check Slack. We also recommend making project specific channels. For these channels you can add all of the team members working on a specific project, so that they can easily discuss and review discussions about the project. Importantly, you can make channels private or public. If a channel is public, anyone in the workspace can join. 6.1.3 Pins If someone sends a really important message, like a link to a relevant document, you can “pin” the message so that it is easier to find later. Hovering over a message you will get the following options: Clicking on the 3 dot button allows you to do several useful things for a message including pinning it to find it easily later. 6.1.4 Code One great feature about Slack, is that it is very convenient to message about code. You can also attach files directly to messages just like in the above message which has a screen shot image file. 6.1.5 Reminders If you want to be reminded about a message in 20 minutes or next week you can also do that using the same hovering and 3 dot button option. Thus if Jack gets a message from his advisor Charlie, but he is busy doing deep work on something else, he can ask Slack to remind him later. You may also notice in the image above that your messages can be edited! unlike an email, in addition you can mark them as unread, which can also be useful for responding to messages asynchronously. 6.1.6 Polls One other nice feature for working with a team is that you can directly poll your team. This requires enabling this feature, but it can be super useful. Say sally wants to schedule a meeting with the lab teammates for a specific project- this could even include collaborators who are outside of the lab. If all the users are on the same Slack channel, she can send out a poll like this one asking people to respond with times that they are available. If you want to learn more about what you can do with Slack, check out these other awesome integrations/Slack apps! 6.2 Git and GitHub Informatics work can especially benefit from keeping track of your steps and the code that you have used. In some cases your lab may use a tool like Galaxy which has built in options for keeping track of the steps that your lab members are taking during their research. However, other tools do not have this option. Instead, we can use a tool called Git which allows for something called “version control”. Version control is the tracking of changes to a file or files overtime. This is equivalent to saving different versions of a grant proposal overtime. However, as you may have noticed, this is not an easy process to maintain. Tools like Git (Git is one of the most popular) help us to keep track of changes. If we save our changes often, we can easily modify our files back to a recent version if necessary. This may be less useful for a grant proposal (although we would argue that it really can be!), but it can be absolutely critical for your informatics code. Why is this? Small changes in your code may result in your code breaking or generating completely different results. To make matters worse, sometimes your code files may be lengthy, if you have 4,000 lines of code (or more!), it can be difficult to identify what is different between one version and another. Git really helps with this. So what is GitHub? GitHub is a free hosting site for code (or other files - including those grant proposals!). Therefore, all the different versions of your files can be saved and accessed online at GitHub. You can make these files private or public. According to Wikipedia: As of January 2020, GitHub reports having over 40 million users and more than 190 million repositories (including at least 28 million public repositories), making it the largest host of source code in the world. You do not have to use GitHub to use Git. If you have data that needs to be complaint with HIPAA, you could still use Git on a local server (more on this in other courses). Alternatively, you could use GitHub after you de-identify your data. See here for info about ways to use GitHub for data that needs to be HIPAA compliant. We recommend this tutorial if you are interested in using Git and GitHub with R. 6.3 Docker If you have multiple team members modifying code for a pipeline or some software, or if you ultimately want to share your code, it is recommended that you use a method to ensure that you (as well as you in the future!) and your team and anyone you want to share your code with uses the same dependency software of the same versions! There are a few ways to do this, but one of the simplest is to use what is called Docker. You might be familiar with something called a virtual machine. A virtual machine basically allows you to perform operations on your computer, but as if you are using a different computer! This is handy because you can ensure that you not only have similar software installed but you are also working with the same operating system as your teammates (even if your computer has a different operating system). For example you might have a Mac, and your teammates might all have Windows machines. Pretty cool, right!? Docker is similar to this, except it uses what is called a container. This allows users to work with software that is preinstalled and an environment that is preconfigured, however it uses part of your existing operating system. This is good because it means that it takes less time and resources than using a virtual machine, which includes a full copy of a virtual operating system. Here is the explanation about what containers are on the Docker blog: A container is a standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another. A Docker container image is a lightweight, standalone, executable package of software that includes everything needed to run an application: code, runtime, system tools, system libraries and settings. See here and here to read more about the differences between virtual machines and containers like Docker. Note that you can use Docker containers within a virtual machine. See here for explanations about host and guest operating systems. In short the host operating system is the local machines operating system (the one with the hardware), the guest operating system is the virtualized operating system. Finally, see here for a deeper explanation about how Docker images and containers work. Like GitHub, there is a Docker Hub, where people store different Docker images (which is what allow people to run a Docker container with all of your software dependencies and configurations on their machine).You can download other people’s Docker images, or you can host your own on Docker Hub for other people to use your Docker images. The important take-home message is that Docker fixes the issue of having your code work only on your machine but not on someone else’s machine. 6.4 Figshare Similar to GitHub, figshare is another option for hosting and sharing files on the internet. However, it is specifically designed for research related files. Users can host data, code, images, posters, papers, and other types of files to allow others to easily find resources related to research. In their words: Figshare is a repository where users can make all of their research outputs available in a citable, shareable and discoverable manner. We make it as simple as possible to make research Findable, Accessible, Interoperable, and Reusable (FAIR). Publish research in any file format and assign an institutionally-branded DOI, document with customizable, discipline-specific metadata, [create] discoverable content across major search engines and academic frameworks. The difference with GitHub is that it is easier to make files citable (thus ensuring better that you get credit) and it also is a bit easier to store large data files. Figshare does not however have the same version control capabilities of Git and GitHub. Ultimately figshare is a great place to share the final versions of your research products to make them findable for others. Figshare also encourages researchers to publish negative findings that did not ultimately end up in publications to reduce redundancy in the research field, which we think is a great idea! 6.5 RStudio and R Markdown If your research teammates are using the R programming language often, we strongly suggest that you consider having these teammates use what is called R Markdown to create reports of their analyses. R Markdown is a flavor of a markup language called Markdown that works especially well with R. What do we mean by markup? A markup language is a language for formatting text, particularly to make text that can be distinguished form the rest of a document. In our case we want text to look different from our code and from the output of our code. Ideally for the sake of reproducibility and transparency, we recommend that your informatics teammates write such R Markdown reports as they are performing their analyses - not afterwards. RStudio is an Integrated Development Environment or IDE for developing code that makes it easy to write such reports. R markdown files make it easy to have a report that shows a bit of the data (or all the data if your data is very small), the code, commentary about what the code is doing, as well the actual output of the code for a given informatics process. We also recommend describing (in one document) what data you used, who performed the analysis, when they performed it, and how they performed it. This really helps with troubleshooting in the future, as well as simplifying maintaining code over time. It also makes it easier to train new lab members or communicate to collaborators about how your code works. The really nice thing about these reports is that Markdown languages allows you to export them in a variety of formats like html websites, pdfs, word documents (or even slide presentations with just a bit of extra work) that can easily be shared with others. You aren’t limited to just writing about code in these reports. You can write about anything. In fact, what you are reading right now was originally written using R markdown. Thus this is also a good option for writing up reports about wet bench experiments as well. 6.5.1 R Markdown guidelines There are a few simple syntax rules for R Markdown. To create headers you can specify them using hashtags. One hashtag # creates the largest header option, while two ## is a bit smaller, and three ### is a bit smaller than two ##, etc. Thus you could create a header like so: # This is a header To create bold text you can use asterisks around the text. **This is bold text** Which will look like this: This is bold text To create italicized text you can use two asterisks around the text. *This is italicized text* Which will look like this: This is italicized text To create both bold and italicized text you can use three asterisks around the text. ***This is bold and italicized text*** Which will look like this: This is bold and italicized text To create a new line you include two spaces after the end of the line. To create a divider line you can use three asterisks without any text on a line. *** Which will look like the following divider line: You can also embed images or videos into your R markdown reports. There are several ways to do this with a package called knitr which allows you to style your reports and include different file types in your reports. However you can also include an image or video simply with the following syntax: ![caption text](URL_or_local_path_to_image_or_videofile) If you are including code (which can be R programming language code, Python, SQL, bash or others). You can specify it using three backticks like this: ```{r} some R code ``` Here is some actual R code that displayed in the html output from the original R Markdown file. There is a slightly darker background for code and for code output. You will see that the result of x is printed right after a break: # This is a code comment about some R code- here comes the code on the next line! x <- c(1, 2, 3, 4, 5) x ## [1] 1 2 3 4 5 Similarly this is some Python code and output: # Now we are going to show some python code x = [1,2,3,4,5] print(x) ## [1, 2, 3, 4, 5] For inline code (meaning you can show the output within some narrative text) you can use one backtick before and after the code starting with r to specify that you are using the R programming language like this: ` r x ` This will result in: Here is the output: 1, 2, 3, 4, 5 Another important thing to know is that you can utilize what are called child Rmd files in case your report is getting too large (something that often happens with analyses). In this case, you can separate out parts of your research process into different report documents and have an additional report document that demonstrates the entire process. See here for more information on how to do this. If you want a quick reference check out this guide. For a more extensive guide check out this article from R Studio. Also see Riederer (n.d.) for additional information about how to use and create R Markdown files. For advanced users check out R Notebook, which is an extension of R Markdown. 6.6 Jupyter Jupyter notebooks are very similar to R markdown reports, however they were designed with an emphasis on using Python rather than R, and such reports are created using a web-based editor rather than software on your local computer. The Markdown syntax used in Jupyter notebooks is nearly the same as what you just learned about for R Markdown. Here you can see a quick guide. See here for an extensive guide, where you might notice some differences in terms of how to include code. JupyterLab is also similar to RStudio. However it is a web-based environment for working with code and writing Jupyter notebooks. You can try some demos here. 6.7 Note-taking apps If you want to take tracking your projects to the next level, we recommend a note-taking app. This allows you to store and organize all of your files related to different projects, take notes, and more. One really useful feature is that many allow you to search across all notes, so if you can’t quite remember where something is you can find it easily. You can also share your notes with others. This is also a great place to jot down ideas, store tips for yourself and others, make timelines and more. Although it will take a bit of time to learn how to use these apps and some time to take notes etc., this will ultimately save you time in the long run and many of these apps have been designed to be especially user-friendly. You don’t need to use all of the available features. Just tracking all the information related to your projects in one place can already greatly improve your ability to manage your projects. This blog has an excellent review of various options, many of which are free or have slightly more limited but free versions. Evernote is a commonly used note-taking app, which as you can see from this video can really be helpful: Be careful if you intend on including any information that would require HIPAA compliance in a note-taking app! Microsoft OneNote offers options for encryption to allow for HIPPA compliance if you need that. See here for more information. 6.8 Conclusion Overall we think that these tools can be helpful to you and your informatics research team. There are however many other tools that can help with informatics analyses. We will discuss these in other courses about data reproducibility and management. In conclusion, here are some of the take-home messages: Slack can be a great option for maintaining communication with lab members who may be onsite or remote. Version control with Git and GitHub as well as standardization using Docker can ensure that your computational work is being maintained and shared smoothly. RStudio and R Markdown reports can improve your analyses that you perform in R. This is also compatible with performing aspects of your analyses using some other languages. Jupyter is very helpful for python related projects. Keeping reports of your work with annotations about the code and data used can be critical for your future self, other lab members, outside collaborators, and others to better understand your analyses. Using a note-taking app can be extremely useful for organizing reports, communications, ideas, notes and more for your various projects. Be careful about including any protected data or information that would require HIPAA compliance. References "],["about-the-authors.html", "About the authors", " About the authors These credits are based on our course contributors table guidelines.     In memory of James Taylor, who was instrumental in initiating this project.   Credits Names Pedagogy Lead Content Instructor Carrie Wright Content Editors/Reviewers Candace Savonen, Sarah Wheelan, Jeff Leek Content Directors Jeff Leek, Sarah Wheelan Content Consultants (Promoting diversity equity and inclusion) Simone Sawyer, Karriem Watson Acknowledgments Andrei Kucharavy, Sarah Opitz, Florian Markowetz, Brody Foy, Michael Mullarkey, Anne Carpenter, Luis Pedro Coelho, Keri Martinowich Production Content Publisher Ira Gooding Content Publishing Reviewers Ira Gooding, Candace Savonen Technical Course Publishing Engineer Carrie Wright Template Publishing Engineers Candace Savonen, Carrie Wright Publishing Maintenance Engineer Candace Savonen Technical Publishing Stylists Carrie Wright, Candace Savonen Package Developers (ottrpal) John Muschelli, Candace Savonen, Carrie Wright Art and Design Illustrator Carrie Wright Funding Funder National Cancer Institute (NCI) UE5 CA254170 Funding Staff Emily Voeglein, Fallon Bachman   ## ─ Session info ─────────────────────────────────────────────────────────────── ## setting value ## version R version 4.3.2 (2023-10-31) ## os Ubuntu 22.04.4 LTS ## system x86_64, linux-gnu ## ui X11 ## language (EN) ## collate en_US.UTF-8 ## ctype en_US.UTF-8 ## tz Etc/UTC ## date 2024-12-20 ## pandoc 3.1.1 @ /usr/local/bin/ (via rmarkdown) ## ## ─ Packages ─────────────────────────────────────────────────────────────────── ## package * version date (UTC) lib source ## bookdown 0.41 2024-10-16 [1] CRAN (R 4.3.2) ## bslib 0.6.1 2023-11-28 [1] RSPM (R 4.3.0) ## cachem 1.0.8 2023-05-01 [1] RSPM (R 4.3.0) ## cli 3.6.2 2023-12-11 [1] RSPM (R 4.3.0) ## devtools 2.4.5 2022-10-11 [1] RSPM (R 4.3.0) ## digest 0.6.34 2024-01-11 [1] RSPM (R 4.3.0) ## ellipsis 0.3.2 2021-04-29 [1] RSPM (R 4.3.0) ## evaluate 0.23 2023-11-01 [1] RSPM (R 4.3.0) ## fastmap 1.1.1 2023-02-24 [1] RSPM (R 4.3.0) ## fs 1.6.3 2023-07-20 [1] RSPM (R 4.3.0) ## glue 1.7.0 2024-01-09 [1] RSPM (R 4.3.0) ## htmltools 0.5.7 2023-11-03 [1] RSPM (R 4.3.0) ## htmlwidgets 1.6.4 2023-12-06 [1] RSPM (R 4.3.0) ## httpuv 1.6.14 2024-01-26 [1] RSPM (R 4.3.0) ## jquerylib 0.1.4 2021-04-26 [1] RSPM (R 4.3.0) ## jsonlite 1.8.8 2023-12-04 [1] RSPM (R 4.3.0) ## knitr 1.48 2024-07-07 [1] CRAN (R 4.3.2) ## later 1.3.2 2023-12-06 [1] RSPM (R 4.3.0) ## lifecycle 1.0.4 2023-11-07 [1] RSPM (R 4.3.0) ## magrittr 2.0.3 2022-03-30 [1] RSPM (R 4.3.0) ## memoise 2.0.1 2021-11-26 [1] RSPM (R 4.3.0) ## mime 0.12 2021-09-28 [1] RSPM (R 4.3.0) ## miniUI 0.1.1.1 2018-05-18 [1] RSPM (R 4.3.0) ## pkgbuild 1.4.3 2023-12-10 [1] RSPM (R 4.3.0) ## pkgload 1.3.4 2024-01-16 [1] RSPM (R 4.3.0) ## profvis 0.3.8 2023-05-02 [1] RSPM (R 4.3.0) ## promises 1.2.1 2023-08-10 [1] RSPM (R 4.3.0) ## purrr 1.0.2 2023-08-10 [1] RSPM (R 4.3.0) ## R6 2.5.1 2021-08-19 [1] RSPM (R 4.3.0) ## Rcpp 1.0.12 2024-01-09 [1] RSPM (R 4.3.0) ## remotes 2.4.2.1 2023-07-18 [1] RSPM (R 4.3.0) ## rlang 1.1.4 2024-06-04 [1] CRAN (R 4.3.2) ## rmarkdown 2.25 2023-09-18 [1] RSPM (R 4.3.0) ## sass 0.4.8 2023-12-06 [1] RSPM (R 4.3.0) ## sessioninfo 1.2.2 2021-12-06 [1] RSPM (R 4.3.0) ## shiny 1.8.0 2023-11-17 [1] RSPM (R 4.3.0) ## stringi 1.8.3 2023-12-11 [1] RSPM (R 4.3.0) ## stringr 1.5.1 2023-11-14 [1] RSPM (R 4.3.0) ## urlchecker 1.0.1 2021-11-30 [1] RSPM (R 4.3.0) ## usethis 2.2.3 2024-02-19 [1] RSPM (R 4.3.0) ## vctrs 0.6.5 2023-12-01 [1] RSPM (R 4.3.0) ## xfun 0.48 2024-10-03 [1] CRAN (R 4.3.2) ## xtable 1.8-4 2019-04-21 [1] RSPM (R 4.3.0) ## yaml 2.3.8 2023-12-11 [1] RSPM (R 4.3.0) ## ## [1] /usr/local/lib/R/site-library ## [2] /usr/local/lib/R/library ## ## ────────────────────────────────────────────────────────────────────────────── "],["references.html", "References", " References "],["404.html", "Page not found", " Page not found The page you requested cannot be found (perhaps it was moved or renamed). You may want to try searching to find the page's new location, or use the table of contents to find the page you are looking for. "]] +[["index.html", "Leadership for Cancer Informatics Research About this course 0.1 Available course formats", " Leadership for Cancer Informatics Research 2024-12-20 About this course This course is part of a series of courses for the Informatics Technology for Cancer Research (ITCR) called the Informatics Technology for Cancer Research Education Resource. This material was created by the ITCR Training Network (ITN) which is a collaborative effort of researchers around the United States to support cancer informatics and data science training through resources, technology, and events. This initiative is funded by the following grant: National Cancer Institute (NCI) UE5 CA254170. Our courses feature tools developed by ITCR Investigators and make it easier for principal investigators, scientists, and analysts to integrate cancer informatics into their workflows. Please see our website at www.itcrtraining.org for more information. Except where otherwise indicated, the contents of this course are available for use under the Creative Commons Attribution 4.0 license. You are free to adapt and share the work, but you must give appropriate credit, provide a link to the license, and indicate if changes were made. Sample attribution: Leadership for Cancer Informatics Research by Johns Hopkins Data Science Lab (CC-BY 4.0). You can download the illustrations by clicking here. 0.1 Available course formats This course is available in multiple formats which allows you to take it in the way that best suites your needs. You can take it for certificate which can be for free or fee. The material for this course can be viewed without login requirement on this Bookdown website. This format might be most appropriate for you if you rely on screen-reader technology. This course can be taken for free certification through Leanpub. This course can be taken on Coursera for certification here (but it is not available for free on Coursera). Our courses are open source, you can find the source material for this course on GitHub. "],["introduction.html", "Chapter 1 Introduction 1.1 Motivation 1.2 Target audience 1.3 Topics covered: 1.4 Curriculum 1.5 Informatics teams challenges 1.6 Meet the team!", " Chapter 1 Introduction 1.1 Motivation Informatics research often requires multidisciplinary teams. This requires more flexibility to communicate with team members with distinct backgrounds. Furthermore, team members often have different research and career goals. This can present unique challenges in making sure that everyone is on the same page and cohesively working together. 1.2 Target audience The course is intended for researchers who lead research teams or collaborate with others to perform multidisciplinary work. We have especially aimed the material for those with moderate to no computational experience who may lead or collaborate with informatics experts. However this material is also applicable to informatics experts working with others who have less computational experience. 1.3 Topics covered: 1.4 Curriculum We will provide you with an awareness for the specific challenges that your informatics collaborators, employees, and mentees might face, as well as ways to mitigate these challenges. By creating a better work environment for your informatics research team, you will ultimately improve the potential impact of your work. We will also discuss the major pitfalls of informatics research and discuss best practices for performing informatics research correctly and well, so that you can get the most out of your informatics projects. 1.5 Informatics teams challenges Informatics work presents unique challenges due to the fact that it requires multidisciplinary teams. According to a recent article about this subject: “Computational biology hinges on mutual respect between scientists from different disciplines, and key elements of respect are understanding a colleague’s particular expertise and motivation.” [way_field_2021] In this course we hope to give you a bit more of an understanding of the variety of perspectives that your colleagues might have. 1.6 Meet the team! In order to familiarize you with our guidelines for how to make the most out of your informatics projects we are going to introduce you to some characters of the type of people you may encounter on your journey. We are going to show these characters in situations that may be similar to what you might experience. We are doing this to make the lessons concrete and to try to make the experience more entertaining and experiential. First our fearless lab leaders who lead informatics research projects. We have Sally who is experienced with working with team members from many disciplines including informatics experts. She helps guide her lab through successful projects all the time. Next, we have Charlie. He is new to informatics research and could learn a bit about how to work with informatics experts more effectively. Now we have our informaticists. First is Jack, who is often forgotten and misunderstood by his lab leader. His lab leader does not really know what he does, how long his work takes, or how to support Jack and his career goals. This is unfortunately impeding Jack from achieving the career that he could have and from producing the good work that he is capable of. We also have Hilda, an example of a happy informaticist. She feels supported in all the ways that she needs, allowing her to be as productive and helpful as possible. Here is Francis the frustrated collaborator. She often feels misunderstood by her colleagues. They seem to think her work is easy and should be done faster and they often don’t discuss important aspects about the project until she needs to redo work. Thus, she is reasonably frustrated. Finally we have Harry, the helpful collaborator. He clearly communicates with his collaborators and is well organized! He teaches is collaborators about informatics and they teach him about their knowledge. He and his collaborators have very productive projects. We will now describe some guidelines for how to be an effective leader, collaborator, and mentor on informatics projects so that you can be more like Sally with mentees and employees like Hilda and collaborators like Harry. Keep in mind that while our cartoons often exaggerate situations to make them more comical, more subtle situations can still be very detrimental to your research teammates. "],["guidelines-for-multidisciplinary-informatics-teams.html", "Chapter 2 Guidelines for multidisciplinary informatics teams 2.1 Finding and creating informatics teams 2.2 Communication for informatics teams 2.3 Record keeping practices 2.4 Leadership best practices 2.5 Conclusion", " Chapter 2 Guidelines for multidisciplinary informatics teams In this lesson we will discuss general guidelines for how to create and maintain healthy relationships within a multidisciplinary informatics research team. 2.1 Finding and creating informatics teams The first step to performing a good research study is to find the right people for your research team. In this section we will provide a guide for finding good coworkers, whether they are mentees, collaborators, or employees, to work on informatics cancer projects particularly for multidisciplinary teams. This section is based on blog posts Peng (2011) from Roger Peng and Matsui (2013) from Elizabeth C. Matsui of the simply statistics blog which has many other useful discussions and resources. 2.1.1 Start early The key to a successful multidisciplinary project is to start looking for your research teammates early. If you plan to collaborate with an expert, we suggest that you find such collaborators and discuss your plans before you even finish designing your studies. If you are new to informatics and you plan to employ or mentor informatics experts in your lab, we also suggest that you seek guidance from a senior informatics expert before you start an informatics project. 2.1.2 Who to look for Especially for projects with multidisciplinary teams, good communication is vital. Look for people who are easy to talk to. In Roger’s words: “If you don’t feel comfortable asking (stupid) questions, pointing out problems, or making suggestions, then chances are the science will not be as good as it could be.” Ideally you want to be able to ask your collaborators basic questions, so that you can be sure that you understand the fundamentals correctly. If your collaborator doesn’t explain information clearly (without jargon that you don’t understand) or doesn’t make you feel safe to ask questions, then this will likely result in you missing out on critical information. This goes hand in hand with finding someone who is compassionate or polite. You never quite know what will happen with projects in Science, thus it is ideal to work with people who can handle difficult situations well and continue to treat others with respect. An ideal collaborator also has enthusiasm to ask you about your work, including again fundamental questions. Having their newer perspective can be especially helpful for you to think about your work differently and notice anything that you might take for granted. You also want collaborators that respect and appreciate your knowledge. If there is an imbalance regarding respect, this can result in situations where one collaborator feels like a subordinate to the other. This generally makes individuals feel less comfortable to bring up problems. Finally, you also want to look for individuals who seem to get things done. You can get a sense of someones productivity by looking at their CV and asking them about their projects. Motivation in collaboration can dwindle due to responsibility being spread across groups, which is why it is so important to find productive collaborators. 2.1.3 Adjust your expectations Be careful about assuming that your experimental collaborator can do any type of wet bench experiment or that your informatics collaborator can analyze any type of data. “Computational biology hinges on mutual respect between researchers from different disciplines, and a key element of respect is understanding a colleague’s particular expertise” (Way et al. 2021). “Computationalists do not like to be seen as “just” running the numbers any more than biologists appreciate the perception that they are “just” a pair of hands that produced the data” (Way et al. 2021). “Statistics, database structures, clinical informatics, genetics, epigenetics, genomics, proteomics, imaging, single-cell technologies, structure prediction, algorithm development, machine learning, and mechanistic modeling are all distinct fields. Biologists should not be offended if a particular idea does not fit a computational biologist’s research agenda, and computational scientists need to clearly communicate analysis considerations, approaches, and limitations” (Way et al. 2021). “Certain grant mechanisms can provide flexibility for computational biologists to develop new technologies, but the scope often focuses on method development, limiting the ability to collaborate on application-oriented projects. The current academic systems incentivize mechanism and translational discovery for biology but methodological or theoretical advances for computational sciences. This explains a common disconnect when collaborating: Projects that require routine use of existing methodology typically provide little benefit to the computational person’s academic record no matter how unique a particular dataset.” (Way et al. 2021) 2.2 Communication for informatics teams Communication is vital for all work relationships, but this is especially true for multidisciplinary teams. Here are a few tips for keeping communication smooth. This section is based on Broman (2019) by Karl Broman and Wang (2019) by Jiangxia Wang. 2.2.1 Talk first OK, so we kinda covered this, but we can’t emphasize it enough. We suggest that you start talking to your collaborators, students, or employees before you even begin a project, so that you can plan the project in the optimal way. This is critical for forming the right informatically feasible and scientifically useful questions and for collecting the right data to address such questions. Collecting the right data can be vital to the success of a project. It may not always be obvious what is possible or impossible for the experimental biologists. Is 30 samples actually feasible? How about 300? Would that be performed in different batches? What is necessary or possible for an informatics project to test certain questions with statistical methods? How long would certain analyses take? These are discussions that should happen long before reagents are purchased, before IRB submissions, and before grant submissions if possible. 2.2.2 Take your time For employees and mentees, allow time to get comfortable talking to one another. As a leader, take the lead in openly expressing areas that are new to you and facilitate an environment where teammates communicate respect for one another’s unique knowledge and perspectives. In all cases (with collaborators, employees, and mentees), build in extra time for projects to allow for teaching time. You can teach them about your domain and they can teach you about theirs. This may feel like it is taking extra time but it will ultimately pay off in the end, as you will be better prepared to work as a team and to ask the most useful and testable hypotheses. 2.2.3 Come up with questions/hypotheses together Not knowing what may be feasible in terms of data collection and analysis can make it nearly impossible to form an appropriately testable hypothesis. Furthermore, it may be difficult to know how your questions fit into the context of a field and what is actually useful to advance treatment and prevention if you are not a domain expert of the cancer or disease that you are studying. By working together in multidisciplinary teams we can determine the best hypotheses to advance science. Domain experts can help to ensure that the question is feasible from a standpoint of data collection, that it leads to other important questions, that it is new, that it is useful, and that the plan to test it will actually lead to interpretations that are useful. Informatics experts can help to ensure that the question is feasible from as standpoint of data collection and data analysis, that a question is testable, and that it leads to the interpretations that the domain experts hope to gain. 2.2.4 Be specific Give and ask for specific feedback. If your collaborator/employee/student says something that you do not quite understand, ask them for more specific clarification. In addition, give feedback that is specific where possible without assuming knowledge that might be necessary and avoid jargon. One way to do this is to reiterate what you think you understood and to describe concepts at high level, and then follow this with detailed descriptions. Include as many details as possible. For example, if a collaborator simply states that the number of samples would be underpowered, this might not be enough information for you to help solve the issue. Ask for clarification about why with response questions, such as “you believe that the sample size is too small to allow for this specific statistical test (specify the test) to be utilized to identify if there are differences between these specific groups (specify the groups)?”. Once you get a better understanding about why they might have said this, you can better understand how to solve the issue. In the above example, perhaps you need to consider a different test before considering getting additional samples. Or perhaps you aren’t even testing the correct question. Clarifying this first will help. Additional communication tips include (based on (Haden 2014) and (“Ten Tips for Asking Good Questions” n.d.): Use neutral language. You want to allow your team members to have their own authentic reaction (especially when it comes to issues). Neutral language might allow them to better realize a solution before they feel threatened by a reaction of stress from you. You also don’t want to ask questions that lead your team members to a specific conclusion even if there isn’t an issue, so as to allow for more discussion. You can do so by initially avoiding including options or assumptions that you have come up with. However after these open questions you want to start including your own understandings to get more specific. Start with general high level concepts and follow this with more specific comments/questions. Try to focus your feedback and questions about one specific aspect at a time. Ask your team members to walk through a process with you step by step. Ask your teammates to think about your process as if all of you were entirely new to the process to consider what you might be missing or taking for granted. Plan meetings ahead of time so that you know exactly what you hope to communicate. Write your questions or feedback out so that you make sure you cover everything that you need to. Assess each question and feedback comment for its overall purpose or goal. 2.2.5 Be compassionate Consider the stage of the project and how your discussions may impact your coworker. For example, pointing out that there is not enough data or samples to do what your collaborator had hoped during later stages of the project can be very disappointing as it is often not possible to collect new data. Being polite and considerate when you make suggestions can make a major difference. Furthermore, suggesting an idea about how the project can still be productive can save your collaborator/coworkers/students stress and heartache. They may not be aware that there is public data available or additional data in your lab that can still save the project. 2.2.6 Keep contact Regular communication continues the momentum of a project and ensures that important details get discussed when necessary. It also relieves anxiety among coworkers by keeping everyone aware of the status of the project and helping to start discussions if someone needs help. 2.2.7 Schedule extra time As a project continues, new challenges will arise that will again require more time for teaching one another about the scientific process specific to your domain. Build in breathing room in the project schedule where possible, to allow for time for setbacks. Keep in mind that you may be unaware of the setbacks that you may encounter for work outside of your expertise. Creating a situation that is less stressful makes it easier for everyone to maintain positive relationships. We will discuss this more in the next chapter. In summary, we suggest that you follow these tips when communicating with your multidisciplinary team: 2.3 Record keeping practices Once you have your project rolling, it is important to keep good records of your work, your collaborators work, and your communication. Keeping good records takes time and discipline but it can save you more time and heartache in the end. Here are some suggestions for how to optimize your record keeping. 2.3.1 Keep organized records of work Record and communicate notes about your data collection and analyses. Be mindful of overwhelming your coworkers, but generally speaking provide extra information where possible. The more people aware of details about what samples were in what batch, the more likely important details are not missed or forgotten. For example if you are sending data to a collaborator send as much information as possible about how it was generated in the email in which you send it to them, even if you have already discussed the data. This can help ensure that no important details fall through the cracks. The best way we think you can do this in general is to use reports - one of our next suggestions. 2.3.2 Keep organized records of communication Besides recording your work, keep a record of your communications. At a minimum organize your emails for projects into a separate folders with easily recognizable titles to save yourself hassle later when something comes into question. However, we highly recommend that in addition for even better record keeping, you copy paste emails and dates to a note-taking system. This could be as simple as a shared Google doc, or you could consider an app like these that are designed for note-taking. With many of these you can also share your notes with research teammates and you can include report documents directly in your notes. Which brings us to our next point about using reports! 2.3.3 Use reports Instead of sending informal short emails (which are useful at some points in a workflow), we suggest intermittently sending lab reports with as much information about what was done and why as possible. For informatics related work in R or Python (or other supported languages) we highly suggest using a method like R markdown or Jupyter notebooks to track what informatics steps you have performed and why. Beginning these reports with a short description of what raw data you used and when you received it can be critical for ensuring that you are using the correct data! We will describe more about how to use such reports in the final chapter of this course. It is also important that the experimental biologists make similar reports defining what reagents they used, when they performed the study, what samples were used, who performed the experiment, and any notes about usual events, such as the electricity went out during the experiment, left the samples overnight but usually leave two hours, mouse #blank unexpectedly died so we lost this sample thus it is not included, or the dye seemed unusually faint in this gel. In summary, we recommend the following record keeping tips: 2.4 Leadership best practices In this section we will describe best practices for lab leaders leading multidisciplinary informatics teams to support their research team members. The section is based on a famous blog post (Watson 2013) called “The lonely bioinformatician” that describes the angst that informatics personnel often feel when they are the only person in their lab with their skill set. The blog post author, Professor Mick Watson at the University of Edinburgh, describes these individuals as “pet bioinformaticians” in his blog called opiniomics. He states: “It is possible they [the pet bioinformaticians] will become isolated and pick up bad practices as they don’t have a senior bioinformatician to guide them. It also concerns me that their career and professional development might suffer.” He also acknowledges the challenges of the opposite case: “Consider the opposite situation – how many bioinformatician PIs manage lab staff? How could we possibly guide a young post doc on how to run gels, PCRs etc nevermind more complicated laboratory SOPs?” He has since then stated for the PIs of experts who do not share the same skill-sets: “Just look after them, and recognise you can’t give them everything that they need. You can give them a lot, just not everything.” “Secondly, there is nothing wrong with being a pet bioinformatician – it can be a really stimulating role, and opens your eyes to lab-based science. I am not criticizing the pets either, I just urge you to look after yourselves.” And ultimately provides a guide for the “pet bioinformaticians” that can be useful for both informatics expert employees/mentees and also for leaders of such individuals as well as for informatics lab leaders who employ lab-based scientists. Extending the major themes from his guide and from his post about clinical labs (Watson 2014) here are guidelines for multidisciplinary research lab leaders: 2.4.1 Recognize that different disciplines require deep expertise Informatics is truly it’s own scientific discipline that requires deep expertise. To truly optimize the multidisciplinary work that you may wish to perform, you need expert-level experience of all disciplines on your team. Although you may be able to learn how to use a particular tool to seemingly test an hypothesis, this may not be correct or optimal in every instance. This is why you need expert-level informatics team members to help you with your research. If you can’t hire such an individual, or even if you hire a more junior informatics mentee or employee, you need to discuss your research with a senior informatics expert. 2.4.2 Avoid employee isolation If possible, employ more than one domain expert or at least collaborate heavily with others - especially those with experience working with human data. Alternatively hire a more senior expert (with expertise studying in the domain you intend) with a higher salary. In Mick Watson’s words: “I am aware of a few lone bioinformaticians working in clinical labs. I want to make this clear – this is a bad idea. In fact, it’s a terrible idea. Through no fault of their own, these guys will make mistakes. Those mistakes may have dire consequences if the data are then used to inform a treatment plan or diagnosis.” In any case, we highly encourage guideline #2 regardless of what option you choose. 2.4.3 Encourage relationships with others in their domain Enable and encourage your employee to cultivate relationships with others who have similar skill-sets at your institution or local community. Ideally, help your employees or mentees find a mentor within their domain. If there is no local group of such individuals, see if your employee would be interested in starting one - such as a seminar group or journal club. Also encourage them to join online forums and attend conferences and workshops. Examples include: R ladies for support for using Bioconductor or R programming. Many of the members are also very familiar with using a variety of genomics and imaging data. You do not need to be a woman to get support from this organization. Many universities have a statistical support group, check at your institution. National Cancer Institute ((NCI) hub groups has a list of cancer specific groups such as large groups like the Informatics Technology for Cancer Research (ITCR). Consider location specific groups/collaborations such as the African Esophageal Cancer Consortium (AfrECC). When in doubt, ask around. Ask at your institution, ask your former colleagues at other institutions, or try on social medial like twitter to find connections. 2.4.4 Encourage growth outside their domain On the other hand, it is important that you also cultivate and encourage your employee’s growth in your domain by again suggesting and enabling their participation in conferences and journal clubs on topics relevant to your lab. 2.4.5 Value their perspective about science in general Encourage feedback and discussion from all of your employees in scientific discussions. Make their input feel welcomed regardless of the topic. A fresh perspective can sometimes lead to really important insights about things that are taken for granted by experts. 2.4.6 Discuss expectations and hypotheses If your employee is helping with work for a grant, provide the proposal to them. Have a discussion with your employee about your expectations and how feasible they are, as well as to make your informatics hypotheses specific. Avoid projects where the informatics goals are vague. Also remember that many informatics tasks may take more time than you anticipate and your employee may have a better sense of how long something will take (or vice versa if you are an informatics expert employing lab scientists). Be clear with your employee in these discussions that you are unclear about how long tasks will take, if that is indeed the case. Continue to have open dialogue about expectations and goals as the work proceeds. 2.4.7 Advocate authorship and idea generation for all Regardless of your employees’ or students’ backgrounds, make sure you advocate for authorship for each of them (particularly if they are interested in a career in research). According to a recent report about computational biology: “Despite the fact that dramatic advancements have been driven by computational biology, too often researchers choosing this path languish in career advancement, publication, and grant review” (carpenter_cultivating_2021?). It is often overlooked, but informatics experts will also need first author papers. However, keep in mind that in some fields authors are listed in different ways. Allow your employees to generate ideas for such publications and discuss this with them. Often the work to help with other projects may not be as interesting for your employee as an idea that they come up with themselves. Often you can create one technical paper and one biological paper from each project. For technical papers, allow your lab members that largely do informatics to play a prominent/leadership role. For biological papers, ask them to play a supporting role. For experimental lab members do the opposite. Allow these lab members to have a prominent role on biological papers and a supporting role on more technical papers. If nothing else, even if your employee is very busy on work for mid-level authorship, give them time to write a review or a software paper for a simple package, or a comparison of informatics methods. Mick Watson suggests making sure that your employees are authoring ~2 first author publications a year if possible. If necessary you can front-load collaboration work and then give your employee more time later to spend on their own work, but be careful about not protecting some of their time for their own career advancement. Also please see the Career Paths for Informatics Mentees section (coming up soon) and read it with your employees in mind, as well. 2.4.8 Check on them! Most importantly, make sure that your employee is getting help and feedback from other experts in their domain. It can be easy for your employee to get stuck or go in the wrong direction if left in isolation. How can you prevent this from happening? Keep tabs on what they are doing in general, if they are still working on the same issue for an extensive amount of time, suggest that they seek help. Also by encouraging your team members to cultivate relationships with experts you will provide them with the opportunity to ask others for their thoughts. 2.4.9 Get external review of work Particularly in informatics, we can especially track our steps. Make sure that your employees are keeping detailed records about their work and then get them to regularly ask for feedback from others. We all make mistakes, it’s good to get external feedback early and often to ensure that the work is correct. 2.4.10 Support diverse teammate work schedules One other important thing to know is that informatics work is often best performed with long stretches of uninterrupted time to allow your informatics employees to perform “deep work”. Why is this? Some of the challenges that your informatics teammates will be working on will require a great deal of abstract thinking and troubleshooting. Such difficult work profits well from deep concentration. How can you accommodate this? Try to work with your informatics teammates to schedule lab meetings and be mindful of other time commitments they might have, such as classes, seminars, or other meetings. On the other hand, if you are an informatics expert mentoring experimental biologists, keep in mind that their experiments will dictate their schedule. Impromptu meetings may be difficult for them at times. Also be aware that some of their experiments may require that they stay late at night or come in very early. Thus on those days, it might be best to not overburden them with other tasks if you want them done well. 2.5 Conclusion We hope these leadership guidelines will help you to better support your lab team to be as successful as possible! In conclusion, here are some of the take-home messages: Look for collaborators early in the process, particularly those that explain clearly. Take your time and expect delays. Reduce stress by scheduling extra time where possible. Keep organized records of communication and analyses. Recognize that different disciplines require deep expertise. Thus help your mentees find mentors for their respective disciplines if it is different form your own. Advocate authorship for all of your mentees (including first authorship). References "],["informatics-project-guidelines.html", "Chapter 3 Informatics project guidelines 3.1 Identifying good informatics questions 3.2 Informatics project pitfalls 3.3 Informatics project pitfall mitigation methods 3.4 Conclusion", " Chapter 3 Informatics project guidelines 3.1 Identifying good informatics questions Once you have identified your research team, your next step is to start thinking more deeply about the specific informatics questions you would like to evaluate. Be sure to include team members of each discipline in these discussions. There are many important considerations to keep in mind when asking an informatics question: We suggest the following steps to take a great scientific question and make into a great informatically testable question. 3.1.1 Steps for forming questions Start with what you know and determine what is unknown Clarify what is most important to learn about what is unknown. What key information would lead to more understanding? What would be most helpful to know to lead to a new treatment or prevention strategy? What would lead to more questions? Narrow down what is unknown into specific statements based on what you identified as important to know from step 2. Write the unknown statements into specific questions. (Look out for vague phrases!) Make the questions into actionable tests by thinking about what would be measured or observed and ultimately what your variables would be in a statistical test. Make a mock-up of what the data would look like. (Do you have necessary controls?) Evaluate if that actionable test can be assessed with statistical methods and if you have access or can collect the necessary data. Rework as necessary, possibly returning to a different question from step 5. Think about possible biases or confounders. Evaluate if the interpretation of the test would provide the insights that you are interested in. For example, say we were interested in identify new diagnostic biomarkers for colorectal cancer. Note: this is only an illustrative example. These suggestions are based on that of: Wang (2019). 3.1.2 STEP 1 - Identify what is known and unknown First we would begin by identifying what is known and unknown: Several potential blood-based biomarkers for colorectal cancer have been identified, however many are lacking evidence due to the previous studies having small sample sizes. [source] You might ask how useful are these biomarkers for diagnosing colorectal cancer? So now we think about what is unknown: You know the sizes of the previous samples that have assessed these biomarkers and you know the level of sensitivity reported by previous reports. However (assuming just this knowledge for illustration purposes), it is unknown: - How sensitive and specific some of these biomarkers are with sufficient sample sizes. - How collectively these biomarkers help to identify patients with cancer. - Which biomarkers are more important. - Which biomarkers or combinations are particularly useful for determining disease progression or what treatment options might be best. 3.1.3 STEP 2 - Prioritize unknowns Step 2 then involves determining which unknowns are the most important to you. This could be what is more translatable to aiding better diagnostics in a noninvasive way. This could be to better understand cancer progression and what these biomarkers tell us about patient prognosis. Determine what unknowns best fit your interest/expertise. Let’s say that we want to know what is most translatable to aiding diagnostic tests now. 3.1.4 STEP 3 - Write specific statements Step 3 then involves writing out specific statements for what is unknown related to making these biomarkers more useful for tests now. It is unknown how useful many of these biomarkers are individually for the diagnosis of colorectal cancer in larger samples. We do not know if combining these biomarkers together is useful in the diagnosis of colorectal cancer. Perhaps combining these blood-based screens with other screens is useful. You can probably imagine many more statements, but we will keep this example simple. 3.1.5 STEP 4 - Transform into specific questions Step 4 involves transforming these into questions: At what sensitivity rate do each of these biomarkers aid in the diagnosis of colorectal cancer? Does the use of a combination of these biomarkers for the diagnosis of colorectal cancer increase diagnosis rates better than any single biomarker? Does the use of a combination of any of these biomarkers with other non-blood-based screens improve diagnosis rates compared to either diagnostic method alone? Look for terms or phrases that are vague in your questions and make them more specific. For example, “How helpful”, “Is it better”. Think about in what way something might be helpful or in what way something might be better. For simplicity purposes we will stick with only the second question. 3.1.6 STEP 5 - Transform into actionable tests Step 5 is to transform questions into actionable tests. For a question to be testable it must meet several requirements. We need to have variables that can be measured or observed. We need to have a variable we can modify or control, and we need to figure out what we cannot control. Now what are our variables, what can we control or observe? We will be observing diagnostic rates of colorectal cancer and biomarker expression, and we can modify or control how many biomarkers we choose to focus on to compare samples. This leads to many questions: Should we compare one biomarker vs all of the biomarkers? Which single biomarker will we choose to compare to or will we look at all of them? Do we have the sample sizes to allow for the statistical power for so many tests? How will we look at the combination of biomarkers? A total score? Will it be additive or something more complicated? For example, we could prioritize some biomarkers over others. These are good questions to ask an informatics expert about. However we are getting to a more testable question. Now let’s really think about what the data we would need and what it would look like. Which brings us to step 6 where we create a mock-up of the data. 3.1.7 STEP 6 - Create a mock-up of your data Creating a mock-up of the data can make you ask yourself more questions about what you are asking and what you need to ask that question. Would it be that we have blood results for these biomarkers for patients where we know (based on surgical pathology) if they have cancer? What would these blood results look like? Would it be absolute expression levels of mRNA or protein? Do we have a threshold of elevated expression that we can use? Will we assign samples as yes or no in terms of meeting this threshold or will we use an absolute quantity or relative percentage over this threshold? Actually creating a mock-up of what the data might look like can reveal other important aspects that you may not yet have thought about. Thus here is the result of step 6. 3.1.8 STEP 7 - Think about statistical tests Step 7 is then to think about what statistical tests you might perform. Could we use a t-test to compare the scores among the patient groups? Would we want to account for other factors like the patients age or gender? Would another test be better? 3.1.9 STEP 8 - Think about interpretation Step 8 is then to think about what this would mean. What would it mean if our results showed a difference in score between the groups? What can we interpret? Do we want to be able to predict patient status? This may involve moving back a step or two. Remember that working with your research teammates can help you to come up with a better research plan before you start collecting data. By involving experts from different domains you can make the most out of your research efforts. We would also suggest that you work with your informatics experts to come up with a biological research question (or set of questions) and a more technical question (or again set of questions) for each project. This can be a good strategy to ensure that everyone in your team gets authorship and that your team is as productive as possible. For this example, your informatics employees or students might write a paper using simulated data or publicly available data to look at methods for creating biomarkers scores. Their studies could better inform you about how to think about testing the utility of colorectal biomarkers for diagnosis purposes. 3.2 Informatics project pitfalls One common misconception is that informatics research projects work out more often or are faster than wet bench experimental research projects. This is however not necessarily true and informatics projects are just as likely to fail and often take more time than one might expect. However, one advantage of having an informatics team member on a project is that there is ample free data available to add to or shift or reframe a study if necessary. This is important to keep in mind when advising your mentees and guiding the planning of their projects. Common reasons why an informatics project might fail: The goals were too vague (see the previous section about identifying good informatics questions). Sadly this happens quite often and it can easily lead informatics employees and mentees down the wrong path. The data is not of high enough quality or lacks consistency. This may be due to a faulty method, methodological differences between lab personnel, expired reagents, temperature differences on data collection days, or aging of a machine over time etc. Some of these issues can be avoided or reduced, while others are unavoidable. Do not be quick to blame your experimental research team members if the data does not look like you expect. Some variation in data is just a part of life. There is not a strong enough signal in the data to detect the effect of interest with the current data/methods. This is also a very common problem if you are not sure what the strength of the effect you are looking for might be (which is often the case in Biology). In this case you need more data or perhaps methods with greater granularity. The method of data collection becomes obsolete. This may not make the project fail per se, but it can make publication difficult. Staying on top of what methods are currently being used can help to avoid this. The signal does not exist. Sometimes our hypotheses are just wrong. 3.3 Informatics project pitfall mitigation methods This section is in part based on a book (Robinson and Nolis (2020)) by Emily Robinson and Jacqueline Nolis. We can mitigate some of these project weak points. (You may notice how some of these have been discussed previously.) However some of these are a bit unavoidable and it is best to have realistic expectations and flexibility about backup project ideas. Ways to mitigate project failure: Discuss with experts Discuss with trusted experts across all necessary domains about your informatics hypotheses to make sure they are feasible with the data you have or will generate before you get too far down the research path. Ask for their help to make sure that your scientific questions are not too vague. Do this as early as possible. Diversify projects It is a good idea to diversify your mentees’ and employees’ projects to enable them to have exposure to different projects, as well as more opportunities to contribute to a project that will ultimately result in a product such as an academic paper or a new software package. Safe project planning Make sure mentees and employees have at least one very solid project. For example, assign a review article, a simple software package, or a project with very promising pilot data. Co-authorship Allow lab members (especially mentees) to work together on projects. Assign one mentee or employee as the main personnel, but allow other team members to contribute in small ways to allow them to at least get co-authorship, just in case their main projects fail. Plan for ample time Plan for projects to have adequate time to account for setbacks. For example, if possible plan on the possibility that additional data may need to be collected or perhaps more data will need to be added from a data resource. It will take additional time to analyze the new data. Unfortunately, simply plugging in new data to an existing script hardly ever works. Instead the following tasks are required: Check the quality of the new data Reformat/wrangle the new data to match that of the existing data Evaluate how different the new data is from the old data - are they similar enough to be included in a larger analysis or does this require two analyses? Perform the analysis on the new data Adjust and reframe When a project appears to fail because the data turns out to not be adequate for answering your original question, reframe the project to answer a question that the data actually can answer. For example, if the goal of a project was to look for differential gene expression of a single gene and no significant difference is found, consider evaluating the gene expression of a pathway or network of genes that are involved in the same biological process. It is best to be transparent about your scientific process in your publications. Get new data In the worst case that the data does not appear to work for your initial goal and reframing the question does not seem possible, look for new data. Now there are many data resources available online. We have curated a list of cancer research related data with the help of the National Cancer Institute (NCI) Informatics Technology for Cancer Research (ITCR) faculty. These are also good resources for finding cancer related data: - The cBioPortal - This article Keep in mind that using new data takes time. Using an existing script on new data rarely works because data can be formatted differently and have other intrinsic differences. This must first be evaluated to know how to proceed. The following steps are required: Overall we will summarize our suggestions for avoiding project pitfalls. 3.4 Conclusion We hope that having an awareness for how informatics projects can fail and that keeping these mitigation strategies in mind when you are planning your projects will help you to be more successful with your informatics research endeavors! In conclusion, here are some of the take-home messages: Follow the outlined steps for forming good informatics questions. Especially remember to make a mock-up of what your data might look like for a project. Remember that there are several sources for project pitfalls some of which are unavoidable at times, however discussing your plan early with other experts, planning for extra time, and diversifying projects can help. References "],["informatics-relationships.html", "Chapter 4 Informatics relationships 4.1 Cultivating good multidisciplinary lab relationships 4.2 Collaborating with informatics experts 4.3 Employing informatics experts 4.4 Mentoring informatics students 4.5 Conclusion", " Chapter 4 Informatics relationships 4.1 Cultivating good multidisciplinary lab relationships Now that we know a bit more about general practices for maintaining successful multidisciplinary teams and projects, we are going to take a deeper look at how to best support the relationships that might have in our team. We will also discuss the pros and cons of each type of relationship to better guide you about decisions regarding building your team. 4.2 Collaborating with informatics experts Studies investigating biology research labs over history indicate that collaboration has been on the rise since the 1950s (Vermeulen, Parker, and Penders 2013) and that the rate continues to increase (Sonnenwald 2007). Indeed the size of biology research teams appear to have doubled from 1955 to 1990 (Vermeulen, Parker, and Penders 2013). But why? 4.2.1 The benefits of collaboration Shared cost Research often involves expensive technology, thus it is cost effective to share resources. Shared expertise Now that technology affords answering in some cases more complex or broader research questions, it is often more effective to employ multiple contributors with different knowledge, skills, and perspectives. Researchers have noted that their own concept of their field changed as a result of working with investigators from other disciplines. Thus this can lead to innovation (Mäkinen, Evans, and McFarland 2020). Shared burden Doing part of the work for a project using the knowledge and skills that you are most comfortable with and seeking help from others who are more knowledgeable on other research aspects can be a more efficient strategy. Shared reliability Including multiple team members who can each evaluate the research can improve the reliability of a project, as mistakes can be found by other members. Shared credibility Collaborations involving experts of multiple areas can improve the perceived credibility of the work by others. 4.2.2 Potential challenges There are always challenges when collaborating with others, but some of these are particularly enhanced in multi-disciplinary teams. Here are some challenges that you may encounter when a collaboration involves informatics experts. Bad collaboration: Communication Differences Extra care needs to be taken to ensure that communication across groups is effective. Typically researchers will not meet as often with a collaborator as they would with an internal team member. Therefore, poor communication in a collaboration can lead to more costly misdirection and thus wasted time and effort. Furthermore, as investigators often have different backgrounds, differences in jargon and language can make communication more challenging. Having internal team members with some familiarity with informatics can be very beneficial for translating discussions with collaborators who are informatics experts. One solution to this is to have trainees work in both labs. This can be especially beneficial for the trainee who will become accustomed to two research styles and will learn a diverse set of skills. This allows the trainee to potentially have their own multi-disciplinary lab in the future (Mäkinen, Evans, and McFarland 2020). Another important method that can help resolve this issue is to have members provide educational seminars for participating members about the fundamentals of their work. Different research style and goals Beyond differences in language, differences in research style and goals can lead to conflict. “Scholars’ different styles of thought, standards, research traditions, techniques, and languages can be difficult to translate across disciplinary domains” (Mäkinen, Evans, and McFarland 2020). Making clear research standards and goals, as well as outlining clear specific tasks at the beginning of a project can help to avoid this issue. Furthermore, meeting consistently throughout the duration of a project can also help to make sure that standards are maintained. Additionally, these meetings should include discussions about intellectual property, authorship, leadership, and defining what success looks likes to each of the various members. Defining these details early can avoid major conflict later. Furthermore, it is critical to keep in mind the diversity of career goals of research team members, as junior team members may have a challenging time persuading others of their independence and contributions when they work on largely collaborative projects. It is also necessary to ensure that junior members have time to devote to their own research programs. (Sonnenwald 2007) Support should be provided for these junior collaborators by more senior collaborators. Different capabilities Research of multi-disciplinary collaborations has revealed that when collaborating members are unclear of how their expertise and work contributes to the project, they are less motivated and fell less valued. Working with members of different backgrounds to determine how their expertise can contribute to the project, as opposed to simply assigning them a task, will not only help with morale, but it can also better define how a collaborator can further contribute to a project in ways that you may not already expect (Mäkinen, Evans, and McFarland 2020). Reduced sense of responsibility Another concern of collaboration is that team members may feel less responsibility or commitment to a project than for a project within their own lab. Defining tasks and expected due dates can help reduce this issue. Discussions to establish due dates should always include team members with expertise in each area of science, as tasks may not take the amount of time that another researcher would expect. It is a common misconception that informatics tasks take less time than the tasks actually take in reality. Research is dynamic Research always has an element of trial and error. Protocols may change and new scientific questions may emerge. Frequent meetings with all group members to understand the dynamics of the project are critical. Furthermore, flexibility and understanding is required. It should be expected that aspects about the project will change. Different levels of resources Particularly when collaborating with community members, community colleges, and institutions that are “Equity-oriented” and serving populations that have historically been marginalized or “minoritized” (Blake 2017), it is important to keep in mind that large differences in resources may exist between collaborating members. Sharing and discussing budget information early and often can help research members to understand what expectations are reasonable and how collaboration partners may best assist one another. It is also important to recognize that: “There is a common misconception that the lack of physical experimentation and laboratory supplies makes computational work automated, quick, and inexpensive.” (Way et al. 2021). However: “In reality, even for well-established data types, analysis can often take as much or more time and effort to perform as generating the data at the bench. Moreover, it typically also requires pipeline optimization, software development and maintenance, and user interfaces so that methods remain usable beyond the scope of a single publication or project” (Way et al. 2021). Don’t forget to provide some budget for your informatics collaborators, as their time ultimately does cost money and there may be computational costs that you may not be aware of. 4.3 Employing informatics experts In contrast to collaborating with informatics experts, in some case it may be beneficial to directly employ them on your team. There are again pros and and cons for this strategy. By directly employing informatics experts, rather than collaborating with an expert, research leaders will have more access to meet with these experts more often. Research leaders may also have more sway in terms of guiding the direction of the experts’ work. Leaders can also potentially grow the informatics part of their research program more readily, leading to even more flexibility in the research questions that they may be able to assess. However, direct employment of informatics experts requires all of the typical responsibilities and costs of employing another lab member. It also requires the additional resource requirements for the informatics work of the particular expert. In addition, it is useful to become familiar with best practices for ethics, reliability, and reproducibility in computational work. This requires some different tactics than that of experiment based research (often called “wet lab” research). Although it is also useful for informatics experts to keep track of the work that they have performed in general, similar to maintaining notes about experimental research with a lab notebook, a much deeper level of detail can be tracked and maintained for computational work. What we mean by this, is that the actual code and data used in their work can be saved over time. This can be invaluable for research reproducibility. Thus research leaders are advised to become familiar with best practices for data sharing and data management so that they can most effectively manage their informatics employees. This is also discussed in more detail later in the course. One other important thing to remember is that informatics work is often best performed with long stretches of uninterrupted time. This will be true for your informatics employees and mentees. Again we suggest that you work with your informatics teammates when you schedule lab meetings and be mindful of their other time commitments. Try to support them in scheduling several hours of uninterrupted time a day if possible. As a reminder, again unless you employee a senior informatics expert and even then - it is advisable that you encourage these employees to make supportive relationships with other informatics experts, and particularly if they are working in a new domain. 4.4 Mentoring informatics students Mentorship is a particularly unique relational experience. While traditional mentorship has been defined by the hierarchical structure of a single mentor who teaches subordinate mentees, new styles have emerged that are not as constrained or limited as the traditional paradigm. At its optimum, mentors and mentees should learn from each other and together and expand what each can do alone. Importantly the more traditional paradigm that does not value “reciprocal learning” as highly, has been shown to be less effective for a larger diversity of students (Mullen and Klimaitis 2021). For research groups that are newer to informatics, some of these less traditional paradigms may be especially useful, we will focus on a few here. 4.4.1 Co-mentoring/collaborative/team mentoring As we described earlier, co-mentoring or collaborative mentoring of students by multiple mentors with different backgrounds can be particularly beneficial to the student and also to the partnering labs. In the case of collaborative mentoring where a mentee is mentored by two research experts in two different labs, this provides an opportunity not only to strengthen a collaboration, but also for students to gain more diverse knowledge, and to in turn provide more of the expertise that they gain back to both labs. Co-mentoring could also occur within the same lab by a research leader and an informatics expert. This could also work well in a multilevel paradigm, where an informatics expert may guide informatics related aspects of research, while an overarching research adviser may guide the student’s overall research mentorship experience. 4.4.2 Peer mentoring Peer mentoring also provides great opportunities to expand students’ expertise and skills without as much time constraints for the research leaders of a lab, particularly for skills that may be new to lab leadership. Furthermore, such paradigms are helpful for improving students’ teaching skills, collaboration skills, self-reliance, and self-confidence. Teaching a peer is often useful for students to identify gaps in their own knowledge and assisting in their quest to “learn how to learn” (Mullen and Klimaitis 2021). Furthermore, such paradigms appear to be especially beneficial to students of historically marginalized populations (Mullen and Klimaitis 2021). However, there are challenges for research leaders from a management standpoint. Mentors should be mindful of any conflicts that may arise between students. These can often be avoided with clear and distinct goals and projects for students, to avoid making students feel like they are competing with one another. Additionally, we highly recommend establishing a code of conduct for the lab, so that students and staff members are clear about what behavior is expected. 4.4.3 Electronic mentoring With the COVID-19 pandemic, the transition to using electronic means of contact with students and staff for research has expanded on an unprecedented scale. It is unclear currently how much this will continue in the future. However, research prior to the pandemic has shown some surprising benefits of providing mentorship through electronic means. Importantly, it appears that this eases burdens for students who are balancing course work, as it often provides more scheduling flexibility. Additionally, such mentorship is particularly helpful for historically marginalized populations who may face more hostility by going to research institutes with face-to-face interaction with others or may have additional scheduling conflicts. Even as we may return to more on-site research labs, additional availability by mentors with mentees using electronic means of contact are likely to be beneficial. Technology such as slack can be especially useful for allowing lab members to interact with one another. We will cover more about this soon. 4.4.4 Career goals The job landscape for scientists has changed in recent decades with more opportunities outside academia in industry and government. Furthermore career goals for informatics mentees can be very different than that of other research mentees. By having informatics expertise, these trainees have additional career opportunities. Becoming aware of these opportunities yourself, as a research leader, is therefore critical for cultivating your mentees’ awareness of the diversity of opportunities available to them. This will ultimately allow your mentees to choose the career path that suits them best. 4.4.4.1 Career paths for informatics mentees Academia - Your informatics mentees may have career opportunities as principal investigators, scientists, or educators just like other cancer biology mentees. In addition to opportunities as educators for informatics and biology, they will also have opportunities for data science. Government - Your informatics mentees may have career opportunities as scientists or policy makers for research institutes just like other cancer biology mentees. However, additional agencies and institutes may have a need for their data science skills on topics outside of biology. For example your mentee may have the skills to work for a city police department. Industry - Beyond the potential career options in the pharmaceutical industry, biotech, and medicine, your informatics mentees will have data science skills that may qualify them for jobs in a variety of industries. For example your informatics mentees could find jobs at companies such as Stitch Fix or Ancestry which use methods in machine learning and bioinformatics for their products. Additionally, your mentee may also have opportunities to join a software company as a computer programmer or even as a programming educator at a company like RStudio. Nonprofit - Beyond research and management positions at nonprofits performing scientific or clinical research, informatics mentees may have opportunities at other nonprofits with other types of goals. For example, your mentee might find work at a nonprofit that advocates for civil rights and investigates social interactions on social media platforms. 4.4.4.2 Career paths outside of academia If your mentee is more interested in a career path outside of academia we suggest you read up about industry perspectives on useful skills and knowledge, so that you are better prepared to guide your mentees to get exposure and experience to the data science domains or aspects that would be most helpful to them. According to Brandon Rohrer, a data scientists who formerly worked at Facebook and now works at iRobot, there are 4 major categories of knowledge and skills for data science: Data Analysis - domain knowledge, research skills, and interpretation skills Data Modeling - machine learning application skills and algorithm development skills Data Engineering - data management skills, skills to make code production-level ready (ex. automation), and software engineering Data Mechanics - data formatting and cleaning and data handling (filtering, subsetting) Based on these categories, he says that there are also 3 major archetypes (with subdivisions) for data scientists: Beginner data scientists These are individuals just starting out but who are familiar with each of the 4 above areas. This is ideally how your mentee will be after training (at a minimum) if their goal is to pursue a data science career. This will allow them to identify their strengths. Generalist These are individuals who are proficient in all areas. This is helpful for becoming a data science manager or executive. Focus on all 4 aspects of data science is also good for mentees who wish to stay in research! Specialists There are 3 major subtype specialties: Detective - strong skills in data analysis and mechanics and exposure in all 4 areas - may be especially useful for working at nonprofits Oracle - strong skills in modeling and mechanics - this is great for working at companies using machine learning Maker - strong sills in mechanics and engineering - this would make your mentee valuable for working in any of the nonacademic fields as well as academia Check out this video for more details: From another perspective, the major skill sets to focus on according to the “Build a Career in Data Science” book by Emily Robinson and Jacqueline Nolis (Robinson and Nolis 2020) are: Statistics Machine learning Programming (python and R) Projects - hands on experience Promoting your mentee’s exposure to each of these domains can only help them further pursue a career in data science. We also suggest that your mentees checkout this blog post on surviving data science interviews also by Branden Rohrer. We think this could be helpful for mentees pursuing any data science career path, however it is especially useful for those interested in industry. 4.4.4.3 Academia for informatics mentees So how do we best support informatics students that want to stay in Academia? Recognize that academic promotion for informatics experts is currently not very accommodating. Some aspects of traditional academic promotion are simply not set up to accommodate informatic experts. For example, these individuals tend to publish many more secondary author publications and create software which can reduce the time they have available for first author publications. However, first author (or last) author papers are still most often used as the major guideline for academic achievement. How can you and your mentees overcome this? Be sure to especially advocate for your student to get first authorship papers if they intend to stay in academia. Wherever possible also try to advocate for more nuanced academic promotion polices that account for multidisciplinary differences at your institution. Encourage your student to seek specialized technical skill sets For your students who wish to stay in academia, it may be less important that they become as generally familiar with a wide variety of data science skills and practices as students interested in a career outside of academia, if they are for example focusing on a specific statistical method. Just like other academic fields, informatics experts will become experts in niche subject areas. Encourage these students to go to targeted conferences to build a network in their field of interest (although we still encourage if possible to allow your students to get a well-rounded exposure to different types of conferences). Also especially encourage these students to learn about grantsmanship just as you would with your other academic mentees. However this can also be useful for students interested in working for a nonprofit or for the government. See Way et al. (2021) and Waller (2018) for a more in-depth discussion and suggestions on how we can work to reform academic promotion practices to be more mindful of disciplinary differences for informatics experts. 4.4.4.4 Authorship considerations In addition to typical biological papers, there are also other companion types of papers that your informatics mentee can publish, including: Data resource papers - your mentee may publish an article introducing a data resource Software papers - an article where the functionality and development of piece of software is discussed Method comparison papers - your mentee may compare existing methods New method or pipeline papers - your mentee could describe a new method that they created These other types of papers are especially good to keep in mind if your mentee will not be first author on a biological paper from your lab. When guiding your mentee through the publication process, it is a good idea to keep in mind their career goals as you prioritize different paper ideas. For example a mentee that is interested in pursuing a data engineering career may benefit more from a software paper, while a mentee interested in staying in academia would benefit more from a new method paper or a biological paper if possible. Most importantly, remember that informatics mentees need first author publications too! 4.5 Conclusion We hope that these tips help you to mentor and lead your team in a more productive and effective way that benefits both your team members and your lab’s mission! In conclusion, here are some of the take-home messages: There are many challenges associated with collaboration, however early and regular communication can help. It is also helpful to outline expectations early on. Becoming familiar with best practices for ethics and reproducibility is helpful for employing a computational biologist, especially if you are new to computation biology. Possibly consider learning a bit of code to better understand what your computational employees are doing if you yourself are not already familiar. Remember that mentoring is a reciprocal process. There are many alternative strategies for mentoring that you may find helpful. Remember to discuss with your mentees about their career goals. The way you advise them should be driven by their interests. Organize projects so that your team can produce biological and computational manuscripts. References "],["promoting-diversity-equity-and-inclusion.html", "Chapter 5 Promoting diversity, equity, and inclusion 5.1 Diversity is beneficial 5.2 Underrepresentation in cancer informatics 5.3 Examples of contributions by individuals of underrepresented groups 5.4 Underrepresentation in clinical trials 5.5 Strategies to promote more equitable inclusion in clinical trials 5.6 What does it take for clinical research to be ethical? 5.7 Research practices to reduce cancer health disparities 5.8 Ways to better support a more diverse research team 5.9 Conclusion", " Chapter 5 Promoting diversity, equity, and inclusion 5.1 Diversity is beneficial Beyond the critical importance of giving everyone more fair opportunities (which cannot be overstated), there are many additional crucial reasons why diversity is particularly important for science and health research. The inclusion of diverse research team members may promote more inclusive research questions and practices to help the populations that need better health care the most. This is especially important as many of the historically marginalized racial and ethnic populations of the United States are growing. By 2045 it is expected that these groups will make up more than 50% of the population (Clark et al. 2019). “Racial/ethnic minority groups in the United States are at disproportionate risk of being uninsured, lacking access to care, and experiencing worse health outcomes from preventable and treatable conditions” (Jackson and Gracia 2014). “…Compared with the general population, racial/ethnic minority populations have poorer health outcomes from preventable and treatable diseases, such as cardiovascular disease, cancer, asthma, and human immunodeficiency virus/acquired immunodeficiency syndrome than those in the majority” (Jackson and Gracia 2014). “… the social environment in which people live, learn, work, and play contributes to disparities and is among the most important determinants of health throughout the course of life” (Jackson and Gracia 2014). More diverse research teams may be more aware of the cultural differences and social determinants that may influence the health of the people that the research could serve. Such consideration could further increase the impact of the research. The inclusion of diverse populations in scientific teams has also been shown to improve innovation (Hofstra et al. 2020). “Scholars from underrepresented groups have origins, concerns, and experiences that differ from groups traditionally represented, and their inclusion in academe diversifies scholarly perspectives. In fact, historically underrepresented groups often draw relations between ideas and concepts that have been traditionally missed or ignored” (Hofstra et al. 2020). 5.2 Underrepresentation in cancer informatics Despite the benefits of diversity in research teams, analyses of scientific article authorship indicate that women are underrepresented in computational biology (Bonham and Stefan 2017) and biomedical engineering (Aguilar et al. 2019). Furthermore, analyses of university faculty and students demonstrate that both women and historically marginalized populations (such as certain racial and ethnic groups, disabled people, and LGBTQ+ individuals) remain underrepresented in science, technology, engineering, and mathematics (STEM) fields in the US and in Europe (Hofstra et al. 2020; Chaudhary and Berhe 2020; Kricorian et al. 2020; Cech and Waidzunas 2021; Iporac 2020; Williams and Shipley 2018). Although Black people and people of Hispanic or Latin American origin or ancestry made up 27% of the workforce in the US in 2016, together they only made up 16% of the STEM workforce. People of tribes and nations that are often called Native American, Indigenous, or Indian Americans are also underrepresented in STEM (Williams and Shipley 2018). Although they made up 1.7% of the population in 2016 in the United States, they only represented 0.6% of individuals with bachelor’s degrees in science and engineering and only 0.2% of individuals with doctoral degrees in these fields (Williams and Shipley 2018). Furthermore all of these groups are particularly underrepresented in academia at the faculty level(Vargas, Saetermoe, and Chavira 2021). Importantly, although Asian Americans are generally considered well represented in STEM, this is only true for some fields and for those of ancestry or origin from certain Asian countries, such as Japan, China, Korea, and India, while those from other countries such as Cambodia, Laos, and others are often underrepresented. Indeed the term Asian American encompasses a very diverse group of people with diverse experiences and socioeconomic statuses.(CARE, n.d.; nsf.gov n.d. ; G. A. Chen and Buell 2018; Iporac 2020) Furthermore, there are issues related to fair support and treatment for promotion and advancement for many marginalized groups, as well as for Asian Americans that are considered well represented. (Funk and Parker 2018) “This ‘other-ness’ exists intentionally or unintentionally between those of a minority and those of a majority from lacking of common cultural background. Relationships at work appear polite on surface but reluctant tendency in willing to share limited opportunities the same way, which I felt in a previous job where whites and males were overwhelmingly a majority.” – Asian woman, engineer, 56 (Funk and Parker 2018). “There are not many people of my race in [my] industry. It requires me to go the extra mile to fit in or be accepted because many of the employees don’t share my background or life experiences. I can do the job just fine, however, there are other factors of one’s life that are considered whenever they are in a critical and highly competitive environment.” – Black man, systems administrator, 30 (Funk and Parker 2018). The reason for this underrepresentation is that people have historically not been allowed to participate, been discouraged from participating due to discrimination, had a lack of visible role models of their background, and have not had the same access to education. (Funk and Parker 2018) Part of the reason that role models were not as visible as they could be, is that even when individuals of underrepresented groups overcame these adversities and made contributions to science, their contributions were often covered up or not fairly credited (Magazine and Dominus n.d.). 5.3 Examples of contributions by individuals of underrepresented groups Despite very often facing discrimination and less support, many individuals of these underrepresented groups have greatly contributed to scientific achievement, which demonstrates the potential innovation that could be achieved when research teams are more diverse. Many Black scientists are responsible for great human achievements that changed the world including: Katherine Johnson (1918–2020): Katherine Johnson was a mathematician who helped enable humans to venture into space. Katherine worked for NASA in the 1960s during the Space Race. A depiction of her work and some of the discrimination that she faced while working for NASA is featured in the film Hidden figures. Vivien Thomas (1910–1985): Vivien was a pioneer in heart surgery at Johns Hopkins in the 1940s when it was considered taboo. He is credited for assisting in creating a life saving surgical operation that improved the oxygenation of children with congenital heart defects. This operation was pivotal in paving the way for other heart surgical procedures. Sadly, it took more than 25 years for him to be credited for this work. See this article for more information about Katherine Johnson and other pioneering and world-changing Black female scientists. Also check out this list of inspiring Black scientists today. Scientists and mathematicians of Latin American or Hispanic origin have also greatly contributed and continue to contribute to scientific innovation: Mario Molina (1943–2020): Mario was born in Mexico and came to the United States for graduate school. He was a chemist and environmental scientist and was awarded the Nobel Prize in chemistry in 1995 for his work in discovering the environmental impact of chlorofluorocarbon (CFC) gases on the Earth’s ozone layer. This work was critical in changing policies (starting in the 1980s) to protect our environment from these chemicals which were widely used as aerosols, solvents and refrigerants. Through these policies, many countries around the world have banned or greatly reduced the manufacturing of these chemicals. NASA projections suggest that the ozone layer would have been largely depleted by 2060 if these chemicals were manufactured at previous historical rates. See here for more information on what would have likely happened to the world if CFCs were not banned. Also check out this podcast episode for more information. Ynes Mexia (1870–1938): Ynes began her career as a botanist in her fifties in the 1920s. She traveled widely (often by herself) collecting and characterizing plants. She unfortunately passed away at 68 cutting her botany career short to less than 20 years. However, in that time she collected over 145,000 specimens and made numerous discoveries. Her legacy in botany had such an impact that the Mexianthus genus is named after her. She is also highly regarded for her efforts in environmental conservation, particularly for helping to preserve the redwood forests in the United States. This adventurous woman has been quoted for saying: “I don’t think there’s any place in the world where a woman can’t venture.” See this video for more information about her inspiring life. Also check out this list for additional historic scientists and this list for some current inspiring scientists of Hispanic or Latin American ancestry or origin. Individuals from the various indigenous or native tribes of the United States have also contributed greatly to scientific understanding and innovation. See here for a list of some, in addition to: Fred Begay/Fred Young/Clever Fox(1932–2013): Fred was a nuclear physicist who’s work was instrumental in innovating alternative energy sources at the Los Alamos National Laboratory. He was born on the Ute Mountain Reservation. His parents were from the Navajo or Diné and Ute tribes. His life and work are featured in a documentary called The Long Walk of Fred Young. Fred spent a great deal of time on outreach and education particular for youths of the Navajo Nation. He is also known for making connections between Navajo beliefs and science. He’s been quoted for saying: “I think the key point is that I learned to think abstractly and develop reasoning skills when I was growing up, learning about lasers and radiation in the Navajo language… That’s all embedded in our religion” (“Fred Begay PhysicsCentral” n.d.). “We strongly rely on natural phenomena. We believe we’re children of nature” (“Fred Begay PhysicsCentral” n.d.). “The Navajo has mysterious ideas about science which cannot be interpreted into English” (“Fred Begay PhysicsCentral” n.d.). See here to learn more about Fred. Floy Agnes Lee (called Aggie) (1922-2018): Floy was a biologist who grew up in New Mexico and her father was from the Santa Clara Pueblo or Kha’po Owingeh. Her work at the Los Alamos National Laboratory, Argonne National Laboratory, the University of Chicago, and Jet Propulsion Laboratory helped expand our understanding of the effects of radiation on living cells. You can see an interview of her at this link and see a written translation of this interview here. Edna Paisano (1948—2014): Edna was a demographer and statistician who was born on the Nez Perce Reservation in Idaho. Her family was from the Nez Perce tribe and the Laguna Pueblo. While working at the United States Census Bureau in the 1980s, she discovered that many Native, indigenous, Indian or tribal communities were being undercounted and that this was reducing the amount of government resources and services potentially available to these communities. She was instrumental in improving the counting of individuals from various tribes in the United States Census by initiating a public information campaign. She also published several books throughout her career. See here for more information about Edna. Also, check out this list for information about current native or indigenous people in STEM. There are also numerous LGBTQ+ scientist who changed the world. See here and here for a review of several including: Alan L. Hart (1890—1962): Alan (1890-1962) was a clinician and researcher who’s work largely focused on tuberculosis. Alan was assigned female at birth and later transitioned to be male in adulthood. Despite facing major obstacles due to discrimination, Alan had a very successful career as an expert in radiology for tuberculosis and served as the director of hospitalization and rehabilitation at the Connecticut State Tuberculosis Commission. Alan also wrote several novels. See here to learn more about Alan’s life. In addition, this list includes many LGBTQ+ people currently working in STEM. These lists are neither complete nor fully inclusive, but shed some light on the amazing achievements of some individuals in science who were or are among groups that remain underrepresented in STEM. Their unique life experiences, perspectives, intellect, and talent helped shape their work which has greatly improved our world. Again, this demonstrates how potentially even greater innovation and scientific advancement could be achieved when including more diverse individuals in our research teams. 5.4 Underrepresentation in clinical trials Beyond underrepresentation of certain populations within research teams, there is also a lack of representation of various populations in clinical trials (in particular woman and people of specific racial and ethnic groups) (Clark et al. 2019). This is particularity true for cancer clinical trial studies (Nazha et al. 2019). Furthermore there is also limited data collected about LGBTQ+ individuals (B. Chen et al. 2019). Importantly this underrepresentation is not due to certain populations being hard to reach, but instead about a lack of practices that enable and encourage individuals from more diverse groups to participate. According to an FDA study in 2018, Black people represented 13.4% of the population, yet only 5% of clinical trial participants, while Hispanics and people of Latin American ancestry made up 18.1% of the population at the time, but less than 1% of clinical trial participants (Coakley et al. 2012). Furthermore, when looking at oncology trials, whites (80% of participants) and males (59.8% of participants) continue to be overrepresented, according to a study in 2019 (Nazha et al. 2019). This lack of diversity in clinical trials is a problem because this results in less understanding about how particular patients will fare with a given treatment/vaccine/diagnosis etc. This lack of knowledge can result in furthering inequities in medical care (Clark et al. 2019). Demographic studies of the United States demonstrate that racial and ethnic populations that have been historically underrepresented in clinical trials and in STEM have grown and are predicted to make up 50% of the population by 2045 (Clark et al. 2019). Thus for medical care to best serve the current and future population of the United States, trials especially need to be more representative of the diverse groups that make up our country. Research suggests the following reasons for the lack of diversity in clinical trials: 5.4.1 Historical injustice According to studies, marginalized groups such as Native Americans, people of Hispanic or Latin American ancestry, and Black Americans understandably distrust researchers at higher rates than white individuals due to historical injustices and inadequate medical care. (Griffith et al. 2020; Mastroianni, Faden, and Federman 1999) “African Americans’ suspicions and fears that many sectors of American society are not trustworthy were logical and accurate interpretations of their perceptions and experiences” (Griffith et al. 2020). Examples of historical injustices include: Tuskegee syphilis trial: A study in Tuskegee, Alabama about the outcomes of untreated syphilis in Black males (1932-1972) in which the patients were told they were being treated but were in fact not being treated (McVean 2019). “The Unfortunate experiment”: A study in New Zealand (1966-1987) in which women diagnosed with a precursor to cervical cancer were also similarly studied but patients were not offered standard treatments, they were not informed of their diagnosis, and they were not told that they were a part of a trial, nor given the opportunity to decide if they wanted to be a part of the trial (Evans 2018). Henrietta Lacks and HeLa Cells: In 1951, a patient named Henrietta Lacks was treated for cervical cancer at Johns Hopkins. Her cancerous cells turned out to be uniquely capable of surviving and reproducing and have been used widely in research for decades for many discoveries. Her family did not receive money from the companies that profited from her cells, and for decades her family was often not asked for consent as doctors and scientists revealed her name and medical records publicly (“Henrietta Lacks: Science Must Right a Historical Wrong” 2020). “I want scientists to acknowledge that HeLa cells came from an African American woman who was flesh and blood, who had a family and who had a story,” her granddaughter Jeri Lacks-Whye told Nature (“Henrietta Lacks: Science Must Right a Historical Wrong” 2020). Sexually Transmitted Disease (STD) experiments in prisons and Guatemala: This (1946-1948) study of STDs started by infecting prisoners in a US prison in Indiana but later moved to Guatemala. Overall this study involved infecting many vulnerable populations with STDs, including children, prostitutes, mentally ill patients, prisoners, indigenous people of Guatemala, and soldiers. “…health officials intentionally infected at least 1308 of these people with syphilis, gonorrhea, and chancroid and conducted serology tests on others” (Rodriguez and García 2013). Radioactive iodine thyroid studies of Alaskan natives: In the 1950s, a study was conducted to examine the thyroid gland in Alaskan natives, in which participants were given doses of radioactive iodine that exceeded current recommendations. Beyond the riskiness of the dose of radiation, this study was unethical due to inadequate translation of research methods and consent forms, thus participants were not able to be properly informed about the risks of the study and therefore not able to provide proper consent. Furthermore, this also led to the inadequate exclusion of participants for pregnancy, lactation, and other conditions (Hodge 2012). Importantly, this is only a small subset of examples. Many other individuals among a variety of populations such as those who were imprisoned (Rodriguez and García 2013), economically disadvantaged, those with mental disabilities, children, the elderly, those with mental illness (Millum 2012), and those among other groups have also been abused by unethical research practices Park and Grayson (2008). This pattern of abuse and neglect has in many cases directly impacted individuals and their families or communities and has unsurprisingly made many people wary of participating in clinical research (Griffith et al. 2020). “We, as African Americans were always the pilot test… I know because people have experienced it within my line, my family, my bloodline in [city name]. Yeah.” He continued by saying, “So I just think that we’ve been used and misused a lot of times within the African American community and in the lower parts of the devastation where the devastation lies” (Griffith et al. 2020). “I’m thinking about horror stories like the Tuskegee experiment and things like that. Like stuff that for me, were mentioned back in school and were mentioned by my family members…” (Griffith et al. 2020). 5.4.2 Health care inequity Beyond such examples of extremely unethical studies and practices, marginalized populations such as people of certain racial and ethnic ancestry have also historically received poorer access to health care and poorer quality health care, thus discouraging their engagement with the health care system (Griffith et al. 2020; Mastroianni, Faden, and Federman 1999; FitzGerald and Hurst 2017; Hodge 2012). Many racial and ethnic groups in the US were historically more likely to be uninsured compared to white people and although coverage rates have been better in more recent history, unequal rates remain to today (Damico 2021). [source] Even with similar access to care (which can be reduced by sociodemographic factors), people of marginalized racial and ethnic groups still have a history of receiving inferior health care (Griffith et al. 2020). These two issues among others have led to health disparities, including for cancer: “An analysis of 1 million patients with cancer in the United States showed that blacks or African Americans have a 28% higher cancer-specific mortality compared with whites. This survival gap is independent of sociodemographic factors, disease stage, and access to treatments. Indeed, disparities exist along the continuum of care, from screening, access to care, and referral to subspecialty centers, to enrollment in clinical trials that define new approaches to cancer treatment” (Nazha et al. 2019). 5.4.3 Poor recruitment and inclusion Due to concern about the time and effort required to find a more diverse set of participants and also due to implicit bias, many populations are not recruited at the same rate. Implicit bias is an unconscious negative evaluation of a person based on characteristics that are irrelevant (FitzGerald and Hurst 2017). “Based on the available evidence, physicians and nurses manifest implicit biases to a similar degree as the general population. The following characteristics are at issue: race/ethnicity, gender, socio-economic status (SES), age, mental illness, weight, having AIDS, brain injured patients perceived to have contributed to their injury, intravenous drug users, disability, and social circumstances” (FitzGerald and Hurst 2017). Sadly, this can result in researchers not including certain populations in their trials. Indeed healthcare professionals have been shown to withhold care, such as treatment based on their own bias about if a particular individual would adhere to the protocol. Thus marginalized groups often aren’t even recruited for clinical trials (Yates et al. 2020). Studies of attempts to regulate oncology trial participation to be more inclusive, show that still 48% did not meet target recruitment goals for recruiting underrepresented populations (Yates et al. 2020). 5.4.4 Inadequate researcher cultural competency and diversity In a study of clinical trial participants of marginalized groups, participants stated that they were more likely to trust a researcher if they seemed scientifically knowledgeable but also had an understanding of the history and context of the study population, as well as experience working with that population (Griffith et al. 2020). “In addition, participants noted that the enthusiasm, commitment, or passion of the researcher to help the population of interest and to study the topic of interest influenced how trustworthy the researcher appeared to be” (Griffith et al. 2020). Participants also state that they feel more trusting and more willing to participant in trials with a researcher of a similar background. Thus, improving the diversity of research teams and supporting underrepresented investigators may help to recruit more diverse participants for clinical trials (Yates et al. 2020). In addition, culture competency training and Diversity, Equity, and Inclusion (DEI) training can help! See here for additional cultural competency training resources. 5.4.5 Barriers of access In a 2019 study, access to oncology trials was found to only be available to about 27% of US cancer patients (Nazha et al. 2019). There are many barriers to this access: Language: One major important barrier is translation of recruitment materials to other languages. Scheduling: Another barrier can be the time at which participants are needed to be in person for a clinical trial. Some individuals need more accommodation if for example they can’t come to appointments during work hours due to having a job with an inflexible work schedule or due to care taking responsibilities. Transportation & location: If it is more time consuming to get to appointments due to a need for public transportation or because an individual lives in a different area of town this can inhibit participation and retention of participants. Health literacy: For individuals who don’t have the time, education, or other means necessary to learn about the importance of specific health interventions or preventative practices for example, they may not be recruited for trials as well as individuals that are regularly concerned with their health and spend leisure time investigating information about their health (Yates et al. 2020). 5.4.6 Lack of community engagement funding There has been historically a lack of funding to support equitable engagement of community stakeholders. However, community engagement can build trust, reduce stigma, increase access, recruitment and retention of more diverse participants for clinical trials. Recent projects in which participants are recruited through community engagement, such as Project Brotherhood (a community based project, that provides Black men in Chicago with health education within their communities) have been shown to help increase health screen participation. 5.4.7 Other aspects of funding Most clinical trials are funded by the pharmaceutical industry which often leads to pressure for a more homogeneous population to improve the opportunity to see effects in patients without confounding patient-specific factors. However, trials that are funded by the NIH include more diverse participants that are more representative of the true population of patients with cancer in the US (Yates et al. 2020). 5.5 Strategies to promote more equitable inclusion in clinical trials Provide adequate financial, logistical, and time support Investigators should consider how to support caregivers who have important obligations, such as tending to their family members or others. For example when possible, perhaps home visits would be feasible. Other strategies include providing flexible hours for trial visits and support for transportation, such as payment for a rideshare service or taxi (Yates et al. 2020). Researchers should also consider providing patients with a cell phone to improve communication (Clark et al. 2019). Finally, participants may also need extra assistance to cope with healthcare needs following participation in a study, particularly if this interferes with their typical obligations. More promotion of health care access in general Individuals have been shown to be more likely to participate when advised by a health care provider (Clark et al. 2019). Increasing health care system access in general could improve clinical trial participation, by increasing opportunities for health care providers to interact with individuals who could participate in a trial. More promotion of community engagement Community stakeholders can help to assess if recruitment information is culturally appropriate and can help with creating more equitable recruitment strategies. More inclusive research teams As stated earlier, individuals have been shown to be more trusting when research teams include members that have a similar background to themselves (Griffith et al. 2020). More inclusive practices When obtaining information about gender, researchers should collect more informative information such as the model proposed in B. Chen et al. (2019). This involves collecting more information about non-binary individuals, for example providing response options to questions about gender so that an individual who was assigned female at birth, but is now male can adequately provide this information. Similarly, more detailed information about ancestry (for example providing more specific Latin American ancestry options as opposed to just Latin American) could also lead to more informative findings about specific populations. If possible aim for funding for clinical trials from an institute that is supportive of the recruitment of more representative samples. 5.6 What does it take for clinical research to be ethical? Here you can see a table of requirements for ethical research trials (Emanuel, Wendler, and Grady 2000). Such consideration can help to ensure that participants are treated with respect and integrity and with their well-being as the top priority. 5.7 Research practices to reduce cancer health disparities According to a recent article (Zavala et al. 2021) about cancer research in the US, the following is suggested to reduce cancer health disparities: Further develop and sustain large diverse cohorts that collect multidimensional/multilevel data. Diversify germ-line and tumor genetics/genomics databases and clinical trials. Develop diverse cell lines and patient-derived xenograft models. Implement system changes in healthcare coverage to guarantee equity in access to high-quality screening and access to treatment. Improvement and system-wide implementation of patient navigation programs. Employ culturally tailored community awareness and education programs to increase cancer screening (including genetics) and modify risk behaviors. Implement legislation that supports behavioral interventions (e.g. limit the sales of tobacco products). 5.8 Ways to better support a more diverse research team In order to best support and encourage mentees and employees of underrepresented groups in cancer informatics, we suggest that lab leaders do the following: 5.8.1 Seek additional training about disparities in informatics and STEM careers Especially focus on hindrances to achievement such as attitudes, biases, and stereotypes. Also, become aware of stereotype threat (also called stereotyped inferiority) - “an internal feeling and concern about confirming a negative stereotype associated with a group (e.g., racial, ethnic, gender, and age) with which the individual identifies” (Stelter, Kupersmidt, and Stump 2021) and how they might influence your mentees. Here is a great video of Russell McClain at the University of Maryland that introduces how implicit bias and stereotype threat impact higher education: Note that you may not be aware of all the barriers of achievement that your mentees may face. For example, mentees from low socioeconomic backgrounds, mentees with disabilities, mentees who have immigrated, older mentees, mentees of traditionally underrepresented races and ethnic groups, and mentees with gender identities that are underrepresented face unique and sometimes overlapping challenges. This is not a complete list and it is also important to learn about how intersectionality (the idea that some individuals may represent more than one underrepresented group (ex. female and Black)) results in more nuanced challenges. For example: “When the intersection of race/ethnicity and gender is considered, women of color report even less access to mentorship and support from mentors than other groups” (Davis et al. 2021). Here is a great video of Kimberlé Crenshaw (at UCLA and Columbia) describing the theory of intersectionality, which she developed: Also, become aware of microaggressions - “subtle verbal and nonverbal slights, insults, or invalidating remarks directed at individuals due to their membership in a group (e.g., racial, ethnic, gender, sexual orientation, age, and physical disability), which are rooted in biases about individuals in that group” (Stelter, Kupersmidt, and Stump 2021). See below a list of examples: [Source] Importantly, “mentors for students with disabilities should receive training, as needed, on their mentee’s specific disability and should be made aware of the accommodations that students may need to succeed in activities and courses” (Stelter, Kupersmidt, and Stump 2021). 5.8.2 Acknowledge mentee’s differences Research shows that mentees of underrepresented racial groups would prefer their mentors to directly discuss how to best cultivate their mentee’s career success given their race. An attitude of “color-blindness” about race has shown to hinder the success of mentees (Byars-Winston et al., n.d.; Holoien and Shelton 2012). Talk with your team members individually (be careful not to single out individual team members in front of the rest of the lab) about how they would like to discuss the potential influences of their background/identity on their career growth. “Racial/ethnic differences between mentees and mentors in interracial mentoring relationships can pose cultural barriers to effective mentoring of HU (Historically underrepresented) students and even affect students’ professional and psychosocial success, especially when complex racial/ethnic issues are not effectively handled or addressed…” (Byars-Winston et al., n.d.). “Two ideological perspectives – colorblindness and multiculturalism – have emerged to shed light on this question. Colorblindness downplays the salience and importance of race by focusing on the commonalities people share, such as one’s underlying humanity. In contrast, multiculturalism acknowledges and highlights racial differences” (Holoien and Shelton 2012). “Exposure to colorblind (vs. multicultural) messages predicts negative outcomes among Whites such as greater implicit and explicit racial bias (Richeson & Nussbaum, 2004)” (Holoien and Shelton 2012). “[Underrepresented groups] benefit when others around them endorse multiculturalism (Plaut et al., 2009)” (Holoien and Shelton 2012). 5.8.3 Work to create a safe environment Educate lab mentors about cultural sensitivity and microaggressions. Highlight the importance of collaboration and create a code of conduct for the lab to demonstrate that respect among lab members is expected and required. 5.8.4 Diverse role models Expose all mentees to a diverse range of role models through seminars, journal clubs, and participation in conferences. Computational biology papers with female authors are more likely to have a last author who is also female. It is unclear if this is because women are more likely to hire other women and or if females are more likely to choose a lab with a female adviser (Bonham and Stefan 2017). Indeed, research of females and other underrepresented groups in STEM including students with disabilities and of certain racial and ethnic groups suggests that role models of underrepresented populations are particularly important for recruiting and keeping students interested in fields where they may feel like an outsider (Stelter, Kupersmidt, and Stump 2021) due to current underrepresentation. One strategy to encourage students of underrepresented populations is to provide students with exposure to such role models through regular seminars where scientists who represent these populations are prominent (Katz 2007). 5.8.5 Advocate for all mentees Introduce your mentee to other scientists and trainees particularly those from a diverse range of underrepresented groups. Encouraging the participation of your mentees in support programs and groups such as graduate student groups. Help mentees cultivate self advocacy practices through open discussions and encouragement. 5.8.6 Support a healthy relationship with failure Be a good role model and openly discuss the role of failure in research. For example, you may describe failures in your own career or you may read some of the book Brilliant Blunders by Mario Livio or this article about the book with your mentees. This book describes how scientific advancement actually occurred due to mistakes of some of the most respected scientists. Educating mentees about the Growth Mindset described by Carol Dweck may also be helpful. The major themes of this mindset is an awareness that our abilities are not fixed, that we can change our aptitudes with practice and work. [source] See here for more information. 5.8.7 Celebration and microaffirmations Be sure to celebrate all of your mentees’ small and large successes. This has been shown to promote confidence and resilience (Stelter, Kupersmidt, and Stump 2021). Be generous complementing or pointing out small successes in discussions with your mentees and thank your mentees for performing tasks that assist you or your lab. For larger successes, consider sharing a meal or other social activity with your lab. For virtual or remote lab members this could be playing a game online. Again aim to do this with all your mentees/lab members. Be mindful about not singling out particular mentees. This could further make such lab members feel like they don’t belong. 5.8.8 Give feedback with cultural sensitivity It is important to be aware that your mentee may be struggling with feeling like they don’t belong when you provide feedback (Stelter, Kupersmidt, and Stump 2021; Lee, Dennis, and Campbell 2007). Thus, when given criticism, certain mentees who may especially feel like they don’t belong because of their background differences, may feel very discouraged. Try to still be encouraging when delivering criticism by acknowledging what is going well and what is progressing. Also keep in mind that it is important to provide feedback for all mentees, as this is needed for growth. Just be sure to provide the feedback in a way that shows that you want your mentee to grow as a scientist overall and that you want them to continue. This article has several good tips for delivering criticism that we will summarize here with our own thoughts: Allow for a discussion about what went wrong. You may learn that your mentee struggled for an entirely different reason than you expected. Having a discussion allows you to better determine how your mentee might be able to perform better in the future. Give criticism in a sandwich. Say something positive, deliver the criticism, then say something else positive. Focus on the actions instead of personality traits. Think about how they can improve. For example, instead of saying “You seem to have time management issues”, you could say something like, “Navigating and prioritizing all these projects is a difficult task and I think we can do better as a team.” Be specific and suggest improvements. You want your mentees to know exactly what they should be aiming to improve and why. Vague criticism may reduce their confidence and will not help them improve as much as concrete specific suggestions and discussion. Deliver criticism in private. Especially if your mentee is feeling like they don’t belong, criticism in front of other lab members can really impact their confidence. It can also lead to more unhealthy competitive dynamics between lab members. Check your feedback. Where possible, double check the feedback you plan to give from the prespective of the mentee. Consider the following: How constructive and useful is the feedback? Can the student realistically make these changes? Is the feedback written in a kind and clear manner? How might the feedback be misconstrued? Don’t surprise your mentees with criticism. Build criticism into regular meetings with your mentee. Don’t create a meeting out of the blue to tell them they need to improve as this may cause excess stress. Secondly, criticism can be normalized if it is delivered gently and in the ways we just outlined. 5.8.9 Consider creating a document of mentor and mentee expectations. These documents help clarify what mentees can expect. This is helpful for your mentees to better perform according to expectations, as they are explicitly stated rather than intuited. [source] Masters and Kreeger (2017) has created a nice set of guidelines for such documents. Also see this table of examples of such documents from the Center for Improvement of Mentored Experiences in Research (CIMER). Keep in mind that such forms should be tailored for different career stages of your mentees and for mentees who are pursuing different expertise. Informatics mentees should incorporate guidelines about data management practices. We will discuss a bit more about that in the next chapter. 5.9 Conclusion We hope that these guidelines will help you to create a safe and more comfortable environment for all of your lab members and to support your team to be more mindful about health inequity when conducting research. We believe that a happier and more inclusive lab has the potential to be more productive and innovative. In conclusion, here are some of the take-home messages: Increasing the diversity of our research teams can improve scientific innovation, can add additional perspectives about what research to focus on, can potentially promote more inclusive research practices, and can improve clinical trail recruitment of more diverse participants. Training about diversity, inclusion, and equity, as well as training about cultural competence can help research teams to communicate with diverse clinical trial participants, promote more inclusive lab culture, and promote more inclusive research. Clinical trials can recruit and retain more diverse participants by designing trials and recruitment strategies that accommodate more diverse participant lifestyles. Community engagement can be a particularly useful strategy for recruitment. Seeking training about biases, stereotypes, intersectionality, and microaggressions can be helpful for creating a more inclusive lab culture. Creating an expectation document for the lab can help clarify your expectations of students so that students do not need to intuit how they are expected to perform. References "],["informatics-lab-management-tools.html", "Chapter 6 Informatics lab management tools 6.1 Slack 6.2 Git and GitHub 6.3 Docker 6.4 Figshare 6.5 RStudio and R Markdown 6.6 Jupyter 6.7 Note-taking apps 6.8 Conclusion", " Chapter 6 Informatics lab management tools There are several tools that can be especially useful for assisting with day-to-day management of projects involving informatics regardless of if you are simply collaborating with an informatics expert or you are leading an informatics research team. 6.1 Slack Slack is a communication tool that allows you to communicate with lab members much more efficiently than email. It is a bit like a combination of an instant message system, email, and Dropbox. You can do quite well with the free version of Slack. It may be all that your research group needs indefinitely. The major difference between the free version and the paid versions is that the free version does not save all of your message history. Currently, with the free version you can search through the history of the last 10,000 messages. From our experience using this with a department with about 250 users, it takes about a year to reach this threshold. If you choose to go with the free option and share any really important messages or files, make sure to save them just in case. Now we will guide you through a bit about how to use Slack. 6.1.1 Workspaces The main landing page for Slack is called a workspace, which looks like this: In the above image, this person has five workspaces which are indicated by the squares on the far left. Each workspace allows for multiple channels for communicating. These channels can include all members of the workspace or specific subsets of members. Team members can also have separate direct messages to have one-on-one discussions. It’s a good idea to check if your department or institute is already using Slack. If so, they may have a workspace that you can join. Otherwise you may want to think about recruiting your department or institute to start using Slack. In this case you could start a workspace where people outside your research group can communicate. This would still allow you to have group messages with your lab or specific groups within the lab. Otherwise, you can start a workspace just for your research group. 6.1.2 Channels Channels are the main way in which you can converse with your team on Slack. We recommend making a Slack channel for your entire research group. Everyone in your group will be able to discuss something by sending messages in real time. If someone is not available at that time, they will see the message when they next check Slack. We also recommend making project specific channels. For these channels you can add all of the team members working on a specific project, so that they can easily discuss and review discussions about the project. Importantly, you can make channels private or public. If a channel is public, anyone in the workspace can join. 6.1.3 Pins If someone sends a really important message, like a link to a relevant document, you can “pin” the message so that it is easier to find later. Hovering over a message you will get the following options: Clicking on the 3 dot button allows you to do several useful things for a message including pinning it to find it easily later. 6.1.4 Code One great feature about Slack, is that it is very convenient to message about code. You can also attach files directly to messages just like in the above message which has a screen shot image file. 6.1.5 Reminders If you want to be reminded about a message in 20 minutes or next week you can also do that using the same hovering and 3 dot button option. Thus if Jack gets a message from his advisor Charlie, but he is busy doing deep work on something else, he can ask Slack to remind him later. You may also notice in the image above that your messages can be edited! unlike an email, in addition you can mark them as unread, which can also be useful for responding to messages asynchronously. 6.1.6 Polls One other nice feature for working with a team is that you can directly poll your team. This requires enabling this feature, but it can be super useful. Say sally wants to schedule a meeting with the lab teammates for a specific project- this could even include collaborators who are outside of the lab. If all the users are on the same Slack channel, she can send out a poll like this one asking people to respond with times that they are available. If you want to learn more about what you can do with Slack, check out these other awesome integrations/Slack apps! 6.2 Git and GitHub Informatics work can especially benefit from keeping track of your steps and the code that you have used. In some cases your lab may use a tool like Galaxy which has built in options for keeping track of the steps that your lab members are taking during their research. However, other tools do not have this option. Instead, we can use a tool called Git which allows for something called “version control”. Version control is the tracking of changes to a file or files overtime. This is equivalent to saving different versions of a grant proposal overtime. However, as you may have noticed, this is not an easy process to maintain. Tools like Git (Git is one of the most popular) help us to keep track of changes. If we save our changes often, we can easily modify our files back to a recent version if necessary. This may be less useful for a grant proposal (although we would argue that it really can be!), but it can be absolutely critical for your informatics code. Why is this? Small changes in your code may result in your code breaking or generating completely different results. To make matters worse, sometimes your code files may be lengthy, if you have 4,000 lines of code (or more!), it can be difficult to identify what is different between one version and another. Git really helps with this. So what is GitHub? GitHub is a free hosting site for code (or other files - including those grant proposals!). Therefore, all the different versions of your files can be saved and accessed online at GitHub. You can make these files private or public. According to Wikipedia: As of January 2020, GitHub reports having over 40 million users and more than 190 million repositories (including at least 28 million public repositories), making it the largest host of source code in the world. You do not have to use GitHub to use Git. If you have data that needs to be complaint with HIPAA, you could still use Git on a local server (more on this in other courses). Alternatively, you could use GitHub after you de-identify your data. See here for info about ways to use GitHub for data that needs to be HIPAA compliant. We recommend this tutorial if you are interested in using Git and GitHub with R. 6.3 Docker If you have multiple team members modifying code for a pipeline or some software, or if you ultimately want to share your code, it is recommended that you use a method to ensure that you (as well as you in the future!) and your team and anyone you want to share your code with uses the same dependency software of the same versions! There are a few ways to do this, but one of the simplest is to use what is called Docker. You might be familiar with something called a virtual machine. A virtual machine basically allows you to perform operations on your computer, but as if you are using a different computer! This is handy because you can ensure that you not only have similar software installed but you are also working with the same operating system as your teammates (even if your computer has a different operating system). For example you might have a Mac, and your teammates might all have Windows machines. Pretty cool, right!? Docker is similar to this, except it uses what is called a container. This allows users to work with software that is preinstalled and an environment that is preconfigured, however it uses part of your existing operating system. This is good because it means that it takes less time and resources than using a virtual machine, which includes a full copy of a virtual operating system. Here is the explanation about what containers are on the Docker blog: A container is a standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another. A Docker container image is a lightweight, standalone, executable package of software that includes everything needed to run an application: code, runtime, system tools, system libraries and settings. See here and here to read more about the differences between virtual machines and containers like Docker. Note that you can use Docker containers within a virtual machine. See here for explanations about host and guest operating systems. In short the host operating system is the local machines operating system (the one with the hardware), the guest operating system is the virtualized operating system. Finally, see here for a deeper explanation about how Docker images and containers work. Like GitHub, there is a Docker Hub, where people store different Docker images (which is what allow people to run a Docker container with all of your software dependencies and configurations on their machine).You can download other people’s Docker images, or you can host your own on Docker Hub for other people to use your Docker images. The important take-home message is that Docker fixes the issue of having your code work only on your machine but not on someone else’s machine. 6.4 Figshare Similar to GitHub, figshare is another option for hosting and sharing files on the internet. However, it is specifically designed for research related files. Users can host data, code, images, posters, papers, and other types of files to allow others to easily find resources related to research. In their words: Figshare is a repository where users can make all of their research outputs available in a citable, shareable and discoverable manner. We make it as simple as possible to make research Findable, Accessible, Interoperable, and Reusable (FAIR). Publish research in any file format and assign an institutionally-branded DOI, document with customizable, discipline-specific metadata, [create] discoverable content across major search engines and academic frameworks. The difference with GitHub is that it is easier to make files citable (thus ensuring better that you get credit) and it also is a bit easier to store large data files. Figshare does not however have the same version control capabilities of Git and GitHub. Ultimately figshare is a great place to share the final versions of your research products to make them findable for others. Figshare also encourages researchers to publish negative findings that did not ultimately end up in publications to reduce redundancy in the research field, which we think is a great idea! 6.5 RStudio and R Markdown If your research teammates are using the R programming language often, we strongly suggest that you consider having these teammates use what is called R Markdown to create reports of their analyses. R Markdown is a flavor of a markup language called Markdown that works especially well with R. What do we mean by markup? A markup language is a language for formatting text, particularly to make text that can be distinguished form the rest of a document. In our case we want text to look different from our code and from the output of our code. Ideally for the sake of reproducibility and transparency, we recommend that your informatics teammates write such R Markdown reports as they are performing their analyses - not afterwards. RStudio is an Integrated Development Environment or IDE for developing code that makes it easy to write such reports. R markdown files make it easy to have a report that shows a bit of the data (or all the data if your data is very small), the code, commentary about what the code is doing, as well the actual output of the code for a given informatics process. We also recommend describing (in one document) what data you used, who performed the analysis, when they performed it, and how they performed it. This really helps with troubleshooting in the future, as well as simplifying maintaining code over time. It also makes it easier to train new lab members or communicate to collaborators about how your code works. The really nice thing about these reports is that Markdown languages allows you to export them in a variety of formats like html websites, pdfs, word documents (or even slide presentations with just a bit of extra work) that can easily be shared with others. You aren’t limited to just writing about code in these reports. You can write about anything. In fact, what you are reading right now was originally written using R markdown. Thus this is also a good option for writing up reports about wet bench experiments as well. 6.5.1 R Markdown guidelines There are a few simple syntax rules for R Markdown. To create headers you can specify them using hashtags. One hashtag # creates the largest header option, while two ## is a bit smaller, and three ### is a bit smaller than two ##, etc. Thus you could create a header like so: # This is a header To create bold text you can use asterisks around the text. **This is bold text** Which will look like this: This is bold text To create italicized text you can use two asterisks around the text. *This is italicized text* Which will look like this: This is italicized text To create both bold and italicized text you can use three asterisks around the text. ***This is bold and italicized text*** Which will look like this: This is bold and italicized text To create a new line you include two spaces after the end of the line. To create a divider line you can use three asterisks without any text on a line. *** Which will look like the following divider line: You can also embed images or videos into your R markdown reports. There are several ways to do this with a package called knitr which allows you to style your reports and include different file types in your reports. However you can also include an image or video simply with the following syntax: ![caption text](URL_or_local_path_to_image_or_videofile) If you are including code (which can be R programming language code, Python, SQL, bash or others). You can specify it using three backticks like this: ```{r} some R code ``` Here is some actual R code that displayed in the html output from the original R Markdown file. There is a slightly darker background for code and for code output. You will see that the result of x is printed right after a break: # This is a code comment about some R code- here comes the code on the next line! x <- c(1, 2, 3, 4, 5) x ## [1] 1 2 3 4 5 Similarly this is some Python code and output: # Now we are going to show some python code x = [1,2,3,4,5] print(x) ## [1, 2, 3, 4, 5] For inline code (meaning you can show the output within some narrative text) you can use one backtick before and after the code starting with r to specify that you are using the R programming language like this: ` r x ` This will result in: Here is the output: 1, 2, 3, 4, 5 Another important thing to know is that you can utilize what are called child Rmd files in case your report is getting too large (something that often happens with analyses). In this case, you can separate out parts of your research process into different report documents and have an additional report document that demonstrates the entire process. See here for more information on how to do this. If you want a quick reference check out this guide. For a more extensive guide check out this article from R Studio. Also see Riederer (n.d.) for additional information about how to use and create R Markdown files. For advanced users check out R Notebook, which is an extension of R Markdown. 6.6 Jupyter Jupyter notebooks are very similar to R markdown reports, however they were designed with an emphasis on using Python rather than R, and such reports are created using a web-based editor rather than software on your local computer. The Markdown syntax used in Jupyter notebooks is nearly the same as what you just learned about for R Markdown. Here you can see a quick guide. See here for an extensive guide, where you might notice some differences in terms of how to include code. JupyterLab is also similar to RStudio. However it is a web-based environment for working with code and writing Jupyter notebooks. You can try some demos here. 6.7 Note-taking apps If you want to take tracking your projects to the next level, we recommend a note-taking app. This allows you to store and organize all of your files related to different projects, take notes, and more. One really useful feature is that many allow you to search across all notes, so if you can’t quite remember where something is you can find it easily. You can also share your notes with others. This is also a great place to jot down ideas, store tips for yourself and others, make timelines and more. Although it will take a bit of time to learn how to use these apps and some time to take notes etc., this will ultimately save you time in the long run and many of these apps have been designed to be especially user-friendly. You don’t need to use all of the available features. Just tracking all the information related to your projects in one place can already greatly improve your ability to manage your projects. This blog has an excellent review of various options, many of which are free or have slightly more limited but free versions. Evernote is a commonly used note-taking app, which as you can see from this video can really be helpful: Be careful if you intend on including any information that would require HIPAA compliance in a note-taking app! Microsoft OneNote offers options for encryption to allow for HIPPA compliance if you need that. See here for more information. 6.8 Conclusion Overall we think that these tools can be helpful to you and your informatics research team. There are however many other tools that can help with informatics analyses. We will discuss these in other courses about data reproducibility and management. In conclusion, here are some of the take-home messages: Slack can be a great option for maintaining communication with lab members who may be onsite or remote. Version control with Git and GitHub as well as standardization using Docker can ensure that your computational work is being maintained and shared smoothly. RStudio and R Markdown reports can improve your analyses that you perform in R. This is also compatible with performing aspects of your analyses using some other languages. Jupyter is very helpful for python related projects. Keeping reports of your work with annotations about the code and data used can be critical for your future self, other lab members, outside collaborators, and others to better understand your analyses. Using a note-taking app can be extremely useful for organizing reports, communications, ideas, notes and more for your various projects. Be careful about including any protected data or information that would require HIPAA compliance. References "],["about-the-authors.html", "About the authors", " About the authors These credits are based on our course contributors table guidelines.     In memory of James Taylor, who was instrumental in initiating this project.   Credits Names Pedagogy Lead Content Instructor Carrie Wright Content Editors/Reviewers Candace Savonen, Sarah Wheelan, Jeff Leek Content Directors Jeff Leek, Sarah Wheelan Content Consultants (Promoting diversity equity and inclusion) Simone Sawyer, Karriem Watson Acknowledgments Andrei Kucharavy, Sarah Opitz, Florian Markowetz, Brody Foy, Michael Mullarkey, Anne Carpenter, Luis Pedro Coelho, Keri Martinowich Production Content Publisher Ira Gooding Content Publishing Reviewers Ira Gooding, Candace Savonen Technical Course Publishing Engineer Carrie Wright Template Publishing Engineers Candace Savonen, Carrie Wright Publishing Maintenance Engineer Candace Savonen Technical Publishing Stylists Carrie Wright, Candace Savonen Package Developers (ottrpal) John Muschelli, Candace Savonen, Carrie Wright Art and Design Illustrator Carrie Wright Funding Funder National Cancer Institute (NCI) UE5 CA254170 Funding Staff Emily Voeglein, Fallon Bachman   ## ─ Session info ─────────────────────────────────────────────────────────────── ## setting value ## version R version 4.3.2 (2023-10-31) ## os Ubuntu 22.04.4 LTS ## system x86_64, linux-gnu ## ui X11 ## language (EN) ## collate en_US.UTF-8 ## ctype en_US.UTF-8 ## tz Etc/UTC ## date 2024-12-20 ## pandoc 3.1.1 @ /usr/local/bin/ (via rmarkdown) ## ## ─ Packages ─────────────────────────────────────────────────────────────────── ## package * version date (UTC) lib source ## bookdown 0.41 2024-10-16 [1] CRAN (R 4.3.2) ## bslib 0.6.1 2023-11-28 [1] RSPM (R 4.3.0) ## cachem 1.0.8 2023-05-01 [1] RSPM (R 4.3.0) ## cli 3.6.2 2023-12-11 [1] RSPM (R 4.3.0) ## devtools 2.4.5 2022-10-11 [1] RSPM (R 4.3.0) ## digest 0.6.34 2024-01-11 [1] RSPM (R 4.3.0) ## ellipsis 0.3.2 2021-04-29 [1] RSPM (R 4.3.0) ## evaluate 0.23 2023-11-01 [1] RSPM (R 4.3.0) ## fastmap 1.1.1 2023-02-24 [1] RSPM (R 4.3.0) ## fs 1.6.3 2023-07-20 [1] RSPM (R 4.3.0) ## glue 1.7.0 2024-01-09 [1] RSPM (R 4.3.0) ## htmltools 0.5.7 2023-11-03 [1] RSPM (R 4.3.0) ## htmlwidgets 1.6.4 2023-12-06 [1] RSPM (R 4.3.0) ## httpuv 1.6.14 2024-01-26 [1] RSPM (R 4.3.0) ## jquerylib 0.1.4 2021-04-26 [1] RSPM (R 4.3.0) ## jsonlite 1.8.8 2023-12-04 [1] RSPM (R 4.3.0) ## knitr 1.48 2024-07-07 [1] CRAN (R 4.3.2) ## later 1.3.2 2023-12-06 [1] RSPM (R 4.3.0) ## lifecycle 1.0.4 2023-11-07 [1] RSPM (R 4.3.0) ## magrittr 2.0.3 2022-03-30 [1] RSPM (R 4.3.0) ## memoise 2.0.1 2021-11-26 [1] RSPM (R 4.3.0) ## mime 0.12 2021-09-28 [1] RSPM (R 4.3.0) ## miniUI 0.1.1.1 2018-05-18 [1] RSPM (R 4.3.0) ## pkgbuild 1.4.3 2023-12-10 [1] RSPM (R 4.3.0) ## pkgload 1.3.4 2024-01-16 [1] RSPM (R 4.3.0) ## profvis 0.3.8 2023-05-02 [1] RSPM (R 4.3.0) ## promises 1.2.1 2023-08-10 [1] RSPM (R 4.3.0) ## purrr 1.0.2 2023-08-10 [1] RSPM (R 4.3.0) ## R6 2.5.1 2021-08-19 [1] RSPM (R 4.3.0) ## Rcpp 1.0.12 2024-01-09 [1] RSPM (R 4.3.0) ## remotes 2.4.2.1 2023-07-18 [1] RSPM (R 4.3.0) ## rlang 1.1.4 2024-06-04 [1] CRAN (R 4.3.2) ## rmarkdown 2.25 2023-09-18 [1] RSPM (R 4.3.0) ## sass 0.4.8 2023-12-06 [1] RSPM (R 4.3.0) ## sessioninfo 1.2.2 2021-12-06 [1] RSPM (R 4.3.0) ## shiny 1.8.0 2023-11-17 [1] RSPM (R 4.3.0) ## stringi 1.8.3 2023-12-11 [1] RSPM (R 4.3.0) ## stringr 1.5.1 2023-11-14 [1] RSPM (R 4.3.0) ## urlchecker 1.0.1 2021-11-30 [1] RSPM (R 4.3.0) ## usethis 2.2.3 2024-02-19 [1] RSPM (R 4.3.0) ## vctrs 0.6.5 2023-12-01 [1] RSPM (R 4.3.0) ## xfun 0.48 2024-10-03 [1] CRAN (R 4.3.2) ## xtable 1.8-4 2019-04-21 [1] RSPM (R 4.3.0) ## yaml 2.3.8 2023-12-11 [1] RSPM (R 4.3.0) ## ## [1] /usr/local/lib/R/site-library ## [2] /usr/local/lib/R/library ## ## ────────────────────────────────────────────────────────────────────────────── "],["references.html", "References", " References "],["404.html", "Page not found", " Page not found The page you requested cannot be found (perhaps it was moved or renamed). You may want to try searching to find the page's new location, or use the table of contents to find the page you are looking for. "]]