by Qian Huang, Qiyun Wang https://arxiv.org/pdf/2409.17426
Background:
- ChatGPT has been used in education for learning, teaching, and research
- Potential to conduct Systematic Literature Reviews (SLRs) but limited empirical studies on how to use it effectively for this purpose
Research Objectives:
- To what extent can ChatGPT conduct SLR?
- What strategies can human researchers utilize to structure prompts for ChatGPT that enhance reliability and validity of a SLR?
Methods:
- Conducted a SLR using ChatGPT on the same 33 papers as a benchmark
- Design-based approach: compared results from ChatGPT's review with those of published research
Findings:
- ChatGPT could conduct a SLR, but requires detailed and accurate prompts to analyze the literature effectively
- Guiding principles summarized for researchers conducting SLRs using ChatGPT.
Systematic Literature Review (SLR)
- A research methodology to collect, identify, and critically analyze studies through a systematic procedure
- Advances evidence-based knowledge by synthesizing common themes and trends over time
- Identifies research gaps for future studies
- Highly labour intensive and time-consuming task in framing clear research questions and developing analysis strategies
ChatGPT
- Developed by OpenAI as an advancement in natural language processing and artificial intelligence
- Originated from the GPT (Generative Pre-trained Transformer) architecture, with GPT-3 being a notable predecessor
- Designed to improve interaction with AI and offer conversational dialogue, question answering, and content generation
- Utilizes deep learning techniques to understand and generate human-like text based on input
- Significant enhancements in understanding, context awareness, and generative abilities from GPT-3 to ChatGPT 4.0
- Promising potential for education applications such as designing quizzes, promoting active learning, strengthening collaborative learning, providing immediate feedback, serving as a writing assistant, and assisting with data analysis research.
Benefits of using ChatGPT in Education:
- Design quizzes (Eysenbach, 2023; Jahic et al., 2023)
- Promote active learning (Shoufan, 2023)
- Strengthen collaborative learning (Cotton et al., 2023)
- Provide immediate feedback and personalized learning (Kuhail et al., 2022)
- Serve as writing assistant (Jahic et al., 2023)
- Help with data analysis in research (Hwang & Chen, 2023; Kooli, 2023)
Challenges of using ChatGPT in Education:
- Lack of ability to generate original ideas and critical thinking (Arif et al., 2023; Choi et al., 2023; Kooli, 2023)
- Learning process changes (Qi et al., 2023)
- Misinterpretation in research (Kooli, 2023)
- Ethical issues: authorship in academic writing and plagiarism (da Silva, 2023; Zhang & Zhen, 2023)
Suggestions for using ChatGPT in Education:
- Correct prompts to generate answers (Thakur et al., 2023)
- Develop learners' new digital skills and competencies (Kasneci et al., 2023).
ChatGPT-4.0 Study on Blended Synchronous Learning (BSL)
Methodology:
- Exploratory study to investigate ChatGPT's ability to conduct a Systematic Literature Review (SLR) equivalent to an original literature review (OLR)
- OLR involved systematic selection of 33 papers and analysis/synthesis, based on research questions:
- Challenges causing low engagement in BSL for online learners?
- Strategies to address challenges and increase online learner engagement?
Process:
- Initial prompt: Researcher uploaded 1st paper, instructing ChatGPT on the two research questions
- Generation of results: ChatGPT summarized strategies, indicating potential for SLR
- Analysis of multiple papers: Researcher asked ChatGPT to generate common challenges and strategies after analyzing 5, then 12 papers
- Results analysis: Generated results had some inaccuracies; researchers identified the need for a more structured approach
Iterative Process:
- Round 1: Identical process as OLR, ChatGPT initially generated accurate but broad results (challenges and strategies) from entire document, not just Results section
- Round 2: Researchers requested specific framework and detailed information for common challenges and strategies
Results and Analysis:
- Generated results were not accurate, as they included information from the "Literature" section instead of just the "Results"
- ChatGPT captured broad information but did not report findings based on a specific analytical framework used in OLR process. Researchers identified the need for more structured approach to address these issues and conduct a more accurate SLR in subsequent rounds.
Issue 1:
- Prompts adjusted by narrowing focus to specific sections (e.g., "Finding/Results" section) using page ranges
- GPT able to address challenges and strategies related to online learning engagement
Issue 2:
- Common challenges and strategies based on interaction framework categorized as learner-instructor, learner-learner, learner-content
- ChatGPT had difficulty identifying correct page numbers for all papers; provided PDF file starting points instead
- Generated limited common challenges and strategies in each theme (e.g., increased cognitive load for instructors)
- Added checkpoints to see if ChatGPT could generate richer results
Issue 3:
- Overview generated by ChatGPT on educational contexts, research methods, design frameworks compared to OLR results:
- Some items highly consistent (e.g., graduate level)
- Big differences in other areas (e.g., K-12 education, DBR, Framework)
- Inaccurate identification of research methods and theoretical frameworks used in studies.
Revised Process: Round 3+
- Two additional checkpoints added between papers 12th and 33rd to generate richer challenges and strategies
- Revised prompt for ChatGPT to provide actual page numbers from the journal, not PDF files
- Common challenges and strategies based on learner-learner, learner-content, learner-instructor interaction framework
- Examples of challenges and strategies provided for each theme (Technological Limitations and Interference)
Results and Analysis:
- More common challenges and strategies generated after analyzing 18th and 25th papers
- Final review results after analyzing 33 papers highly consistent with previous OLR results
- ChatGPT provided concrete examples to illustrate summarized themes (Figure 12)
Issue:
- Page numbers in ChatGPT's responses were still incorrect, often referring to PDF file names instead of journal page numbers. For example: "Conklina.pdf" (Page 6) is actually page 22 in the journal.
Study Discussion:
- Exploratory study on using ChatGPT for systematic literature reviews (SLR)
- Confirms ChatGPT can assist in SLR: analyzing papers, extracting info, summarizing findings
- Limitations: researchers screen papers, no access to academic databases or accurate page number identification
- Guiding principles: screen papers using PRISMA protocol, provide detailed instructions, define research questions and frameworks, check page numbers manually, document process, let ChatGPT read multiple times for accuracy.
Key Findings:
- ChatGPT can help conduct a tentative literature review but has limitations
- Importance of human oversight in the screening process and manual checks for accuracy
Design Principles:
- Screen reviewed papers first, using PRISMA protocol or other methods
- Provide detailed instructions with clear steps
- Define research questions and theoretical frameworks
- Specify page number ranges or sections of papers for analysis
- Check actual page numbers manually
- Document the research process to maintain records and ensure accuracy
- Let ChatGPT read papers repeatedly and triangulate with human beings for enhanced efficiency and reliability.