Skip to content

Latest commit

 

History

History
110 lines (78 loc) · 5.56 KB

README.md

File metadata and controls

110 lines (78 loc) · 5.56 KB

YouTube Transcript Fetcher

A Python program to retrieve the text transcript of YouTube videos. This tool uses the youtube_transcript_api library to fetch available transcripts (both manually provided and auto-generated) for a given YouTube video URL.

Table of Contents

Prerequisites

1. Python Installation

Ensure you have Python 3.6 or later installed on your system. You can download Python from the official website.

2. Install Required Python Packages

Open your terminal or command prompt and install the necessary packages using pip:

pip install youtube_transcript_api pytube

3. (Optional) Handling Non-English Transcripts

If you need transcripts in languages other than English, ensure that the video has transcripts available in those languages. The script can be adjusted to fetch transcripts in specific languages.

How to Use the Program

Clone the repository

git clone https://github.com/ChristianE00/Youtube-Transcript-Fetcher.git

Install Dependencies

pip install youtube_transcript_api pytube

Run the Script

Open your terminal or command prompt, navigate to the directory containing youtube_transcript_fetcher.py, and execute the script using the following syntax:

python youtube_transcript_fetcher.py "YOUTUBE_VIDEO_URL" -o "output_filename.txt" -l "language_code"
  • "YOUTUBE_VIDEO_URL": Replace with the actual URL of the YouTube video.
  • "output_filename.txt": (Optional) Replace with your desired output filename. Defaults to transcript.txt if not specified.
  • "language_code": (Optional) Replace with the desired language code (e.g., 'en' for English, 'es' for Spanish). Defaults to 'en' if not specified.

Examples

Fetch English Transcript (Default)

python youtube_transcript_fetcher.py "https://www.youtube.com/watch?v=dQw4w9WgXcQ"

Description: This command fetches the English transcript of the provided video and saves it to transcript.txt.

Specify Output Filename and Language

python youtube_transcript_fetcher.py "https://www.youtube.com/watch?v=dQw4w9WgXcQ" -o "rickroll_transcript.txt" -l "en"

Description: This fetches the English transcript and saves it to rickroll_transcript.txt.

Fetching a Non-English Transcript

Description: This command fetches the Spanish transcript of the specified video and saves it to transcript_es.txt.

Handling Errors and Exceptions

No Transcript Available

  • Scenario: If a transcript isn't available in the specified language, the script will attempt to fetch the English transcript. If no transcript is found, it will notify you accordingly.

Transcripts Disabled

  • Scenario: If transcripts are disabled for the video, the script will inform you that transcripts are disabled.

Invalid URL

  • Scenario: If the provided URL is invalid or the video ID cannot be extracted, the script will display an error message.

Additional Notes

Multiple Transcripts

  • Description: Some YouTube videos have multiple transcripts in different languages or both auto-generated and manually provided captions. The script lists all available transcripts before attempting to fetch the desired one. This helps you choose the correct language code.

Auto-Generated vs. Manually Provided Transcripts

  • Description: Auto-generated transcripts might be less accurate compared to manually provided ones. This is indicated in the transcript listing output.

Handling Long Videos

  • Description: The youtube_transcript_api handles relatively long transcripts efficiently. However, if you encounter issues with exceptionally long videos, consider modifying the script to process segments or use more advanced transcript retrieval methods.

Privacy and Respecting YouTube's Terms of Service

  • Description: Ensure that you have the right to access and use the transcripts, especially for copyrighted content. Always respect YouTube's Terms of Service when accessing and using their data.

Contributions

Feel free to contribute! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request.