The Gmail Email Processor is a tool designed to process Gmail mbox files, extract email content, and save the processed emails into text files. It handles decoding MIME words, normalizing text, and ensuring a clean output format.
- Decodes MIME words in email headers.
- Normalizes text to ensure a maximum of two consecutive line breaks.
- Cleans email bodies to remove unwanted characters.
- Sorts emails by date and writes them to text files based on the sender's domain.
- Miniconda
-
Clone the repository:
git clone https://github.com/WEMAKE-CX/gmail-email-processor.git cd gmail-email-processor
-
Run the setup script:
./start.sh
-
Place your mbox files in the
source/Gmail
directory. -
Run the processing script:
python emailembed.py
emailembed.py
Handles the processing of mbox files and extraction of email content.
start.sh
Sets up the environment using Miniconda and installs required packages.
Processed emails are saved in the output
directory, with filenames based on the sender's domain.
This project is licensed under the MIT License.