Relax Teacher .

Motive

New_Recording_-_4_7_2023._2_13_53_PM.mp4

My inspiration came from the aftermath of the 2020 lockdown. A powerful reason brought that in every school, college, and university, everything was closed. We were aware of a new way to serve our education at the time, known as "Online Class." That online class has both positive and negative aspects. But what saddened or disappointed me was that many students did not respect their senior teachers simply because they were unfamiliar with computer/technology stuff, and as a result, our respected teachers made a few mistakes. Students mocked and trolled them, which was wrong. So I created this simple website with Ai/ML to assist our esteemed teachers. It is automated, so a teacher only needs to upload his or her recorded class; after that, my website will save the recorded video in the appropriate folder as 'subject' and also save it in text format with a summary of that class.

Methodology

Data Collection

As my input format is recorded class, which is video format, I've decided to collect data from `YouTube` as there have free tutorial videos.
I want to express my graditute to CrashCourse, Art of Wei, Khan Academy, Professor Dave Explains, thenewboston, The School of Life YouTube mentors for your amazing Tutorials made education easy for all and I used your data.
I collected videos based on different subjects. Subjects are History, Art, Physics, Chemistry, Biology, Astrology, Literature, Philosophy, Politics, Economics, Phychology, Sociology. Code here

I've downloaded 700 videos, which means my dataset size is 700.
Data Preprocessing

This step was very important for this project. For the sake of this project, I need to convert Video to Audio, then strong>Audio to Text. And finally from the Text to Summary. I used this Summay for classification of the subject the uploaded video is teaching.

- For video to audio conversion, I used Moviepy Python library .

- To convert this audio to text, I used HuggingFace pretrained model. I tried several models, namely openai/whisper-base.en, whisper-large, whisper-tiny and I found openai/whisper-base.en more accurate. Code

- And lastly for text to summary again, I tried HuggingFace buit-in models to compare the models and use the best one, models were "facebook/bart-large-cnn","philschmid/bart-large-cnn-samsum", "google/pegasus-cnn_dailymail" , "sshleifer/distilbart-cnn-12-6" . I found "sshleifer/distilbart-cnn-12-6" is very fast and finalized this model. Code
Model Training
Using this dataset, , I started training a model and evaluated using Blurr.
Blurr: Blurr is a new technique for teaching a model using a dataset. The teaching method is combined by HuggingFace and Fastai which makes it faster to train a model and get impressive accuracy.

  For training, I used the built-in model from HuggingFace I used two models
1. distilroberta-base and
  2. j-hartmann/emotion-english-distilroberta-base
  And different Batch sizes : 2, 4, 16, 32

Final Model Selection

Here is all trainning results:

Model Batch Size Accuracy

emotion-english-distilroberta-base 2 21%

emotion-english-distilroberta-base 4 21.2%

emotion-english-distilroberta-base 16 41%

emotion-english-distilroberta-base 32 21%

distilroberta-base 2 11%

distilroberta-base 4 23%

distilroberta-base 16 51%

distilroberta-base 32 98%

So this is obvious that I kept distilroberta-base model further use Code.

Model Size Compression

Model size compression is one of the important tasks in ML. As we all want to build something which costs low memmory size also faster and accurate. So Onnx is the best, option in my opinion.

Before using onnx my final model size was 314 MB, and after using onnx my model was compressed and reduced in size to 78.7 MB which is almost 5 times smaller than the original model.

Deployment

This project is deployed in HuggingFace Platform. It's a great place for Data Scientist or ML Engineer like Heaven. You can use this deployed app from here

Here it's deployed look..

Interface

I was thinking, I would make it free through a website. For making the website, I used the Flask Framework. I made a very simple UI nothing complex is in it. All of my work using flask and the huggingface API is here

You can use my website.

Here is demo images of my website..

"Subject: Astrology", Here Astrology is ML generated. The video summary classified this video's summary as Astrology

The text in "context" is automatically generated from video's voice.

And lastly, the text in "Summary" automatically generated from "Context"

Problem That I've Faced and Solution

While I was working on this project, I faced several problems.

1. Converting video to an audio file. At first, I used subprocess to convert mp4 to "wav." But this process is slow. Then I used moviepy library, which was far faster than subprocess.
2. The next problem that I had faced was the conversion of audio to text. I directly used Huggingface model, which gave me the text but only 30 seconds of video. Finally, I used Pipeline from transformars. This process helped get full text in a video in a faster way.
3. The next problem was that I tried to convert the full text into a summary. That gave me errors multiple times. After a while, I found a resource on Youtube, that I have to make Chunks using whole text. Then I can feed this to a model for Summary. chunk is make whole text to sub-text with 512 words.

Conclusion

This project is made for our honorable senior teachers, who are the nation builders. Please show some respect towards them.

If you think this project needs some changes or you want to contribute, please pull a request.

Thanks.

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
audios		audios
data		data
flask		flask
images		images
model		model
notebook		notebook
scripts		scripts
LICENSE		LICENSE
README.md		README.md
list.txt		list.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Relax Teacher .

Motive

Methodology

Data Collection

Data Preprocessing

Model Training

Final Model Selection

Model Size Compression

Deployment

Interface

Problem That I've Faced and Solution

Conclusion

About

Releases

Packages

Languages

Model	Batch Size	Accuracy
emotion-english-distilroberta-base	2	21%
emotion-english-distilroberta-base	4	21.2%
emotion-english-distilroberta-base	16	41%
emotion-english-distilroberta-base	32	21%
distilroberta-base	2	11%
distilroberta-base	4	23%
distilroberta-base	16	51%
distilroberta-base	32	98%

License

AklimaRimi/Relax-Teacher

Folders and files

Latest commit

History

Repository files navigation

Relax Teacher .

Motive

Methodology

Data Collection

Data Preprocessing

Model Training

Final Model Selection

Model Size Compression

Deployment

Interface

Problem That I've Faced and Solution

Conclusion

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages