IHRepo

March 11, 2023 | 3 min read

Keywords : python | django | mysql | bbdd | trasnscription | subtitles | summary | keywords | translation | multi-language | repository | video | natural language processing |

Intelligent Helper Repository is my attempt to show how artificial intelligence can help in access to education.

This is the project that I created for my Final Bootcamp project in Data Analytics in IronHack. The idea was to try to improve the Bootcamp experience, acquiring more students and helping the existing students. So I decided to transcribe all the videos of my bootcamp, translate all the texts (to Spanish at least but some to Portuguese, Italian, German, Chinese, and Hindi, as the languages that could speak my fantastic cohort), caption them, creating a summary and keywords, all automatic (at this moment, couldn't finish all the automatism but exists all the functions for that).
project0302
The project also includes one app to search and find videos about one topic (or any word you want to use in any language that was translated) and when decide on the video you want to explore have the option to search inside the video and locate the exact moment where the video talk about your searching (I implemented the FULL TEXT index in MySQL with the option of Natural Language).
To play the videos with subtitles and locate in the exact position I used VLC media player.
I would wanted to train my models but my hardware and the time made me change my mind, however, this project is not closed so maybe in the future I return here to improve it.
project0303

Process

Transcription
For this task, I decided to use Whisper. After testing different options, this was which fitted more to my project. I used the model "base.en" because of my schedule (I had to create the project after work) and hardware requirements but can use others larger to get better performance, or even multi-language.
Subtitles
I had many problems managing the subtitles, I tried to split the videos but synchronizing again the subtitles was impossible with the time I had to do it, so in the end I decided to use Stable Whisper without split the video.
Summary
For this task, I decided to use Hugging Face transformers after trying other options, I also tried many different models and decided to use mrm8488/flan-t5-large-finetuned-openai-summarize_from_feedback which gave me better results.
Keywords
KeyBERT was the chosen between some alternatives and the model I used was universal-sentence-encoder
Translation
This was the only part that didn't try other options because I used it before in my Lyrics project and worked fine for me but this also was not a smooth path, beware that translating subtitles can give you some surprises (break lines...) so I decided to split the subtitles into lines and translate them to joined together at the end.
BBDD
This process stores in one MySQL database all the texts I got (originals and translations) with the relationship with the videos. I created a FULL-TEXT index in the database so I could use the natural language option to search the videos.
APP
Since I tried Flask in my Lyrics project, for this project I wanted to try other options and after researching Streamlit and Django, I decided to take the complicated path, so I used Django (with Bootstrap 5).

project0304

As I said before, the limited time didn't allowed me to finish everything I had in my mind (launch all the logic from Django, update videos, option to transcribe online videos (at the moment only can use files). But I think in this project there are a lot of information about this complicated path I had chosen.

With this project, my cohort gave me the chance to participate in the Remote Hackshow of Ironhack in March of 2023.

To summarize the links of my project, here they are:
-Source codes (including the Django App): My Github
-Presentation video: My Youtube

-Hackshow video: My Youtube