Presentify | Devpost

Hero Page
Lofi Mapping
Instructions Page

Inspiration

Presentations are often a grueling task, requiring significant time and effort to convey information effectively. Our inspiration for Presentify came from the desire to simplify this process, making presentations more accessible, engaging, and inclusive. We aim to enhance communication by integrating speech-to-text technology, which allows for real-time subtitle generation, thereby supporting individuals with diverse needs.

What it does

Presentify empowers users to create dynamic presentations that incorporate:

Real-time subtitles generated from spoken input, making presentations more accessible to deaf or hard-of-hearing individuals.
Visual presentations that include images and bullet points, enhancing audience engagement and comprehension. This also allows presentation building to be more accessible to those that are not familiar to technology or may suffer from chronic pain that impact their ability to type and use technology.
Audio effects that can be triggered by specific keywords detected during the speech, adding a layer of interactivity to a presentation sequence.

How we built it

We built Presentify using a combination of the following technologies:

React for building a responsive user interface, allowing for real-time updates and smooth user interactions.
Web Speech API for converting speech to text, enabling automatic subtitle generation.
GPT-4o mini to summarize speech content into bullet points and identify keywords for image pairing.
PexelsAPI for stock image fetching, allowing us to pull relevant imagery.

Challenges we ran into

During development, we encountered several challenges:

Speech Recognition Accuracy: Ensuring high accuracy in recognizing speech across different accents and speech patterns was a significant hurdle.
Real-time Processing: Managing the performance of real-time subtitle generation without lag was essential to maintaining a smooth user experience.
Audio Management: Synchronizing audio effects with keywords while ensuring they do not distract from the presentation’s content required careful design and testing.
Accurate Generated Content: Ensuring that bullet points and images generated sufficiently corresponded to the speech content was often difficult to ensure due to the dynamic nature of public speaking.

Accomplishments that we're proud of

We're proud of everything !!!! This was a great experience for everyone on the team to work with technology that they were unfamiliar with, or wanted to become more familiar with. From the ideation, the tech stack, and the user experience and applications, we're extremely proud of ourselves.

What we learned

Real-time is difficult. It is difficult to pick up on punctuation and nuances that come with speech patterns. A lot of additional thought and adjustment had to be made on our approach in order to deal with latency and interpretation issues.

What's next for Presentify

Looking ahead, we plan to:

Enhance the speech recognition capabilities by incorporating machine learning for improved accuracy and adaptability.
Expand the audio effects library to include more diverse options, catering to a wider range of presentation styles and preferences.
Offer more flexibility with the visual formatting for the presentation, potentially enabling dynamic formatting on a slide-to-slide basis.

Built With

chatgpt
css
express.js
html
javascript
pexelsapi
react
webspeechapi

Submitted to

StormHacks 2024: Surge x Enactus x MSESS

Created by

UI/UX designer, front-end (audio queues + landing/ending) <3

Michelle Wan
4th year bucs @ ubc
Back-end (Open AI API, Pexels API), prompt engineering, server-client communication (Express.js)

Lucas Gingera
ubc business + cs
Jason Kuo
ubc cs + business `26
Marcus Kam
4th year bucs

Updates

Michelle Wan started this project — Oct 06, 2024 02:28 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.