Third Eye | Devpost

What it does

Third Eye combines voice audio input with a portable camera to aid the blind and severely visually impaired with understanding and navigating their surroundings. It is also helpful for those with vision, since it can be configured to expand the user's visual field or be used as a security option that doesn't require visual attention.

How we built it

We created Third Eye by combining various libraries, frameworks, and tools such as React Speech Recognition, GPT-4, and Web Text to Speech within a carefully designed application. We researched how to use each one and interfaced them with each other, as well as our own laptops and phones. We tested on multiple devices to ensure functionality and a future compact physical product for Third Eye.

Challenges we ran into

We struggled with several bugs pertaining to the behaviour of Third Eye while it was under construction; we overcomplicated our back end in the beginning and library usage. However, we managed to succeed in critically thinking about our design and cutting out unnecessary components. In addition, we initially had unreliable speech detection. Determined, we turned to different API options until we found one that was suited to the needs of Third Eye.

Built With

css
next.js
react
typescript
web-speech
whisper

Submitted to

YRHacks 2024
- Winner ACTO Real-World Applications

Created by

I came up with the idea and planning the project structure. I also implemented the basic groundwork for the camera, speech-to-text, and prompting.

Daniel Pu
i make stuff
I helped plan the primary features of the project and our workflow. I implemented custom API routes, the OpenAI library, and text-to-speech and speech-to-text functionality. I also connected the frontend to the backend through the API routes and ensured cross-device compatibility.

Aayan Karmali
Gordon Zeng

Updates

Daniel Pu started this project — Apr 18, 2024 05:29 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.