TrackBack continuously records the environment using a camera and leverages object detection through YOLOv4 to identify and log objects, providing bounding boxes, the location, the label, and the most prominent colour. When users forget where an item was placed, they can type a query like "Where is my brown mug?" on our web app. An LLM (powered by Voiceflow) will access an image of the object along with object details stored by the YOLO model to provide instructions on where the object is.
One of our key features is the chat function. We understand that seniors tend to have a hard time with technology, so having a chat feature that mimics real-life conversations could help them use this software.
We used YOLOv4 for object detection, transforming and labelling detected objects in a real-time feed and their associated metadata (bounding boxes, labels, locations, colours) into a structured format accessible by Voiceflow. The data is stored in a tabular format to ensure it is compatible with Voiceflow's LLM, which handles the user queries and retrieves relevant object information.
Our front-end was first designed in Figma and then implemented in React.js. The information displayed comes from calling the Voiceflow API.
- Understanding how Voiceflow flows
- Working around how to store and format the data from the YOLOV4 model to make it accessible for Voiceflow, pivoting from submitting a JSON file to a CSV.
- Figuring out how to stream the videos from our phones to the S3 database.
- Finding a plausible solution to a problem we're all passionate about
- Developing a functional minimum viable product that entails a YOLOv4 model that seamlessly integrates with Voiceflow through an intuitive user interface
- Building an easy-to-navigate, and ✨aesthetic✨user interface
- A CSV file in a table format is the most optimal way of storing the data passed by YOLOv4, as it is compatible with the LLM as opposed to storing it and passing it through a direct JSON file.
- How to take in camera footage and stream the frames in 15-second segments to S3
- The importance of designing a simple user interface for senior citizens suffering from dementia to easily navigate
- Integrating voice commands can enable voice-based queries for users who may find typing difficult or not possible.
- Adapting the current tool into an easy-to-navigate, user-friendly mobile application for improved access across devices
- Adding a map feature that provides the user directions to where the object is located for misplaced items outside of one's home
- Creating a collaborative memory base where multiple users can share camera footage to help find a misplaced object with improved efficiency