AutoniMAKE | Devpost

Inspiration

Robotics and artificial intelligence are two of the most powerful technologies today, but building systems that combine them is often extremely difficult. Developers frequently spend more time integrating sensors, hardware, and software pipelines than actually building intelligent behaviors.

In Canada, this challenge is even more significant because of our large geographic size, remote communities, and growing need for automation in industries like healthcare, logistics, agriculture, and accessibility technologies. Many of these applications require intelligent machines that can operate autonomously or assist people where human resources are limited.

However, developing these systems typically requires deep expertise in machine learning, robotics, and embedded systems, which creates a barrier for students, researchers, and small teams who want to build solutions for real-world problems.

We were inspired by the idea that if we could dramatically lower the barrier to building AI-powered robotics, more people could develop solutions for challenges facing Canada — from assistive technologies to remote automation.

This led us to build AutoniMake, a platform that allows users to train computer vision models and connect them to hardware actions without needing to write complex machine learning code.

We were inspired by the question:

What if building an autonomous robot was as easy as training an image classifier?

Our goal was to reduce the barrier to entry for robotics and AI by creating a platform where users could teach machines new behaviors without writing complex machine learning code.

What it does

AutoniMake is a code-free AI robotics platform that allows users to train computer vision models and instantly connect those models to hardware actions.

Using our interface, a user can:

Capture training examples from a camera

Train a computer vision model in seconds

Map predictions to hardware actions

Control real-world devices like robots, displays, and sensors

For example, a user could train the system to recognize hand gestures:

👍 → move robot forward

✋ → stop robot

✌️ → turn robot left

Once trained, the system performs real-time inference and sends commands to modular hardware peripherals.

This allows anyone to prototype autonomous systems without building the entire robotics pipeline from scratch.

How we built it

AutoniMake combines computer vision, machine learning, and modular hardware control into a single framework.

Computer Vision Pipeline

We built a vision pipeline using OpenCV for image preprocessing and data capture. Camera frames are processed and passed to a lightweight convolutional neural network (CNN) for classification.

A CNN works by learning spatial patterns within images through convolutional filters. Each layer extracts increasingly complex features:

Feature Extraction→Pattern Recognition→Classification

This allows the system to recognize gestures, objects, or signals from camera input.

AI Training System

Users can collect datasets directly from the camera feed. These images are used to train a custom classification model that predicts labels and outputs confidence scores.

Example prediction output:

Gesture: thumbs_up Confidence: 94%

The system performs real-time inference, allowing visual input to instantly trigger hardware responses.

Hardware Architecture

AutoniMake uses a hub-and-peripheral architecture.

A Raspberry Pi acts as the central hub

The hub runs the AI pipeline and web interface

Commands are sent to hardware modules via serial communication

Peripheral devices are powered by ESP32 microcontrollers, which control hardware components like:

Displays

Motors

Sensors

Commands follow a simple protocol:

DISPLAY:Hello ROBOT:F ROBOT:STOP

This modular design allows new hardware components to be added easily without changing the AI system.

Interface

We built a simple web-based interface that allows users to:

capture training images

train AI models

map predictions to hardware actions

monitor predictions in real time

This makes the platform accessible even for users with no machine learning experience.

Challenges we ran into

One of the biggest challenges was integrating multiple systems in real time.

Our project involved:

computer vision

machine learning

web interfaces

microcontroller communication

robotics hardware

Ensuring all these components worked together with low latency required careful system design.

Another challenge was building a reliable communication pipeline between the Raspberry Pi hub and ESP32 peripherals. We had to design a simple command protocol and ensure commands were parsed correctly across devices.

Finally, training models quickly enough for a live demo required optimizing our dataset size and inference pipeline.

Accomplishments that we're proud of

Building a complete end-to-end AI robotics system in a short time. We successfully integrated computer vision, machine learning, web software, and embedded hardware into a single platform that allows AI predictions to directly control real-world devices.

Creating a code-free workflow for training AI models. Instead of requiring users to write machine learning code, AutoniMake allows users to capture training data, train a vision model, and deploy it to hardware through a simple interface.

Developing a modular hardware architecture. Our Raspberry Pi hub communicates with ESP32 peripherals using a custom command protocol, allowing multiple hardware modules like displays or robots to be connected and controlled by AI.

Achieving real-time AI inference controlling physical devices. Our system processes camera input, classifies it using a CNN-based model, and sends commands to hardware modules with minimal latency.

Designing an accessible platform for robotics experimentation. By simplifying both the AI and hardware setup, we created a system that lowers the barrier for students, makers, and researchers to build autonomous systems.

Successfully demonstrating live training and deployment. One of our biggest accomplishments was being able to train a model and immediately see its predictions trigger hardware responses in real time.