← Back to Portfolio

Hexapod Self-Leveling with Q-Learning

Skills: Python, Reinforcement Learning, Q-Learning, Sensor Fusion, Hexapod Robotics

Project Overview

This project, completed for CS/ME 301, applied Q-learning for dynamic self-leveling of a HiWonder hexapod robot on an inclined surface. The robot used IMU data to evaluate its inclination and learned effective leg adjustments through trial-and-error, guided by a custom reward function. All learning ran on real hardware, demonstrating the practical challenges and successes of reinforcement learning in physical robotics.

Reinforcement Learning Framework

Hexapod Self-Leveling Markov Decision Process
Left: Hexapod leg DOFs used for leveling. Right: RL Markov Decision Process flow.

Training Process

\[Q(S, A) ← Q(S, A) + α [R + γ max Q(S', a') - Q(S, A)]\]

Results & Analysis

Accelerometer & Reward data
Reward trends (raw/average/std) over time show that the system became less level as the epoch continued, and the increase in SD shows that the robot was not able to learn and reduce its variance enough
Goal State Timestamps
Timestamps of epochs where the goal state was reached show that the rate of sucessful epochs accelerated (large gap is from night away from lab)
Servo Motor Position Histogram
Servo position histogram reveals that there was little to no bias learned for the position of the motors

Limitations & Reflections