Gesture perception is an expanding area of research that focuses on enabling more intuitive human-machine interaction. Rather than relying solely on traditional input devices like keyboards or controllers, gesture perception allows humans to communicate with machines using natural body/hand movements. This enhances the user experience by making interactions more seamless and accessible, particularly in environments where physical interaction with a device is impractical. By interpreting gestures, machines can better understand human intentions, improving efficiency and responsiveness in various applications, from industrial robotics to another areas.

The gesture perception system is composed of two main components: the Gesture Detection Module and the Camera Interpreter in ROS2. The Gesture Detection Module uses machine learning algorithms to recognize specific hand movements and interpret them as commands, enabling smooth human-machine interaction. Meanwhile, the Camera Interpreter in ROS2 processes visual data captured by the camera, converting it into actionable input for the system. Together, these components ensure real-time gesture recognition and accurate interpretation, allowing machines to respond quickly and appropriately to human movements.
Within the SMARTHANDLE project context, the Hand Gesture recognition system is being developing to be deployed in 2 different scenarios where different kind of robot models will be controlled by the system. To this end up it is necessary to enhance the gesture recognition model by training it with a larger dataset, incorporating a greater variety of hands and multiple ways to perform the same gesture. This approach increases the system’s robustness, ensuring higher accuracy across diverse users and environments. Additionally, gestures should be refined to make them more intuitive and effortless to perform, ultimately improving the user experience and making human-machine interaction more natural and efficient.