Skip to content

Home

Robot Environment Logo

Infografik

A comprehensive Python framework for robotic pick-and-place operations with vision-based object detection and manipulation capabilities

Python License: MIT codecov Code Quality Tests CodeQL Code style: black Ruff


Overviewยถ

The Robot Environment is a modular Python framework that automates the control of robotic arms for precise pick-and-place tasks. By integrating AI-based object detection and advanced workspace management, robots such as the Niryo Ned2 or WidowX can independently identify and manipulate objects. A key feature is the decoupled architecture, in which image processing and robot control communicate efficiently with each other via Redis interfaces. The system also supports real-time data streaming, thread safety for simultaneous processes, and natural interaction through text-to-speech functions. Thanks to its support for simulations and physical hardware, the software provides a flexible basis for complex robotic applications.

Key Featuresยถ

  • ๐Ÿค– Multi-Robot Support - Modular architecture supporting Niryo Ned2 and WidowX robotic arms
  • ๐Ÿ‘๏ธ Vision-Based Object Detection - Integration with multiple detection models, using vision_detect_segment
  • ๐Ÿ—บ๏ธ Workspace Management - Flexible workspace definition with camera-to-world coordinate transformation, using robot_workspace
  • ๐Ÿ“ก Redis Communication - Efficient image streaming and object data sharing via Redis, using redis_robot_comm
  • ๐Ÿ”Š Text-to-Speech - Natural language feedback using text2speech
  • ๐Ÿงต Thread-Safe Operations - Concurrent camera updates and robot control with proper locking
  • ๐ŸŽฎ Simulation Support - Compatible with both real robots and Gazebo simulation
  • ๐Ÿ’พ Object Memory Management - Intelligent tracking of detected objects with workspace-aware updates

Architectureยถ

System Overviewยถ

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    Environment Layer                        โ”‚
โ”‚  (Central orchestrator coordinating all subsystems)         โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                            โ”‚
        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
        โ”‚                   โ”‚                   โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Robot Control โ”‚   โ”‚   Vision    โ”‚   โ”‚    Workspace     โ”‚
โ”‚     Layer      โ”‚   โ”‚    Layer    โ”‚   โ”‚      Layer       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
        โ”‚                   โ”‚                   โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ RobotControllerโ”‚   โ”‚FrameGrabber โ”‚   โ”‚    Workspace     โ”‚
โ”‚   (Abstract)   โ”‚   โ”‚  (Abstract) โ”‚   โ”‚   (Abstract)     โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
        โ”‚                   โ”‚                   โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ NiryoRobot     โ”‚   โ”‚ NiryoFrame  โ”‚   โ”‚ NiryoWorkspace   โ”‚
โ”‚  Controller    โ”‚   โ”‚   Grabber   โ”‚   โ”‚                  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                            โ”‚
                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                    โ”‚ Redis Streams โ”‚
                    โ”‚  (Images +    โ”‚
                    โ”‚   Objects)    โ”‚
                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Core Componentsยถ

Environment Layer - Environment - Central orchestrator managing all subsystems - Coordinates camera updates and robot control - Manages object memory with workspace-aware tracking - Handles thread-safe operations with proper locking

Robot Control Layer - Robot - High-level robot API implementing pick-and-place operations - RobotController - Abstract base class for hardware control - NiryoRobotController - Niryo Ned2 implementation with pyniryo - WidowXRobotController - WidowX implementation with InterbotixManipulatorXS

Vision Layer - FrameGrabber - Abstract camera interface with Redis streaming - NiryoFrameGrabber - Niryo-mounted camera with undistortion - WidowXFrameGrabber - Intel RealSense integration (stub)

Workspace Layer - Workspace - Abstract workspace with coordinate transformation - NiryoWorkspace - Niryo-specific workspace implementation - Workspaces - Collection managing multiple workspaces

Communication Layer - RedisImageStreamer - Variable-size image streaming (from redis_robot_comm) - RedisMessageBroker - Object detection results publishing - RedisLabelManager - Dynamic object label configuration

For detailed architecture documentation, see architecture.md


Installationยถ

Prerequisitesยถ

  • Python โ‰ฅ 3.9
  • Redis Server โ‰ฅ 5.0
  • Robot-specific drivers:
  • Niryo: pyniryo or pyniryo2
  • WidowX: interbotix-xs-modules

Basic Installationยถ

git clone https://github.com/dgaida/robot_environment.git
cd robot_environment
pip install -e .

Dependenciesยถ

Core dependencies are automatically installed:

pip install numpy opencv-python redis torch torchaudio
pip install vision-detect-segment redis-robot-comm robot-workspace text2speech

Robot-specific dependencies:

# For Niryo Ned2
pip install pyniryo

# For WidowX
pip install interbotix-xs-modules

Redis Serverยถ

# Using Docker (recommended)
docker run -p 6379:6379 redis:alpine

# Or install locally
# Ubuntu/Debian:
sudo apt-get install redis-server

# macOS:
brew install redis

Quick Startยถ

Basic Pick and Placeยถ

from robot_environment.environment import Environment
from robot_workspace import Location
import threading
import time

# Initialize environment
env = Environment(
    el_api_key="your_elevenlabs_key",  # For text-to-speech
    use_simulation=False,               # Set True for Gazebo
    robot_id="niryo",                   # or "widowx"
    verbose=True,
    start_camera_thread=True            # Auto-start camera updates
)

# Alternative: Manual camera thread control
def start_camera_updates(environment, visualize=False):
    def loop():
        for img in environment.update_camera_and_objects(visualize=visualize):
            pass
    t = threading.Thread(target=loop, daemon=True)
    t.start()
    return t

# Move to observation pose
env.robot_move2observation_pose(env.get_workspace_home_id())

# Wait for object detection
time.sleep(2)

# Get detected objects
detected_objects = env.get_detected_objects_from_memory()
print(f"Detected {len(detected_objects)} objects:")
for obj in detected_objects:
    print(f"  - {obj.label()} at [{obj.x_com():.2f}, {obj.y_com():.2f}]")

# Pick and place an object
robot = env.robot()
success = robot.pick_place_object(
    object_name="pencil",
    pick_coordinate=[-0.1, 0.01],
    place_coordinate=[0.1, 0.11],
    location=Location.RIGHT_NEXT_TO
)

if success:
    print("โœ“ Object successfully picked and placed")
else:
    print("โœ— Pick and place operation failed")

# Cleanup
env.cleanup()

Multi-Workspace Operationsยถ

from robot_environment.environment import Environment
from robot_workspace import Location

env = Environment("key", False, "niryo", verbose=True)

# Get workspace IDs
left_ws_id = env.workspaces().get_workspace_left_id()
right_ws_id = env.workspaces().get_workspace_right_id()

# Observe left workspace
env.robot_move2observation_pose(left_ws_id)
env.set_current_workspace(left_ws_id)
time.sleep(2)

# Get objects from left workspace
left_objects = env.get_detected_objects_from_workspace(left_ws_id)
print(f"Left workspace: {len(left_objects)} objects")

# Transfer object to right workspace
if len(left_objects) > 0:
    obj = left_objects[0]
    robot.pick_place_object_across_workspaces(
        object_name=obj.label(),
        pick_workspace_id=left_ws_id,
        pick_coordinate=[obj.x_com(), obj.y_com()],
        place_workspace_id=right_ws_id,
        place_coordinate=[0.25, -0.05],
        location=Location.RIGHT_NEXT_TO
    )

For complete multi-workspace examples, see examples/multi_workspace_example.py


Advanced Featuresยถ

Object Detection and Filteringยถ

from robot_workspace import Location

# Get objects from memory (persists during robot motion)
detected_objects = env.get_detected_objects_from_memory()

# Spatial filtering
objects_left = detected_objects.get_detected_objects(
    location=Location.LEFT_NEXT_TO,
    coordinate=[0.2, 0.0],
    label="cube"
)

# Find nearest object
nearest, distance = detected_objects.get_nearest_detected_object(
    coordinate=[0.25, 0.05],
    label="pencil"
)

# Size-based queries
largest, size = detected_objects.get_largest_detected_object()
smallest, size = detected_objects.get_smallest_detected_object()

# Sort by size
sorted_objects = detected_objects.get_detected_objects_sorted(ascending=True)

Workspace Coordinate Systemยถ

# Get workspace corners
workspace = env.get_workspace(0)
upper_left = workspace.xy_ul_wc()
lower_right = workspace.xy_lr_wc()
center = workspace.xy_center_wc()

# Transform camera coordinates to world coordinates
pose = workspace.transform_camera2world_coords(
    workspace_id="niryo_ws",
    u_rel=0.5,  # Center of image (normalized [0,1])
    v_rel=0.5,
    yaw=0.0
)

# Get workspace dimensions
width = workspace.width_m()
height = workspace.height_m()
print(f"Workspace: {width:.3f}m ร— {height:.3f}m")

Object Memory Managementยถ

# Memory is automatically updated when at observation pose
# Manual memory operations:

# Clear all memory
env.clear_memory()

# Remove specific object after manipulation
env.remove_object_from_memory("pencil", [0.25, 0.05])

# Update object position after placement
env.update_object_in_memory(
    object_label="cube",
    old_coordinate=[0.2, 0.0],
    new_pose=new_pose_object
)

# Get memory contents
memory_objects = env.get_detected_objects_from_memory()

Finding Free Spaceยถ

# Find largest free area in workspace
largest_area_m2, center_x, center_y = env.get_largest_free_space_with_center()

print(f"Free space: {largest_area_m2*10000:.2f} cmยฒ")
print(f"Center: [{center_x:.2f}, {center_y:.2f}]")

# Place object at center of free space
robot.pick_place_object(
    object_name="box",
    pick_coordinate=[0.2, 0.0],
    place_coordinate=[center_x, center_y],
    location=Location.NONE
)

Pushing Objectsยถ

# For objects too large to grip
success = robot.push_object(
    object_name="large_box",
    push_coordinate=[0.3, 0.1],
    direction="left",    # "up", "down", "left", "right"
    distance=50.0        # millimeters
)

Custom Object Labelsยถ

# Add new detectable object
message = env.add_object_name2object_labels("custom_tool")
print(message)  # "Added custom_tool to recognizable objects"

# Get current labels
labels = env.get_object_labels_as_string()
print(labels)  # "I can recognize these objects: pencil, pen, custom_tool, ..."

Text-to-Speech Feedbackยถ

# Asynchronous speech (non-blocking)
thread = env.oralcom_call_text2speech_async(
    "I have detected a pencil at position 0.25, 0.05"
)
# Continue with other operations
robot.pick_object("pencil", [0.25, 0.05])
thread.join()  # Wait for speech to complete

Configurationยถ

Robot Selectionยถ

# Niryo Ned2 (real robot)
env = Environment(
    el_api_key="key",
    use_simulation=False,
    robot_id="niryo"
)

# Niryo in Gazebo simulation
env = Environment(
    el_api_key="key",
    use_simulation=True,
    robot_id="niryo"
)

# WidowX robot
env = Environment(
    el_api_key="key",
    use_simulation=False,
    robot_id="widowx"
)

Adding Custom Workspacesยถ

Edit niryo_workspace.py:

def _set_observation_pose(self) -> None:
    if self._id == "my_custom_workspace":
        self._observation_pose = PoseObjectPNP(
            x=0.20, y=0.0, z=0.35,
            roll=0.0, pitch=math.pi/2, yaw=0.0
        )
    # ... existing workspaces

Vision Configurationยถ

The vision system uses vision_detect_segment with configurable models:

# Models are configured in environment.py
# Default: OWL-V2 for open-vocabulary detection
# Available: "owlv2", "yolo-world", "yoloe-11l", "grounding_dino"

# To change model, modify in environment.py:
self._visual_cortex = VisualCortex(
    objdetect_model_id="yoloe-11l",  # Fast with built-in segmentation
    device="auto",
    verbose=verbose,
    config=config
)

API Referenceยถ

See api.md.


Performance Considerationsยถ

Detection Speedยถ

Model Detection Segmentation Total FPS Best For
YOLOE-11L 6-10ms Built-in 100-160 FPS Real-time unified tasks
YOLO-World 20-50ms 50-100ms (FastSAM) 10-25 FPS Speed-critical
OWL-V2 100-200ms 200-500ms (SAM2) 1-3 FPS Custom classes
Grounding-DINO 200-400ms 200-500ms (SAM2) 1-2 FPS Complex queries

Optimization Tipsยถ

# 1. Use faster detection model
config = get_default_config("yoloe-11s")  # Fast variant

# 2. Reduce object labels
config.set_object_labels(["cube", "cylinder"])  # Only what you need

# 3. Disable segmentation if not needed
config.enable_segmentation = False

# 4. Adjust camera update rate
time.sleep(0.5)  # Between camera updates

# 5. Use GPU acceleration
cortex = VisualCortex("yoloe-11l", device="cuda")

Memory Managementยถ

  • Object memory stores detection history during robot motion
  • Memory automatically updated when at observation pose
  • Old detections removed when workspace visibility changes
  • Manual updates from pick/place operations persist briefly

Testingยถ

See TESTING.md


Troubleshootingยถ

Common Issuesยถ

No Objects Detected

# Check Redis connection
from redis_robot_comm import RedisMessageBroker
broker = RedisMessageBroker()
if broker.test_connection():
    print("โœ“ Redis connected")

Objects at Wrong Positions

# Check workspace calibration
workspace = env.get_workspace_by_id("niryo_ws")
print(f"Corners: UL={workspace.xy_ul_wc()}, LR={workspace.xy_lr_wc()}")

# Ensure workspace is level and stable
# Verify camera is properly mounted

# Get fresh detection before picking
env.robot_move2observation_pose(workspace_id)
time.sleep(2)  # Wait for detection
objects = env.get_detected_objects_from_memory()

Robot Won't Move

# Check connection
robot_ctrl = env.get_robot_controller()
pose = robot_ctrl.get_pose()
print(f"Current pose: {pose}")

# Verify calibration (Niryo)
robot_ctrl.calibrate()

# Check coordinates are reachable
workspace = env.get_workspace(0)
print(f"Valid range: X=[{workspace.xy_lr_wc().x}, {workspace.xy_ul_wc().x}]")
print(f"             Y=[{workspace.xy_lr_wc().y}, {workspace.xy_ul_wc().y}]")

Memory Issues

# Clear stale memory
env.clear_memory()

# Force fresh detection
env.robot_move2observation_pose(workspace_id)
time.sleep(2)

# Check memory contents
memory = env.get_detected_objects_from_memory()
print(f"Objects in memory: {len(memory)}")

For comprehensive troubleshooting, see troubleshooting.md.


Examplesยถ

Complete Examplesยถ

Run Examplesยถ

# Start Redis server
docker run -p 6379:6379 redis:alpine

# Run basic example
python main.py

# Run multi-workspace examples
cd examples
python multi_workspace_example.py

Documentationยถ


Developmentยถ

Code Qualityยถ

# Install development dependencies
pip install -r requirements-dev.txt

# Linting with Ruff
ruff check . --fix

# Formatting with Black
black .

# Type checking with mypy
mypy robot_environment --ignore-missing-imports

# Security scanning with Bandit
bandit -r robot_environment/ -ll

Pre-Commit Hooksยถ

pip install pre-commit
pre-commit install

CI/CDยถ

The project includes comprehensive GitHub Actions workflows:

  • Tests - Multi-platform testing (Ubuntu, Windows, macOS) across Python 3.9-3.11
  • Code Quality - Ruff, Black, mypy checks
  • Security - CodeQL and Bandit security scanning
  • Dependency Review - Automated security audits
  • Release - Automated package building on tags

Contributingยถ

See contributing.md for details.


This package integrates with several companion projects:


Licenseยถ

This project is licensed under the MIT License. See LICENSE for details.


Citationยถ

If you use this package in your research, please cite:

@software{robot_environment,
  author = {Gaida, Daniel},
  title = {robot_environment: Vision-Based Robotic Manipulation Framework},
  year = {2025},
  url = {https://github.com/dgaida/robot_environment}
}

Acknowledgmentsยถ

This package builds upon:


Supportยถ


Authorยถ

Daniel Gaida Email: daniel.gaida@th-koeln.de GitHub: @dgaida

Project Link: https://github.com/dgaida/robot_environment


Roadmapยถ

Planned Featuresยถ

  • [ ] Additional robot support (UR5, Franka Emika)
  • [ ] Improved collision detection and avoidance
  • [ ] Force/torque sensor integration
  • [ ] Advanced grasp planning
  • [ ] Multi-robot coordination
  • [ ] Web-based control interface
  • [ ] ROS2 integration
  • [ ] Improved simulation support

Recent Additionsยถ

  • โœ… Multi-workspace support
  • โœ… YOLOE model support with built-in segmentation
  • โœ… Enhanced object memory management
  • โœ… Workspace visibility tracking

Last Updated: December 2025