Posts by Collection

portfolio

Development of autonomous intelligent systems

I am working on a framework that enables robots to follow natural-language instructions by combining large language models with real-time perception, spatial reasoning, and robot control through a modular tool-based system.

Verbundvorhaben: Intelligenter Virtueller Betriebsassistent zur Optimierung landwirtschaftlicher Biogasanlagen durch ein Large Multimodal Model

Im Projekt entwickeln wir einen intelligenten, virtuellen Betriebsassistenten auf Basis eines multimodalen KI-Modells, der die Substratzufuhr in Biogasanlagen automatisch optimiert und eine flexible, effiziente Stromeinspeisung ermöglicht. Durch modellbasierte prädiktive Regelung, KI-gestützte Zustandsschätzung und eine sprachbasierte Benutzeroberfläche wird der Anlagenbetrieb vereinfacht, nachhaltiger gestaltet und in realen Biogasanlagen validiert.

publications

Automatisierung einer Pilot-Biogasanlage mit LabVIEW

Published in , 1900

Betriebsoptimierung von Abfallvergärungsanlagen mittels Online-Messtechnik und Datenanalyse

Published in , 1900

Development of an easy-to-use Membrane WWTP Simulation Software

Published in , 1900

Dynamische Echtzeit-Optimierung der Substratzufuhr von landwirtschaftlichen Biogasanlagen

Published in , 1900

Gummersbach Environmental Computing Center

Published in , 1900

Messung organischer Säurekonzentration über UV/vis Spektroskopie

Published in , 1900

Modellbasierte Optimierung und Regelung zur nachhaltigen Energieerzeugung aus Biogas

Published in , 1900

Regelung der Substratzufuhr von Biogasanlagen–Ein Review

Published in , 1900

SUBSTRATE FEED OPTIMIZATION AND CONTROL OF AGRICULTURAL BIOGAS PLANTS

Published in , 1900

Dynamic real-time substrate feed optimization of anaerobic co-digestion plants

Published in Water Science and Technology, 2005

Fusion of visual and inertial measurements for pose estimation

Published in dynamics, 2008

A comparison of performance of advanced pattern recognition methods for organic acid prediction in biogas plants using uv/vis spectroscopic online-measurements

Published in Proceedings of the 2010 International Conference on Life System Modeling and Simulation (LSMS), Wuxi, China, 2010, 2010

Development of a Simulation Model for hollow-fiber and flat sheet membrane wastewater treatment Plants

Published in Crossing Borders within the ABC: Automation, Biomedical Engineering and Computer Science, 2010

Organic acid prediction in biogas plants using UV/vis spectroscopic online-measurements

Published in International Conference on Intelligent Computing for Sustainable Energy and Environment, 2010

Matlab toolbox for biogas plant modelling and optimization

Published in Progress in Biogas II-Biogas production from agricultural biomass and organic residues, 2011

Optimal control of biogas plants using nonlinear model predictive control

Published in , 2011

An advanced simulation model for membrane bioreactors: development, calibration and validation

Published in Water Science and Technology, 2012

Nonlinear model predictive substrate feed control of biogas plants

Published in 2012 20th Mediterranean Conference on Control & Automation (MED), 2012

Predicting Organic Acids in Biogas Plants using UV/vis Spectrometry Online-Measurements

Published in no. June, 2012

State estimation for anaerobic digesters using the ADM1

Published in Water science and technology, 2012

Predicting organic acid concentration from UV/vis spectrometry measurements–a comparison of machine learning techniques

Published in Transactions of the Institute of Measurement and Control, 2013

Steuerungs-und Regelungskonzepte für landwirtschaftliche Biogasanlagen

Published in Biogas in der Landwirtschaft--Stand und Perspektiven, Proceedings FNR/KTBL-Kongress, 2013

COD and NH 4-N estimation in the inflow of Wastewater Treatment Plants using Machine Learning Techniques

Published in 2014 IEEE international conference on automation science and engineering (CASE), 2014

Intelligent automation and IT for the optimization of renewable energy and wastewater treatment processes

Published in Energy, Sustainability and Society, 2014

Multi-objective nonlinear model predictive substrate feed control of a biogas plant

Published in Kompendium der Forschungsgemeinschaft: metabolon 2012-2014, 2014

Online-measurement systems for agricultural and industrial AD plants–A review and practice test

Published in Kompendium der Forschungsgemeinschaft: metabolon 2012-2014, 2014

Online monitoring of AD processes using a fully automated, low maintenance middle-infrared (MIR) measurement system

Published in Conf Proc Int Sci Conf Biogas Sci, 2014

Dynamische Echtzeit-Optimierung der Substratzufuhr für anaerobe Co-Vergärungsanlagen

Published in at-Automatisierungstechnik, 2015

Expected hypervolume improvement algorithm for PID controller tuning and the multiobjective dynamical control of a biogas plant

Published in 2015 IEEE Congress on Evolutionary Computation (CEC), 2015

Feed Control of Anaerobic Digestion Processes for Sustainable Renewable Energy Production: A Review

Published in Dubrovnik, 2015

Instrumentation and control of anaerobic digestion processes: a review and some research challenges

Published in Reviews in Environmental Science and Bio/Technology, 2015

Optimizing anaerobic co-digestion plants: MIR online instrumentation and dynamic real-time substrate feed optimization.

Published in Automatisierungstechnik, 2015

A bias compensated cross-relation approach to thermocouple characterisation

Published in IFAC-PapersOnLine, 2016

Multi-fidelity modeling and optimization of biogas plants

Published in Applied Soft Computing, 2016

Noise resilient discrete-time cross-relation sensor characterisation

Published in 2016 UKACC 11th International Conference on Control (CONTROL), 2016

Process optimization, advanced control, and computational fluid dynamics modelling of wastewater disinfection using peracids

Published in 3rd IWA Specialiized International Conference Ecotechnologies for Wastewater Treatment (EcoSTP 2016), 2016

Feed control of anaerobic digestion processes for renewable energy production: A review

Published in Renewable and Sustainable Energy Reviews, 2017

This is How We Do It–Pilot Validation of a Dose (CT)-Based Control for Wastewater Disinfection with peracetic acid

Published in Proceedings of WEFTEC 2017-Water Environment Federation Technical Exhibition and Conference 2017, 2017

Tightly coupled fusion of direct stereo visual odometry and inertial sensor measurements using an iterated information filter

Published in 2017 DGON Inertial Sensors and Systems (ISS), 2017

Wastewater disinfection by peracids: advanced dose control technology validation with pilot and modeling studies

Published in 12th IWA Specialist Conference on Instrumentation, Control and Automation (ICA 2017), 2017

student_projects

Lokalisierung eines Fahrzeugs mit einem Unscented Kalman Filter

T. Y. N., 2023

Vehicle localization for autonomous driving is achieved by fusing GNSS, IMU, and LiDAR data with an Unscented Kalman Filter (UKF). The UKF’s performance is evaluated in the CARLA driving simulator and compared to an Error State Extended Kalman Filter (ESEKF), with higher accuracy in position and orientation estimation observed under sensor noise and signal loss. It is demonstrated that the UKF provides a more reliable solution for non‑linear vehicle state estimation.

Training eines Neuronalen Netzes (Yolov8) zur Echtzeiterkennung von Autos in Airsim

S. S., 2023

A YOLOv8 neural network was trained to detect cars in real time within the AirSim Unreal Engine simulation environment. Successive expansions of the training set, image augmentations, and validation on both public and simulated images were applied, raising the mean average precision from approximately 0.31 to 0.66. The resulting model delivers reliable car detection while remaining lightweight enough for use on modest hardware, supporting further research on autonomous‑vehicle technologies.

Entwicklung einer KI-gestützten Objekterkennung zur Navigation der autonomen Plattform Turtlebot 4

A. P., 2023

A YOLOv8 model was trained to detect fire extinguishers and integrated into the TurtleBot 4 platform, enabling real‑time object recognition. Two deployment strategies were evaluated: processing on the onboard Raspberry Pi, which proved too slow, and leveraging the OAK‑D‑PRO camera’s on‑chip inference, which delivered a substantial performance boost. Additionally, a laser‑scan based navigation algorithm was implemented, allowing the robot to autonomously explore unknown environments.

Entwicklung eines Kolorierungsassistenten auf Basis einer generativen KI

V. P., 2024

A coloring assistant based on a generative diffusion model is developed to automatically color manga line drawings. Its performance is evaluated against existing models using PSNR, MS‑SSIM and FID, showing improved fidelity in eye and hair colors and closer adherence to original sketches, while noting some residual artifacts that can be reduced by scaling the model. The results demonstrate that diffusion‑based approaches can deliver high‑quality coloring on consumer‑grade hardware.

Entwicklung eines Reinforcement Learning basierten Ansatzes zur Strategieoptimierung in einem Multi Agenten Roboter Szenario

A. M., 2024

A reinforcement‑learning approach based on Deep Q‑Learning is developed to optimise strategies for competing robots in a multi‑agent Microservice Dungeon scenario. Through a series of experiments, the agent’s performance is shown to improve with tailored reward functions, hyper‑parameter tuning, and observation adjustments, while the simplified environment and limited action space constrain its capabilities. The findings are highlighted as evidence that more advanced policy‑based methods and graph neural networks are needed to overcome these limitations and further enhance strategic decision‑making.

Entwicklung eines KI-gesteuerten Browser-Plugins zur Generierung kontext-abhängiger Alternativtexte für verbesserte Web-Barrierefreiheit

P. F., 2024

An AI‑powered browser plugin was developed to automatically generate context‑aware alternative text for images on webpages, thereby improving web accessibility for users with visual impairments. In evaluations, the AI was found capable of deciding when an alt‑text is needed and of producing more accurate descriptions by incorporating surrounding page text, although final verification by a human remains necessary.

Vergleich von Algorithmen zur Bewegungsplanung für einen Robotergreifarm mit 6 Freiheitsgraden in einer Simulation

C. V., 2024

A comparative study of three motion‑planning algorithms—PRM, CHOMP, and the convex‑set‑based GCS—was conducted for a six‑degree‑of‑freedom robotic manipulator in simulated environments. The algorithms were integrated into the ROS 2/MoveIt 2 framework and benchmarked in scenarios such as a table, a narrow passage, and a bookshelf, revealing that GCS can achieve rapid planning when correctly configured, while PRM is asymptotically optimal but computationally heavy and CHOMP provides fast local refinement. The results highlight trade‑offs between optimality, computation time, and robustness for robotic arm trajectory planning.

Naturkatastrophen Analyse mittels Deep Learning

N. B., 2024

The study investigates how deep‑learning techniques can automatically classify social‑media texts about natural disasters, using the HUMAID tweet dataset. LSTM, BERT and logistic‑regression models were trained and compared with respect to accuracy, hyper‑parameter effects, and computational cost, revealing that the fine‑tuned BERT model achieves the highest precision while requiring greater processing time. The results demonstrate the potential of advanced NLP models to support rapid, reliable information extraction for emergency response and disaster management.

Entwicklung einer Retrieval Augmented Generation-Anwendung zur Unterstützung bei der Anerkennung von Prüfungsleistungen

N. V. und M. Z., 2024

A Retrieval‑Augmented Generation (RAG) system is presented to automate the recognition of university credits when students transfer between institutions. The solution combines a Minimal‑Marginal‑Relevance retriever with large language models (Llama‑3.1‑8B‑Instruct and GPT‑4o) and employs query expansion to improve the relevance of retrieved document chunks, with GPT‑4o delivering notably more accurate answers. The application provides a stable, user‑friendly web interface for uploading module handbooks and querying them, and its modular architecture enables future extensions such as a broader knowledge base and full automation of the credit‑recognition workflow.

Visuelle Navigation und Lokalisation in einem Raum mithilfe von Neural Radiance Fields und Gaussian Splatting

P. T., 2024

The suitability of Neural Radiance Fields and Gaussian Splatting for visual‑only localisation and navigation of indoor mobile robots is investigated. A Monte‑Carlo particle filter that compares camera images with synthetically rendered views is implemented on a TurtleBot and evaluated at a minimum update rate of 1 Hz, demonstrating that accurate global localisation can be achieved without prior pose information, though occasional localisation errors remain. The results suggest that a fully visual, low‑cost navigation system based on these radiance‑field models is feasible for real‑time robot operation.

Weiterentwicklung und Evaluation eines adversarial robusten Network Intrusion Detection Systems

L. T., 2025

The thesis advances the adversarially robust network intrusion detection system Apollon by integrating additional machine‑learning classifiers and by replacing its original multi‑armed‑bandit selector with alternative heuristics such as epsilon‑greedy and Thompson‑sampling. The enhanced system is evaluated on the CIC‑IDS‑2017 dataset, where the expanded model pool markedly improves detection accuracy and resilience against black‑box adversarial attacks, and the new heuristics provide occasional robustness gains.

Navigation in Fahrzeugsimulation

F. E., 2025

A navigation system for the AirSim vehicle simulator has been developed to compute the shortest route between two points while staying on road surfaces. Road areas are identified by a YOLOv8 neural network trained on a custom dataset of aerial images, and the resulting segmentation is converted into a weighted graph of street centerlines. Dijkstra’s algorithm is then applied to this graph to determine and visualize the optimal path.

Entwicklung eines Chatbots, um Kran-spezifisches Wissen bereitzustellen

R. J., 2025

A chatbot was built using Retrieval‑Augmented Generation to answer employee queries by searching internal company PDFs. Text from the PDFs is extracted, embedded and indexed with FAISS, then re‑ranked and answered by a German QA model, ensuring fact‑based, context‑relevant responses. The system streamlines access to technical documentation, reducing the time needed to locate information.

Implementierung eines spezifizierten Frage-Antwort-Large Language Models (LLM) für das Robot Operating System (ROS) mit Nutzung von Retrieval Augmented Generation (RAG)

S. S., 2025

A Retrieval‑Augmented Generation (RAG) system was built to extend a large language model with external ROS‑specific knowledge, using Python, LangChain, a Chroma vector database, Ollama for local model execution, and a Streamlit web interface. The system’s answers were compared with those of the plain language model, and a scoring‑based evaluation showed that RAG can produce noticeably better responses for certain question types and when sufficient domain data are available.

Wissensintegration aus Webdaten in eine Graphdatenbank zur Nutzung in RAG-Systemen

N. V., 2025

A system was developed to automatically extract, model, and store university‑related information from the Technical University of Cologne’s website in a domain‑specific knowledge graph hosted in Neo4j. Large Language Models are used to translate natural‑language questions into structured Cypher queries, enabling a conversational question‑answer interface that reliably retrieves details about study programmes, faculties, and thematic focuses. Evaluation with a reference test set showed high accuracy for structured queries, while answers to personal‑data requests were limited by incomplete or inconsistent source information.

Materialeffizienz und Statik vereint: Ein hybrider Algorithmus zur Optimierung der Kastenträger-Konfiguration

M. T., 2025

A hybrid algorithm has been created that first determines the minimal number of metal plates needed and then uses an evolutionary optimization with specially designed mutations to satisfy static constraints such as the j‑measure. By enforcing these constraints during the mutation process, a significant reduction in material waste and the number of weld seams for crane box‑girders has been achieved, surpassing earlier heuristic lookup methods. Further improvements may be obtained by integrating neural‑network‑based artificial intelligence.

Navigation in AirSim

R. P. H., 2025

A graph of the road network in the AirSimNH simulation is generated by extracting vehicle pose data and storing nodes and weighted edges in a custom Python graph class. Dijkstra’s algorithm is then applied to compute the shortest route between two points, and the resulting path is converted into a CSV format that drives the simulated vehicle autonomously while a Kalman filter continuously estimates its state to compensate for sensor noise.

Automatische Analyse von Text und Bildern in Sozialen Medien zu Naturkatastrophen

R. H., 2025

An automated system has been developed to classify images and social‑media text about natural disasters using TensorFlow‑based neural networks. Convolutional Neural Networks were trained on a curated dataset of more than 11 000 images and on disaster‑related posts, and the performance of YOLOv8 and EfficientNetV2 models was compared to evaluate accuracy and speed.

Entwicklung eines Chatbots für die öffentlich zugängliche Website der TH Köln

F. E., 2025

A chatbot was built for the publicly accessible website of the Technical University of Cologne, employing a Retrieval‑Augmented Generation approach that combines semantic vector search on crawled web content with a large language model (LLaMA 3) to generate natural‑language answers and cite sources. The system was prototyped with a Gradio web interface, hosted on Groq, and evaluated by comparing several LLMs on answer quality and clarity. The results demonstrate that a data‑driven assistant can meaningfully improve information access for university users.

Vorhersage von Lachgasemissionen bei der Abwasserbehandlung mit Maschinelles Lernen/Deep Learning

A. U., 2025

The project investigates how machine‑learning and deep‑learning techniques can forecast nitrous‑oxide (N₂O) emissions from wastewater‑treatment plants, comparing mechanistic and data‑driven models on real‑world process data. Advanced feature‑engineering, temporal cross‑validation, and model‑interpretability methods (e.g., SHAP, permutation importance) are applied to evaluate the predictive performance and robustness of algorithms such as XGBoost, Random Forest, k‑NN, and neural networks. The results show that selected ML models can reliably predict N₂O emissions, offering a practical basis for emission‑monitoring soft sensors in treatment facilities.

Entwicklung einer mobilen Anwendung zur interaktiven visuellen Szenenanalyse mithilfe von Segment-Anything-Modellen und Large Language Modellen

J. F. K. R., 2025

A modular Android application was created that combines on‑device image segmentation (YOLO‑seg) with a quantized large language model (Gemma) via a flexible JSON interface, enabling offline multimodal scene analysis for visually impaired users. The system was evaluated against a hybrid server‑assisted version, confirming that fully local processing is technically feasible on a standard smartphone, while the limited context window of the on‑device LLM was identified as the primary performance bottleneck, requiring intelligent prompt‑reduction strategies.

Autonomous Object Detection and Manipulation Using a Mobile Cobot

T. Y. N., 2026

Autonomous exploration, open-vocabulary object recognition, and safe grasping have been combined on a compact mobile robot, enabling it to independently map unknown rooms, locate user-specified items, and pick them up. Field tests showed the frontier-guided strategy finds and grasps objects faster and more reliably than simpler straight-line searches, even in cluttered indoor spaces.

Entwicklung und Test neuronaler Netze mithilfe von Edge Impulse auf einem Mikrocontroller zur Detektion und Klassifikation von flüchtigen organischen Verbindungen (VOCs)

J. D., 2026

A neural‑network model was trained with the low‑code platform Edge Impulse to recognise volatile organic compounds (VOCs) from data gathered by a cheap metal‑oxide gas sensor. The model was then deployed on an Arduino Nicla Sense ME microcontroller, enabling on‑device detection and classification of VOC sources such as cigarette smoke, 3D‑printer emissions and cooking fumes without any network connection. The results demonstrate that inexpensive embedded hardware can reliably identify common indoor VOCs, while also highlighting the limits of sensor sensitivity and the need for careful measurement conditions.

Modifikation des Selbststudiums im digitalen Lernalltag: Ein KI-Dokumentenassistent für Studierende zur effizienten Suche und Aufbereitung von Lernmaterialien

N. V., 2026

A privacy-preserving AI document assistant was built to let students find and understand their scattered digital course materials. After systematically testing 56 local RAG-pipeline setups, the best combination of smart parsers, dense retrieval and reranking rivaled cloud services without sending data off campus.

Entwicklung einer App zur sprachbasierten Interaktion mit Discord über einen Discord MCP Server

A. M., 2026

A voice-controlled web tool has been created that lets users manage Discord events and channels through everyday speech, automatically converting phrases like “tomorrow 3 pm” into the exact timestamps Discord needs. The system links large language models to Discord’s servers via the open Model Context Protocol, hides the technical details behind a simple browser interface, and can be used from any device on the network.

Prototypische Umsetzung webbasierter Anwendungen zur Unterstützung von Restaurantabläufen

T. M. N., 2026

A set of web apps has been built to digitalize restaurant processes. Guests can scan a QR code to order and pay at the table, reserve seats and preorder meals, while staff track all tables in real time.

AlphaZero für Blokus-Duo: Analyse und Vergleich mit Minimax- und MCTS-Algorithmen

L. S., 2026

An AlphaZero-style AI for the two-player board game Blokus Duo was built and pitted against hand-crafted Minimax and MCTS opponents. Although ten rounds of self-training did not yet surpass the classical agents, the neural network noticeably boosted MCTS move quality, indicating clear room for further improvement.

Vergleichende Analyse von Vektor-Datenbanken für Chatbot-Anwendungen

C. Ö., 2026

Four vector databases (FAISS, Chroma, Qdrant, Weaviate) and four embedding models were benchmarked on one million text snippets for speed, accuracy and memory use.

Vergleich von lokalen LLMs zur Programmierfähigkeit in Python

M. P. G., 2026

Local open-source language models running on an Nvidia Jetson board were timed and graded on how well they wrote Python tasks for a robot arm. The 1.5-billion-token Qwen2.5-Coder delivered the fastest, most reliable code and was further tested in larger and quantized versions.

Entwicklung eines Chatbots für Prüfungsrecht

T. M. N., 2026

A chatbot was built to make university exam rules easy to query by voice or text. It uses Retrieval-Augmented Generation, instantly quoting the official TH Köln exam regulations and North Rhine-Westphalia higher-education law.

Erstellung einer diskreten, stochastischen und adversarialen 2D-Umgebung für KI-Vorlesung

M. B., 2026

A miniature 4×5 turn-based grid world has been created where an AI robot must reach goal squares while a chasing ghost tries to catch it. Random “slip” moves and walls add uncertainty, making the Gymnasium-based environment a compact test-bed for reinforcement-learning algorithms.

Entwicklung eines KI gestützten Chatbots mit Retrieval-Augmented Generation (RAG) für die öffentlich zugängliche Webseite der TH Köln

W. T., 2026

An AI chatbot using Retrieval-Augmented Generation was built to turn TH Köln’s sprawling website into easy, conversation-style answers. Tests show it cuts search time—especially for tricky topics like exam rules—while keeping facts trustworthy and hallucinations low.

Steuerung der Turtlebot 4 Plattform per Text - TextToTurtlebot

N. S. & M. K. & M. F., 2026

Natural-language commands are converted into autonomous TurtleBot 4 actions through a language-model interpreter linked to YOLO-based vision, LiDAR, and depth sensing; behavior-tree control replaced an early state-machine to let the robot explore, avoid obstacles, and resume tasks smoothly. The integrated system was shown to simplify human–robot interaction without requiring programming skills.

Entwicklung eines Powerpoint zu LATEX-Beamer-Konverters auf Basis von Large Language Models

I. K., 2026

A new tool has been developed that automatically converts PowerPoint slides into LaTeX Beamer code by combining layout analysis with local large language models. Texts, images, tables and exact positions are extracted and reliably translated into compilable LaTeX, eliminating the need for manual reworking.

Vergleich von 3D-Gaussian-Splatting-Algorithmen zur Rekonstruktion einer komplexen Indoor-Szene

N. S., 2026

Seven recent 3D-Gaussian-splatting algorithms were compared on a cluttered garage scene to see how well they reconstruct tricky indoor spaces. No single method won outright: Mip-Splatting delivered the sharpest images, Brush the most natural look with least memory, and FastGS the fastest training, so the best choice depends on whether quality, memory, or speed matters most.

Active Learning Framework for a Meta-Model to Predict Concentrations of Pesticides in Groundwater

B. K. M., 2026

A fast machine-learning “meta-model” was trained with active-learning tricks to imitate Europe’s slow groundwater-exposure model for pesticides. The resulting CatBoost committee predicts concentrations in real time with under 0.7 % error, enabling instant, low-cost screening of crop-protection products.

Materialeffizienz und Statik vereint: Ein hybrider Algorithmus zur Optimierung multipler Kastenträger-Konfigurationen

M. T., 2026

A hybrid evolutionary algorithm is presented that optimizes the cutting of metal plates for crane frame structures, simultaneously minimizing material waste and the number of joint connections that affect static integrity. The method operates in two stages—first determining the minimal number of plates required and then mutating plate configurations under static constraints—and is designed to be applied across multiple orders, enabling additional material savings and adaptable, data‑driven planning.

Entwicklung eines RAG-basierten Chatbots zur Beantwortung prüfungsrechtlicher Fragen

N. B., 2026

A privacy-focused chatbot was built to answer university exam-regulation questions using a hybrid search through 47 official rulebooks. Surprisingly, a local 14-billion-parameter model matched the accuracy of a 70-billion-parameter cloud service, proving that smaller, on-campus AI can deliver reliable, regulation-grade answers without sending data elsewhere.

LLM Unterstützte High Level Aufgabenplanung zur kollisionsfreien Bewegungsplanung eines Roboter- arms mit MoveIt in einer Simulationsumgebung

L. A., 2026

A large language model is employed as a high‑level interface that translates natural‑language commands into well‑defined robot actions, which are then executed collision‑free by the ROS 2‑based MoveIt framework in a simulated environment. The approach is demonstrated with a Niryo Ned2 arm performing object detection, pick‑and‑place, and stacking tasks, and its feasibility, limitations, and challenges are evaluated.

talks

Synergizing Language Models and Biogas Plant Control: A GPT-4 Approach

Published: June 04, 2024

This study delves into the utilization of the large language model, GPT-4, as a controller to optimize substrate feed in an agricultural anaerobic co-digestion plant. Assigned with specific objectives, including targeted methane production, GPT-4 harnesses knowledge encompassing plant parameters, substrate characteristics, and real-time process data. The model formulates recommendations for substrate feed, offering transparent rationales for its decisions. To evaluate its effectiveness, a simulation model of an agricultural anaerobic co-digestion plant based on the Anaerobic Digestion Model no. 1 is employed. Initial findings suggest that GPT-4 effectively regulates substrate feed, maintaining methane production rates near predefined targets. Crucially, the explanations provided by GPT-4 are comprehensible. The accompanying code will be made accessible for further investigation and exploration.

Entwicklung intelligenter autonomer Assistenten

Published: June 28, 2025

The tutorial provides an overview of the development of intelligent autonomous assistants, focusing on how multimodal AI technologies—such as large language models, computer vision, and robotics—enable systems to understand language, perceive their environment, plan actions, and manipulate objects. It discusses practical applications in Industry 5.0, logistics, laboratories, smart homes, and agriculture, and contrasts modular system designs with end-to-end vision-language-action models. Finally, it highlights current research challenges, including data scarcity, simulation-to-reality transfer, system speed, and robustness in complex real-world environments.

Natural-Language Robot Manipulation via MCP: An Integrated Framework for Vision-Guided Pick-and-Place Automation

Published: March 20, 2026

This work presents a unified software framework for controlling robotic manipulators through unconstrained natural language, built around the Model Context Protocol (MCP). Users can issue commands via text, voice, or a web GUI, which are interpreted by a large language model that decomposes instructions into structured tool calls executed by a FastMCP server. The system integrates real-time open-vocabulary object detection, instance segmentation, and workspace coordinate transforms to ground language in the physical scene, enabling complex pick-and-place operations. A Redis-based pub/sub architecture decouples perception, reasoning, and control into independent processes, while a platform-agnostic hardware abstraction supports both the Niryo Ned2 and WidowX-250 robotic arms. Evaluation across eight diverse manipulation tasks — including multi-step operations, spatial reasoning, and multilingual instructions — achieved an 83% success rate on a Niryo Ned2, with the primary bottleneck being open-vocabulary perception rather than LLM reasoning or motion control. All components are publicly available, providing a reproducible foundation for natural-language interfaces to cyber-physical systems.

Daniel Gaida

Posts by Collection

portfolio

publications

student_projects

talks

teaching