-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First version of whatisee node #219
Conversation
Draft - still not tested and missing similarity comparison (for freshness). |
I've noticed that the CI failed. Please merge the #220 and rebase |
Testing - to run the ros2 node with sim ros2 bag: |
860fbdc
to
870c38b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
The code is overall good, but I would like to discuss the following things:
-
I'm not sure about naming the node as "whatisee". This seems to be just a small feature of the "what I see" agent/pipeline. It might be better to name the node differently to more accurately reflect that this is just a part of the larger system.
-
I believe this code uses a mutex to protect shared resources, but I'm not entirely sure if it's necessary given the default single-threaded executor in ROS 2 C++. Out of curiosity, could you please explain the rationale behind including the mutex in this implementation? Is it primarily for future-proofing or are there other considerations I might be missing?
-
The failing jazzy pipeline. I believe this is caused by a library name mismatch, so a small fix should suffice.
1-2. The naming and mutex do reflect the node purpose beyond the first version. I am open to suggestions as for the name. The intention was to make it easy for users to understand the role, much like with whoami node. |
#230 raised an issue for multi-threading. |
5810e07
to
8da3bc4
Compare
8da3bc4
to
1ea3278
Compare
7508939
to
a2dfa72
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adamdbrw
I created a simple node to test /rai/whatisee/get
service. I have 2 questions:
observations
is always "nothing to add" even though camera image changes - is it expected?pose
is not changing even though robot moved.
Please check the video below:
whatisee-2024-09-18_13.39.19.mp4
Simple python client:
import rclpy
import cv2
from rclpy.node import Node
from rai_interfaces.srv import WhatISee
from cv_bridge import CvBridge
class Client(Node):
def __init__(self):
super().__init__('whatisee_client')
self.client = self.create_client(WhatISee, '/rai/whatisee/get')
while not self.client.wait_for_service(timeout_sec=1.0):
self.get_logger().info('service not available, waiting again...')
def send_request(self):
req = WhatISee.Request()
future = self.client.call_async(req)
rclpy.spin_until_future_complete(self, future)
return future.result()
rclpy.init()
node = Client()
response = node.send_request()
print(f'{response.observations=}')
print(f'{response.perception_source=}')
print(f'{response.pose}')
msg = response.image
bridge = CvBridge()
cv_image = bridge.imgmsg_to_cv2(msg, desired_encoding="passthrough")
if cv_image.shape[-1] == 4:
cv_image = cv2.cvtColor(cv_image, cv2.COLOR_BGRA2RGB)
else:
cv_image = cv2.cvtColor(cv_image, cv2.COLOR_BGR2RGB)
cv2.imshow("image", cv_image)
cv2.waitKey(0)
Both are expected, this is a first version that doesn't resolve the issue |
Signed-off-by: Adam Dąbrowski <[email protected]>
Signed-off-by: Adam Dąbrowski <[email protected]>
Signed-off-by: Adam Dąbrowski <[email protected]>
… 2 version (#227) Co-authored-by: Adam Dąbrowski <[email protected]>
1ea3278
to
e06f766
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: Adam Dąbrowski <[email protected]> Co-authored-by: Maciej Majek <[email protected]>
Sorry for late reply and thank you for answer! PR approved |
Purpose
Combining state of art robotic algorithms with generative AI poses a number of challenges. Vision Language Models (VLMs), for instance, are unable to cope with frequencies of robot sensors (cameras), and it would be costly now to run them constantly. There is a need for solution that gets the most interesting image over a period of time, and is able to handle cases where there is no important change to see. This PR brings in the first version of a node to handle this.
Proposed Changes
A new rclcpp node that exposes two services to see if anything changed and to get the recent image along with description.
The first version is very simple.
See the linked issue for more description.
Issues
#213
Testing
I conducted manual testing with the use of ros2 bag recorded from rosbot xl sim.
Testing - to run the ros2 node with sim ros2 bag:
ros2 run rai_whatisee rai_whatisee_node --ros-args -p camera_color_topic:=/camera/camera/color/image_raw
Play the bag (in bag directory)
ros2 bag play . --loop
Call services:
ros2 service call /rai/whatisee/anything_new std_srvs/srv/Trigger
ros2 service call /rai/whatisee/get rai_interfaces/srv/WhatISee