Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First version of whatisee node #219

Merged
merged 4 commits into from
Sep 18, 2024
Merged

First version of whatisee node #219

merged 4 commits into from
Sep 18, 2024

Conversation

adamdbrw
Copy link
Member

@adamdbrw adamdbrw commented Sep 12, 2024

Purpose

Combining state of art robotic algorithms with generative AI poses a number of challenges. Vision Language Models (VLMs), for instance, are unable to cope with frequencies of robot sensors (cameras), and it would be costly now to run them constantly. There is a need for solution that gets the most interesting image over a period of time, and is able to handle cases where there is no important change to see. This PR brings in the first version of a node to handle this.

Proposed Changes

A new rclcpp node that exposes two services to see if anything changed and to get the recent image along with description.
The first version is very simple.
See the linked issue for more description.

Issues

#213

Testing

I conducted manual testing with the use of ros2 bag recorded from rosbot xl sim.

Testing - to run the ros2 node with sim ros2 bag:
ros2 run rai_whatisee rai_whatisee_node --ros-args -p camera_color_topic:=/camera/camera/color/image_raw
Play the bag (in bag directory)
ros2 bag play . --loop
Call services:
ros2 service call /rai/whatisee/anything_new std_srvs/srv/Trigger
ros2 service call /rai/whatisee/get rai_interfaces/srv/WhatISee

@adamdbrw adamdbrw marked this pull request as draft September 12, 2024 16:22
@adamdbrw
Copy link
Member Author

Draft - still not tested and missing similarity comparison (for freshness).

@maciejmajek
Copy link
Member

I've noticed that the CI failed. Please merge the #220 and rebase

@adamdbrw
Copy link
Member Author

Testing - to run the ros2 node with sim ros2 bag:
ros2 run rai_whatisee rai_whatisee_node --ros-args -p camera_color_topic:=/camera/camera/color/image_raw
Play the bag (in bag directory)
ros2 bag play . --loop
Call services:
ros2 service call /rai/whatisee/anything_new std_srvs/srv/Trigger
ros2 service call /rai/whatisee/get rai_interfaces/srv/WhatISee

@adamdbrw adamdbrw marked this pull request as ready for review September 13, 2024 15:16
Copy link
Member

@maciejmajek maciejmajek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.
The code is overall good, but I would like to discuss the following things:

  1. I'm not sure about naming the node as "whatisee". This seems to be just a small feature of the "what I see" agent/pipeline. It might be better to name the node differently to more accurately reflect that this is just a part of the larger system.

  2. I believe this code uses a mutex to protect shared resources, but I'm not entirely sure if it's necessary given the default single-threaded executor in ROS 2 C++. Out of curiosity, could you please explain the rationale behind including the mutex in this implementation? Is it primarily for future-proofing or are there other considerations I might be missing?

  3. The failing jazzy pipeline. I believe this is caused by a library name mismatch, so a small fix should suffice.

@adamdbrw
Copy link
Member Author

1-2. The naming and mutex do reflect the node purpose beyond the first version. I am open to suggestions as for the name. The intention was to make it easy for users to understand the role, much like with whoami node.
3. I merged in your PR with the solution to support different distros, thank you!

@adamdbrw
Copy link
Member Author

#230 raised an issue for multi-threading.

Copy link
Member

@boczekbartek boczekbartek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adamdbrw
I created a simple node to test /rai/whatisee/get service. I have 2 questions:

  • observations is always "nothing to add" even though camera image changes - is it expected?
  • pose is not changing even though robot moved.
    Please check the video below:
whatisee-2024-09-18_13.39.19.mp4
Simple python client:
import rclpy
import cv2

from rclpy.node import Node
from rai_interfaces.srv import WhatISee
from cv_bridge import CvBridge

class Client(Node):
    def __init__(self):
        super().__init__('whatisee_client')
        self.client = self.create_client(WhatISee, '/rai/whatisee/get')
        while not self.client.wait_for_service(timeout_sec=1.0):
            self.get_logger().info('service not available, waiting again...')

    def send_request(self):
        req = WhatISee.Request()
        future = self.client.call_async(req)
        rclpy.spin_until_future_complete(self, future)
        return future.result()

rclpy.init()
node = Client()
response = node.send_request()
print(f'{response.observations=}')
print(f'{response.perception_source=}')
print(f'{response.pose}')
msg = response.image
bridge = CvBridge()

cv_image = bridge.imgmsg_to_cv2(msg, desired_encoding="passthrough")

if cv_image.shape[-1] == 4:
    cv_image = cv2.cvtColor(cv_image, cv2.COLOR_BGRA2RGB)
else:
    cv_image = cv2.cvtColor(cv_image, cv2.COLOR_BGR2RGB)

cv2.imshow("image", cv_image)
cv2.waitKey(0)

@adamdbrw
Copy link
Member Author

Both are expected, this is a first version that doesn't resolve the issue

adamdbrw and others added 4 commits September 18, 2024 16:37
Signed-off-by: Adam Dąbrowski <[email protected]>
Signed-off-by: Adam Dąbrowski <[email protected]>
Signed-off-by: Adam Dąbrowski <[email protected]>
Copy link
Member

@maciejmajek maciejmajek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@adamdbrw adamdbrw merged commit da5ac2e into development Sep 18, 2024
4 checks passed
@adamdbrw adamdbrw deleted the rai_vision_node branch September 18, 2024 14:50
maciejmajek added a commit that referenced this pull request Sep 18, 2024
Signed-off-by: Adam Dąbrowski <[email protected]>
Co-authored-by: Maciej Majek <[email protected]>
@boczekbartek
Copy link
Member

Sorry for late reply and thank you for answer! PR approved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants