Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the tool calling #242

Open
shiv810 opened this issue Jan 27, 2025 · 30 comments · May be fixed by ubiquity-os/plugin-sdk#68, ubiquity-os-marketplace/command-query#38 or #245
Open

Improve the tool calling #242

shiv810 opened this issue Jan 27, 2025 · 30 comments · May be fixed by ubiquity-os/plugin-sdk#68, ubiquity-os-marketplace/command-query#38 or #245

Comments

@shiv810
Copy link

shiv810 commented Jan 27, 2025

Improve the tool calling approach by:

  1. Switching to the gpt4o model for better performance on complex queries.
  2. Using Few-Shot Examples (dynamic or static) with semantically similar examples to guide the model, improving accuracy by providing relevant context. Dynamic examples can adapt to user input for more precise responses. The plugins call examples would be semantically similar to input query to the command router.
@0x4007
Copy link
Member

0x4007 commented Jan 27, 2025

@shiv810
Copy link
Author

shiv810 commented Jan 27, 2025

/start

Copy link

Beneficiary 0xDAba6e01D15Db560b88C8F426b016801f79e1F69

Tip

  • Use /wallet 0x0000...0000 if you want to update your registered payment wallet address.
  • Be sure to open a draft pull request as soon as possible to communicate updates on your progress.
  • Be sure to provide timely updates to us when requested, or you will be automatically unassigned from the task.

Copy link

Important

  • Be sure to link a pull-request before the first reminder to avoid disqualification.
  • Reminders will be sent every 2 days and 8 hours if there is no activity.
  • Assignees will be disqualified after 4 days and 16 hours of inactivity.

@shiv810
Copy link
Author

shiv810 commented Jan 27, 2025

@0x4007 Should the examples be predefined and stored, or should they be retrieved from the manifest along with the command?

@0x4007
Copy link
Member

0x4007 commented Jan 27, 2025

should they be retrieved from the manifest along with the command?

This is the only viable option. Plugins will continue to be built.

@shiv810
Copy link
Author

shiv810 commented Jan 30, 2025

This is the only viable option. Plugins will continue to be built.

I think the first step, be updating the mainfest, to ensure it includes the examples for the plugin. And then include it in the prompt along with the command.

@0x4007
Copy link
Member

0x4007 commented Jan 31, 2025

This is the only viable option. Plugins will continue to be built.

I think the first step, be updating the mainfest, to ensure it includes the examples for the plugin. And then include it in the prompt along with the command.

The manifest is supposed to automatically update with CI when plugin source code is updated. I am pretty sure the examples are also automatically included in there so I'm not sure what the problem is but I don't think you need to change anything.

@gentlementlegen rfc

@shiv810
Copy link
Author

shiv810 commented Feb 2, 2025

The manifest is supposed to automatically update with CI when plugin source code is updated. I am pretty sure the examples are also automatically included in there so I'm not sure what the problem is but I don't think you need to change anything.

I couldn't find anything directly related to that in the plugin-sdk repo. The closest reference I found was ubiquity:example. But, the manifest will need to be updated for plugins regardless, since in most cases, this has values like /ask, so they have to updated with more rich examples.

@gentlementlegen
Copy link
Member

/help

Copy link

Available Commands

Command Description Example
/help List all available commands. /help
/ask Ask any question about the repository, issue or pull request /ask
/query Returns the user's wallet, access, and multiplier information. /query @UbiquityOS
/start Assign yourself and/or others to the issue/task. /start
/stop Unassign yourself from the issue/task. /stop
/wallet Register your wallet address for payments. Use '/wallet unset' to unlink your wallet. /wallet ubq.eth

@gentlementlegen
Copy link
Member

^ currently these are retrieved from the manifest.
Command ask: https://github.com/ubiquity-os-marketplace/command-ask/blob/development/manifest.json#L8

These are meant to be manually populated by the developer, allowing you to describe the commands as you please.

@shiv810
Copy link
Author

shiv810 commented Feb 5, 2025

^ currently these are retrieved from the manifest. Command ask: https://github.com/ubiquity-os-marketplace/command-ask/blob/development/manifest.json#L8

These are meant to be manually populated by the developer, allowing you to describe the commands as you please.

Is there any other field that could be used to add the same tool calls in the manifest ? Otherwise, manifest type would have to modified for this.

An alternative for this would to begin, with dynamic few shot examples but stored in the main kernel as for now. We can move those to manifest if this works out well

@gentlementlegen
Copy link
Member

What are you trying to achieve exactly? From what I understand you want to add more examples to the manifest?

@shiv810
Copy link
Author

shiv810 commented Feb 5, 2025

What are you trying to achieve exactly? From what I understand you want to add more examples to the manifest?

I wanted to add some examples of tool calls for each plugin. When the prompt is built at the kernel, it should retrieve these along with the command. These examples would include a few scenarios where the tool was used and what the output was.

@gentlementlegen
Copy link
Member

If it is meant only to be used for the LLM, we can consider adding a field in the manifest. Otherwise we can make the example section as an array so we can include multiple examples.

@shiv810
Copy link
Author

shiv810 commented Feb 5, 2025

I think, to begin with, for the core plugins at least, let's start by storing the examples directly in the kernel. If that works well, we can update the manifest and potentially expand this to other plugins.

@0x4007 rfc

@0x4007
Copy link
Member

0x4007 commented Feb 5, 2025

I think, to begin with, for the core plugins at least, let's start by storing the examples directly in the kernel. If that works well, we can update the manifest and potentially expand this to other plugins.

@0x4007 rfc

I don't have experience with this sort of problem but when it comes to LLM related matters I generally would go with shiv810's judgement. I hope that it won't clutter the codebase too much because I understand that gentlementlegen is likely trying to keep the kernel codebase clean as it should be.

@gentlementlegen
Copy link
Member

I am not sure how you can store examples in the kernel since it is dynamically generated by fetching manifests for plugins, or maybe you meant just feeding them all regardless of what plugin is running?

@whilefoo rfc

@Keyrxng
Copy link
Contributor

Keyrxng commented Feb 5, 2025

In my opinion, the manifest approach is best from the beginning as we only have core plugins right now and ultimately that's the ideal option for scalability.

The other option might look like a hardcoded mapping of examples against plugin names taken from the fetched manifest, so it makes sense to embed it in the manifest from the jump I think.

@shiv810
Copy link
Author

shiv810 commented Feb 5, 2025

I am not sure how you can store examples in the kernel since it is dynamically generated by fetching manifests for plugins, or maybe you meant just feeding them all regardless of what plugin is running?

The existing commands in the manifest would be modified to support this:

{
  "commands": {
    "query": {
      "ubiquity:example": "/query @UbiquityOS",
      "description": "Returns the user's wallet, access, and multiplier information.",
      "parameters": {
        "type": "object",
        "properties": {
          "username": {
            "description": "Github username",
            "type": "string"
          }
        }
      }
    },
    "example": [
      {
        "command": "Give me the details about @xxx wallet",
        "parameters": {
          "username": "xxx"
        },
        "response": "| Property | Value |\n|----------|-------|\n| Wallet | <WALLET_ADDRESS> |"
      }
    ]
  }
}

This would be retrieved from the manifest.json by the kernel, and then it would be added to the prompt.

As an alternative, just to make sure this properly works with the system, we can test this out by adding the examples in the kernel, and if that works fine we can make the change in the plugin-sdk and other places as required.

@gentlementlegen
Copy link
Member

Why not directly adding the example to the parameters? They are already used by the LLM for the commands

{
  "commands": {
    "query": {
      "ubiquity:example": "/query @UbiquityOS",
      "description": "Returns the user's wallet, access, and multiplier information.",
      "command": {
          "example": [
            {
              "command": "Give me the details about @xxx wallet",
              "parameters": {
                "username": "xxx"
              },
              "response": "| Property | Value |\n|----------|-------|\n| Wallet | <WALLET_ADDRESS> |"
            }
          ]
          "parameters": {
            "type": "object",
            "properties": {
              "username": {
                "description": "Github username",
                "type": "string"
              }
            }
      }
    },
}
  }
}

@0x4007
Copy link
Member

0x4007 commented Feb 5, 2025

/ask design the manifest to solve the problem we are discussing

Copy link

ubiquity-os-beta bot commented Feb 5, 2025

To address the issue of integrating examples for each plugin command within the manifest, we can enhance the manifest.json structure to include an examples array for each command. This approach ensures that the kernel can retrieve these examples dynamically and incorporate them into the prompts, thereby improving the LLM's performance with relevant context.

Updated manifest.json

Here's a proposed structure for the manifest.json incorporating the examples field:

{
  "commands": {
    "query": {
      "ubiquity:example": "/query @UbiquityOS",
      "description": "Returns the user's wallet, access, and multiplier information.",
      "parameters": {
        "type": "object",
        "properties": {
          "username": {
            "description": "GitHub username",
            "type": "string"
          }
        },
        "required": ["username"]
      },
      "examples": [
        {
          "command": "Give me the details about @johnDoe wallet",
          "parameters": {
            "username": "johnDoe"
          },
          "response": "| Property | Value |\n|----------|-------|\n| Wallet | 0xABC123... |"
        },
        {
          "command": "Show @janeSmith's access and multiplier",
          "parameters": {
            "username": "janeSmith"
          },
          "response": "| Property | Value |\n|----------|-------|\n| Access | Full |\n| Multiplier | 2x |"
        }
      ]
    },
    "help": {
      "description": "Provides assistance and available commands.",
      "parameters": {},
      "examples": [
        {
          "command": "/help",
          "parameters": {},
          "response": "Available commands:\n- /query @username: Retrieve wallet and access information.\n- /status: Get the current system status."
        }
      ]
    }
    // Add more commands as needed
  }
}

Explanation

  1. examples Array: Each command now includes an examples array containing objects that illustrate how to use the command. Each example comprises:

    • command: A sample command invocation.
    • parameters: The parameters used in the command.
    • response: The expected response from the system.
  2. Dynamic Retrieval: The kernel can fetch these examples from the manifest.json and incorporate them into the prompt sent to the LLM. This ensures that the model has concrete instances to reference, enhancing its ability to understand and execute commands accurately.

  3. Scalability: By embedding examples directly within each command, adding new commands or updating existing ones becomes straightforward. Developers can simply append new examples to the relevant command's examples array.

TypeBox Schema Update

To ensure TypeScript schema validation accommodates the new examples field, update the TypeBox schema as follows:

import { Type, Static } from '@sinclair/typebox';

const Example = Type.Object({
  command: Type.String(),
  parameters: Type.Record(Type.String(), Type.Any()),
  response: Type.String()
});

const Command = Type.Object({
  "ubiquity:example": Type.String(),
  description: Type.String(),
  parameters: Type.Object({
    type: Type.String(),
    properties: Type.Record(Type.String(), Type.Object({
      description: Type.String(),
      type: Type.String()
    })),
    required: Type.Optional(Type.Array(Type.String()))
  }),
  examples: Type.Optional(Type.Array(Example))
});

const Manifest = Type.Object({
  commands: Type.Record(Type.String(), Command)
});

type ManifestType = Static<typeof Manifest>;

Implementation Steps

  1. Update manifest.json: Incorporate the examples array for each relevant command as shown above.

  2. Modify Kernel Retrieval Logic:

    • Ensure that the kernel fetches the examples from the manifest alongside the command details.
    • Integrate these examples into the prompt construction for the LLM.
  3. Validate Schema:

    • Update the TypeBox schema to include the examples field.
    • Run existing tests and add new Jest tests to validate the presence and structure of examples in the manifest.
  4. Deploy Changes:

    • Use Wrangler to deploy the updated application to Cloudflare Workers.
    • Monitor the integration to ensure that examples are correctly utilized by the LLM.

Example Usage

With the updated manifest, when a user invokes the /query command, the system can utilize the provided examples to guide the LLM in generating accurate responses. For instance:

User Input:

/query @janeSmith

LLM Prompt Incorporation:

Examples:
1. Command: "Give me the details about @johnDoe wallet"
   Response: "| Property | Value |\n|----------|-------|\n| Wallet | 0xABC123... |"

2. Command: "Show @janeSmith's access and multiplier"
   Response: "| Property | Value |\n|----------|-------|\n| Access | Full |\n| Multiplier | 2x |"

User Command: "/query @janeSmith"

Generated Response:

| Property | Value |
|----------|-------|
| Access | Full |
| Multiplier | 2x |

This structured approach ensures consistency, scalability, and improved accuracy in the system's responses by leveraging well-defined examples within the manifest.

@whilefoo
Copy link
Member

whilefoo commented Feb 5, 2025

I'm concerned that putting examples in the manifest will clutter it, for example if for every command you have 3 examples and each example includes the response which can be quite big for some commands. This would be a burden on a plugin developer to supply all those examples and keep them updated as the code changes. Of course if there's no better option we can try this.

@shiv810 Did you test it just by switching to gpt4o? Maybe that will improve accuracy enough so that we don't need detailed examples.

Why not directly adding the example to the parameters? They are already used by the LLM for the commands

{
"commands": {
"query": {
"ubiquity:example": "/query @UbiquityOS",
"description": "Returns the user's wallet, access, and multiplier information.",
"command": {
"example": [
{
"command": "Give me the details about @xxx wallet",
"parameters": {
"username": "xxx"
},
"response": "| Property | Value |\n|----------|-------|\n| Wallet | <WALLET_ADDRESS> |"
}
]
"parameters": {
"type": "object",
"properties": {
"username": {
"description": "Github username",
"type": "string"
}
}
}
},
}
}
}

Currently we are using strict mode so I think it won't allow arbitrary properties in the schema, we can disable the strict mode however it means that sometimes function call won't adhere to the schema. Also I'm not sure if OpenAI does any processing of the tool calling and removes any unknown properties or just feeds it to the prompt.

@0x4007
Copy link
Member

0x4007 commented Feb 5, 2025

Perhaps we can assume that this LLM calling feature of plug-ins should only work with our official plug-ins.

Perhaps they should not be required properties for normal plug-ins.

@shiv810
Copy link
Author

shiv810 commented Feb 6, 2025

@shiv810 Did you test it just by switching to gpt4o? Maybe that will improve accuracy enough so that we don't need detailed examples.

It does increase the accuracy. I think it would better if we include some examples, at least for core plugins, like query and start/stop, as they are used most frequently.

Perhaps they should not be required properties for normal plug-ins.

It could be made as optional property, like how command is handled, if examples are provided we can use them in the prompt otherwise the even is forwarded as such.

@0x4007
Copy link
Member

0x4007 commented Feb 6, 2025

Then let's proceed and make this an optional manifest property. Let's also use the highest ranking LLM on the LLM leaderboards that supports tool calling.

I figure that cost optimization can be handled much later.

@shiv810
Copy link
Author

shiv810 commented Feb 10, 2025

Updated the manifest with an optional property. ubiquity-os/plugin-sdk#68

@0x4007
Copy link
Member

0x4007 commented Feb 14, 2025

Improve the tool calling approach by:

  1. Switching to the gpt4o model for better performance on complex queries.

Make sure to use openrouter so we can pick the latest and best model easily. I imagine that OpenAI has newer stuff with tool calling like o3. Also I'm pretty sure Claude also supports tool calling. 4o seems kind of dated already right? Also I'm not concerned about cost optimization for awhile. Mostly focused on top performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment