Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Request] 未来新知识库功能请求 #6054

Open
BryceWG opened this issue Feb 12, 2025 · 24 comments
Open

[Request] 未来新知识库功能请求 #6054

BryceWG opened this issue Feb 12, 2025 · 24 comments
Labels
🌠 Feature Request New feature or request | 特性与建议 files 上传文件/知识库

Comments

@BryceWG
Copy link

BryceWG commented Feb 12, 2025

🥰 需求描述

一般认为知识库是主要利用rag技术,与直接把文件作为上下文相对应。
我的设想是为知识库新增一个功能:在对话中选择附加文件时,可以选择直接把已经在知识库里的‘文件’作为上下文,当然也保留把‘知识库’作为上下文。相当于为知识库增加一个云盘的功能,让里面的文件增加一个快速调用的方式。

🧐 解决方案

在对话中选择附加文件时,可以选择直接把已经在知识库里的‘文件’作为上下文

📝 补充信息

No response


@arvinxx : 借该 issue 召集下大家的诉求,如果有对知识库目前不满意的地方,欢迎提出来,3月份开始做知识库 2.0 改造,你的每个诉求我都会看到

@BryceWG BryceWG added the 🌠 Feature Request New feature or request | 特性与建议 label Feb 12, 2025
@lobehubbot
Copy link
Member

👀 @BryceWG

Thank you for raising an issue. We will investigate into the matter and get back to you as soon as possible.
Please make sure you have given us as much context as possible.
非常感谢您提交 issue。我们会尽快调查此事,并尽快回复您。 请确保您已经提供了尽可能多的背景信息。

@lobehubbot
Copy link
Member

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


🥰 Requirement description

It is generally believed that the knowledge base mainly uses rag technology, which corresponds to directly using files as context.
My idea is to add a new feature to the knowledge base: when selecting additional files in a conversation, you can choose to directly use the 'file' already in the knowledge base as the context, and of course, you also retain the 'knowledge base' as the context. It is equivalent to adding a cloud disk function to the knowledge base, allowing the files inside to be called quickly.

🧐 Solution

When selecting an attached file in a conversation, you can choose to directly use the 'file' already in the knowledge base as the context

📝 Supplementary information

No response

@dosubot dosubot bot added the files 上传文件/知识库 label Feb 12, 2025
@arvinxx
Copy link
Contributor

arvinxx commented Feb 12, 2025

其实现在的交互就是支持的,但是由于之前是 RAG 的方式没法做全文注入所以效果不理想,这次2.0会把全文注入的能力加上,应该在一些需要全文引用的场景,效果会大大提升的

@arvinxx arvinxx added this to the Knowledgebase 2.0 milestone Feb 12, 2025
@lobehubbot
Copy link
Member

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


In fact, the current interaction is supported, but since the previous RAG method could not be used for full text injection, the effect is not ideal. This time, 2.0 will add the ability to inject the full text, and it should be greatly improved in some scenarios that require full text citation. of

@BryceWG
Copy link
Author

BryceWG commented Feb 12, 2025

说到引用,能不能实现引用对话的内容?类似豆包的这个功能,选中对话中的一段文字,作为上下文:

Image

@lobehubbot
Copy link
Member

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Speaking of quotations, can the content of the quotation dialogue be implemented? Similar to this function of Doubao, select a paragraph of text in the dialogue as the context:

Image

@Alencryenfo
Copy link

说到引用,能不能实现引用对话的内容?类似豆包的这个功能,选中对话中的一段文字,作为上下文:

Image

same request
这个功能感觉很有用

@lobehubbot
Copy link
Member

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Speaking of quotations, can the content of the quotation dialogue be implemented? Similar to this function of Doubao, select a paragraph of text in the dialogue as the context:

Image

same request
This function feels very useful

@git268
Copy link

git268 commented Feb 13, 2025

补充一下,目前lobechat上传文件后还需要等待向量化才送给大模型。有些PPT或PDF文件本身有比较复杂的图表时会报错向量化失败。但是其他类似的平台例如cherry studio似乎是将整个文件发送给大模型,速度快而且读的内容更精准。

@lobehubbot
Copy link
Member

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


To add, lobechat currently needs to wait for vectorization to be given to the big model after uploading the file. Some PPT or PDF files themselves have relatively complex charts and will report error vectorization failure. But other similar platforms such as cherry studio seem to send the entire file to the big model, which is fast and read more accurately.

@BryceWG
Copy link
Author

BryceWG commented Feb 13, 2025

补充一下,目前lobechat上传文件后还需要等待向量化才送给大模型。有些PPT或PDF文件本身有比较复杂的图表时会报错向量化失败。但是其他类似的平台例如cherry studio似乎是将整个文件发送给大模型,速度快而且读的内容更精准。

我也用过cherry,只有少数服务商的api支持直接接收文件,按作者说的没有单独适配的api其实都是本地解析出文件内容再发送

@lobehubbot
Copy link
Member

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


To add, lobechat currently needs to wait for vectorization to send it to the big model after uploading the file. Some PPT or PDF files themselves have relatively complex charts and will report error vectorization failure. But other similar platforms such as cherry studio seem to send the entire file to the big model, which is fast and read more accurately.

I have also used cherry. Only a few APIs of service providers support direct reception of files. According to the author's statement, APIs that do not have separate adaptations are actually parsed locally and then sent

@arvinxx arvinxx pinned this issue Feb 13, 2025
@Sun-drenched
Copy link

1.建议接入Doc2X等一线文档解析API,提高知识库文档解析精度。
2.由于直接向API传文档普适性不够,建议给出(解析后)全原文索引的功能(可以像现在的深度思考栏一样,单独分栏显示,默认不展开)
3.允许创建、便捷共享(可以做到团队空间/公有、私有助手市场里/or直接导出压缩包等可以直接导入的文件)附带向量化及原文知识库的助手
4.手机端拥有和桌面端一致的知识库管理功能。

@lobehubbot
Copy link
Member

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


  1. It is recommended to connect to the front-line document analysis API such as Doc2X to improve the accuracy of document analysis of knowledge bases.
  2. Since the universality of directly passing documents to the API is not enough, it is recommended to give (after parsing) the function of full original text index (can be displayed separately like the current deep thinking column, and will not be expanded by default)
  3. Allow creation and convenient sharing (can directly export compressed packages and other files that can be imported directly in the team space/public and private assistant market) with vectorization and original knowledge base assistant
  4. The mobile phone has a consistent knowledge base management function with the desktop.

@yagev5
Copy link

yagev5 commented Feb 15, 2025

希望后面版本中,知识库在设置里增加一个选择量化的模型的入口,方便用户选择自己想用的在线或本地模型构建知识库,帮助文档里,更改环境变量的方式,换了一下,有太麻烦了,有些模型又报错,后面知识库希望可以直接添加网页地址,自动量化网页数据到知识库

@lobehubbot
Copy link
Member

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


I hope that in the later version, the knowledge base will add an entry to select the quantized model in the settings, so that users can choose the online or local model they want to use to build the knowledge base. In the help document, the way to change the environment variables, it has changed, it is too I'm in trouble. Some models reported an error again. I hope that the knowledge base can directly add the web address and automatically quantify the web page data to the knowledge base.

@SAnBlog
Copy link

SAnBlog commented Feb 17, 2025

期望可以在线创建文档,markdown格式在线编辑内容,保存后的数据可以手动向量化,在对话时可以勾选某个文档或者某个文档的目录,等同于AI笔记+对话

@lobehubbot
Copy link
Member

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


I hope to create documents online, edit content online in markdown format, and save data can be manually vectorized. During conversation, you can check a document or a directory of a document, which is equivalent to AI notes + dialogue.

@memset0
Copy link

memset0 commented Feb 18, 2025

其实现在的交互就是支持的,但是由于之前是 RAG 的方式没法做全文注入所以效果不理想,这次2.0会把全文注入的能力加上,应该在一些需要全文引用的场景,效果会大大提升的

对于 Gemini 这样的服务商,提供了文件上传的接口,希望在全文注入时可选直接调用这种接口以获得更好的性能

@lobehubbot
Copy link
Member

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


In fact, the current interaction is supported, but since the RAG method was unable to inject full text, the effect is not ideal. This time, 2.0 will add the ability to inject full text, and it should be in some scenarios that require full text citation, and the effect will be great. Improved

For service providers like Gemini, they provide an interface for file upload, hoping to directly call this interface when injecting the full text to obtain better performance.

@memset0
Copy link

memset0 commented Feb 18, 2025

个人建议可以在聊天界面的侧边栏增加一个 panel,里面有当前对话的文件/知识库列表及复选框,每次可选发送部分文件/在部分文件中进行 RAG 检索(或许是 NotebookLM 类似的交互体验)

@lobehubbot
Copy link
Member

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Personally, I suggest you add a panel to the sidebar of the chat interface, which contains the current conversation file/knowledge base list and check boxes. You can select some files/retrieve RAG in some files (perhaps similar to NotebookLM. Interactive experience)

@rollby
Copy link

rollby commented Feb 19, 2025

建议知识库加上这些功能
1、支持创建文件夹,便于知识库内分类管理
2、支持在线创建文件或分段,便于补充内容
3、支持文件存档或者版本更新,例如我有一个联系表会不断更新,我能够直接覆盖旧版本的文件,或者对旧版本文件进行存档
4、支持团队管理、团队共享知识库
5、支持对知识库文件打tag,便于快速查找文件,或者便于检索

@lobehubbot
Copy link
Member

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Recommended knowledge base to add these features

  1. Supports creating folders, which facilitates classification management in the knowledge base
  2. Support online creation of files or segments to facilitate supplementary content
  3. Support file archives or version updates. For example, I have a contact form that will be updated continuously. I can directly overwrite old version files, or archive old version files.
  4. Support team management and team sharing knowledge base
  5. Support tagging knowledge base files, which facilitates quick search of files or facilitates retrieval.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🌠 Feature Request New feature or request | 特性与建议 files 上传文件/知识库
Projects
None yet
Development

No branches or pull requests

10 participants