Skip to content

Commit

Permalink
Deployed c20d03f with MkDocs version: 1.4.3
Browse files Browse the repository at this point in the history
  • Loading branch information
ruoxining committed Aug 5, 2024
1 parent cd5bc9e commit aa63bab
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 6 deletions.
10 changes: 5 additions & 5 deletions DL/PaperNotes/llama3_405b/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -1655,14 +1655,14 @@ <h3 id="benchmarks">Benchmarks 预览</h3>
<li>Tool-use 使用工具<ul>
<li><code>Nexus</code> (Srinivasan et al., 2023) 很奇怪,这个数据集能搜到的条目只有 huggingface 并且点开 404 了。</li>
<li><code>API-Bank</code> (Li et al., 2023b) 由 73 个 API 工具构成,根据模型产生的对话中调用 API 的情况,测试模型是否能调用、检索+调用、规划+检索+调用 API。</li>
<li><code>API-Bench</code> (Patil et al., 2023) </li>
<li><code>BFCL</code> (Yan et al., 2024) </li>
<li><code>API-Bench</code> (Patil et al., 2023) 分为两个 subtask:query-based API(根据自然语言描述的需求输出 API) 和 code-based API(根据挖去 API 的代码填空)。分为 Python 和 Java 语言版本。</li>
<li><code>BFCL</code> (Yan et al., 2024) Berkeley function calling leaderboard,多编程语言,且有 function call 类型的数据。</li>
</ul>
</li>
<li>Long context 长文本<ul>
<li><code>ZeroSCROLLS</code> (Shaham et al., 2023) </li>
<li><code>Needle-in-a-Haystack</code> (Kamradt, 2023) </li>
<li><code>InfiniteBench</code> (Zhang et al., 2024) </li>
<li><code>ZeroSCROLLS</code> (Shaham et al., 2023) zero-shot benchmark for long text understanding,一个综合的长文本多任务数据集。</li>
<li><code>Needle-in-a-Haystack</code> (Kamradt, 2023) 大海捞针实验。</li>
<li><code>InfiniteBench</code> (Zhang et al., 2024) 多任务 100K token 长文本数据集。</li>
</ul>
</li>
</ul>
Expand Down
2 changes: 1 addition & 1 deletion search/search_index.json

Large diffs are not rendered by default.

Binary file modified sitemap.xml.gz
Binary file not shown.

0 comments on commit aa63bab

Please sign in to comment.