-
Notifications
You must be signed in to change notification settings - Fork 28
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Sebastian Gehrmann
committed
Feb 19, 2024
1 parent
7022663
commit 98b73de
Showing
174 changed files
with
189 additions
and
193 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
<!DOCTYPE html><html><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width"/><title>404: This page could not be found</title><meta name="next-head-count" content="3"/><link rel="preload" href="/_next/static/css/86a77084a15a5546.css" as="style"/><link rel="stylesheet" href="/_next/static/css/86a77084a15a5546.css" data-n-g=""/><noscript data-n-css=""></noscript><script defer="" nomodule="" src="/_next/static/chunks/polyfills-78c92fac7aa8fdd8.js"></script><script src="/_next/static/chunks/webpack-a73844ba913878ac.js" defer=""></script><script src="/_next/static/chunks/framework-7a7e500878b44665.js" defer=""></script><script src="/_next/static/chunks/main-a56c17dda72126ba.js" defer=""></script><script src="/_next/static/chunks/pages/_app-da8862f0ec3a97c1.js" defer=""></script><script src="/_next/static/chunks/pages/_error-4afcb85b7c260fd3.js" defer=""></script><script src="/_next/static/_6PTD149dab44BiFUecbS/_buildManifest.js" defer=""></script><script src="/_next/static/_6PTD149dab44BiFUecbS/_ssgManifest.js" defer=""></script></head><body><div id="__next"><div style="font-family:system-ui,"Segoe UI",Roboto,Helvetica,Arial,sans-serif,"Apple Color Emoji","Segoe UI Emoji";height:100vh;text-align:center;display:flex;flex-direction:column;align-items:center;justify-content:center"><div style="line-height:48px"><style>body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}</style><h1 class="next-error-h1" style="display:inline-block;margin:0 20px 0 0;padding-right:23px;font-size:24px;font-weight:500;vertical-align:top">404</h1><div style="display:inline-block"><h2 style="font-size:14px;font-weight:400;line-height:28px">This page could not be found<!-- -->.</h2></div></div></div></div><script id="__NEXT_DATA__" type="application/json">{"props":{"pageProps":{"statusCode":404}},"page":"/_error","query":{},"buildId":"_6PTD149dab44BiFUecbS","nextExport":true,"isFallback":false,"gip":true,"scriptLoader":[]}</script></body></html> | ||
<!DOCTYPE html><html><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width"/><title>404: This page could not be found</title><meta name="next-head-count" content="3"/><link rel="preload" href="/_next/static/css/86a77084a15a5546.css" as="style"/><link rel="stylesheet" href="/_next/static/css/86a77084a15a5546.css" data-n-g=""/><noscript data-n-css=""></noscript><script defer="" nomodule="" src="/_next/static/chunks/polyfills-78c92fac7aa8fdd8.js"></script><script src="/_next/static/chunks/webpack-a73844ba913878ac.js" defer=""></script><script src="/_next/static/chunks/framework-7a7e500878b44665.js" defer=""></script><script src="/_next/static/chunks/main-a56c17dda72126ba.js" defer=""></script><script src="/_next/static/chunks/pages/_app-da8862f0ec3a97c1.js" defer=""></script><script src="/_next/static/chunks/pages/_error-4afcb85b7c260fd3.js" defer=""></script><script src="/_next/static/Qgo86BhcVH2uzjK6jWdVw/_buildManifest.js" defer=""></script><script src="/_next/static/Qgo86BhcVH2uzjK6jWdVw/_ssgManifest.js" defer=""></script></head><body><div id="__next"><div style="font-family:system-ui,"Segoe UI",Roboto,Helvetica,Arial,sans-serif,"Apple Color Emoji","Segoe UI Emoji";height:100vh;text-align:center;display:flex;flex-direction:column;align-items:center;justify-content:center"><div style="line-height:48px"><style>body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}</style><h1 class="next-error-h1" style="display:inline-block;margin:0 20px 0 0;padding-right:23px;font-size:24px;font-weight:500;vertical-align:top">404</h1><div style="display:inline-block"><h2 style="font-size:14px;font-weight:400;line-height:28px">This page could not be found<!-- -->.</h2></div></div></div></div><script id="__NEXT_DATA__" type="application/json">{"props":{"pageProps":{"statusCode":404}},"page":"/_error","query":{},"buildId":"Qgo86BhcVH2uzjK6jWdVw","nextExport":true,"isFallback":false,"gip":true,"scriptLoader":[]}</script></body></html> |
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{"pageProps":{"sharedTaskData":{"contentHtml":"<h1 id=\"user-content-shared-task---gem-2024\">Shared Task - GEM 2024</h1>\n<h2 id=\"user-content-general-information\">General information</h2>\n<p>Our <a href=\"https://forms.gle/vbTZDMCuqzok8tTA9\">pre-registration form</a> is now available. Although this step is needed in order to receive the test data, submission will NOT be mandatory, so don't hesitate to fill in the form and play with the data! And if you do, please make sure you record the details of your experiments since you will be asked to write one model card per submission.</p>\n<p>This year, the GEM shared task features two main tasks: <strong>Data-to-text generation</strong> and <strong>Summarization</strong>, with a special emphasis on multilinguality; furthermore, no training data is provided, and the test data includes previously unpublished test sets; data illustrations are provided in <a href=\"https://docs.google.com/document/d/1xaGRNl-f6aOH7GWZCOwb745rGvBu-Mz7FtTyvOSmqBM/edit?usp=sharing\">this online document</a>.</p>\n<h2 id=\"user-content-data-to-text-task\">Data-to-text Task</h2>\n<p>The data-to-text task consists in generating texts from input triple sets in the WebNLG fashion, where each triple is made of Subject | Property | Object. There are two subtasks:</p>\n<ul>\n<li><strong>Subtask 1: WebNLG-based (D2T-1)</strong>: we use the official WebNLG test set (1,779 inputs) for the “seen” subtask; even though the WebNLG test set contains properties and entities not seen in the training/dev data, we consider the whole WebNLG dataset as seen since all splits (training/dev/test) have been available online for 3 years. The dataset contains 220 different properties; the original dataset specifications can be found on the <a href=\"https://synalp.gitlabpages.inria.fr/webnlg-challenge/challenge_2020/\">WebNLG website</a>.</li>\n<li><strong>Subtask 2: Wikidata-based (D2T-2)</strong>: we use new triple sets compiled from Wikidata (i.e. new properties and entities, 1,800 inputs) for the “unseen” subtask. The dataset contains 74 different properties, none of which were in WebNLG; more information about the Wikidata-based inputs can be found in <a href=\"https://aclanthology.org/2023.mmnlg-1.5.pdf\">this paper</a></li>\n</ul>\n<p>For each subtask, there are 3 parallel datasets (see examples <a href=\"https://docs.google.com/document/d/1xaGRNl-f6aOH7GWZCOwb745rGvBu-Mz7FtTyvOSmqBM/edit?usp=sharing\">here</a>):</p>\n<ul>\n<li><strong>Dataset 1: Factual (FA)</strong>: we use the triples as found in the WebNLG data and on Wikidata.</li>\n<li><strong>Dataset 2: Counterfactual (CFA)</strong>: entities in the factual dataset are switched based on their class (e.g. a person entity is replaced by another person entity, a date by another date, etc.).</li>\n<li><strong>Dataset 3: Fictional (FI)</strong>: entities in the factual datasets are replaced by made up entities (obtained via LLM prompting).</li>\n</ul>\n<p><strong>Data and languages:</strong> No training or development data is provided, only test data (2 sets of 3 test files); to get access to the test data, please pre-register using the link at the top of the page. We accept submissions of outputs in the following languages: English (en), Chinese (zh), German (de), Russian (ru), Spanish (es), Korean (ko), Hindi (hi), Swahili (sw), Arabic (ar). For both subtasks, a subset of the data/languages will be selected for human evaluation based on the number of submissions we receive for each language.</p>\n<p><strong>DISCLAIMER:</strong> This dataset contains counterfactual and fictional data, so it is possible that in some (rare) cases, the resulting data could be judged offensive. In the counterfactual dataset for instance, real person names, roles, dates, locations etc. are switched, which can result in some unfortunate combinations; e.g. a work or a person can end up being associated with Adolf Hitler as author, employee, spouse… In the fictional dataset, entity names are made up by a language model and in theory cannot have the same form as existing known entities, but we cannot ensure that no entity will have a label that one could consider offensive.</p>\n<p><strong>Submissions:</strong> Please submit your model outputs <a href=\"TOUPDATE\">here (to be communicated soon)</a>. Each team is expected to submit outputs for the 3 datasets of the subtask(s) they participate in, in three different files. As for the WebNLG shared tasks, each submission file must be a .txt file (UTF-8 encoding) where each text is true-cased and detokenized; see an <a href=\"https://synalp.gitlabpages.inria.fr/webnlg-challenge/files/submission-example-2020-nlg.txt\">example</a> for English on the WebNLG page. In the submission files, each line should correspond to the verbalisation of one triple set: Line 1 should represent the verbalisation of the triple set with the ID=1, line 2 — the triple set with the ID=2, etc. If no output is produced, an empty line is expected, so all output files are expected to contain as many lines as there are inputs. Each submission file should be named with the (i) system name, (ii) subtask and dataset, and (iii) ISO 639-1 standard language id (see Data and languages above), separated by underscores: SystemX_[subtask]-[dataset]_[lang id].txt; for instance for a submission for the WebNLG subtask, Factual dataset in English: <strong>SystemX_D2T-1-FA_en.txt</strong>.</p>\n<p><strong>Evaluation:</strong> Only human evaluation will be carried out, via 4 quality criteria: Grammaticality, Fluency, No-omissions, No-Additions; see definitions <a href=\"https://docs.google.com/document/d/1xaGRNl-f6aOH7GWZCOwb745rGvBu-Mz7FtTyvOSmqBM/edit?usp=sharing\">here.</a></p>\n<h2 id=\"user-content-summarization-task\">Summarization Task</h2>\n<p>The summarization task generates a concise summary based on the input text document. To make this task challenging, we focus on several different aspects of the task: underrepresented language (Swahili), cross-lingual, and long-context input.</p>\n<p>There are three subtasks corresponding to the above aspects (see examples <a href=\"https://docs.google.com/document/d/1xaGRNl-f6aOH7GWZCOwb745rGvBu-Mz7FtTyvOSmqBM/edit?usp=sharing\">here</a>):</p>\n<ul>\n<li><strong>Subtask 1: Underrepresented Language Summarization (Swahili).</strong> Both the input text document and the generated summary in this subtask are Swahili, which is usually not covered sufficiently in large language models (LLMs) as an underrepresented language. The dataset we use is the <a href=\"https://zenodo.org/records/4300294\">Swahili news dataset</a>, which contains Swahili news articles collected from different websites. It was originally created for a classification task but we’ll use the input text (news article) as our input article for summarization [23,268 articles in train.csv].</li>\n<li><strong>Subtask 2: Cross-lingual Summarization.</strong> This subtask checks the cross-lingual summarization setting between English and another language for both directions. The data is scraped from news websites similar to <a href=\"https://huggingface.co/datasets/csebuetnlp/xlsum\">XLSum</a> <a href=\"https://aclanthology.org/2021.findings-acl.413/\">(Hasan et al., 2021)</a>. For example, the input document is English and the generated summary is Chinese (en document → zh summary); the input document is Chinese and the generated summary is English (zh document → en summary). The languages covered can be found in the “Data and Languages” section below.</li>\n<li><strong>Subtask 3: English Book Chapter Summarization.</strong> We test the English summarization with long-context input (e.g. book chapters) in this subtask. A recent work ( <a href=\"https://arxiv.org/abs/2105.08209\">Kryściński et al., 2021</a>) has introduced an available dataset for long-context input along with human-written summaries. We’ll also leverage the undergraduate student reading group at NYU to acquire human references.</li>\n</ul>\n<p><strong>Data and languages:</strong> No training or development data is provided, only test data (1 testset per subtask); to get access to the test data, please pre-register using the link at the top of the page. For subtask 2, we require submissions of outputs in the following languages for the cross-lingual task: English (en), Chinese (zh), German (de), Russian (ru), Spanish (es), Korean (ko), Hindi (hi), Swahili (sw), Arabic (ar). For all tasks, a subset of the data/languages will be selected for human evaluation based on the number of submissions we receive for each language and on the available annotators.</p>\n<p><strong>Submissions:</strong> Please submit your model outputs <a href=\"TOUPDATE\">here (to be communicated soon)</a>. Each team is expected to submit outputs for the subtask(s) they participate in. Each submission file should be named with the (i) system name, (ii) subtask, and (iii) ISO 639-1 standard language id.</p>\n<ul>\n<li>For subtask 1, we expect one output file in Swahili. The expected filename is <strong>SystemX_Summ-1_sw.jsonl</strong>.</li>\n<li>For subtask 2, we expect at least one cross-lingual file and up to 8 files of the different languages from the input language. The file name for each language output should be in the format of SystemX_Summ-2_[lang id].jsonl (e.g., for Chinese: <strong>SystemX_Summ-2_zh.jsonl</strong>).</li>\n<li>For subtask 3, we expect one output file in English that only includes the information in the book chapters. The expected filename is <strong>SystemX_Summ-3_en.jsonl</strong>.</li>\n</ul>\n<p>Each submission file must be a jsonl file (UTF-8 encoding) where each text is true-cased and detokenized; see an <a href=\"https://drive.google.com/file/d/1oeYfxX05BP_099AboVy499HVvgWBmcmY/view?usp=sharing\">example</a> for English.</p>\n<p><strong>Evaluation:</strong> Only human evaluation will be carried out, via 5 quality criteria: Understandability, Compactness, Grammaticality, Coherence, Faithfulness, Saliency; see definitions <a href=\"https://docs.google.com/document/d/1xaGRNl-f6aOH7GWZCOwb745rGvBu-Mz7FtTyvOSmqBM/edit?usp=sharing\">here.</a></p>\n<h2 id=\"user-content-important-dates\">Important Dates</h2>\n<p><code>February 20</code> GEM shared task launched, pre-registration open.</p>\n<p><code>March 8</code> Deadline for pre-registering systems.</p>\n<p><code>April 5</code> Deadline for output submission (all tasks).</p>\n<p><code>April 6</code> Human evaluation starts.</p>\n<p><code>Before summer</code> Human evaluation results.</p>\n<p><strong>System Descriptions and Analyses</strong></p>\n<p><code>TBD</code> System Descriptions and Analyses due</p>\n<p><code>TBD</code> Notification of Acceptance</p>\n<p><code>TBD</code> Camera-ready due</p>\n<p>To stay up-to-date on announcements, please join our <a href=\"https://groups.google.com/g/gem-benchmark\">Google Group</a>. The same group may be used for questions and discussions.</p>\n"}},"__N_SSG":true} |
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Oops, something went wrong.