Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty "metrics" list passed to "@scorer" casuses a crash of the log viewer #1325

Closed
max-kaufmann opened this issue Feb 15, 2025 · 1 comment

Comments

@max-kaufmann
Copy link
Contributor

As described in #1324 , I'm trying to create a scorer which just returns strings. I hence passed an empry list to metrics (since I don't want to reduce anything). I get this error from the resulting log when I try to open it:

at $TypeError: Cannot read properties of undefined (reading 'reducer')
    at ResultsPanel (https://file+.vscode-resource.vscode-cdn.net/Users/max/Documents/git_repos/stories-cip/.venv/lib/python3.11/site-packages/inspect_ai/_view/www/dist/assets/index.js:60545:42)
    at renderWithHooks (https://file+.vscode-resource.vscode-cdn.net/Users/max/Documents/git_repos/stories-cip/.venv/lib/python3.11/site-packages/inspect_ai/_view/www/dist/assets/index.js:3533:25)
    at updateFunctionComponent (https://file+.vscode-resource.vscode-cdn.net/Users/max/Documents/git_repos/stories-cip/.venv/lib/python3.11/site-packages/inspect_ai/_view/www/dist/assets/index.js:5030:20)
    at beginWork (https://file+.vscode-resource.vscode-cdn.net/Users/max/Documents/git_repos/stories-cip/.venv/lib/python3.11/site-packages/inspect_ai/_view/www/dist/assets/index.js:5685:18)
    at performUnitOfWork (https://file+.vscode-resource.vscode-cdn.net/Users/max/Documents/git_repos/stories-cip/.venv/lib/python3.11/site-packages/inspect_ai/_view/www/dist/assets/index.js:8750:18)
    at workLoopSync (https://file+.vscode-resource.vscode-cdn.net/Users/max/Documents/git_repos/stories-cip/.venv/lib/python3.11/site-packages/inspect_ai/_view/www/dist/assets/index.js:8649:41)
    at renderRootSync (https://file+.vscode-resource.vscode-cdn.net/Users/max/Documents/git_repos/stories-cip/.venv/lib/python3.11/site-packages/inspect_ai/_view/www/dist/assets/index.js:8633:11)
    at performWorkOnRoot (https://file+.vscode-resource.vscode-cdn.net/Users/max/Documents/git_repos/stories-cip/.venv/lib/python3.11/site-packages/inspect_ai/_view/www/dist/assets/index.js:8335:44)
    at performWorkOnRootViaSchedulerTask (https://file+.vscode-resource.vscode-cdn.net/Users/max/Documents/git_repos/stories-cip/.venv/lib/python3.11/site-packages/inspect_ai/_view/www/dist/assets/index.js:9175:7)
    at MessagePort.performWorkUntilDeadline (https://file+.vscode-resource.vscode-cdn.net/Users/max/Documents/git_repos/stories-cip/.venv/lib/python3.11/site-packages/inspect_ai/_view/www/dist/assets/index.js:191:50)

Here is my scorer, I believe this reproduces with any scorer if you make metrics empty:

@scorer(metrics=[])
def scenario_parser_v0() -> Scorer:
    """Returns a scorer that returns a dictionary containing the model's question and answer. Fills out metadata with parse_error if the model's output does not contain both a <scenario> and a <question> tag."""

    async def scenario_parser_v0(state: TaskState, target: Target) -> Score:
        model_output = state.output.completion

        scenario_match = re.search(
            r"<scenario>(.*?)</scenario>", model_output, re.DOTALL
        )
        question_match = re.search(
            r"<question>(.*?)</question>", model_output, re.DOTALL
        )

        scenario = scenario_match.group(1).strip() if scenario_match else None
        question = question_match.group(1).strip() if question_match else None

        parse_error = scenario is None or question is None

        return Score(
            value=1.0,
            answer=model_output,
            metadata={
                "parse_error": parse_error,
                "question": question if question else "",
                "answer": scenario if scenario else "",
            },
        )

    return scenario_parser_v0

Should probably error the user earlier, when they first create the scorer or (as in my usecase) allow for an empty list of metrics.

@dragonstyle
Copy link
Collaborator

Great repro thx!!

@dragonstyle dragonstyle mentioned this issue Feb 15, 2025
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants