Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for the pitch number notation #14

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

kichatos
Copy link

@kichatos kichatos commented May 7, 2021

Added support for the pitch number notation as an alternative to the binary pitch notation to simplify the pitch accent input.

Copy link
Owner

@IllDepence IllDepence left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

thanks a lot for the PR!

I think the function is useful but have a few suggestions regarding the code.

  • Naming
    • If pitch_svg now acccepts low-high patterns or numbers, the parameter patt should be renamed. I guess a generic pitch_annot would do.
    • Accodingly, the docstring of pitch_svg should then mention that pitch_annot can bei either a low-high pattern or a downstep position.
    • I feel pitch and patt as they stand right now are a bit confusing or at least not very clear. I'm not sure if there are official names for the differnt types of annotating pitch accent (only found this), but since the number notation marks the downstep, I'd suggest renaming pitch_to_pattern to downstep_to_pattern and its parameters to something like downstep_pos and num_mora.
  • Instead of trying to convert patt to an int, I'd suggest using isdigit() to test if the input is numeric or not, then testing for the length (existing check + checking if the number is too high case of number notation (e.g. input はな and 5))
  • The pitch_to_pattern (or downstep_to_pattern) function should have a minimal docstring explaining what it does.
  • Currently pitch_svg not only accepts l, h, L, and H for marking low and high but also 0 and 1 (and, looking at the code, 2—must've also been used in some annotations). This leads to a problem as an input of '10' could either be a downstep at mora position 10 or a high then low pattern. Accordingly 0, 1and 2 should be removed from the lines if accent in ['H', 'h', '1', '2'] and elif accent in ['L', 'l', '0'].
  • In your pitch_to_pattern function, I suggest using if, elif, else instead of two subsequent ifs and followed by the "neither of the above" case just below.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants