-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New user_pitchdb.csv #19
base: master
Are you sure you want to change the base?
Conversation
Hey, thanks for converting the Kanjium pitch data. I would like to keep the
|
…cture being similar to Wadoku's DB. Leaving only first entries in the 3rd column for now
Thank you for the reply. |
Well, I've spent some time rewriting the algorithm, testing the data and putting up the gist. The current summary is:
Also one question about "keeping the Gist. If there are any problems other than listed please inform me. Also since you're not planning on merging I will be committing to a separate repo, if that's OK. There's just some stuff not affiliated with this one. |
Thanks for the extensive info and excuse the late reply. I now added a remark pointing to the gist to the add-on page. |
Well, my point was that the algorithm you use to work around the custom db structure doesn't seem to support multiple entries with same key. I may be wrong, but either way the addon only generates the last entry. As I said,
It also doesn't look like it supports multiple values in a field, so I can't think of anything I can do to fix this. In the previous comment I wanted to let you know that you should probably revise the custom db structure you use, not post the data that works incorrectly. |
Sorry for not properly addressing your comment. I wasn't able to find any time for the add-on recently and since your comment had been sitting there for a while without any reaction from me I did a rush job on it. If you want me to remove the link to your Gist on the add-on page for now let me know. As for the points you raise: The add-on integrates the
By processing the Reading your description I guess the Kanjium data lists a Japanese word multiple times starting with the most common reading/pitch followed by less and less common readings/pitch accent patterns. In that case the pragmatic way to make the best use of the Kanjium data before #20 is addressed would be to only keep the first entry for every word in the Kanjium data. /edit:
To quickly elaborate on this: the |
Thank you for your reply. I'd appreciate if you remove the link for now. The way to fix the data you suggest may be pragmatic but functionality is going to suffer greatly. It's not about just removing extra entries. For some words it would mean removing an entire reading type (i.e. 見物, which can be read in both systems, would be left over with either けんぶつ (on) or みもの (kun)). I believe the current situation is that "expression" field acts as the only key to search through the custom DB. But the thing is, there is also "reading" field. You select both while bulk adding pitches in the app, but is the latter even used in the search? |
Gotcha. Removed the link. Regarding multiple readings:
Because of this only the last entry for a word in
into the |
I agree that current structure makes more sense. |
Converted you-know-what (wink-wink) to .csv so the addon can use it as a custom DB.
Works fine, as far as I can tell.