-
Notifications
You must be signed in to change notification settings - Fork 208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mii Structs #400
Mii Structs #400
Conversation
@larsenv allowed the previous Mii pull request to age, so I decided to make my own.
This commit addresses some issues raised in the PR larsenv created. WIP: Documentation for the neglected formats - Changed the naming scheme - Moved the files to their own folder
I know there's a few issues with the AAMP V2 KSY, besides being incomplete for my own standards. Unfortunately, I don't think I can legally share an example file, but if anyone does happen to have a copy of a Nintendo game that uses the format dumped to their computer and would like to check my work, that would be much appreciated. |
doc: Whether the Mii was downloaded from the Check Mii Out channel. | ||
- id: hair_type | ||
type: b7 | ||
doc: Hair type. Ranges from 0 to 71. Not ordered the same as visible in editor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be a enum
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would make it an enum, if I had a copy of the original lookup table, but it's missing from the repo I extracted these from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This thread (#355) brings up issues about the idea that I don't want to deal with though. 😬
region_lock: | ||
0: no_lock | ||
1: jpn | ||
2: usa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have "bit-sized types" that may be useful for more straightforward specifying of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is counter to the advice given in the previous thread. Are you sure that's better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have "bit-sized types" that may be useful for more straightforward specifying of this.
Um, you're talking about the instances
a few lines below, not this enum region_lock
, right?
Sure, after seeing the instances
heavily relying on manual bit handling, I also consider it necessary to rewrite it using the bit-sized integers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, seems like attached to a wrong line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, I don't think I can legally share an example file
I doubt it. Who created it?
If you create a file yourself (using any system or program), you own the copyright and nobody can take it from you. See https://www.gnu.org/licenses/gpl-faq.en.html#GPLOutput:
Is there some way that I can GPL the output people get from use of my program? For example, if my program is used to develop hardware designs, can I require that these designs must be free? (#GPLOutput)
In general this is legally impossible; copyright law does not give you any say in the use of the output people make from their data using your program. If the user uses your program to enter or convert her own data, the copyright on the output belongs to her, not you. More generally, when a program translates its input into some other form, the copyright status of the output inherits that of the input it was generated from.
So even if anybody writes "output files from our program cannot be redistributed", such statement has no legal weight (unless the program copies substantial parts of itself into the output, but that does not apply to any binary file format I've ever seen).
But I've actually found a bunch of public sample files for the gen1_mii_wii.ksy
spec: https://miicontest.wii.rc24.xyz/popular.html
game/nintendo_mii/gen1_wii_mii.ksy
Outdated
- id: mii_id | ||
type: u1 | ||
repeat: expr | ||
repeat-expr: 4 | ||
doc: Unique Mii identifier. Also governs color of Mii's pants |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's only an identifier, it should be either a 4-byte array (size: 4
) or a u4
integer. There is no reason to parse 4 single-byte unsigned integers if no individual integer has any meaning per se.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More info needed, but I think one of these bytes determines whether a Mii is "special" and has gold pants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to this, it looks like it's the first part.
https://github.com/HEYimHeroic/mii2studio/blob/f9f67f9/mii2studio.py#L136-L154
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this reasoning satisfactory enough to keep my solution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this reasoning satisfactory enough to keep my solution?
Hm, I'm still not much happy about that. What about splitting mii_id
into the "golden pants" and "unique ID" parts?
- id: mii_type
type: u1
- id: mii_id
size: 3
And BTW, shouldn't mii_type
be associated with an enum?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And BTW, shouldn't mii_type be associated with an enum?
I tried that. Enums don't really like it when you have multiple values with the same name.
https://github.com/HEYimHeroic/mii2studio/blob/master/mii2studio.py#L136-L143
Multiple values correlate to a normal Mii with black pants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for splitting it, that's not right. The type is still part of the ID.
It is properly represented by being derived from the id, not a separate value from it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for splitting it, that's not right. The type is still part of the ID.
OK, I get it. I'd go for a byte array then:
seq:
# ...
- id: mii_id
size: 4
doc: Unique Mii identifier. Also governs color of Mii's pants
You can extract mii_type
the same way as you do with the u1[]
array (which you use now):
instances:
mii_type:
value: mii_id[0]
And BTW, shouldn't mii_type be associated with an enum?
I tried that. Enums don't really like it when you have multiple values with the same name.
Yeah, you can't do that. But you can create an internal enum with the unique values that occur and whatever numeric keys you want (but it's reasonable to simply enumerate like 1, 2, 3, ...), and then create a value instance with nested ternary operators that will resolve the appropriate enum value.
Here's how you do it (just not in switch-on
, but a value instance):
kaitai_struct_formats/macos/resource_compression/dcmp_0.ksy
Lines 66 to 72 in 7f3ecee
switch-on: | | |
tag >= 0x00 and tag <= 0x1f ? tag_kind::literal | |
: tag >= 0x20 and tag <= 0x4a ? tag_kind::backreference | |
: tag >= 0x4b and tag <= 0xfd ? tag_kind::table_lookup | |
: tag == 0xfe ? tag_kind::extended | |
: tag == 0xff ? tag_kind::end | |
: tag_kind::invalid |
kaitai_struct_formats/macos/resource_compression/dcmp_0.ksy
Lines 85 to 94 in 7f3ecee
enums: | |
# Internal enum, only for use in the type switch above. | |
# This is a workaround for kaitai-io/kaitai_struct#489. | |
tag_kind: | |
-1: invalid | |
0: literal | |
1: backreference | |
2: table_lookup | |
3: extended | |
4: end |
- bactcapt | ||
xref: | ||
zeldamods: AAMP | ||
endian: le |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All .ksy
specs are required to specify the /meta/license
key in order to be included into the format gallery. (If they don't, they're under exclusive copyright and nobody but the sole author can legally use it, copy, distribute or modify - that means you can't either, because you're not the only author I suppose.)
If you are not the sole copyright owner, you should contact all other authors and agree together on a license. We recommend CC0-1.0
or MIT
, but any license from this list can be chosen.
IMHO it's ideal to use GitHub issues (or PRs) to coordinate the consent gathering, for example see kaitai-io/kaitai_struct_doc#30.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that closes this whole PR, because Nintendo cannot be reasoned with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, I realized this makes me look like an idiot (which I am.) but now I realize you were probably talking about the file itself, not the format.
For licensing those, I would need both @larsenv and @HEYimHeroic to weigh in.
The repository these files were taken from were licensed under AGPL-3.0.
So that's what I propose to license these files under.
I fixed these before being told that all struct files must have an explicit license. Which is obviously ridiculous to expect, when Nintendo cannot be reasoned with. Honestly, they should have led with that.
The format in question was Usually these files have to be extracted from Nintendo games. The purpose for reverse-engineering them is to modify the games by re-injecting them. |
There is a two byte checksum at the end of Is there a way for me to check the This works with both for me. Is it agreeable? - id: checksum
type: u2
repeat: eos |
Took @generalmimon's suggestion. Co-authored-by: Petr Pučil <[email protected]>
All info added may or may not be incorrect. This is only a starting point for me, to copy over the docs from gen 1
It isn't relevant what program is used to create the files, but rather where does come the original input (in other words, the user input data that are reflected to the output files) from. I assume that you give some input into the program and it gives you (or saves anywhere on your computer) some output file, right? If so, it's clear - the copyright status of the output inherits that of the input it was generated from. So if the input you fill into the program is yours, the output files that whatever program generates from it is yours as well, and you can do whatever you want with it - copy, redistribute, modify, ... It doesn't matter what layout the output files have (file formats per se as particular layouts how to store data usually cannot be copyrighted, so nobody can really claim them). As I said in #400 (review), a potential problem would only occur if the program generating the files would inject considerable amounts of itself into the output files. For example, if you make a screenshot from a 3D game, there are usually lots of textures of objects, terrain and whatever, and these original images are parts of the game and are copyrighted by the company which created the game. So your screenshot would be a derivative work and it "is subject to whatever rights your license to use the game gives you and fair use/fair dealing" (see https://law.stackexchange.com/a/8547). But there is really no reason to worry about such simple binary files generated just from your data - that's still your 100% original work, only stored in a different layout, expressed in another medium. |
So I put a s. I put a s at the end of it. When? Always.
The only files in the format I have on hand are not my original work whatsoever. They are not user-generated at all. They are extracted from Nintendo games. Therefore, they still belong to Nintendo. Created by them, for their games. I cannot share them. Sorry. |
If it's not expected to be present more than once, use - id: checksum
type: u2
if: not _io.eof See https://doc.kaitai.io/user_guide.html#_streams for reference of the
Makes sense, thanks for the explanation.
What about https://github.com/zeldamods/oead? Edit: there also seems to be a lot of public sample files in https://github.com/zeldamods/oead/tree/b2d9e2f/test/aamp/files and https://github.com/zeldamods/aamp/tree/ed99df8/test_data. |
That's only really a library, not an editor, per se.
Cool. Good to know. |
I found a bug that shows up in the online IDE but does not show up not in VS Code (cc @fudgepop01), which prevents the Studio KSY from compiling. I'm working on it, but the IDE doesn't make it very easy to find the cause of an error. Edit: It turned out to be a missing type declaration on |
Because that was still causing issues with c# code.
That enum was pretty much useless anyway, as far as I know
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All .ksy
specs should contain some references to external documentation, specifications or other implementations via keys /meta/xref
or /doc-ref
(whenever possible). The only spec which satisfies that is game/aamp_v2.ksy
, which at least specifies /meta/xref/zeldamods: AAMP
.
Can you please add a few sentences describing each format in the /doc
key, i.e. some general information about the format? What it is used for? What kind of data it stores? What is the main usage field?
The point is for someone who visits the format page in our format gallery to get a basic idea about what kind of the format it is, and not have to search it on Wikipedia or other external websites.
game/aamp_v2.ksy
Outdated
- id: data | ||
type: u4 | ||
instances: | ||
data_offset: | ||
value: data & 24 | ||
doc: Offset to data, divided by 4 and relative to parameter start. | ||
parameter_type: | ||
value: data >> 24 | ||
enum: parameter_type | ||
doc-ref: https://zeldamods.org/wiki/AAMP#ParameterType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- id: data | |
type: u4 | |
instances: | |
data_offset: | |
value: data & 24 | |
doc: Offset to data, divided by 4 and relative to parameter start. | |
parameter_type: | |
value: data >> 24 | |
enum: parameter_type | |
doc-ref: https://zeldamods.org/wiki/AAMP#ParameterType | |
- id: parameter_type | |
type: b8 | |
enum: parameter_type | |
- id: data_offset | |
type: b24 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a fine solution, but doesn't work in practice. Any idea why?
(Sorry if it's obvious, I'm still waking up.)
Current code:
parameter:
seq:
- id: name_crc32
type: u4
doc: |
CRC32 checksum of the name of this parameter.
Can be compared against this list to get the name:
doc-ref: [SNIPPED]
- id: data_offset
type: b24
doc: Offset to data, divided by 4 and relative to parameter start.
- id: parameter_type
type: b8
enum: parameter_type
doc-ref: https://zeldamods.org/wiki/AAMP#ParameterType
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a fine solution, but doesn't work in practice. Any idea why?
Well, let's see how the bytes are laid out. data
has the little-endian byte order, so extracting the members data_offset
and parameter_type
works with the d[3] d[2] d[1] d[0]
array:
d[3] d[2] d[1] d[0]
7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
p7 p6 p5 p4 p3 p2 p1 p0 f23 f22 f21 f20 f19 f18 f17 f16 f15 f14 f13 f12 f11 f10 f9 f8 f7 f6 f5 f4 f3 f2 f1 f0
data_offset
would be obtained as 0b{f23}{f22}{f21}...{f1}{f0}
, analogously parameter_type
is 0b{p7}{p6}...{p1}{p0}
.
But note that the bytes are presented in little-endian order in the above diagram, so let's swap them back to the original order as present in the stream:
d[0] d[1] d[2] d[3]
7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
f7 f6 f5 f4 f3 f2 f1 f0 f15 f14 f13 f12 f11 f10 f9 f8 f23 f22 f21 f20 f19 f18 f17 f16 p7 p6 p5 p4 p3 p2 p1 p0
Diagram A ↑
So yeah, true, it becomes apparent that data_offset
cannot be parsed using the big-endian bit-sized integer parsing method described at https://doc.kaitai.io/user_guide.html#bit-ints-be, because it uses the following parsing direction:
d[0] d[1]
7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 ...
f7 f6 f5 f4 f3 f2 f1 f0 f15 f14 f13 f12 f11 f10 f9 f8 f7 ...
┃ ───────────────────────────> │ │ ───────────────────────────> │ │ ── ... ──>
parsing direction ╷ ↑ ╷ ↑
└┄┄┘ └┄┄┘
Which is pretty useless for reading the data_offset
, because by applying this direction to Diagram A you will get 0b{f7}{f6}{f5}...{f0}{f15}{f14}...{f8}{f23}{f22}...{f16}
, and that's certainly not what you want.
So you need to use the little-endian direction (indicated by bit-endian: le
or type: bXle
), which looks like this:
d[0] d[1]
7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 ...
f7 f6 f5 f4 f3 f2 f1 f0 f15 f14 f13 f12 f11 f10 f9 f8 ...
│ <─────────────────────────── ┃ │ <─────────────────────────── │ │
╷ parsing direction ↑
└┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈>┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┘
Using this method on Diagram A will finally give you 0b{f23}{f22}{f21}...{f2}{f1}{f0}
(you can imagine it like the first parsed bit is the least significant bit {f0}
from d[0].0
, the second is {f1}
from d[0].1
, ..., the 8th is {f7}
from d[0].7
, the 9th is {f8}
from d[1].0
and so on).
In short, this should work:
meta:
bit-endian: le
# ...
types:
parameter:
seq:
# ...
- id: data_offset
type: b24
doc: Offset to data, divided by 4 and relative to parameter start.
- id: parameter_type
type: b8
enum: parameter_type
doc-ref: https://zeldamods.org/wiki/AAMP#ParameterType
favorite: | ||
value: data_1 >> 14 & 1 | ||
doc: Whether the Mii is a favorite or not. | ||
favorite_color: | ||
value: data_1 >> 10 & 15 | ||
enum: favorite_colors | ||
doc: Favorite color. Ranges from 0 to 11. | ||
birth_day: | ||
value: data_1 >> 5 & 31 | ||
doc: Mii birthday day, Ranges from 0 to 30 | ||
birth_month: | ||
value: data_1 >> 1 & 15 | ||
enum: months | ||
doc: Mii birthday month, Ranges from 0 to 11 | ||
gender: | ||
value: data_1 & 1 | ||
enum: genders | ||
doc: Mii gender. 0 = male, 1 = female. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider rewriting all such manual bit operations using the little-endian bit integers. For the data_1
bitfield, it would mean replacing this
- id: data_1
type: u2
with this:
meta:
bit-endian: le
seq:
# ...
- id: gender
type: b1
enum: genders
doc: Mii gender
- id: birth_month
type: b4
enum: months
doc: Mii birthday month, Ranges from 0 to 11
- id: birth_day
type: b5
doc: Mii birthday day, Ranges from 0 to 30
- id: favorite_color
type: b4
enum: favorite_colors
doc: Favorite color. Ranges from 0 to 11.
- id: is_favorite
type: b1
doc: Whether the Mii is a favorite or not.
- id: unused # I assume this bit is not unused; if you consider it rather unknown, sure, omit the `id`
type: b1
Read https://doc.kaitai.io/user_guide.html#bit-ints-le for more info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@HEYimHeroic, would this interfere with "Mii Studio Codes" at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I already did that in the Kaitai that I made. See my latest comment.
Getting cross references for these is very difficult, seeing as larsenv and HEYimHeroic were the only users I know of to be interested in documenting most of the formats before. I have been doing my best to add |
"Out of scope for this PR" But still necessary. :/
You have my permission to use the Kaitais that I made for this PR. This could be better than the PR that I made for #356. I am fine with AGPL v3 license. |
Also applications for gen 1 mii data
Am I crazy or is this the opposite order to the way it's documented on 3dbrew? |
OK, I haven't read the thread, but I agree using bitwise operators for the Kaitai is bad. I was the one who made the Kaitai, and I did that because bit-endian didn't exist at the time. I fixed the Kaitai though after 0.9 came out, and made PR #356 for it, which never got merged. See https://github.com/larsenv/kaitai_struct_formats/blob/d098a15570421aa29b2770929235bb37ecf7f843/game/WiiU3DSMiitomo_miidatafile.ksy Maybe it would be worth starting over and using what I made. I tested it and it works better. |
I've done a lot of work on documentation and rewriting the Gen 2 data today (yes, I copied over the data from larsen's newer version.), but it's not complete, and probably not even working yet. I really should be going to sleep earlier, so I'm not going to commit my changes right now. I'll continue tomorrow. |
@halotroop2288 Where did this PR get stuck? Does the commit message of your last revision (267bcfb):
mean that you're stuck somewhere and need help? |
@generalmimon Yes and no. I have a hard time committing to tasks for a long time. I was having issues and also just getting bored of code, so I took a break. And usually when I take a break from a project, I never return. But I always hope to return to old projects. In this case, I'd rather pass it off to someone if it means this PR won't go stale, otherwise, we'll see how long it takes for me to want to come back to this. |
#355 contains review comments that are also relevant for changes in this PR (but haven't been pointed out or addressed here). |
So close, yet so far. It doesn't look like this will ever be finished. I didn't dedicate any of my recent "programming mood" time to this PR, and I believe @HEYimHeroic will not be commenting on the license. Maintainers may close this at their leisure, otherwise it may become stale. It is also open to edits by maintainers if they so choose. I am turning my focus onto STFS format for my next PR. |
The authors are never going to respond. |
@larsenv allowed #356 to age for a couple months, and I've been chatting with @HEYimHeroic about Miis and working on my own Mii editor, so I decided to take a crack at this and make my own pull request.
This PR includes updated versions of the specs from both #355 and #356, making them unnecessary unless they are accepted before this one.
This addresses a couple of the issues brought up before, but maybe not all of them, and may have added more of its own.
I started adding documentation, but haven't finished it as of making this pull request. Consider it a preview.
I'm very new to this, so be gentle, please. Any pointers you can give would be appreciated.
(This is the same as #399 but aimed at the correct branch of my repo.)