Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix wrong iterator handling for user types in repeat-until. #134

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ams-tschoening
Copy link
Contributor

I have the following KSY which broke in Ruby and Java because the special iterator _ has not been processed properly. In Ruby it wasn't declared early enough and is therefore not known during user type invocation and in Java it wasn't initialized properly, leading to a compile time error. This PR fixes both.

I did the change in RubyTranslator because I wasn't sure if translating _ is missing there by accident or purpose and came across it. Feel free to change that if it was wrong.

KSY:

  - id:       vdbs
    type:     fmt_oms_apl_full_vdb(vdbs.size - 1, _)
    repeat:       until
    repeat-until: _.next_isnt_vdb
    if:       not   next_isnt_vdb

Wrong Ruby:

  if !(next_isnt_vdb)
    (@_debug['vdbs'] ||= {})[:start] = @_io.pos
    @vdbs = []
    i = 0
    begin
      (@_debug['vdbs'][:arr] ||= [])[@vdbs.size] = {:start => @_io.pos}
      _t_vdbs = FmtOmsAplFullVdb.new(@_io, self, @_root, (vdbs.length - 1), _)
      _t_vdbs._read
      _ = _t_vdbs
      @vdbs << _
      @_debug['vdbs'][:arr][@vdbs.size - 1][:end] = @_io.pos
      i += 1
    end until _.next_isnt_vdb
    (@_debug['vdbs'] ||= {})[:end] = @_io.pos
  end

Wrong Java:

FmtOmsAplFullVdb _it;
int i = 0;
do {
	_it = new FmtOmsAplFullVdb(this._io, this, _root, (vdbs().size() - 1), _it);
	this.vdbs.add(_it);
	i++;
} while (!(_it.nextIsntVdb()));

Working Ruby:

  if !(next_isnt_vdb)
    (@_debug['vdbs'] ||= {})[:start] = @_io.pos
    @vdbs = []
    _it = nil
    i = 0
    begin
      (@_debug['vdbs'][:arr] ||= [])[@vdbs.size] = {:start => @_io.pos}
      _t_vdbs = FmtOmsAplFullVdb.new(@_io, self, @_root, (vdbs.length - 1), _it)
      _t_vdbs._read
      _it = _t_vdbs
      @vdbs << _it
      @_debug['vdbs'][:arr][@vdbs.size - 1][:end] = @_io.pos
      i += 1
    end until _it.next_isnt_vdb
    (@_debug['vdbs'] ||= {})[:end] = @_io.pos
  end

Working Java:

FmtOmsAplFullVdb _it = null;
int i = 0;
do {
    _it = new FmtOmsAplFullVdb(this._io, this, _root, (vdbs().size() - 1), _it);
    this.vdbs.add(_it);
    i++;
} while (!(_it.nextIsntVdb()));

@GreyCat
Copy link
Member

GreyCat commented Mar 29, 2018

Thanks! I'll try to add the relevant test first and check if this problem existing with other languages.

Also, why are you using vdbs.size and not _index?

@ams-tschoening
Copy link
Contributor Author

Also, why are you using vdbs.size and not _index?

Simply didn't think of that, thanks for the catch!

@GreyCat
Copy link
Member

GreyCat commented Mar 31, 2018

After reviewing it again, I'm kind of unsure how to treat it. _ is supposed to be undefined during the loop invocation, and it's not very clear what to do with it if we'll be using it in:

  • size
  • if
  • process arguments

Even with type parameters, it's not very clear what one should expect.

Right now you're kind of using the fact that _ still holds collection item from the previous iteration of the loop, and somehow that helps it. It's not guaranteed for all implementations and all languages. If we'll ban _ everywhere except for repeat-until, probably it won't hurt that much. Using _index (passing it as a parameter into the invoked type, if necessary, as in your example), one can access i-1-th, i-2-th, or basically any other items of the collection.

Setting _ to null is actually not a very good solution also because some languages (C++ and Java included) do not allow nullable primitive types, i.e. you can't assign null to an int, and it's possible to have repeat-until loop with type: u4 or something like that. This PR actually breaks such cases by trying to assigning null to such type.

ams-tschoening pushed a commit to ams-ts-kaitai/compiler that referenced this pull request Apr 1, 2018
…d for KS and its support for other languages and use-cases than I expected. I was able to rewrite my KS for now because of the usage of FMPP, so don't rely in this fix anymore. this way I'm more back in line with master again.
@ams-tschoening
Copy link
Contributor Author

Using _index (passing it as a parameter into the invoked type, if necessary, as in your example), one can access i-1-th, i-2-th, or basically any other items of the collection.

I think not in my use case and that's why I used _: The problem I initially had was multiple different types which ultimately used one and the same type which needed to implement some special workaround. That ultimate type needed to know if to apply the workaround or not and couldn't without looking at "parent" types. Because those parents where different, they have been resolved to KataiStruct and I couldn't properly check things anymore.

It's something around the following:

A -> B[1..n of C] -> C -> V
X -> Y[1..n of Z] -> Z -> V

V is the common type implementing a workaround for some bug, B and Y are collections implemented using repeat-until and the individual items are C and Z. Those two are ultimately using V and therefore are differently typed parents for V. From my understanding, I don't easily know in which context V is used in this scenario, so thought of checking what I need to know for V in C and Z already. That check depends on the former elements in the collections and I therefore simply forwarded the last element to my check.

The workaround I'm implementing is VERY specific to my data and use case, but as it was still doable in KS, I'm doing it there to in theory benefit in all supported target languages. So you might perfectly come to the conclusion that it's not worth support in KS at all.

OTOH, in my opinion it was obvious to assume that _ should support my use case to contain the last element, because it was available in repeat-until already. So a more general discussion about what exactly to support when and why, like you started in kaitai-io/kaitai_struct#404, seems like a good idea.

Additionally, I had another look at my implementation and am pretty sure that I can rework my current solution and that special use case to not rely on _ being a last element of the iteration anymore. So I might be able to close this PR in favour of the more general discussion.

@ams-tschoening
Copy link
Contributor Author

I've reworked my types to not rely on the fixes of this PR anymore. @GreyCat, feel free to close it if you think this needs a more thorough discussion on how to support things how/when. If you close it, have a second look at the following line, because using Identifier.ITERATOR instead of _ seems to be a useful change.

5da3ba9#diff-fc2acab10340febf58569b1517b3d4b5L392

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants