Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Construct: use lambda this: for ternary expressions #246

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Mimickal
Copy link
Contributor

Construct does not support using conditionals in this expressions. Currently, the compiler outputs ternary expressions like this: ((this.len + 1) if this.len > 0 else 0). That expression does not evaluate the way you might expect. I expect that when this.len is 0, the value supplied to Construct is also 0. The value that actually gets supplied currently is 1.

I've cooked up a practical demo of the problem.

I'm using this ksy:

meta:
  id: ternary
seq:
  - id: len
    type: u1
  - id: data
    type: str
    size: 'len > 0 ? len + 1 : 0'
    encoding: ASCII

This test script contains the Construct struct the compiler outputs before and after this change. It feeds several values through the old and new struct to demonstrate the behavior I'm correcting

# Demonstrates that Construct requires a lambda to access `this` in conditional contexts
from construct import *
from construct.lib import *

# Output from ksc before this change
ternary_old = Struct(
	'len' / Int8ub,
	'data' / FixedSized(((this.len + 1) if this.len > 0 else 0), GreedyString(encoding='ASCII')),
)

# Output from ksc after this change
ternary_new = Struct(
	'len' / Int8ub,
	'data' / FixedSized((lambda this: (this.len + 1) if this.len > 0 else 0), GreedyString(encoding='ASCII')),
)

for val in [
    b'\x00',  # Old fails because it still expects 1 char even though it
              # should expect 0. New handles this properly.
    b'\x01a', # Both fail, since we expect 2 chars but only get 1.
    b'\x01aa' # Both succeed, since we expect 2 chars and get 2 chars.
]:
    try:
        print('old', val, ternary_old.parse(val), '\n')
    except Exception as e:
        print('old', val, str(e), '\n')

    try:
        print('new', val, ternary_new.parse(val), '\n')
    except Exception as e:
        print('new', val, str(e), '\n')

This script produces the following output:

old b'\x00' Error in path (parsing) -> data
stream read less than specified amount, expected 1, found 0 

new b'\x00' Container: 
    len = 0
    data = u'' (total 0) 

old b'\x01a' Error in path (parsing) -> data
stream read less than specified amount, expected 2, found 1 

new b'\x01a' Error in path (parsing) -> data
stream read less than specified amount, expected 2, found 1 

old b'\x01aa' Container: 
    len = 1
    data = u'aa' (total 2) 

new b'\x01aa' Container: 
    len = 1
    data = u'aa' (total 2) 

@@ -6,6 +6,9 @@ import io.kaitai.struct.format.Identifier
import io.kaitai.struct.ConstructClassCompiler

class ConstructTranslator(provider: TypeProvider, importList: ImportList) extends PythonTranslator(provider, importList) {
override def doIfExp(condition: Ast.expr, ifTrue: Ast.expr, ifFalse: Ast.expr): String =
s"(lambda this: ${translate(ifTrue)} if ${translate(condition)} else ${translate(ifFalse)})"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, this is fundamentally wrong - it only works if the ternary is the top level operation in the expression (which is a nonsensical assumption, because it certainly does not hold in practice, maybe just coincidentally). Even something as simple as

   - id: data
     type: str
-    size: 'len > 0 ? len + 1 : 0'
+    size: '(len > 0 ? len + 1 : 0) + 1'
     encoding: ASCII

or nested ternaries (please leave aside that len was unsigned and can't be negative):

   - id: data
     type: str
-    size: 'len > 0 ? len + 1 : 0'
+    size: 'len > 0 ? len + 1 : (len < 0 ? 1 : 0)'
     encoding: ASCII

will not work. So I don't see much point in accepting this PR as it stands.

Besides, as you describe the issue, it is not specific to the ternary operation, is it? It seems likely to me that there are other expression features that require the same type of handling.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All fair points. As far as I can tell, this is an issue anywhere this is used in combination with logic when building a Construct struct. I'm not sure what other things would trigger this behavior other than ternaries. I do see how this would break with nested ternaries though, so I'll figure out how to avoid that scenario.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants