Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xdrill functions for ledger entry changes #3

Open
wants to merge 4 commits into
base: 5551/xdrill-operations
Choose a base branch
from

Conversation

chowbao
Copy link
Owner

@chowbao chowbao commented Jan 29, 2025

PR Checklist

PR Structure

  • This PR has reasonably narrow scope (if not, break it down into smaller PRs).
  • This PR avoids mixing refactoring changes with feature changes (split into two PRs
    otherwise).
  • This PR's title starts with name of package that is most changed in the PR, ex.
    services/friendbot, or all or doc if the changes are broad or impact many
    packages.

Thoroughness

  • This PR adds tests for the most critical parts of the new functionality or fixes.
  • I've updated any docs (developer docs, .md
    files, etc... affected by this change). Take a look in the docs folder for a given service,
    like this one.

Release planning

  • I've reviewed the changes in this PR and if I consider them worthwhile for being mentioned on release notes then I have updated the relevant CHANGELOG.md within the component folder structure. For example, if I changed horizon, then I updated (services/horizon/CHANGELOG.md. I add a new line item describing the change and reference to this PR. If I don't update a CHANGELOG, I acknowledge this PR's change may not be mentioned in future release notes.
  • I've decided if this PR requires a new major/minor version according to
    semver, or if it's mainly a patch change. The PR is targeted at the next
    release branch if it's not a patch change.

What

stellar#5552

xdrill functions for ingest.Changes

Why

Create low level helper functions for ledger entry changes parsing

Known limitations

  • Refactor of the processor/transforms to be done in separate ticket/pr

ingest/change.go Outdated
return ledgerKey.MarshalBinaryBase64()
}

func (c Change) EntryDetails(passphrase string) (map[string]interface{}, error) {
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I restructured the ledger entry changes helper function to behave like the LedgerOperations helper function where the output from a given entry type will populate a details[] map.

How does this look? Are there any other preferred alternatives?

I'll write tests when we agree on a format

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be an opportunity to have these be strictly typed instead of an opaque interface{} to interpret. Maybe not worth it? It'd be a looooot of work, so probably not if people are used to consuming/seeing this variant.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I would like to make it strictly typed but it definitely isn't worth it at this point. Could be a good new eng task though 🤔

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm actually how would strict typing work in this case?

I'm assuming I would need to create a struct for each entryDetail or would it be like a giant struct for all entryDetails?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 691bfd8

So instead of the generic interface{} each ledgerentry will now return a relevant struct. This seemed like the best way to strictly type the results but I'm open to other options if there are any

Copy link

@Shaptic Shaptic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just drivin' by 🏎️

ingest/change.go Outdated Show resolved Hide resolved
Comment on lines +341 to +342
default:
return xdr.LedgerEntry{}, changeType, false, fmt.Errorf("unable to extract ledger entry type from change")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's also LedgerEntryChangeTypeLedgerEntryState which should probably be handled here instead of bailing out, no? Then no error case at all.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's good to know. I think stellar-etl code has been skipping LedgerEntryChangeTypeLedgerEntryState on purpose (or accidentally 😨)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the bool for?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ok bools are usually used to signify if the returned value is actually valid or not. And it being not valid is not an error.

For example think of soroban fees. If you want to get soroban fees for a classic transaction you'd return 0 for the fee value but also false for the bool because there is no real value for the fee

Copy link

@amishas157 amishas157 Jan 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Ideally it should be null but also go weirdly handles null values. So this is the way of handling it. 👍

case xdr.LedgerEntryChangeTypeLedgerEntryCreated, xdr.LedgerEntryChangeTypeLedgerEntryUpdated:
return *c.Post, changeType, false, nil

What's the reason of setting false bool in this case? Isn't it returning valid value?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry in this case the bool is to signify if the entry was deleted or not

ingest/change.go Outdated
return ledgerKey.MarshalBinaryBase64()
}

func (c Change) EntryDetails(passphrase string) (map[string]interface{}, error) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be an opportunity to have these be strictly typed instead of an opaque interface{} to interpret. Maybe not worth it? It'd be a looooot of work, so probably not if people are used to consuming/seeing this variant.

Comment on lines +425 to +427
if err != nil {
return details, err
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can skip all of these checks by just returning details, err at the end since err will be set accordingly

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah makes sense. Is that good coding practice though? I thought it's generally good to return/err when it happens rather than at the end of a function. BUT these are also really small functions anyways so it's probably not a big deal

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends who you ask 😆 imo omitting it can show "all this function does is switch/case + parse + return"

I also hate Go's pedantry for error verbosity so I like to shorten it when I can

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah just golang things 😢

ingest/change.go Outdated
Sponsor string
}

func AccountDetails(accountEntry *xdr.AccountEntry) (map[string]interface{}, error) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These parsers feel like they belong in a separate file

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I thought about doing that. Example for account

In the example I split it out into separate files (also gave each data element it's own function). It does seem to clutter the directory though. Maybe that's not a big deal?

Also I wonder is it worth doing the same split for each operation as well? 🤔 Both are pretty large files

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's too cluttery you could have a parsers/ or equivalent subdirectory. At the very least I'd dump everything below this line into a parsedetails.go (or similar) file 🤷

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'll split it into separate files. Nicer on the eyes

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 691bfd8

Comment on lines 18 to 21
// MarshalBinaryBase64 marshals XDR into a binary form and then encodes it
// using base64.
func (c ClaimableBalanceId) MarshalBinaryBase64() (string, error) {
b, err := c.MarshalBinary()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems redundant given you can do:

var id xdr.ClaimableBalanceId
str, err := xdr.MarshalBase64(&id)

At the very least this method should use that instead of repeating a xdr -> bytes -> base64 code path.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I didn't realize there was an xdr.MarshalBase64() function. I split it out into xdr.* specific marshal base64 because that's what I saw existing for some other xdr structs (probably old and was just never refactored). I'll update

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah those are outdated I bet. They're convenient because of intellisense suggesting it on the object itself, but even then, they'd belong in xdrgen rather than handwritten.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 08a42b7

@chowbao
Copy link
Owner Author

chowbao commented Jan 30, 2025

just drivin' by 🏎️

Why does this make me think of Fast and Furious lol

@chowbao chowbao mentioned this pull request Feb 3, 2025
7 tasks
Copy link

@sydneynotthecity sydneynotthecity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR contains two of the material differences in schema between Horizon and Hubble:

  1. Offers containing a buying and selling asset (vs base and counter),
  2. Liquidity pools flattening the Reserves map into AssetA and AssetB

It may be worthwhile to ask some opinions of the platform team during team meeting to make sure there is consensus on this schema choice.

if err != nil {
return details, err
}
default:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: missing ConfigSetting ledger entry type, but dunno if you're intentionally leaving off

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did leave that off intentionally. But you're right I should just add it and make it do nothing instead of leaving it completely out

AssetCode string `json:"asset_code"`
AssetIssuer string `json:"asset_issuer"`
AssetType string `json:"asset_type"`
AssetID int64 `json:"asset_id"`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not seeing where you parse AssetId in ClaimableBalanceDetails. Can we just drop?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup this is a mistake. I will drop AssetID

AssetIssuer string `json:"asset_issuer"`
AssetType string `json:"asset_type"`
BalanceHolder string `json:"balance_holder"`
Balance string `json:"balance"`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dumb question: why is this balance a string and not any other balances? Are we not worried about int overflow for the other amounts?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is a string because the Balance is a big.Int (not sure why just this balance is big.Int though might just be because how we defined smart contracts)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, ok. TIL that only soroban contract balance is a big.Int

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps one day we'll materialize this in hubble

Comment on lines +13 to +20
AssetAType string `json:"asset_a_type"`
AssetACode string `json:"asset_a_code"`
AssetAIssuer string `json:"asset_a_issuer"`
AssetAReserve int64 `json:"asset_a_amount"`
AssetBType string `json:"asset_b_type"`
AssetBCode string `json:"asset_b_code"`
AssetBIssuer string `json:"asset_b_issuer"`
AssetBReserve int64 `json:"asset_b_amount"`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious if @Shaptic or others have thoughts on this proposed schema as it's a departure on how Horizon parses this ledger entry. Like? Dislike?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine, but amounts need to be strings and I feel like the structure could be a little less flat - both the JSON and the struct are a little gross to read. It'd be nice if assets had a generic structure we re-use everywhere, e.g. { type, code, issuer }

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

amounts need to be strings

Ah cause they are all big ints (or whatever the larger int is called)?

both the JSON and the struct are a little gross to read.

Oh from the OLAP realm I actually prefer all flattened lol. I can see JSOn being more flexible for other use cases though. We can discuss with the team if they have opinions

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No because JS sucks and can't represent 64 bit ints as, well, ints.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WAT

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh from the OLAP realm I actually prefer all flattened

+1 we would still unpack in BQ just because flattened data structure is nicer for normies. But i do think @Shaptic is right that others prefer a consistent non-flattened object for asset details

import "github.com/stellar/go/xdr"

type Ttl struct {
LiveUntilLedgerSeq uint32

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing the key hash, no?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was intentionally left off (same with contract_data and contract_code). This is because the ledger_key_hash actually comes from the Change instead of the LedgerEntry and hence you can get LedgerKeyHash for anything (not only smart contract related entries) with LedgerKeyHash()

BUT if this seems clunky I don't mind adding the LedgerKeyHash() to these structs

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave it as it is, the struct just seemed weirdly empty

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants