Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review Mandatory information for GlyGen glycomics CSV files #78

Open
ReneRanzinger opened this issue Aug 4, 2024 · 3 comments
Open
Assignees

Comments

@ReneRanzinger
Copy link
Member

@Mindyporterfield and @senaarpinar are finalizing a tool for generating GlyGen CSV files for glycomics data. Please review the following ticket (especially the second half with the mandatory columns and the assumptions) and let us know if you agree or have alternative suggestions.

This is based on the CSV file definition you provided a while ago:

https://gwu0.sharepoint.com/:w:/r/sites/GlyGenTeam-GRP/_layouts/15/Doc.aspx?sourcedoc=%7B27A34C21-ADB9-4D85-A96D-4B7EB40A103A%7D&file=Mindy%20paper%20curation.docx&action=default&mobileredirect=true

@ReneRanzinger
Copy link
Member Author

@jeet-vora will share the example file with GW and decide if and what changes are needed.

@jeet-vora
Copy link

The protein table is under heading `Table structure for glycoprotein information in this Word Document

@ReneRanzinger @katewarner @ubhuiyan @Mindyporterfield Please review. We also need to clean/arrange the document once final.

@ReneRanzinger
Copy link
Member Author

The expression part is undefined and will be ignored for now. There is also an issue with the relationship between the two expressions. Each row represents a glycosylation event. The protein expression has nothing to do with a row. Its rather the "sum" of all rows with the protein + the abundance of the non-glycosylated version. For the glycan expression its not sure what this means. If its the abundance of a certain glycan it again has nothing to do with an individual row but rather is the sum of all occurrences of this glycan on all sites. If its something like 20% of the glycans on this site are this glycan we have two problems. I am not sure how the user can enter this with lots of overhead and we have this kind of information often in the form of (20% high mannose and 80% complex Nglycans). But these would not have a GlyTouCan ID.

The last point raises the question how to deal with this type of data? Its even "less" than a composition. Its just a glycan type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants