Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in the assign_grnas() step #162

Open
Hyunsoo7268 opened this issue Dec 19, 2024 · 6 comments
Open

Error in the assign_grnas() step #162

Hyunsoo7268 opened this issue Dec 19, 2024 · 6 comments

Comments

@Hyunsoo7268
Copy link

Hyunsoo7268 commented Dec 19, 2024

Dear awesome sceptre team,

First of all, thank you so much for developing this amazing tool.

Second, recently, I've been working on learning and testing this tool, sceptre to analyze my single-cell CRISPR screen data from the 10x Genomics platform.
I read and followed the sceptre tutorial, which went very well with the sample data provided by sceptredata.
After that, I tried to test the sceptre workflow with my 10x Genomics data, but I got stuck at the step of assign_grnas(), which is the step that exactly I need!
My goal of using sceptre is to only categorize my cells into cells with targeting sgRNAs (perturbed cells) and cells with non-targeting sgRNAs (NT cells).
With the information, I can proceed to Seurat and do whatever analysis I need.

So, let me share what I did and what error I faced below.

1. Import data

grna_target_df <- read.csv("grna_target_df.csv")
sceptre_object <- import_data_from_cellranger(
directories = "filtered_feature_bc_matrix",
moi = "high",
grna_target_data_frame = grna_target_df
)

sceptre_object

discovery_pairs <- construct_trans_pairs(
sceptre_object = sceptre_object,
pairs_to_exclude = "none"
)
head(discovery_pairs)
dim(discovery_pairs) # 77212 2

side <- "both"

sceptre_object <- set_analysis_parameters(
sceptre_object = sceptre_object,
discovery_pairs = discovery_pairs,
side = side
)

2. Assign gRNAs to cells

a <- plot_grna_count_distributions(sceptre_object)
jpeg("1_gRNA_CountDistributions.jpg", width = 12, height = 6, units = "in", res=200)
a
dev.off()

sceptre_object <- assign_grnas(sceptre_object = sceptre_object, parallel = TRUE)
print(sceptre_object)

a <- plot(sceptre_object)
jpeg("2_gRNA_assignment.jpg", width = 12, height = 6, units = "in", res=200)
a
dev.off()

The code where the error occurs is the a <- plot(sceptre_object) part.
R returned
"Error in sample.int(length(x), size, replace, prob) :
cannot take a sample larger than the population when 'replace = FALSE'".
image

I don't really understand what it means and what process I did wrong, so I wish to ask you for some advice on this.

For your information, let me also share what my data look like.
image

image

Really really wish I could have some help from you~~~

Thanks!

@ekatsevi
Copy link
Member

It appears that the gRNA assignment to cells worked, and that that error occurred only in trying to visualize the results. Therefore, if what you want are just the gRNA assignments, you can export those via get_grna_assignments() and proceed with your analysis.

To help us better understand what happened with the visualization function, could you please run the following code on your sceptre_object (after running assign_grnas()), and let me know the output?

  init_assignments <- sceptre_object@initial_grna_assignment_list
  grna_matrix <- get_grna_matrix(sceptre_object) |> sceptre:::set_matrix_accessibility(make_row_accessible = TRUE)
  grna_ids <- names(init_assignments)
  assigned <- vapply(init_assignments, length, FUN.VALUE = integer(1)) >= 1
  grna_ids <- grna_ids[assigned]
  nrow(grna_matrix)
  grna_ids

@Hyunsoo7268
Copy link
Author

Hyunsoo7268 commented Jan 4, 2025

Thank you so much for taking a look at my situation, ekatsevi!
And I'm really sorry for my delayed response!!

I feel like the gRNA assignment was not well done because init_assignments shows only 20 numbers (out of the cell number 11686) under the target gene (RAB1A-2)...?
As you advised, let me share what I got after your suggested codes!

init_assignments <- sceptre_object@initial_grna_assignment_list
image

grna_matrix <- get_grna_matrix(sceptre_object) |> sceptre:::set_matrix_accessibility(make_row_accessible = TRUE)
image

grna_ids <- names(init_assignments)
image

assigned <- vapply(init_assignments, length, FUN.VALUE = integer(1)) >= 1
image

grna_ids <- grna_ids[assigned]
image

nrow(grna_matrix)
image

grna_ids
image

The codes you suggested ran without any errors but still not sure if the gRNA assignment went right.
If you need any other information to find the problem, please let me know anytime!

Thanks again!

Hyunsoo

@ekatsevi
Copy link
Member

ekatsevi commented Jan 5, 2025

Hi Hyunsoo,

Thanks for following up. I now know what caused the error in the plotting code, and this is something we can fix fairly easily. However, as you suggest, there seem to be some other strange things going on with you data and/or our software: your targeting gRNA was detected in only 20 cells, and your non-targeting gRNA was detected in no cells. This is quite unusual. (Incidentally, the fact that you have a gRNA not detected in any cells is what caused the error in our plotting code.) At this stage, for me to determine whether this is caused by your data or our software, I would need to actually look at your data. Would you mind sharing your data with me, either via email ([email protected]) if it is under 20MB or so, or via a service like Google Drive if it is larger than that? I would keep your data confidential and delete it on my end after I finished diagnosing your issue.

Best,
Gene

@Hyunsoo7268
Copy link
Author

Hi Gene,

Thanks again for your efforts to help me!
Because the data size is over 20MB, let me share the data through Google Drive.
You can download it using the link below:
https://drive.google.com/drive/folders/1MvjTxxatxHAc2BaSVr0MOyi1Jh-1BA7H?usp=sharing

For your information, the data I'm using to learn about sceptre is from 10x Genomics.
If needed, please refer to a link below that describes the data briefly!
https://www.10xgenomics.com/datasets/10-k-a-375-cells-transduced-with-1-non-target-and-1-target-sg-rna-dual-indexed-3-1-standard-4-0-0

(For your information again, the estimated cell count of my data is 11686 whereas that by 10x Genomics is 11791. This is because I downloaded fastq files from 10x Genomics and ran cellranger by myself!)

If you need any other information, please feel free to ask me anytime!

Thanks!

Hyunsoo

@ekatsevi
Copy link
Member

ekatsevi commented Jan 8, 2025

Hi Hyunsoo,

I've looked at the data, and I've reproduced what you are seeing: very few cells are having gRNAs assigned to them. My current theory is that something about the gRNA assignment method (the default, which is the mixture method) is going wrong for your data, because you have so few gRNAs. We will dig into this more when we have a chance. In the meantime, I recommend you use an alternative gRNA assignment method, the thresholding method. You can eyeball the gRNA count histograms produced by plot_grna_count_distributions() to see that a gRNA UMI count of 10 seems like a good threshold for calling a cell as containing a gRNA. I recommend you try this out via

sceptre_object <- assign_grnas(sceptre_object = sceptre_object, method = "thresholding", threshold = 10)

With this method, nearly all cells have one or more assigned gRNAs. You can then proceed with your analysis. If and when we figure out why the mixture gRNA assignment method failed in this case, we will get back to you.

Best,
Gene

@Hyunsoo7268
Copy link
Author

Hi Gene,

Thank you so much for your super quick check and advice!

I haven't done the threshold method yet but will do quite soon.
I believe it will work since you must have checked my data in person.

As you suggested, I'll use the method, waiting for you and your development team to solve the initial issue.

Again, thank you very much for your help with professional manners!

Hyunsoo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants