You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Im currently running your pipeline in a dataset (~15K cells) and everything was running perfectly until I tried to run the prune2df function. I've checked the other issues related with this function and haven't been able to solve the problem.
There are 19812 genes in the expression matrix.
Here are the variables and the call to the function
There are a lot of warnings that appear when running the script (just to show an example):
2020-02-11 13:27:32,745 - pyscenic.transform - WARNING - Less than 80% of the genes in Regulon for RELB could be mapped to hg38__refseq-r80__10kb_up_and_down_tss.mc9nr. Skipping this module.
And at some point it throws the following error/warning
Traceback (most recent call last):
File "", line 1, in
File "/home/jpromero/PythonLib/pyscenic/prune.py", line 351, in prune2df
num_workers, module_chunksize)
File "/home/jpromero/PythonLib/pyscenic/prune.py", line 300, in _distributed_calc
return create_graph().compute(scheduler='processes', num_workers=num_workers if num_workers else cpu_count())
File "/home/jpromero/PythonLib/dask/base.py", line 165, in compute
(result,) = compute(self, traverse=False, **kwargs)
File "/home/jpromero/PythonLib/dask/base.py", line 436, in compute
results = schedule(dsk, keys, **kwargs)
File "/home/jpromero/PythonLib/dask/multiprocessing.py", line 222, in get
**kwargs
File "/home/jpromero/PythonLib/dask/local.py", line 486, in get_async
raise_exception(exc, tb)
File "/home/jpromero/PythonLib/dask/local.py", line 316, in reraise
raise exc
File "/home/jpromero/PythonLib/dask/local.py", line 222, in execute_task
result = _execute_task(task, data)
File "/home/jpromero/PythonLib/dask/core.py", line 119, in _execute_task
return func(*args2)
File "/home/jpromero/PythonLib/dask/dataframe/utils.py", line 657, in check_meta
check_matching_columns(meta, x)
File "/home/jpromero/PythonLib/dask/dataframe/utils.py", line 682, in check_matching_columns
" Missing: %s" % (extra, missing)
ValueError: The columns in the computed data do not match the columns in the provided metadata
Extra: []
Missing: []
After that, it continues for a while and then just suddenly stops. I've tried increasing the memory, but that doesn't seem to fix the problem. Is there anything I am missing or not seeing?
Thanks in advance!
jp
The text was updated successfully, but these errors were encountered:
Hello,
Im currently running your pipeline in a dataset (~15K cells) and everything was running perfectly until I tried to run the prune2df function. I've checked the other issues related with this function and haven't been able to solve the problem.
There are 19812 genes in the expression matrix.
Here are the variables and the call to the function
dbs
[FeatherRankingDatabase(name="hg38__refseq-r80__10kb_up_and_down_tss.mc9nr"), FeatherRankingDatabase(name="hg38__refseq-r80__500bp_up_and_100bp_down_tss.mc9nr")]`
len(modules)
8432
MOTIF_ANNOTATIONS_FNAME
'/home/jpromero/Data/PyScenic/Resources/motifs-v9-nr.hgnc-m0.001-o0.0.tbl'
df = prune2df(dbs, modules, MOTIF_ANNOTATIONS_FNAME, num_workers=6)
There are a lot of warnings that appear when running the script (just to show an example):
2020-02-11 13:27:32,745 - pyscenic.transform - WARNING - Less than 80% of the genes in Regulon for RELB could be mapped to hg38__refseq-r80__10kb_up_and_down_tss.mc9nr. Skipping this module.
And at some point it throws the following error/warning
Traceback (most recent call last):
File "", line 1, in
File "/home/jpromero/PythonLib/pyscenic/prune.py", line 351, in prune2df
num_workers, module_chunksize)
File "/home/jpromero/PythonLib/pyscenic/prune.py", line 300, in _distributed_calc
return create_graph().compute(scheduler='processes', num_workers=num_workers if num_workers else cpu_count())
File "/home/jpromero/PythonLib/dask/base.py", line 165, in compute
(result,) = compute(self, traverse=False, **kwargs)
File "/home/jpromero/PythonLib/dask/base.py", line 436, in compute
results = schedule(dsk, keys, **kwargs)
File "/home/jpromero/PythonLib/dask/multiprocessing.py", line 222, in get
**kwargs
File "/home/jpromero/PythonLib/dask/local.py", line 486, in get_async
raise_exception(exc, tb)
File "/home/jpromero/PythonLib/dask/local.py", line 316, in reraise
raise exc
File "/home/jpromero/PythonLib/dask/local.py", line 222, in execute_task
result = _execute_task(task, data)
File "/home/jpromero/PythonLib/dask/core.py", line 119, in _execute_task
return func(*args2)
File "/home/jpromero/PythonLib/dask/dataframe/utils.py", line 657, in check_meta
check_matching_columns(meta, x)
File "/home/jpromero/PythonLib/dask/dataframe/utils.py", line 682, in check_matching_columns
" Missing: %s" % (extra, missing)
ValueError: The columns in the computed data do not match the columns in the provided metadata
Extra: []
Missing: []
After that, it continues for a while and then just suddenly stops. I've tried increasing the memory, but that doesn't seem to fix the problem. Is there anything I am missing or not seeing?
Thanks in advance!
jp
The text was updated successfully, but these errors were encountered: