-
Notifications
You must be signed in to change notification settings - Fork 101
Pull requests: NVIDIA/NeMo-Curator
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Clean up Pandas, cuDF, Dask, and Dask-cuDF Run GPU CI/CD on PR
DocumentDataset
type logic
gpuci
#494
opened Jan 23, 2025 by
sarahyurick
Loading…
Standardize Run GPU CI/CD on PR
text_field
and id_field
terminology
gpuci
#485
opened Jan 17, 2025 by
sarahyurick
Loading…
Minor CrossFit improvements
gpuci
Run GPU CI/CD on PR
#483
opened Jan 16, 2025 by
sarahyurick
Loading…
Add Run GPU CI/CD on PR
nemo-toolkit
dependency to gpuCI
gpuci
#480
opened Jan 10, 2025 by
sarahyurick
Loading…
Enable ADD ID to work with CPU/GPU both
gpuci
Run GPU CI/CD on PR
#479
opened Jan 10, 2025 by
VibhuJawa
Loading…
Support
dask_expr
migration into dask.dataframe
#477
opened Jan 9, 2025 by
rjzamora
Loading…
3 tasks
Update
get_all_files_paths_under
examples to include keep_extensions
#450
opened Dec 20, 2024 by
sarahyurick
Loading…
[WIP] Add RAPIDS Nightly to GPU CI
gpuci
Run GPU CI/CD on PR
#436
opened Dec 17, 2024 by
praateekmahajan
•
Draft
3 tasks
Bump nltk from 3.8.1 to 3.9 in /tutorials/dapt-curation/code
dependencies
Pull requests that update a dependency file
#429
opened Dec 13, 2024 by
dependabot
bot
Loading…
Create notebook tutorials for distributed data classifiers
documentation
Improvements or additions to documentation
#415
opened Dec 6, 2024 by
sarahyurick
Loading…
3 tasks done
[WIP] Efficient Exact Duplicate Removal Code
#404
opened Dec 2, 2024 by
praateekmahajan
•
Draft
3 tasks
Fix GPU error messages for fuzzy deduplication
gpuci
Run GPU CI/CD on PR
#387
opened Nov 22, 2024 by
sarahyurick
Loading…
2 tasks done
Fuzzy Dedup: Make skipping the False positive check the default
enhancement
New feature or request
gpuci
Run GPU CI/CD on PR
#386
opened Nov 21, 2024 by
ayushdg
Loading…
2 of 3 tasks
Remove Run GPU CI/CD on PR
max_text_bytes_per_part
gpuci
#385
opened Nov 20, 2024 by
sarahyurick
Loading…
Create Run GPU CI/CD on PR
Cache
class for exact, fuzzy, and semantic deduplication
gpuci
#384
opened Nov 19, 2024 by
sarahyurick
•
Draft
2 of 4 tasks
Convert
translation_example.py
into a Jupyter Notebook tutorial
#336
opened Oct 29, 2024 by
sarahyurick
•
Draft
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.