Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support except file comand line argument #9

Merged
merged 1 commit into from
Jan 14, 2025

Conversation

sangshuduo
Copy link
Owner

No description provided.

Copy link
Contributor

@mentatbot mentatbot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed the changes for the random pairs of s3file tool. Overall, the implementation looks good with proper error handling and documentation. Here are a few suggestions for improvement:

Other than these minor suggestions, the implementation looks solid with:

  • ✅ Proper error handling for S3 operations
  • ✅ Efficient use of HashSet for exclusions
  • ✅ Clear command line argument documentation
  • ✅ Maintainable and well-structured code

The feature will be useful for users who need to exclude certain files from the random pairing process.

Thanks for using MentatBot. Give comments a 👍 or 👎 to help me improve!
You can change my review settings on my settings page.

@@ -50,6 +57,14 @@ async fn main() -> Result<(), Box<dyn Error>> {
let directory_prefix = &args.directory;
let url_prefix = &args.url_prefix;

// Read excluded keys from file if provided
let excluded_keys: HashSet<String> = if let Some(exclude_file_path) = args.exclude_file {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding error handling for individual lines in the exclude file. Currently, invalid lines are silently skipped with map_while(Result::ok). It might be better to either:

  1. Log warnings for invalid lines
  2. Return an error if any line is invalid
  3. Add documentation explaining that invalid lines are skipped

@@ -77,6 +92,7 @@ async fn main() -> Result<(), Box<dyn Error>> {
let all_keys: Vec<String> = objects
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a debug log or print statement showing how many keys were excluded. This would help users verify that their exclude file is working as expected.


```shell
Usage: random_pairs_of_s3file [OPTIONS] --num-pairs <NUM> --bucket <BUCKET> --directory <DIR> --url-prefix <PREFIX>
Usage: random_pairs_of_s3file [OPTIONS] --num-pairs <NUM_PAIRS> --bucket <BUCKET> --directory <DIRECTORY> --url-prefix <URL_PREFIX>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The README documentation is good, but it would be helpful to add:

  1. An example of the exclude file format
  2. Whether the exclude file should contain full S3 keys or just filenames
  3. A complete example command with the exclude file option

@sangshuduo sangshuduo force-pushed the feat/sangshuduo/random-pairs-with-s3 branch from 7740d0a to 536f4fb Compare January 14, 2025 04:09
@sangshuduo sangshuduo merged commit 6effea5 into main Jan 14, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant