Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to prompt for json output ? #37

Open
snimavat opened this issue Sep 13, 2024 · 3 comments · May be fixed by #40
Open

Ability to prompt for json output ? #37

snimavat opened this issue Sep 13, 2024 · 3 comments · May be fixed by #40

Comments

@snimavat
Copy link

snimavat commented Sep 13, 2024

The base seems to be there, would be great, if there was flexibility to provide/extend prompt and expect back a json response.
I would like to send a pdf which has structure, and want gpt to convert it to specific json format... it does it very well,
so instead of coding it myself, i would rather prefer to use/extend zerox if there was flexibility of output format and ability to costomize prompt

@pradhyumna85
Copy link
Contributor

@snimavat The python api has option to override the system prompt Refer the readme.

But still there are some postprocessing/cleanups which zerox does on the output from the vision model.

So I would say in your custom system prompt, say to output your required JSON but still encapsulate it in ```json markdown block (so that it doesn't result in some unexpected behaviour) which you can trim out later on your own and convert the json text to say python dict using json.loads function,

@tylermaran, this issue can be closed.

@snimavat
Copy link
Author

snimavat commented Sep 14, 2024

@pradhyumna85 thanks for response, but i think the issue should not be closed.

Not keeping zerox tied to markdown only will increase the utility of this helpful project and can increase usage.

There's no reason, why zerox should remain tied to markdown.... its could be a little ocr framework tht makes it easy to get ocr done through LLM and get the output in format of user's choice. it can support markdown/json out of the box, but can expose the raw output text too tht user can post process to as he desires

Zerox as general purpose llm based ocr library will be more attractive thn zerox as ocr markdown library

@pradhyumna85
Copy link
Contributor

pradhyumna85 commented Sep 14, 2024

@snimavat make sense. Adding a flag to the zerox api to skip markdown based post-processing (formatmarkdown function) could be done incase user is interested in raw output in conjunction with custom system prompt. I can raise a PR for this.
@tylermaran, @annapo23 any thoughts on how to proceed?

Edit:
@tylermaran, @annapo23, I have raised PR #40 for your review.

pradhyumna85 pushed a commit to pradhyumna85/zerox that referenced this issue Sep 15, 2024
- added post_process_function param to override/skip Zerox's default format_markdown post processing on the model's text output.
- removed output_dir param and added output_file_path which is more flexible for arbitrary file extensions
- page_separator param added (used when writing the consolidated output to the output_file_path
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants