You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Also given the table-heavy nature of the dataset, it would make sense to use the AWS Textract Table Feature (note that this will change the price comparison). (15$/1000 pages)
AWS Textract supports Markdown/HTML output through the Textractor python library: see https://aws-samples.github.io/amazon-textract-textractor/notebooks/document_linearization_to_markdown_or_html.html
Also given the table-heavy nature of the dataset, it would make sense to use the AWS Textract Table Feature (note that this will change the price comparison). (15$/1000 pages)
Amazon Bedrock Data Automation offers PDF to markdown service as well (10$/1000 pages).
The text was updated successfully, but these errors were encountered: