I want to create a project that gathers a user's favorite categories and specific news sources, then generates two outputs based on the retrieved content:
- A 5-minute podcast script summarizing the key points from the articles.
- A newsletter containing summarized versions of the articles, formatted for easy reading.
- User Input: Allow users to select their favorite categories and news sources dynamically (e.g.,
{categories}
,{news_source}
). - Task 1 (Podcast): Design a process that:
- Scrapes or fetches news articles based on user preferences.
- Summarizes the key points into a coherent script for a 5-minute podcast.
- Ensure the script has a clear flow, engaging tone, and concise summaries.
- Task 2 (Newsletter): Create a process that:
- Summarizes the articles into a brief but informative newsletter.
- Ensure the format is visually appealing, with clear sections for each category or news source.
- Podcast Script: A well-structured 5-minute script summarizing the articles, ready to be recorded.
- Newsletter: A visually structured and concise newsletter summarizing the same articles.
- Utilize an agent for content scraping or fetching articles from the specified sources.
- Use a Summarizer agent to condense the information from each article.
- Employ a Writer agent to organize the podcast script and format the newsletter.
- Optionally, integrate external tools (APIs) for fetching real-time news or using summarization models.
- Agents should retain user preferences (categories, news sources) to allow for easy reuse in future interactions.
- Use dynamic variables such as
{categories}
and{news_source}
for customizations in both outputs.
Design and help me plan this project by outlining the specific components, tools, and steps needed.
Please use GPT-3.5 Turbo for the summarization and I want to create a detailed task-based working plan for developing the project that gathers a user's favorite categories and specific news sources, then generates:
A 5-minute podcast script summarizing key points from the articles. A newsletter containing summarized versions of the articles. Developer Task Breakdown: Break the project into as many small steps as possible. Each step should be treated as a Developer Ticket with clear instructions and expected outcomes. For each ticket, include: Task Description: What the developer needs to accomplish. Expected Result: What the final output of the task should look like. Dependencies: If the task depends on other tasks, list them. Interaction: We will work interactively, so the plan should allow me to follow along and interact with each step. Create the plan in a markdown file for easy reference and tracking. Working Plan Requirements: Divide the process into multiple developer tickets. Each ticket should have a clear, small, and actionable step, allowing the project to progress incrementally. Ensure that the developer's tasks cover: User Input Handling (collecting categories and news sources). Content Scraping from specific news sources. Summarization of the articles for both the podcast script and newsletter. Podcast Script Generation. Newsletter Formatting. Include space for feedback and adjustments at each step. Example Ticket Structure: Task Description: Scrape news articles based on user preferences. Expected Result: Code that retrieves articles from a specified news source based on given categories. Dependencies: Requires user input collection functionality. Output: Provide the plan as a markdown file that I can review, with each developer ticket clearly outlined. We will proceed step-by-step, and I will interact with each task as the development process unfolds.
To plan the development of your project step-by-step, I’ll break it down into small, actionable tickets, each with a specific goal, expected output, and any dependencies. The project involves creating a system that scrapes or fetches news articles, summarizes them for a podcast script and newsletter, and dynamically adapts to user input for categories and news sources.
I’ll prepare this as a markdown document for easy tracking and review.
- Task Description: Initialize the project structure for the news summarization system.
- Create a new CrewAI project.
- Set up the necessary files (tasks, agents, tools, etc.).
- Expected Result: A well-structured project repository, with all folders and files properly set up.
- Dependencies: None.
- Output: Project scaffolding is ready for future development.
- Task Description: Design and implement a method to collect the user's preferred categories and news sources.
- Use a form or prompt to collect categories and sources dynamically.
- Store user preferences for future reuse.
- Expected Result: Code that allows users to input categories and news sources, which will be used to customize content fetching.
- Dependencies: None.
- Output: A functional interface (CLI or web form) that collects and saves user inputs.
- Task Description: Set up a tool for scraping or fetching articles from selected news sources based on user input.
- Use an external API (like NewsAPI or Serper) to retrieve articles.
- Filter articles by categories and news sources provided by the user.
- Expected Result: A working tool that retrieves relevant articles based on user input.
- Dependencies: Ticket 2 (user input collection).
- Output: Code that fetches articles dynamically based on user preferences.
- Task Description: Create a summarizer agent that condenses each fetched article into a concise summary.
- Use GPT-3.5 Turbo for summarizing the articles.
- Ensure the summaries capture the key points of each article.
- Expected Result: A functioning summarizer agent that processes fetched articles and returns summarized content.
- Dependencies: Ticket 3 (content scraping).
- Output: Summarized versions of each article, ready for further processing.
- Task Description: Create an agent that organizes the summarized content into a coherent podcast script.
- Ensure the script has a clear introduction, transitions between articles, and a conclusion.
- The script should be designed to be read in a 5-minute timeframe.
- Expected Result: A well-structured 5-minute podcast script generated from the summarized content.
- Dependencies: Ticket 4 (article summarization).
- Output: Podcast script ready for recording.
- Task Description: Create an agent that formats the summarized content into a visually appealing newsletter.
- Each section should cover one category or news source.
- Ensure readability and clear structure, with bold headings for each section.
- Expected Result: A formatted newsletter that provides summarized content in a reader-friendly manner.
- Dependencies: Ticket 4 (article summarization).
- Output: A newsletter ready to be sent out or displayed.
- Task Description: Implement memory functionality so that user preferences (categories, sources) are saved for future use.
- Store preferences in a database or file system for persistent use.
- Expected Result: A dynamic system that remembers and reuses user preferences across sessions.
- Dependencies: Ticket 2 (user input collection).
- Output: User preferences are stored and can be dynamically reused.
- Task Description: Test the entire pipeline from input collection to podcast script and newsletter generation.
- Collect feedback on the flow, summarization quality, and output format.
- Expected Result: Fully tested system with feedback integrated for improvement.
- Dependencies: All previous tickets.
- Output: A functional end-to-end system with refined outputs based on feedback.
- Task Description: Prepare the project for deployment and create documentation.
- Ensure all instructions for running the project are clear.
- Deploy the project (if applicable) to a server or hosting platform.
- Expected Result: The project is ready for users, and documentation is provided for developers.
- Dependencies: All previous tickets.
- Output: Deployment-ready project with accompanying documentation.
- User Input Handling – Needed before fetching articles.
- Content Scraping – Dependent on user input.
- Article Summarization – Dependent on content scraping.
- Podcast Script Generation – Dependent on article summarization.
- Newsletter Generation – Dependent on article summarization.
- Memory Management – Enhances user input handling for future use.
Each of these tickets represents a step in the process. After completing each ticket, we will review the output and gather feedback to proceed to the next step. This will allow us to adjust and refine the workflow incrementally.
My Project Name: Autonews
Agents:
content_scraper: role: "News Content Scraper" goal: "Scrape articles from {news_sources} for categories {categories}." backstory: "You are an expert in scraping news from different websites and summarizing their contents for the requested categories."
news_classifier: role: "News Classifier" goal: "Classify the scraped news articles into categories {categories}." backstory: "You are an expert in classifying news articles into different categories."
news_summarizer: role: "News Summarizer" goal: "Summarize the news articles into {summary_length} sentences." backstory: "You are an expert in summarizing news articles into different lengths."
news_researcher: role: "News Researcher" goal: "Research the news articles and provide any and all relevant information." backstory: "You are an expert in researching news articles and providing any and all relevant information."
news_reporter: role: "News Reporter" goal: "Report the news articles in a detailed manner." backstory: "You are an expert in reporting news articles in a detailed manner."
Create Tasks:
: Done Tasks:
scrape_articles_task: description: "Scrape articles from the following news sources: {news_sources} in the categories of {categories} on {date}, and summarize the top headlines." expected_output: "A summary of the top 5 headlines from each news source in each category for {date}."
classify_articles_task: description: "Classify the scraped news articles into categories {categories}." expected_output: "A list of the categories that the news articles belong to."
summarize_articles_task: description: "Summarize the news articles into {summary_length} sentences." expected_output: "A summary of the news articles in {summary_length} sentences."
research_articles_task: description: "Research the news articles and provide any and all relevant information." expected_output: "A list of the most relevant information from the news articles."
report_articles_task: description: "Report the news articles in a detailed manner." expected_output: "A detailed report of the news articles."
Create Tools:
: Done Tools:
news_api: description: "A tool for fetching news articles from various sources."