Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate Wikipedia Data #6

Open
TheOnlyWayUp opened this issue Aug 17, 2024 · 7 comments
Open

Integrate Wikipedia Data #6

TheOnlyWayUp opened this issue Aug 17, 2024 · 7 comments
Assignees
Labels
help wanted Extra attention is needed stale Must work on it later on.

Comments

@TheOnlyWayUp
Copy link

Hey Tanmay, cool project!

Atm the data is from GPT-4o,

prompt: `
You are a geopolitical analyst. Given the two countries ${country1} and ${country2}, analyze their relationship based on the following six key factors using the most recent and latest data available from the internet:
1. Diplomatic Relations: Assess the presence of embassies, consulates, and other diplomatic missions, the frequency of high-level diplomatic visits, and any significant treaties or agreements. Provide the number and locations of embassies in both countries, and include any notable diplomatic engagements. Pull the latest diplomatic events and treaties from current news sources.
- Score: Provide a score out of 100, where 0 indicates very weak diplomatic relations and 100 indicates very strong relations.
- Explanation: Briefly explain why this score was assigned, citing specific data such as the number of embassies, their locations, and the nature of diplomatic interactions. Ensure that all data is the most up-to-date by incorporating the latest diplomatic news from the internet.
2. Economic Ties: Evaluate the trade volume, investment flows, and any economic agreements or sanctions in place. Mention key export-import statistics, major investment projects, and the overall economic interdependence between the two countries. Include the most recent trade agreements and economic developments as reported in current news articles.
- Score: Provide a score out of 100, where 0 indicates minimal economic ties and 100 indicates very strong economic cooperation.
- Explanation: Briefly explain why this score was assigned, including specific data like trade volumes, key industries involved, and major economic agreements. Use the latest trade data available and include recent economic news.
3. Military Relations: Examine the level of defense cooperation, including military alliances, joint exercises, arms deals, and any military bases or deployments. Provide details on recent joint exercises or significant defense agreements from the latest news sources.
- Score: Provide a score out of 100, where 0 indicates minimal military cooperation and 100 indicates very strong military relations.
- Explanation: Briefly explain why this score was assigned, including details of joint military exercises, defense pacts, or arms sales, using the most recent developments from the internet.
4. Political Alignments: Consider their alignment on global political issues, voting patterns in international organizations, and public statements by leaders. Mention any significant differences or alignments in their foreign policies. Include the most recent political statements and alignments as reported in the news.
- Score: Provide a score out of 100, where 0 indicates divergent political stances and 100 indicates close political alignment.
- Explanation: Briefly explain why this score was assigned, with references to specific voting patterns or public statements. Ensure the analysis reflects the latest political interactions from current news.
5. Cultural and Social Ties: Assess people-to-people connections, cultural exchanges, educational partnerships, and public perception. Mention significant cultural events, student exchanges, and public opinion data. Include the latest cultural exchanges and public sentiment as captured by current surveys and news reports.
- Score: Provide a score out of 100, where 0 indicates weak cultural ties and 100 indicates strong cultural connections.
- Explanation: Briefly explain why this score was assigned, including details on cultural programs, educational exchanges, or public sentiment. Use recent cultural exchanges, surveys, and news articles.
6. Historical Context: Analyze the historical relationship between the two countries, including past conflicts, cooperation, and any unresolved historical issues. Mention key historical events that continue to influence the relationship, and include any recent historical discussions or commemorations.
- Score: Provide a score out of 100, where 0 indicates a problematic historical relationship and 100 indicates a strong, cooperative history.
- Explanation: Briefly explain why this score was assigned, citing specific historical events or legacies.
Overall Geopolitical Relationship Score: Calculate the average score across all six factors. Ensure that the overall score is consistent with the individual factor scores and reflects the general relationship between the two countries.
- Score: Provide a final score out of 100, where 0 indicates a very weak overall relationship and 100 indicates a very strong relationship.
- Explanation: Briefly summarize why this score was assigned, ensuring that it is an average and well-balanced reflection of the individual scores provided above.
Scoring must be objective and based solely on the analysis and data provided, without being influenced by personal opinions or external factors.
`,
, that's cool - but it's not dynamic.

Maybe you could integrate data from the Wikipedia pages (which change as events occur, geopolitical and otherwise), and feed it into GPT for better information (something like RAG?).

Example: France-India Relations | wikipedia.org

@sarkartanmay393
Copy link
Owner

@TheOnlyWayUp Yes definitely, I can do that. Your idea is on my list now. If you are free sometimes, we can chat about it.

@sarkartanmay393 sarkartanmay393 added the help wanted Extra attention is needed label Aug 17, 2024
@sarkartanmay393
Copy link
Owner

@Nil2000
Copy link
Contributor

Nil2000 commented Aug 22, 2024

For Future ref: China_India_relation
HTML format

@sarkartanmay393 sarkartanmay393 self-assigned this Aug 24, 2024
@sarkartanmay393
Copy link
Owner

sarkartanmay393 commented Aug 24, 2024

@Nil2000 Do test the initial wikipedia attachment. Now we are passing the whole extracted text from the html to chatgpt. Passing max 15k tokens.

@TheOnlyWayUp
Copy link
Author

TheOnlyWayUp commented Aug 26, 2024

@Nil2000 Do test the initial wikipedia attachment. Now we are passing the whole extracted text from the html to chatgpt. Passing max 15k tokens.

Hmm, I took a look at Wikipedia's simple pages (simple.wikipedia.org), they have very few stubs for geopolitical relations (valid, they're hard to boil down in most cases).

I see a few options, ranked by how well they'd do (imo)

  1. Summarizing each heading and its content, bringing down the token usage (but requires more requests).
  2. Embedding each heading (and its content) and comparing them to each of your 6 parameters. The top k results would be provided to the model instead of the entire article. (This is an RAG Approach)
  3. Truncating everything past the first paragraph for each heading. (Worth a try, LLMs can perform well with limited information.)

Tias!

@ghost
Copy link

ghost commented Sep 3, 2024

@Nil2000 Do test the initial wikipedia attachment. Now we are passing the whole extracted text from the html to chatgpt. Passing max 15k tokens.

Hmm, I took a look at Wikipedia's simple pages (simple.wikipedia.org), they have very few stubs for geopolitical relations (valid, they're hard to boil down in most cases).

I see a few options, ranked by how well they'd do (imo)

  1. Summarizing each heading and its content, bringing down the token.
  2. Embedding each heading (and its content) and comparing them to each of your 6 parameters. The top k results would be provided to the model instead of the entire article. (This is an RAG Approach)
  3. Truncating everything past the first paragraph for each heading. (Worth a try, LLMs can perform well with limited information.)

Tias!

You are right, using the RAG approach is the way to go for cost effective and more accuracy.

And also for more current results if we fetch top 20-30 news articles from the internet of selected two countries and use that if it would be much much better but it would cost more.

Ideal: Use Wikipedia + GPT for overall knowledge, and latest news.

@sarkartanmay393
Copy link
Owner

Added a quick support for this, but need to improve on it further.

@sarkartanmay393 sarkartanmay393 added the stale Must work on it later on. label Sep 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed stale Must work on it later on.
Projects
None yet
Development

No branches or pull requests

3 participants