left_join(x, y2)%>%data.frame#> Joining with `by = join_by(ccode)`
-#> Warning in left_join(x, y2): Each row in `x` is expected to match at most 1 row in `y`.
-#> ℹ Row 11 of `x` matches multiple rows.
-#> ℹ If multiple matches are expected, set `multiple = "all"` to silence this
-#> warning.
+#> Warning in left_join(x, y2): Detected an unexpected many-to-many relationship between `x` and `y`.
+#> ℹ Row 11 of `x` matches multiple rows in `y`.
+#> ℹ Row 1 of `y` matches multiple rows in `x`.
+#> ℹ If a many-to-many relationship is expected, set `relationship =
+#> "many-to-many"` to silence this warning.#> ccode year ranking#> 1 2 2016 low#> 2 2 2017 low
diff --git a/docs/authors.html b/docs/authors.html
index 4d258fb..dff5d79 100644
--- a/docs/authors.html
+++ b/docs/authors.html
@@ -77,13 +77,20 @@
Miller, Steven V. Forthcoming. {peacesciencer}: An R Package for Quantitative Peace Science Research Conflict Management and Peace Science http://svmiller.com/peacesciencer/
-
@Article{peacesciencer-package,
+
Miller S (2022).
+“peacesciencer: An R Package for Quantitative Peace Science Research.”
+Conflict Management and Peace Science, 39(6), 755–779.
+doi: 10.1177/07388942221077926.
+
+
@Article{,
title = {{peacesciencer}: An R Package for Quantitative Peace Science Research},
- author = {{Steven V. Miller}},
+ author = {Steven V. Miller},
journal = {Conflict Management and Peace Science},
year = {2022},
- url = {http://svmiller.com/peacesciencer/},
+ volume = {39},
+ number = {6},
+ pages = {755--779},
+ doi = {10.1177/07388942221077926},
}
-
peacesciencer is an R package including various functions and data sets to allow easier analyses in the field of quantitative peace science. The goal is to provide an R package that reasonably approximates what made EUGene so attractive to scholars working in the field of quantitative peace science in the early 2000s. EUGene shined because it encouraged replications of conflict models while having the user also generate data from scratch. Likewise, this R package will offer tools to approximate what EUGene did within the R environment (i.e. not requiring Windows for installation).
+
peacesciencer is an R package including various functions and data sets to allow easier analyses in the field of quantitative peace science. The goal is to provide an R package that reasonably approximates what made EUGene so attractive to scholars working in the field of quantitative peace science in the early 2000s. EUGene shined because it encouraged replications of conflict models while having the user also generate data from scratch. Likewise, this R package will offer tools to approximate what EUGene did within the R environment (i.e. not requiring Windows for installation).
+
Installation
@@ -185,7 +186,7 @@
How to Use {peacesciencer}#> 10 I(gmlmidspell^2) 0.00247 0.000135 18.4 2.74e- 75#> 11 I(gmlmidspell^3) -0.0000116 0.000000891 -13.0 1.16e- 38toc()
-#> 7.559 sec elapsed
+#> 7.35 sec elapsed
Here is how you might do a standard civil conflict analysis using Gleditsch-Ward states and UCDP conflict data.
@@ -256,7 +257,7 @@
How to Use {peacesciencer}#> 11 I(war_ucdpspell^3) -0.0000499 0.0000302 -1.65 0.0982toc()
-#> 2.444 sec elapsed
+#> 2.315 sec elapsed
Citing What You Do in {peacesciencer}
diff --git a/docs/pkgdown.yml b/docs/pkgdown.yml
index 247e79c..e543266 100644
--- a/docs/pkgdown.yml
+++ b/docs/pkgdown.yml
@@ -9,7 +9,7 @@ articles:
parlor-tricks: parlor-tricks.html
state-systems: state-systems.html
versions: versions.html
-last_built: 2023-03-22T13:35Z
+last_built: 2023-03-24T11:16Z
urls:
reference: http://svmiller.com/reference
article: http://svmiller.com/articles
diff --git a/docs/search.json b/docs/search.json
index 4df40fd..9f26e9d 100644
--- a/docs/search.json
+++ b/docs/search.json
@@ -1 +1 @@
-[{"path":"http://svmiller.com/articles/coerce-dispute-year-dyad-year.html","id":"converting-cow-mid-dyadic-dispute-year-data-into-dyad-year-data","dir":"Articles","previous_headings":"","what":"Converting CoW-MID Dyadic Dispute-Year Data into Dyad-Year Data","title":"How `{peacesciencer}` Coerces Dispute-Year Data into Dyad-Year Data","text":"First, let’s identify dyad-year duplicates data. absolute data United Kingdom-Soviet Union dyad, six conflicts ongoing /initiated 1920. Next tie United States-Soviet Union dyad 1958, Egypt-Israel dyad (1959, 1960), Syria-Israel dyad (1955). told, 498 dyad-years duplicate dyadic dispute-year data. need whittle one dyad-year data.","code":"cow_mid_dirdisps %>% # make it non-directed for ease of presentation filter(ccode2 > ccode1) %>% group_by(ccode1, ccode2, year) %>% summarize(n = n(), mids = paste0(dispnum, collapse = \", \")) %>% arrange(-n) %>% filter(n > 1) %>% ungroup() #> `summarise()` has grouped output by 'ccode1', 'ccode2'. You can override using #> the `.groups` argument. #> # A tibble: 498 × 5 #> ccode1 ccode2 year n mids #> #> 1 200 365 1920 6 186, 197, 1133, 2363, 2364, 2604 #> 2 2 365 1958 5 125, 173, 608, 2215, 2216 #> 3 651 666 1959 5 3375, 3405, 3419, 3421, 3430 #> 4 651 666 1960 5 3375, 3405, 3419, 3422, 3430 #> 5 652 666 1955 5 3404, 3405, 3416, 3417, 3418 #> 6 2 365 1962 4 61, 1353, 2219, 3361 #> 7 2 365 1967 4 345, 2930, 2931, 2934 #> 8 200 365 1919 4 197, 2363, 2604, 2605 #> 9 651 666 1958 4 3375, 3405, 3419, 3420 #> 10 652 666 1954 4 3403, 3404, 3415, 3417 #> # … with 488 more rows"},{"path":"http://svmiller.com/articles/coerce-dispute-year-dyad-year.html","id":"first-select-unique-onsets","dir":"Articles","previous_headings":"Converting CoW-MID Dyadic Dispute-Year Data into Dyad-Year Data","what":"First: Select Unique Onsets","title":"How `{peacesciencer}` Coerces Dispute-Year Data into Dyad-Year Data","text":"primary aim preserve unique onsets. case United States-United Kingdom dyad 1903 illustrate ’s stake . , United States United Kingdom three MIDs ongoing 1903. Two (MID#0002 MID#0254) began 1902. third, MID#3301, new onset. case, want remove observation MID#0002 MID#0254 keep observation MID3301. United States-United Kingdom Dyadic Dispute-Years 1903 ’s peacesciencer first cut. Grouping dyad-year (.e. group_by(ccode1, ccode2, year)), creates new variable equals 1 number rows dyad-year 1. Maintaining grouped structure, calculates standard deviation disponset variable. Cases standard deviation calculate cases dyad-year duplicate assigned 0. Next, creates simple removeme column equals 1 1) ’s duplicated dyad-year, 2) ’s unique onset, 3) standard deviation greater 0 (.e. least one onset dyad-year). removes cases removeme == 1. Observe fixed USA-United Kingdom observation 1903. fix Italy-France problem 1860, ’s three dispute-years onsets year. France-Italy Dyadic Dispute-Years 1903 just tells us ’re done, knew wouldn’t . need exclusion rules whittle data.","code":"cow_mid_dirdisps %>% filter(ccode1 == 2 & ccode2 == 200 & year == 1903) %>% select(dispnum:disponset) %>% kbl(., caption = \"United States-United Kingdom Dyadic Dispute-Years in 1903\", booktabs = TRUE, longtable = TRUE) %>% kable_styling(position = \"center\", full_width = F, bootstrap_options = \"striped\") cow_mid_dirdisps %>% group_by(ccode1, ccode2, year) %>% mutate(duplicated = ifelse(n() > 1, 1, 0)) %>% # Remove anything that's not a unique MID onset mutate(sd = sd(disponset), sd = ifelse(is.na(sd), 0, sd)) %>% mutate(removeme = ifelse(duplicated == 1 & disponset == 0 & sd > 0, 1, 0)) %>% filter(removeme != 1) %>% # remove detritus select(-removeme, -sd) %>% # practice safe group_by() ungroup() -> hold_this # ^ The `hold_this` naming convention is my favorite for intermediate objects. # It's also a bad idea to overwrite data objects that come in this package. hold_this %>% filter(ccode1 == 2 & ccode2 == 200 & year == 1903) %>% select(dispnum:disponset) #> # A tibble: 1 × 6 #> dispnum ccode1 ccode2 year dispongoing disponset #> #> 1 3301 2 200 1903 1 1 hold_this %>% filter(ccode1 == 220 & ccode2 == 325 & year == 1860) %>% select(dispnum:disponset) %>% kbl(., caption = \"France-Italy Dyadic Dispute-Years in 1903\", booktabs = TRUE, longtable = TRUE) %>% kable_styling(position = \"center\", full_width = F, bootstrap_options = \"striped\")"},{"path":"http://svmiller.com/articles/coerce-dispute-year-dyad-year.html","id":"second-keep-the-highest-dispute-level-fatality","dir":"Articles","previous_headings":"Converting CoW-MID Dyadic Dispute-Year Data into Dyad-Year Data","what":"Second: Keep the Highest Dispute-Level Fatality","title":"How `{peacesciencer}` Coerces Dispute-Year Data into Dyad-Year Data","text":"presented opportunity keep one dispute drop another two appear year, researchers likely prefer “serious” one rather one might simple threat use show force. Consider Russia-Ottoman Empire (Turkey) dyad-year 1853. two unique onsets two year. One (MID#0057) became Crimean War, important conflict! (MID#0126) apparent show force fatalities. conditions, ’s easy call keep one fatalities. Russia-Ottoman Empire Dyadic Dispute-Years 1853 one limitation CoW-MID data toward end. obviously know CoW-MID assigns fatalities end dispute participants, ’d way knowing priori many fatalities Russia-Turkey dyad 1853. situation like Belgium-Germany 1939-1940. case, highest action Belgium engaged Germany 1939 mobilization war momentarily eliminated Belgium international system happened next year. also don’t know extent Turkey responsible Russia’s fatalities. Crimean War multilateral war pitting Russians United Kingdom, Austria-Hungary, Italy, Turkey, France. Thus, follows crude, still useful. ’ll use dispute-level fatality information stand-keep duplicate dyad-year observation highest fatality score. ’ll also need take inventory handle cases fatality == -9. forthcoming data release, find cases missing fatalities CoW-MID data mean fatalities half cases. even wars! However, ’d way knowing CoW-MID. ’ll safe recode -9 .5, indicating 0 fatalities “less” fatality level 1 (1-25 deaths) CoW-MID can least confidently say latter happened. fix Russia-Turkey-1853 problem. won’t fix cases multiple disputes initiated year dyad, one died. lot . , ’ll need case exclusion rules.","code":"hold_this %>% filter(ccode1 == 365 & ccode2 == 640 & year == 1853) %>% select(dispnum:disponset, fatality1:fatality2, hiact1, hiact2) %>% kbl(., caption = \"Russia-Ottoman Empire Dyadic Dispute-Years in 1853\", booktabs = TRUE, longtable = TRUE) %>% kable_styling(position = \"center\", full_width = F, bootstrap_options = \"striped\") hold_this %>% left_join(., cow_mid_disps %>% select(dispnum, fatality)) %>% mutate(fatality = ifelse(fatality == -9, .5, fatality)) %>% arrange(ccode1, ccode2, year) %>% group_by(ccode1, ccode2, year) %>% mutate(duplicated = ifelse(n() > 1, 1, 0)) %>% group_by(ccode1, ccode2, year, duplicated) %>% # Keep the highest fatality filter(fatality == max(fatality)) %>% mutate(fatality = ifelse(fatality == .5, -9, fatality)) %>% arrange(ccode1, ccode2, year) %>% # practice safe group_by() ungroup() -> hold_this #> Joining with `by = join_by(dispnum)` hold_this %>% filter(ccode1 == 365 & ccode2 == 640 & year == 1853) #> # A tibble: 1 × 20 #> dispnum ccode1 ccode2 year dispongoing dispon…¹ sidea1 sidea2 fatal…² fatal…³ #> #> 1 57 365 640 1853 1 1 1 0 6 6 #> # … with 10 more variables: fatalpre1 , fatalpre2 , hiact1 , #> # hiact2 , hostlev1 , hostlev2 , orig1 , orig2 , #> # duplicated , fatality , and abbreviated variable names #> # ¹disponset, ²fatality1, ³fatality2"},{"path":"http://svmiller.com/articles/coerce-dispute-year-dyad-year.html","id":"third-keep-the-highest-dispute-level-hostility","dir":"Articles","previous_headings":"Converting CoW-MID Dyadic Dispute-Year Data into Dyad-Year Data","what":"Third: Keep the Highest Dispute-Level Hostility","title":"How `{peacesciencer}` Coerces Dispute-Year Data into Dyad-Year Data","text":"next case exclusion rule want continue isolating serious MIDs MIDs lesser severity. Consider case India Pakistan 1963. India-Pakistan Dyadic Dispute-Years 1963 two unique MID onsets 1963 neither fatal, meaning duplicate dyad-year still . However, MID#2630 just threat use force whereas MID#1317 occupation territory (Pakistan India). former threat. latter use. MID#2630 higher hostility level MID ’ll want keep. caveat applies, fatalities, ’ll use dispute-level hostility variable plug-. least fix India-Pakistan observation 1963, others like .","code":"hold_this %>% filter(ccode1 == 750 & ccode2 == 770 & year == 1963) %>% select(dispnum:year, disponset, fatality1, fatality2, hiact1, hiact2) %>% kbl(., caption = \"India-Pakistan Dyadic Dispute-Years in 1963\", booktabs = TRUE, longtable = TRUE) %>% kable_styling(position = \"center\", full_width = F, bootstrap_options = \"striped\") hold_this %>% left_join(., cow_mid_disps %>% select(dispnum, hostlev)) %>% arrange(ccode1, ccode2, year) %>% group_by(ccode1, ccode2, year) %>% mutate(duplicated = ifelse(n() > 1, 1, 0)) %>% group_by(ccode1, ccode2, year, duplicated) %>% # Keep the highest hostlev filter(hostlev == max(hostlev)) %>% arrange(ccode1, ccode2, year) %>% # practice safe group_by() ungroup() -> hold_this #> Joining with `by = join_by(dispnum)` hold_this %>% filter(ccode1 == 750 & ccode2 == 770 & year == 1963) %>% select(dispnum:year, disponset, fatality1, fatality2, hiact1, hiact2) #> # A tibble: 1 × 9 #> dispnum ccode1 ccode2 year disponset fatality1 fatality2 hiact1 hiact2 #> #> 1 1317 750 770 1963 1 0 0 0 14"},{"path":"http://svmiller.com/articles/coerce-dispute-year-dyad-year.html","id":"fourth-keep-the-highest-dispute-level-minimum-then-maximum-duration","dir":"Articles","previous_headings":"Converting CoW-MID Dyadic Dispute-Year Data into Dyad-Year Data","what":"Fourth: Keep the Highest Dispute-Level (Minimum, Then Maximum) Duration","title":"How `{peacesciencer}` Coerces Dispute-Year Data into Dyad-Year Data","text":"point, still duplicate dyad-years remaining data, ’ve selected cases fairly similar (least given dispute- participant-level data available). duplicates remain unique onsets fatality levels hostility levels. next available measure approximates dispute severity duration. Consider duplicate observation Colombia-Peru 1852 corresponding MIDs (MID#1506 MID#1523). MIDs look fairly similar. started year. level fatalities (none). hostility level (show force). tough read tea leaves argue alert (hiact: 8) “greater” show force (hiact: 7) even 8 > 7 (.e. CoW-MID action codes never truly ordinal). , ’re multilateral MIDs. MID#1506 pit Venezuela Colombia Chile Peru whereas MID#1523 pit Chile Colombia Peru. even unhelpfully unknown duration . -9s start days . However, MID#1523 highest minimum duration. lasted least 110 days (many 140) whereas MID#1506 minimum duration 63 days (maximum duration 122 days). conditions, keep one minimum duration , duplicates still remain, keep one highest maximum duration. fix Colombia-Peru problem 1852.","code":"haven::read_dta(\"~/Dropbox/data/cow/mid/5/MIDB 5.0.dta\") %>% filter(dispnum %in% c(1506, 1523)) %>% select(dispnum:sidea, fatality, hiact, hostlev) #> # A tibble: 7 × 13 #> dispnum stabb ccode stday stmon styear endday endmon endyear sidea fatality #> #> 1 1506 VEN 101 -9 10 1852 -9 11 1852 1 0 #> 2 1506 CHL 155 14 9 1852 -9 11 1852 0 0 #> 3 1506 PER 135 -9 8 1852 -9 11 1852 0 0 #> 4 1506 COL 100 -9 8 1852 -9 11 1852 1 0 #> 5 1523 PER 135 -9 3 1852 18 7 1852 0 0 #> 6 1523 CHL 155 2 6 1852 2 6 1852 1 0 #> 7 1523 COL 100 -9 3 1852 18 7 1852 1 0 #> # … with 2 more variables: hiact , hostlev hold_this %>% left_join(., cow_mid_disps %>% select(dispnum, mindur, maxdur)) %>% arrange(ccode1, ccode2, year) %>% group_by(ccode1, ccode2, year) %>% mutate(duplicated = ifelse(n() > 1, 1, 0)) %>% group_by(ccode1, ccode2, year, duplicated) %>% # Keep the highest mindur filter(mindur == max(mindur)) %>% arrange(ccode1, ccode2, year) %>% group_by(ccode1, ccode2, year) %>% mutate(duplicated = ifelse(n() > 1, 1, 0)) %>% group_by(ccode1, ccode2, year, duplicated) %>% # Keep the highest maxdur filter(maxdur == max(maxdur)) %>% # practice safe group_by() ungroup() -> hold_this #> Joining with `by = join_by(dispnum)` hold_this %>% filter(ccode1 == 135 & ccode2 == 100 & year == 1852) %>% select(dispnum:year, disponset, fatality1, fatality2, hiact1, hiact2) #> # A tibble: 1 × 9 #> dispnum ccode1 ccode2 year disponset fatality1 fatality2 hiact1 hiact2 #> #> 1 1523 135 100 1852 1 0 0 0 8"},{"path":"http://svmiller.com/articles/coerce-dispute-year-dyad-year.html","id":"final-case-exclusions-for-the-cow-mid-data","dir":"Articles","previous_headings":"Converting CoW-MID Dyadic Dispute-Year Data into Dyad-Year Data","what":"Final Case Exclusions for the CoW-MID Data","title":"How `{peacesciencer}` Coerces Dispute-Year Data into Dyad-Year Data","text":"started 498 duplicate directed dyad-years dyadic dispute-year data. ’re now just 24 directed (12 non-directed) dyad-years. glance remaining observations suggest substance similar. example, MID#4428 MID#4430 one-day border fortifications Kyrgyzstan Uzbekistan 2005. MID#2171 MID#2172 one-day threats use force Cyprus Turkey 1965. Duplicate Non-Directed Dyad-Years Still Remaining final case exclusion rules round us home. First, duplicate dyad-years feature case one dispute reciprocated . example, MID#4428 mutual border fortification MID#4430 just one border fortification directed Kyrgyzstan Uzbekistan. Thus, keep one involved least two codable incidents rather MID just one codable incident. reader may object reciprocation feature higher proverbial chain, given prominence audience cost literature. caution . Gibler Miller (also Little) driven home reciprocation variable information-poor variable. minimally tells Side B MID initiated militarized incident involved attack clear initiator. review conflict data, find attacks ambushes initiated Side countered happen half time. , inferences made reciprocation variable among sensitive errors report CoW-MID data. reason, discourage researchers using variable analyses , application, ’s peacesciencer uses dispute-level reciprocation variable near bottom rung case exclusions. Still, ’s . ’re just three duplicate dyad-years now. reason MID#4428 MID#4430 still CoW-MID MID#4428 unreciprocated dispute-level also militarized incident Side B dispute. CoW-MID issue peacesciencer issue. Duplicate Non-Directed Dyad-Years Still Remaining three effectively identical MIDs. start year. fatality-level, hostility-level, duration, either reciprocated -reciprocated (MID#4428/MID#4430 issue notwithstanding). Thus, select one lowest start month. enough eliminate duplicate dyad-years.","code":"hold_this %>% group_by(ccode1, ccode2, year) %>% filter(n() > 1) %>% filter(ccode2 > ccode1) %>% select(dispnum:disponset, hiact1:hiact2, fatality:maxdur) %>% kbl(., caption = \"Duplicate Non-Directed Dyad-Years Still Remaining\", booktabs = TRUE, longtable = TRUE) %>% kable_styling(position = \"center\", full_width = F, bootstrap_options = \"striped\") hold_this %>% left_join(., cow_mid_disps %>% select(dispnum, recip)) %>% arrange(ccode1, ccode2, year) %>% group_by(ccode1, ccode2, year) %>% mutate(duplicated = ifelse(n() > 1, 1, 0)) %>% group_by(ccode1, ccode2, year, duplicated) %>% # Keep the reciprocated ones, where non-reciprocated ones exist filter(recip == max(recip)) %>% arrange(ccode1, ccode2, year) %>% # practice safe group_by() ungroup() -> hold_this #> Joining with `by = join_by(dispnum)` hold_this %>% group_by(ccode1, ccode2, year) %>% filter(n() > 1) %>% filter(ccode2 > ccode1) %>% select(dispnum:disponset, hiact1:hiact2, fatality:maxdur) %>% kbl(., caption = \"Duplicate Non-Directed Dyad-Years Still Remaining\", booktabs = TRUE, longtable = TRUE) %>% kable_styling(position = \"center\", full_width = F, bootstrap_options = \"striped\") hold_this %>% left_join(., cow_mid_disps %>% select(dispnum, stmon)) %>% arrange(ccode1, ccode2, year) %>% group_by(ccode1, ccode2, year) %>% mutate(duplicated = ifelse(n() > 1, 1, 0)) %>% group_by(ccode1, ccode2, year, duplicated) %>% # Keep the reciprocated ones, where non-reciprocated ones exist filter(stmon == min(stmon)) %>% arrange(ccode1, ccode2, year) %>% # practice safe group_by() ungroup() -> hold_this #> Joining with `by = join_by(dispnum)` # And we're done hold_this %>% group_by(ccode1, ccode2, year) %>% filter(n() > 1) #> # A tibble: 0 × 25 #> # Groups: ccode1, ccode2, year [0] #> # … with 25 variables: dispnum , ccode1 , ccode2 , year , #> # dispongoing , disponset , sidea1 , sidea2 , #> # fatality1 , fatality2 , fatalpre1 , fatalpre2 , #> # hiact1 , hiact2 , hostlev1 , hostlev2 , orig1 , #> # orig2 , duplicated , fatality , hostlev , mindur , #> # maxdur , recip , stmon "},{"path":"http://svmiller.com/articles/different-data-types.html","id":"state-year-data","dir":"Articles","previous_headings":"","what":"State-Year Data","title":"Create Different Kinds of Data in `{peacesciencer}`","text":"basic form data peacesciencer creates state-year, way create_stateyears(). create_stateyears() two arguments: system mry. system takes either “cow” “gw”, depending whether user wants Correlates War state years Gleditsch-Ward state-years. defaults “cow” absence user-specified override given prominence Correlates War data peace science ecosystem. mry takes logical (TRUE FALSE), depending whether user wants function extend recently concluded calendar year (2022). Correlates War state system data extend end 2016 Gleditsch-Ward state system extend end 2017. argument allow researcher extend data years, (reasonable) assumption fundamental composition changes state system since data sets last updated. mry defaults TRUE absence user-specified override. create Correlates War state-year data 1816 2022. create Gleditsch-Ward state-year data 1816 2017.","code":"create_stateyears() #> # A tibble: 17,121 × 3 #> ccode statenme year #> #> 1 2 United States of America 1816 #> 2 2 United States of America 1817 #> 3 2 United States of America 1818 #> 4 2 United States of America 1819 #> 5 2 United States of America 1820 #> 6 2 United States of America 1821 #> 7 2 United States of America 1822 #> 8 2 United States of America 1823 #> 9 2 United States of America 1824 #> 10 2 United States of America 1825 #> # … with 17,111 more rows create_stateyears(system = \"gw\", mry = FALSE) #> # A tibble: 17,767 × 3 #> gwcode statename year #> #> 1 2 United States of America 1816 #> 2 2 United States of America 1817 #> 3 2 United States of America 1818 #> 4 2 United States of America 1819 #> 5 2 United States of America 1820 #> 6 2 United States of America 1821 #> 7 2 United States of America 1822 #> 8 2 United States of America 1823 #> 9 2 United States of America 1824 #> 10 2 United States of America 1825 #> # … with 17,757 more rows"},{"path":"http://svmiller.com/articles/different-data-types.html","id":"dyad-year-data","dir":"Articles","previous_headings":"","what":"Dyad-Year Data","title":"Create Different Kinds of Data in `{peacesciencer}`","text":"create_dyadyears() one useful functions peacesciencer, transforming raw Correlates War state system data (cow_states peacesciencer) Gleditsch-Ward state system data (gw_states) possible dyad-years. three arguments. system mry operate create_stateyears(). additional argument—directed—also takes logical (TRUE FALSE). default TRUE, returning directed dyad-year data (useful dyadic conflict analyses initiator/target distinction matters). FALSE returns non-directed dyad-year data, useful cases initiator/target distinction matter researcher cares presence absence conflict. convention non-directed dyad-year data ccode2 > ccode1 underlying code create_dyadyears() simply takes directed dyad-year data chops half rule. Correlates War dyad-years 1816 2022. Gleditsch-Ward dyad-years temporal domain.","code":"create_dyadyears() #> Joining with `by = join_by(ccode1, ccode2, year)` #> # A tibble: 2,139,270 × 3 #> ccode1 ccode2 year #> #> 1 2 20 1920 #> 2 2 20 1921 #> 3 2 20 1922 #> 4 2 20 1923 #> 5 2 20 1924 #> 6 2 20 1925 #> 7 2 20 1926 #> 8 2 20 1927 #> 9 2 20 1928 #> 10 2 20 1929 #> # … with 2,139,260 more rows create_dyadyears(system = \"gw\") #> Joining with `by = join_by(gwcode1, gwcode2, year)` #> # A tibble: 2,089,826 × 3 #> gwcode1 gwcode2 year #> #> 1 2 20 1867 #> 2 2 20 1868 #> 3 2 20 1869 #> 4 2 20 1870 #> 5 2 20 1871 #> 6 2 20 1872 #> 7 2 20 1873 #> 8 2 20 1874 #> 9 2 20 1875 #> 10 2 20 1876 #> # … with 2,089,816 more rows"},{"path":"http://svmiller.com/articles/different-data-types.html","id":"major-vs--major-dyad-years","dir":"Articles","previous_headings":"Dyad-Year Data","what":"Major vs. Major Dyad-Years","title":"Create Different Kinds of Data in `{peacesciencer}`","text":"Consider section vignette comparison kind dyad-year data EUGene create user, apparently request. EUGene apparently create types dyad-years specific dyad-year types whereas peacesciencer treats case exclusions can fact given functionality package. example, just major vs. major dyads. simplicity’s sake, directed dyad-years core (captured cow_ddy package shortcut).","code":"cow_ddy %>% add_cow_majors() %>% filter(cowmaj1 == 1 & cowmaj2 == 1) #> # A tibble: 6,140 × 5 #> ccode1 ccode2 year cowmaj1 cowmaj2 #> #> 1 2 200 1898 1 1 #> 2 2 200 1899 1 1 #> 3 2 200 1900 1 1 #> 4 2 200 1901 1 1 #> 5 2 200 1902 1 1 #> 6 2 200 1903 1 1 #> 7 2 200 1904 1 1 #> 8 2 200 1905 1 1 #> 9 2 200 1906 1 1 #> 10 2 200 1907 1 1 #> # … with 6,130 more rows"},{"path":"http://svmiller.com/articles/different-data-types.html","id":"major-vs--any-state-dyad-years","dir":"Articles","previous_headings":"Dyad-Year Data","what":"Major vs. Any State Dyad-Years","title":"Create Different Kinds of Data in `{peacesciencer}`","text":"dyad-years state major power.","code":"cow_ddy %>% add_cow_majors() %>% filter(cowmaj1 == 1 | cowmaj2 == 1) #> # A tibble: 183,722 × 5 #> ccode1 ccode2 year cowmaj1 cowmaj2 #> #> 1 2 20 1920 1 0 #> 2 2 20 1921 1 0 #> 3 2 20 1922 1 0 #> 4 2 20 1923 1 0 #> 5 2 20 1924 1 0 #> 6 2 20 1925 1 0 #> 7 2 20 1926 1 0 #> 8 2 20 1927 1 0 #> 9 2 20 1928 1 0 #> 10 2 20 1929 1 0 #> # … with 183,712 more rows"},{"path":"http://svmiller.com/articles/different-data-types.html","id":"all-contiguous-dyad-years","dir":"Articles","previous_headings":"Dyad-Year Data","what":"All Contiguous Dyad-Years","title":"Create Different Kinds of Data in `{peacesciencer}`","text":"dyad-years separated 400 miles water fewer, though documentation add_contiguity() cautions users least little critical contiguity data.","code":"cow_ddy %>% add_contiguity() %>% filter(conttype %in% c(1:5)) #> Joining with `by = join_by(ccode1, ccode2, year)` #> # A tibble: 82,440 × 4 #> ccode1 ccode2 year conttype #> #> 1 2 20 1920 1 #> 2 2 20 1921 1 #> 3 2 20 1922 1 #> 4 2 20 1923 1 #> 5 2 20 1924 1 #> 6 2 20 1925 1 #> 7 2 20 1926 1 #> 8 2 20 1927 1 #> 9 2 20 1928 1 #> 10 2 20 1929 1 #> # … with 82,430 more rows"},{"path":"http://svmiller.com/articles/different-data-types.html","id":"all-dyad-years-within-a-set-distance","dir":"Articles","previous_headings":"Dyad-Year Data","what":"All Dyad-Years Within a Set Distance","title":"Create Different Kinds of Data in `{peacesciencer}`","text":"dyad-years minimum distance user-specified threshold (kilometers). function lean add_minimum_distance(), side effect truncating left bound temporal domain —right now—1886. Correlates War dyad-years 1886 2019 separated 1,000 kilometers fewer.","code":"cow_ddy %>% add_minimum_distance() %>% filter(mindist <= 1000) #> Joining with `by = join_by(ccode1, ccode2, year)` #> # A tibble: 167,532 × 4 #> ccode1 ccode2 year mindist #> #> 1 2 20 1921 0 #> 2 2 20 1922 0 #> 3 2 20 1923 0 #> 4 2 20 1924 0 #> 5 2 20 1925 0 #> 6 2 20 1926 0 #> 7 2 20 1927 0 #> 8 2 20 1928 0 #> 9 2 20 1929 0 #> 10 2 20 1930 0 #> # … with 167,522 more rows"},{"path":"http://svmiller.com/articles/different-data-types.html","id":"dyadic-dispute-year-data","dir":"Articles","previous_headings":"","what":"Dyadic Dispute-Year Data","title":"Create Different Kinds of Data in `{peacesciencer}`","text":"Dyadic dispute-year data come pre-processed peacesciencer. Another vignette show transformed true dyad-year data, also available analysis. example, (directed) dyadic dispute-year Gibler-Miller-Little (GML) MID data available gml_dirdisp. , can add information dyadic dispute-years identify contiguity relationships Correlates War major status. Users interested Correlates War MID data available use cow_mid_dirdisps. Future updates may change object names better standardization, now.","code":"gml_dirdisp %>% add_contiguity() %>% add_cow_majors() #> Joining with `by = join_by(ccode1, ccode2, year)` #> # A tibble: 10,276 × 42 #> dispnum ccode1 ccode2 year midongoing midonset sidea1 sidea2 revst…¹ revst…² #> #> 1 2 2 200 1902 1 1 1 0 1 1 #> 2 2 200 2 1902 1 1 0 1 1 1 #> 3 3 300 345 1913 1 1 1 0 1 0 #> 4 3 345 300 1913 1 1 0 1 0 1 #> 5 4 200 339 1946 1 1 0 1 0 0 #> 6 4 339 200 1946 1 1 1 0 0 0 #> 7 7 200 651 1951 1 1 1 0 0 1 #> 8 7 200 651 1952 1 0 1 0 0 1 #> 9 7 651 200 1951 1 1 0 1 1 0 #> 10 7 651 200 1952 1 0 0 1 1 0 #> # … with 10,266 more rows, 32 more variables: revtype11 , revtype12 , #> # revtype21 , revtype22 , fatality1 , fatality2 , #> # fatalpre1 , fatalpre2 , hiact1 , hiact2 , #> # hostlev1 , hostlev2 , orig1 , orig2 , hiact , #> # hostlev , mindur , maxdur , outcome , settle , #> # fatality , fatalpre , stmon , endmon , recip , #> # numa , numb , ongo2010 , version , conttype , …"},{"path":"http://svmiller.com/articles/different-data-types.html","id":"state-day-data","dir":"Articles","previous_headings":"","what":"State-Day Data","title":"Create Different Kinds of Data in `{peacesciencer}`","text":"peacesciencer comes create_statedays() function. admittedly proof concept really difficult conjure many daily data sets peace science, certainly coverage 19th century. matter, create_statedays() create data. system mry arguments (defaults) create_stateyears(). Correlates War state-days 1816 2022. Gleditsch-Ward state-days temporal domain. can conjure application user may want think daily conflict episodes within Gleditsch-Ward domain. UCDP armed conflict data precise dates , say, Correlates War MID data, making analysis possible. However, conflict data 1946 reflect peacesciencer something like . require lubridate.","code":"create_statedays() #> # A tibble: 6,203,441 × 3 #> ccode statenme date #> #> 1 2 United States of America 1816-01-01 #> 2 2 United States of America 1816-01-02 #> 3 2 United States of America 1816-01-03 #> 4 2 United States of America 1816-01-04 #> 5 2 United States of America 1816-01-05 #> 6 2 United States of America 1816-01-06 #> 7 2 United States of America 1816-01-07 #> 8 2 United States of America 1816-01-08 #> 9 2 United States of America 1816-01-09 #> 10 2 United States of America 1816-01-10 #> # … with 6,203,431 more rows create_statedays(system = \"gw\") #> # A tibble: 6,765,801 × 3 #> gwcode statename date #> #> 1 2 United States of America 1816-01-01 #> 2 2 United States of America 1816-01-02 #> 3 2 United States of America 1816-01-03 #> 4 2 United States of America 1816-01-04 #> 5 2 United States of America 1816-01-05 #> 6 2 United States of America 1816-01-06 #> 7 2 United States of America 1816-01-07 #> 8 2 United States of America 1816-01-08 #> 9 2 United States of America 1816-01-09 #> 10 2 United States of America 1816-01-10 #> # … with 6,765,791 more rows create_statedays(system = \"gw\") %>% filter(year(date) >= 1946) #> # A tibble: 3,998,000 × 3 #> gwcode statename date #> #> 1 2 United States of America 1946-01-01 #> 2 2 United States of America 1946-01-02 #> 3 2 United States of America 1946-01-03 #> 4 2 United States of America 1946-01-04 #> 5 2 United States of America 1946-01-05 #> 6 2 United States of America 1946-01-06 #> 7 2 United States of America 1946-01-07 #> 8 2 United States of America 1946-01-08 #> 9 2 United States of America 1946-01-09 #> 10 2 United States of America 1946-01-10 #> # … with 3,997,990 more rows"},{"path":"http://svmiller.com/articles/different-data-types.html","id":"state-month-data","dir":"Articles","previous_headings":"","what":"State-Month Data","title":"Create Different Kinds of Data in `{peacesciencer}`","text":"State-months simple aggregations state-days. can accomplish extra commands create_statedays().","code":"create_statedays(system = \"gw\") %>% mutate(year = year(date), month = month(date)) %>% distinct(gwcode, statename, year, month) #> # A tibble: 222,370 × 4 #> gwcode statename year month #> #> 1 2 United States of America 1816 1 #> 2 2 United States of America 1816 2 #> 3 2 United States of America 1816 3 #> 4 2 United States of America 1816 4 #> 5 2 United States of America 1816 5 #> 6 2 United States of America 1816 6 #> 7 2 United States of America 1816 7 #> 8 2 United States of America 1816 8 #> 9 2 United States of America 1816 9 #> 10 2 United States of America 1816 10 #> # … with 222,360 more rows"},{"path":"http://svmiller.com/articles/different-data-types.html","id":"state-quarter-data","dir":"Articles","previous_headings":"","what":"State-Quarter Data","title":"Create Different Kinds of Data in `{peacesciencer}`","text":"assumption worth belaboring “quarter” look like general context, might look something like . , aggregation create_statedays().","code":"create_statedays(system = \"gw\") %>% mutate(year = year(date), month = month(date)) %>% filter(month %in% c(1, 4, 7, 10)) %>% mutate(quarter = case_when( month == 1 ~ \"Q1\", month == 4 ~ \"Q2\", month == 7 ~ \"Q3\", month == 10 ~ \"Q4\" )) %>% distinct(gwcode, statename, year, quarter) #> # A tibble: 74,079 × 4 #> gwcode statename year quarter #> #> 1 2 United States of America 1816 Q1 #> 2 2 United States of America 1816 Q2 #> 3 2 United States of America 1816 Q3 #> 4 2 United States of America 1816 Q4 #> 5 2 United States of America 1817 Q1 #> 6 2 United States of America 1817 Q2 #> 7 2 United States of America 1817 Q3 #> 8 2 United States of America 1817 Q4 #> 9 2 United States of America 1818 Q1 #> 10 2 United States of America 1818 Q2 #> # … with 74,069 more rows"},{"path":"http://svmiller.com/articles/different-data-types.html","id":"leader-day-leader-month-leader-year-data","dir":"Articles","previous_headings":"","what":"Leader-Day (Leader-Month, Leader-Year) Data","title":"Create Different Kinds of Data in `{peacesciencer}`","text":"peacesciencer leader-level units analysis well, can easily created modified Archigos (archigos) data peacesciencer. data version 4.1. create_leaderdays() create leader-day data archigos. want note one thing leader-level functions package. Whereas Correlates War state system membership often default system lot functions (prominently create_stateyears() create_dyadyears()), Gleditsch-Ward system default system state system around Archigos project created leader data. Moreover, leader data aren’t exactly tethered Gleditsch-Ward state system dates either (e.g. leader entries Gleditsch-Ward states aren’t system yet). case like , can standardize leader data either Correlates War system Gleditsch-Ward system standardize argument. default, option “none” (.e. return available leader days recorded Archigos data). “cow” “gw” standardizes leader data Correlates War state system membership Gleditsch-Ward state system membership, respectively. user may want think additional post-processing top , enough get started. , process creates state-months can create something like leader-months. leader-years, pre-packaged peacesciencer function. package also adds information leader gender, approximation leader’s age year (.e. year - yrborn), running count (starting 1) leader’s tenure (years).","code":"archigos #> # A tibble: 3,409 × 11 #> obsid gwcode leadid leader yrborn gender startdate enddate entry exit #> #> 1 USA-1869 2 81dcc… Grant 1822 M 1869-03-04 1877-03-04 Regu… Regu… #> 2 USA-1877 2 81dcc… Hayes 1822 M 1877-03-04 1881-03-04 Regu… Regu… #> 3 USA-188… 2 81dcf… Garfi… 1831 M 1881-03-04 1881-09-19 Regu… Irre… #> 4 USA-188… 2 81dcf… Arthur 1829 M 1881-09-19 1885-03-04 Regu… Regu… #> 5 USA-1885 2 34fb1… Cleve… 1837 M 1885-03-04 1889-03-04 Regu… Regu… #> 6 USA-1889 2 81dcf… Harri… 1833 M 1889-03-04 1893-03-04 Regu… Regu… #> 7 USA-1893 2 34fb1… Cleve… 1837 M 1893-03-04 1897-03-04 Regu… Regu… #> 8 USA-1897 2 81dcf… McKin… 1843 M 1897-03-04 1901-09-14 Regu… Irre… #> 9 USA-1901 2 81dd2… Roose… 1858 M 1901-09-14 1909-03-04 Regu… Regu… #> 10 USA-1909 2 81dd2… Taft 1857 M 1909-03-04 1913-03-04 Regu… Regu… #> # … with 3,399 more rows, and 1 more variable: exitcode create_leaderdays() #> # A tibble: 5,298,380 × 5 #> obsid gwcode leader date yrinoffice #> #> 1 USA-1869 2 Grant 1869-03-04 1 #> 2 USA-1869 2 Grant 1869-03-05 1 #> 3 USA-1869 2 Grant 1869-03-06 1 #> 4 USA-1869 2 Grant 1869-03-07 1 #> 5 USA-1869 2 Grant 1869-03-08 1 #> 6 USA-1869 2 Grant 1869-03-09 1 #> 7 USA-1869 2 Grant 1869-03-10 1 #> 8 USA-1869 2 Grant 1869-03-11 1 #> 9 USA-1869 2 Grant 1869-03-12 1 #> 10 USA-1869 2 Grant 1869-03-13 1 #> # … with 5,298,370 more rows create_leaderdays(standardize = \"cow\") #> Joining with `by = join_by(gwcode, year)` #> Joining with `by = join_by(ccode, date)` #> # A tibble: 4,824,967 × 5 #> obsid ccode leader date yrinoffice #> #> 1 USA-1869 2 Grant 1869-03-04 1 #> 2 USA-1869 2 Grant 1869-03-05 1 #> 3 USA-1869 2 Grant 1869-03-06 1 #> 4 USA-1869 2 Grant 1869-03-07 1 #> 5 USA-1869 2 Grant 1869-03-08 1 #> 6 USA-1869 2 Grant 1869-03-09 1 #> 7 USA-1869 2 Grant 1869-03-10 1 #> 8 USA-1869 2 Grant 1869-03-11 1 #> 9 USA-1869 2 Grant 1869-03-12 1 #> 10 USA-1869 2 Grant 1869-03-13 1 #> # … with 4,824,957 more rows create_leaderdays() %>% mutate(year = year(date), month = month(date)) %>% group_by(gwcode, obsid, year, month) %>% slice(1) #> # A tibble: 177,128 × 7 #> # Groups: gwcode, obsid, year, month [177,128] #> obsid gwcode leader date yrinoffice year month #> #> 1 USA-1869 2 Grant 1869-03-04 1 1869 3 #> 2 USA-1869 2 Grant 1869-04-01 1 1869 4 #> 3 USA-1869 2 Grant 1869-05-01 1 1869 5 #> 4 USA-1869 2 Grant 1869-06-01 1 1869 6 #> 5 USA-1869 2 Grant 1869-07-01 1 1869 7 #> 6 USA-1869 2 Grant 1869-08-01 1 1869 8 #> 7 USA-1869 2 Grant 1869-09-01 1 1869 9 #> 8 USA-1869 2 Grant 1869-10-01 1 1869 10 #> 9 USA-1869 2 Grant 1869-11-01 1 1869 11 #> 10 USA-1869 2 Grant 1869-12-01 1 1869 12 #> # … with 177,118 more rows create_leaderyears() #> # A tibble: 17,686 × 7 #> obsid leader gwcode gender leaderage year yrinoffice #> #> 1 USA-1869 Grant 2 M 47 1869 1 #> 2 USA-1869 Grant 2 M 48 1870 2 #> 3 USA-1869 Grant 2 M 49 1871 3 #> 4 USA-1869 Grant 2 M 50 1872 4 #> 5 USA-1869 Grant 2 M 51 1873 5 #> 6 USA-1869 Grant 2 M 52 1874 6 #> 7 USA-1869 Grant 2 M 53 1875 7 #> 8 USA-1869 Grant 2 M 54 1876 8 #> 9 USA-1869 Grant 2 M 55 1877 9 #> 10 USA-1877 Hayes 2 M 55 1877 1 #> # … with 17,676 more rows"},{"path":"http://svmiller.com/articles/different-data-types.html","id":"leader-dyad-year-data","dir":"Articles","previous_headings":"","what":"Leader Dyad-Year Data","title":"Create Different Kinds of Data in `{peacesciencer}`","text":"peacesciencer can also create leader dyad-year data way create_leaderdyadyears(). can see underlying code creating data. ’s lot code, take lot time run scratch, ensuing output large store R data object package CRAN hard-caps package size 5 MB. Instead, users want data first run download_extdata() first install update package. Therein, can run create_leaderdyadyears() create full universe leader dyad-year data.","code":"# create_leaderdyadyears() is effectively doing this. # Let's do the G-W leader dyad-year data for illustration's sake. # `download_extdata()` will download these data into the package directory. # Thus, it is *not* downloading the data fresh each time. the_url <- \"http://svmiller.com/R/peacesciencer/gw_dir_leader_dyad_years.rds\" readRDS(url(the_url)) %>% declare_attributes(data_type = \"leader_dyad_year\", system = \"gw\") #> # A tibble: 2,336,990 × 11 #> year obsid1 obsid2 gwcode1 gwcode2 gender1 gender2 leade…¹ leade…² yrino…³ #> #> 1 1870 AFG-1868 AUH-1… 700 300 M M 45 40 3 #> 2 1870 AFG-1868 BAV-1… 700 245 M M 45 39 3 #> 3 1870 AFG-1868 BRA-1… 700 140 M M 45 45 3 #> 4 1870 AFG-1868 CHN-1… 700 710 M M 45 35 3 #> 5 1870 AFG-1868 COS-1… 700 94 M M 45 39 3 #> 6 1870 AFG-1868 ECU-1… 700 130 M M 45 49 3 #> 7 1870 AFG-1868 GMY-1… 700 255 M M 45 73 3 #> 8 1870 AFG-1868 GRC-1… 700 350 M M 45 25 3 #> 9 1870 AFG-1868 IRN-1… 700 630 M M 45 39 3 #> 10 1870 AFG-1868 JPN-1… 700 740 M M 45 18 3 #> # … with 2,336,980 more rows, 1 more variable: yrinoffice2 , and #> # abbreviated variable names ¹leaderage1, ²leaderage2, ³yrinoffice1 # ^ compare with: # download_extdata() # create_leaderdyadyears()"},{"path":"http://svmiller.com/articles/joins.html","id":"left-outer-join","dir":"Articles","previous_headings":"","what":"Left (Outer) Join","title":"A Discussion of Various Joins in `{peacesciencer}`","text":"first type join important type join function peacesciencer. Indeed, almost every function package deals adding variables type data created peacesciencer includes . “left join” (left_join() dplyr), alternatively known “outer join” “left outer join” SQL context, type “mutating join” tidyverse context. plain English, left_join() assumes two data objects—“left” object (x) “right” object (y)—returns rows left object (x) matching information right object (y) set common matching keys (columns x y). simple example works peacesciencer context. Assume simple state-year data set United States (ccode: 2), Canada (ccode: 20), United Kingdom (ccode: 200) years 2016 2020. Recreating simple kind data problem R serve “left object” (x) simple example. Let’s assume ’re building toward kind state-year analysis describe manuscript accompanying package. example, canonical civil conflict analysis Fearon Laitin (2003) outcome varies year, several independent variables time-invariant serve variables making state--state comparisons model civil war onset (e.g ethnic fractionalization, religious fractionalization, terrain ruggedness). similar manner, basic ranking United States, Canada, United Kingdom case. Minimally, United States scores “low”, Canada scores “medium”, United Kingdom scores “high” metric. variation time simple example. “right object” (y) want add “left object” serves main data frame. Notice x variable ranking information want. , however, matching observations state identifiers corresponding Correlates War state codes U.S., Canada, United Kingdom. left join (left_join()) merges y x, returning rows x matching information y based columns share common (: ccode). obviously simple example, scales well even additional complexity. example, let’s assume added simple five-year panel Australia (ccode: 900) “left object” (x). However, corresponding information Australia “right object” (y). left join produce circumstances. ranking Australia simple example, left join returns NAs (.e. missing values) Australia. original number rows x conditions unaffected. happen observation y corresponding match x? example, let’s assume y data also included ranking Denmark (ccode: 390), though Denmark appear x. happen circumstances. Notice output left join identical output . Australia x, y. Thus, rows Australia returned absence ranking information Australia y means variable NA Australia merge. Denmark y, x. left join returns rows x matching information y, absence observations Denmark x means nowhere ranking information go merge. Thus, Denmark’s ranking ignored.","code":"tibble(ccode = c(2, 20, 200)) %>% # rowwise() is a great trick for nesting sequences in tibbles # This parlor trick, for example, generates state-year data out of raw state # data in create_stateyears() rowwise() %>% # create a sequence as a nested column mutate(year = list(seq(2016, 2020))) %>% # unnest the column unnest(year) -> x x #> # A tibble: 15 × 2 #> ccode year #> #> 1 2 2016 #> 2 2 2017 #> 3 2 2018 #> 4 2 2019 #> 5 2 2020 #> 6 20 2016 #> 7 20 2017 #> 8 20 2018 #> 9 20 2019 #> 10 20 2020 #> 11 200 2016 #> 12 200 2017 #> 13 200 2018 #> 14 200 2019 #> 15 200 2020 tibble(ccode = c(2, 20, 200), ranking = c(\"low\", \"medium\", \"high\")) -> y y #> # A tibble: 3 × 2 #> ccode ranking #> #> 1 2 low #> 2 20 medium #> 3 200 high # alternatively, as I tend to do it: x %>% left_join(., y) left_join(x, y) #> Joining with `by = join_by(ccode)` #> # A tibble: 15 × 3 #> ccode year ranking #> #> 1 2 2016 low #> 2 2 2017 low #> 3 2 2018 low #> 4 2 2019 low #> 5 2 2020 low #> 6 20 2016 medium #> 7 20 2017 medium #> 8 20 2018 medium #> 9 20 2019 medium #> 10 20 2020 medium #> 11 200 2016 high #> 12 200 2017 high #> 13 200 2018 high #> 14 200 2019 high #> 15 200 2020 high tibble(ccode = 900, year = c(2016:2020)) %>% bind_rows(x, .) -> x x #> # A tibble: 20 × 2 #> ccode year #> #> 1 2 2016 #> 2 2 2017 #> 3 2 2018 #> 4 2 2019 #> 5 2 2020 #> 6 20 2016 #> 7 20 2017 #> 8 20 2018 #> 9 20 2019 #> 10 20 2020 #> 11 200 2016 #> 12 200 2017 #> 13 200 2018 #> 14 200 2019 #> 15 200 2020 #> 16 900 2016 #> 17 900 2017 #> 18 900 2018 #> 19 900 2019 #> 20 900 2020 left_join(x, y) #> Joining with `by = join_by(ccode)` #> # A tibble: 20 × 3 #> ccode year ranking #> #> 1 2 2016 low #> 2 2 2017 low #> 3 2 2018 low #> 4 2 2019 low #> 5 2 2020 low #> 6 20 2016 medium #> 7 20 2017 medium #> 8 20 2018 medium #> 9 20 2019 medium #> 10 20 2020 medium #> 11 200 2016 high #> 12 200 2017 high #> 13 200 2018 high #> 14 200 2019 high #> 15 200 2020 high #> 16 900 2016 NA #> 17 900 2017 NA #> 18 900 2018 NA #> 19 900 2019 NA #> 20 900 2020 NA tibble(ccode = 390, ranking = \"high\") %>% bind_rows(y, .) -> y y #> # A tibble: 4 × 2 #> ccode ranking #> #> 1 2 low #> 2 20 medium #> 3 200 high #> 4 390 high left_join(x, y) #> Joining with `by = join_by(ccode)` #> # A tibble: 20 × 3 #> ccode year ranking #> #> 1 2 2016 low #> 2 2 2017 low #> 3 2 2018 low #> 4 2 2019 low #> 5 2 2020 low #> 6 20 2016 medium #> 7 20 2017 medium #> 8 20 2018 medium #> 9 20 2019 medium #> 10 20 2020 medium #> 11 200 2016 high #> 12 200 2017 high #> 13 200 2018 high #> 14 200 2019 high #> 15 200 2020 high #> 16 900 2016 NA #> 17 900 2017 NA #> 18 900 2018 NA #> 19 900 2019 NA #> 20 900 2020 NA"},{"path":"http://svmiller.com/articles/joins.html","id":"why-the-left-join-in-particular","dir":"Articles","previous_headings":"Left (Outer) Join","what":"Why the Left Join, in Particular?","title":"A Discussion of Various Joins in `{peacesciencer}`","text":"interested user may ask ’s special kind join appears everywhere peacesciencer. One reply use left_join() part matter taste. just well vignette reference “right join”, mirror join “left join.” right join dplyr’s right_join(x,y) returns records y matching rows x common columns, though equivalency depend reversing order x y (.e. left_join(x, y) produces information right_join(y, x)). arrangement columns differ left_join() right_join() simple application even underlying information . Ultimately, tend think “left-handed” comes data management instruct students introduce data transformation R. like intuition, especially pipe-based workflow, start master data object top pipe keep “left” add information . benefit keeping units analysis (e.g. state-years simple setup) first columns user sees well. preferred approach data transformation left_join() recurs peacesciencer result. Beyond matter taste, left join everywhere peacesciencer project endeavors hard recreate appropriate universe cases interest user allow user add stuff see fit. create_stateyears() create entire universe state-years 1816 present state-year analysis. create_dyadyears() create entire universe dyad-years 1816 present dyad-year analysis. logic, implemented peacesciencer’s multiple functions, type data user wants create created . user want expand data , though user may want something like reduce full universe 1816-2020 state-years just 1946-2010. However, universe partially discarded, universe augmented expanded. mind, every function’s use left join assumes data object receives represents full universe cases interest researcher. left join just adding information , based matching information one many data sets. done carefully, left join dutiful way adding information data set without changing number rows original data set. number columns obviously expand, number rows unaffected.","code":""},{"path":"http://svmiller.com/articles/joins.html","id":"potential-problems-of-the-left-join","dir":"Articles","previous_headings":"Left (Outer) Join","what":"Potential Problems of the Left Join","title":"A Discussion of Various Joins in `{peacesciencer}`","text":"“done carefully” heavy-lifting last sentence. , let explain situations left join produce problems researcher (even join supposed operational standpoint). first less problem, least implemented peacesciencer, caution. example, panel consists just U.S., Canada, United Kingdom, Australia. happen ranking Denmark, Denmark wasn’t panel (effectively, exclusively) Anglophone states. Therefore, row created Denmark. important left join create rows Denmark, first place (.e. panel Denmark x merge). case, left join behaving . Denmark panel trying match information . peacesciencer circumvents issue creating universal data (e.g. state-years, dyad-years, available leader-years) user free subset see fit. Users run one “create” functions (e.g. create_stateyears(), create_dyadyears()) top script adding information left join, implemented everywhere package, building assumption universe cases interest user represented “left object” left outer join. Basically, expect left join create new rows x situation state represented y x. . type join assumes universe cases interest researcher already appear “left object.” second situation bigger problem. Sometimes, often bouncing information denominated Correlates War states Gleditsch-Ward states, unwanted duplicate observation data frame merged primary data interest user. Let’s go back simple example x y . Everything performs nicely, though Australia (x) ranking Denmark (y) panel state-years wasn’t part original universe cases interest us. Let’s assume, however, mistakenly entered United Kingdom twice y. know data supposed simple state-level rankings. state supposed just . United Kingdom appears twice. left join y2 x, get unwelcome result. United Kingdom duplicated yearly observations. doesn’t matter duplicate ranking y2 UK . messier, sure, ranking different duplicate observation, matters duplicated. panel like , user careful effect overweighting observations duplicate. simple example like , subsetting just complete cases (.e. Australia ranking), UK 50% observations despite fact just third observations. ’s ideal researcher. peacesciencer goes beyond make sure doesn’t happen data creates. Functions aggressively tested make sure nothing duplicates, various parlor tricks (prominently group-slices) used internally cull duplicate observations. release function makes prominent use left join done assurance doesn’t create duplicate. matter, biggest peril left join researcher may want duplicate peacesciencer . Always inspect data merge, output.","code":"x #> # A tibble: 20 × 2 #> ccode year #> #> 1 2 2016 #> 2 2 2017 #> 3 2 2018 #> 4 2 2019 #> 5 2 2020 #> 6 20 2016 #> 7 20 2017 #> 8 20 2018 #> 9 20 2019 #> 10 20 2020 #> 11 200 2016 #> 12 200 2017 #> 13 200 2018 #> 14 200 2019 #> 15 200 2020 #> 16 900 2016 #> 17 900 2017 #> 18 900 2018 #> 19 900 2019 #> 20 900 2020 y #> # A tibble: 4 × 2 #> ccode ranking #> #> 1 2 low #> 2 20 medium #> 3 200 high #> 4 390 high tibble(ccode = 200, ranking = \"high\") %>% bind_rows(y, .) -> y2 left_join(x, y2) %>% data.frame #> Joining with `by = join_by(ccode)` #> Warning in left_join(x, y2): Each row in `x` is expected to match at most 1 row in `y`. #> ℹ Row 11 of `x` matches multiple rows. #> ℹ If multiple matches are expected, set `multiple = \"all\"` to silence this #> warning. #> ccode year ranking #> 1 2 2016 low #> 2 2 2017 low #> 3 2 2018 low #> 4 2 2019 low #> 5 2 2020 low #> 6 20 2016 medium #> 7 20 2017 medium #> 8 20 2018 medium #> 9 20 2019 medium #> 10 20 2020 medium #> 11 200 2016 high #> 12 200 2016 high #> 13 200 2017 high #> 14 200 2017 high #> 15 200 2018 high #> 16 200 2018 high #> 17 200 2019 high #> 18 200 2019 high #> 19 200 2020 high #> 20 200 2020 high #> 21 900 2016 #> 22 900 2017 #> 23 900 2018 #> 24 900 2019 #> 25 900 2020 "},{"path":"http://svmiller.com/articles/joins.html","id":"semi-join","dir":"Articles","previous_headings":"","what":"Semi-Join","title":"A Discussion of Various Joins in `{peacesciencer}`","text":"“semi-join” (semi_join() dplyr) returns rows left object (x) matching values right object (y). type “filtering join”, affects observations variables. appears just twice peacesciencer, serving final join create_leaderdays() create_leaderyears(). cases, serves means standardizing leader data (denominated Gleditsch-Ward system, necessarily Gleditsch-Ward system dates) Correlates War Gleditsch-Ward system. basic example semi-join context, illustration kind difficulties manifest standardizing Archigos’ leader data Correlates War state system. Assume simple state system just two states—“Lincoln” “Morrill”—two-week period start 1975 (Jan. 1, 1975 Jan. 14, 1975). simple system, “Lincoln” state full two week period (Jan. 1-Jan.14) whereas “Morrill” state just first seven days (Jan. 1-Jan. 7) , let’s say, “Lincoln” occupied “Morrill” ended statehood. also happened leader data two states. two week period, leader data suggests “Lincoln” just one continuous leader—“Archie”—whereas “Morrill” three. “Brian” leader “Morrill” retired office replaced “Cornelius.” However, deposed “Lincoln” invaded “Morrill” replaced puppet head state, “Pete.” data look like . can use basic rowwise() transformation recast data daily, resulting state-day data leader-day data. wanted standardize leader-day data state system data, semi-join leader-day data (left object) state-day object (right object), returning just leader-day data valid days state system data. Notice Pete drops data , simple example, Pete puppet head state installed Archie “Lincoln” invaded occupied “Morrill”. semi-join simply standardizing leader data state system data, effectively ’s happening semi-joins create_leaderdays() (aggregation function: create_leaderyears()).","code":"tibble(code = c(\"Lincoln\", \"Morrill\"), stdate = make_date(1975, 01, 01), enddate = c(make_date(1975, 01, 14), make_date(1975, 01, 07))) -> state_system state_system #> # A tibble: 2 × 3 #> code stdate enddate #> #> 1 Lincoln 1975-01-01 1975-01-14 #> 2 Morrill 1975-01-01 1975-01-07 tibble(code = c(\"Lincoln\", \"Morrill\", \"Morrill\", \"Morrill\"), leader = c(\"Archie\", \"Brian\", \"Cornelius\", \"Pete\"), stdate = c(make_date(1975, 01, 01), make_date(1975, 01, 01), make_date(1975, 01, 04), make_date(1975, 01, 08)), enddate = c(make_date(1975, 01, 14), make_date(1975, 01, 04), make_date(1975, 01, 08), make_date(1975, 01, 14))) -> leaders leaders #> # A tibble: 4 × 4 #> code leader stdate enddate #> #> 1 Lincoln Archie 1975-01-01 1975-01-14 #> 2 Morrill Brian 1975-01-01 1975-01-04 #> 3 Morrill Cornelius 1975-01-04 1975-01-08 #> 4 Morrill Pete 1975-01-08 1975-01-14 state_system %>% rowwise() %>% mutate(date = list(seq(stdate, enddate, by = '1 day'))) %>% unnest(date) %>% select(code, date) -> state_days state_days %>% data.frame #> code date #> 1 Lincoln 1975-01-01 #> 2 Lincoln 1975-01-02 #> 3 Lincoln 1975-01-03 #> 4 Lincoln 1975-01-04 #> 5 Lincoln 1975-01-05 #> 6 Lincoln 1975-01-06 #> 7 Lincoln 1975-01-07 #> 8 Lincoln 1975-01-08 #> 9 Lincoln 1975-01-09 #> 10 Lincoln 1975-01-10 #> 11 Lincoln 1975-01-11 #> 12 Lincoln 1975-01-12 #> 13 Lincoln 1975-01-13 #> 14 Lincoln 1975-01-14 #> 15 Morrill 1975-01-01 #> 16 Morrill 1975-01-02 #> 17 Morrill 1975-01-03 #> 18 Morrill 1975-01-04 #> 19 Morrill 1975-01-05 #> 20 Morrill 1975-01-06 #> 21 Morrill 1975-01-07 leaders %>% rowwise() %>% mutate(date = list(seq(stdate, enddate, by = '1 day'))) %>% unnest(date) %>% select(code, leader, date) -> leader_days leader_days %>% data.frame #> code leader date #> 1 Lincoln Archie 1975-01-01 #> 2 Lincoln Archie 1975-01-02 #> 3 Lincoln Archie 1975-01-03 #> 4 Lincoln Archie 1975-01-04 #> 5 Lincoln Archie 1975-01-05 #> 6 Lincoln Archie 1975-01-06 #> 7 Lincoln Archie 1975-01-07 #> 8 Lincoln Archie 1975-01-08 #> 9 Lincoln Archie 1975-01-09 #> 10 Lincoln Archie 1975-01-10 #> 11 Lincoln Archie 1975-01-11 #> 12 Lincoln Archie 1975-01-12 #> 13 Lincoln Archie 1975-01-13 #> 14 Lincoln Archie 1975-01-14 #> 15 Morrill Brian 1975-01-01 #> 16 Morrill Brian 1975-01-02 #> 17 Morrill Brian 1975-01-03 #> 18 Morrill Brian 1975-01-04 #> 19 Morrill Cornelius 1975-01-04 #> 20 Morrill Cornelius 1975-01-05 #> 21 Morrill Cornelius 1975-01-06 #> 22 Morrill Cornelius 1975-01-07 #> 23 Morrill Cornelius 1975-01-08 #> 24 Morrill Pete 1975-01-08 #> 25 Morrill Pete 1975-01-09 #> 26 Morrill Pete 1975-01-10 #> 27 Morrill Pete 1975-01-11 #> 28 Morrill Pete 1975-01-12 #> 29 Morrill Pete 1975-01-13 #> 30 Morrill Pete 1975-01-14 leader_days %>% semi_join(., state_days) %>% data.frame #> Joining with `by = join_by(code, date)` #> code leader date #> 1 Lincoln Archie 1975-01-01 #> 2 Lincoln Archie 1975-01-02 #> 3 Lincoln Archie 1975-01-03 #> 4 Lincoln Archie 1975-01-04 #> 5 Lincoln Archie 1975-01-05 #> 6 Lincoln Archie 1975-01-06 #> 7 Lincoln Archie 1975-01-07 #> 8 Lincoln Archie 1975-01-08 #> 9 Lincoln Archie 1975-01-09 #> 10 Lincoln Archie 1975-01-10 #> 11 Lincoln Archie 1975-01-11 #> 12 Lincoln Archie 1975-01-12 #> 13 Lincoln Archie 1975-01-13 #> 14 Lincoln Archie 1975-01-14 #> 15 Morrill Brian 1975-01-01 #> 16 Morrill Brian 1975-01-02 #> 17 Morrill Brian 1975-01-03 #> 18 Morrill Brian 1975-01-04 #> 19 Morrill Cornelius 1975-01-04 #> 20 Morrill Cornelius 1975-01-05 #> 21 Morrill Cornelius 1975-01-06 #> 22 Morrill Cornelius 1975-01-07"},{"path":"http://svmiller.com/articles/joins.html","id":"anti-join","dir":"Articles","previous_headings":"","what":"Anti-Join","title":"A Discussion of Various Joins in `{peacesciencer}`","text":"anti-join another type filtering join, returning rows left object (x) without match right object (y). type join appears just peacesciencer. Prominently, peacesciencer prepares presents two data sets package—false_cow_dyads false_gw_dyads—represent directed dyad-years Correlates War Gleditsch-Ward systems active year, never time year. dyads context. created two scripts , year respective state system data, creates every possible daily dyadic pairing truncates dyads just least one day overlap. computationally demanding procedure compared peacesciencer (creates every possible dyadic pair given year, given state system data supplied ). However, creates possibility false dyads given year showed overlap. Consider case Suriname (115) Republic Vietnam (817) 1975 illustrative . Notice Suriname Republic Vietnam active 1975. Suriname appears Nov. 25, 1975 whereas Republic Vietnam exits April 30, 1975. However, daily overlap two exist point day 1975. false dyads. anti_join() used create_dyadyears() function remove observations presenting user. simple example anti-join examples mind.","code":"false_cow_dyads #> # A tibble: 60 × 4 #> ccode1 ccode2 year in_ps #>