Skip to content

Commit

Permalink
added functionality for creating master index CSV
Browse files Browse the repository at this point in the history
  • Loading branch information
tylerjthomas9 committed Jun 7, 2021
1 parent 5276987 commit 50ac150
Showing 1 changed file with 34 additions and 0 deletions.
34 changes: 34 additions & 0 deletions src/main_index.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
using CSV
using DataFrames


"""
Create main index TSV file by combining all metadata files
Parameters
----------
metadata_folder::String
- Folder where metadata TSVs are stored
master_file::String
- TSV file name for combined metadata
Returns
----------
nothing
"""
function create_master_index(metadata_folder="../metadata/"::String,
master_file="../metadata/master_idx.tsv"::String)

# Import all csv files into a dataframe
metadata_files = [i for i in readdir(metadata_folder; join=true) if i!=master_file]
df = reduce(vcat, [DataFrame(CSV.File(i, delim="|")) for i in metadata_files])

# remove duplicates
df = df[findall(nonunique(df)), :]

# export df
CSV.write(master_file, df, delim="|")

return

end

0 comments on commit 50ac150

Please sign in to comment.