Files used to compile monthly peacekeeping contribution statistics from PDF's, manipulate into MySQL database, and analyze in R.
Step 1 - Run scraper.py through scraperwiki data store and export extracted data as sqlite db into folder Step 2 - run sqlite_csv.py to format and generate gender.csv Step 3.1 - remove commas for "Tanzania,", "Moldova," and "Macedonia," Step 3.2 - replace blank entries with 0's step 3.3 - fix tccIso3Num entries with leading 0's Step 3.4 - insert "," to assis in changing to sql statement Step 3.5 - Reformat csv file as SQL load statement and load into gender table in main peacekeeping db *** INSERT INTO gender (date, dateString, tcc, tccIso3Alpha,tccIso3Num, mission, ip_M, ip_F, ip_T, fpu_M, fpu_F, fpu_T, eom_M, eom_F, eom_T, troops_M, troops_F, troops_T) VALUES (...), (...) Note-strings enclosed by '' lines separated by ,/n Step 4 - Run statements in sql_script.txt for each date extracted and export to csv's on desktop with minimum date extracted in the @extracted_date field Step 5 - convert contribution ints where value=0 to NA step 6 - run R script.R step 7 - upload contents of tcc_files to web server via ftp into documents folder (archive previous month's files into archived folder) step 8 - format to JSON schema by hand in text wrangler so that data.json.csv conforms to schema contained in tcc_schema...insert date objects to the end of tcc.json and upload into appropriate directory in webserver step 9 -