diff --git a/02_activities/assignments/DC_Cohort/Assignment1.md b/02_activities/assignments/DC_Cohort/Assignment1.md index f78778f5b..eb1fec78c 100644 --- a/02_activities/assignments/DC_Cohort/Assignment1.md +++ b/02_activities/assignments/DC_Cohort/Assignment1.md @@ -205,5 +205,9 @@ Consider, for example, concepts of fariness, inequality, social structures, marg ``` -Your thoughts... +The article was published in 'Ideas' in November 2021 and authored by Rida Qadri, then a PhD Candidate in MIT's program of urban information systems. In the article, Rida shares the story of Riz, a Pakistani person who was excluded from the National Database and Registration Authority (NADRA) because their parents' marriages did not meet the (rigid) criteria of the new digitalized identification system. The inability to be identified by NADRA resulted in Riz's inaccessibility to social and welfare services and significant constraints on their freedom of movement. To my understanding, the main claim of the article is that the social hierarchies, social perspectives and political power relationships are embedded in data systems. This claim echoes scholars in Science and Technology Studies (e.g., Bruno Latour, Donna Haraway) and the Philosophy of Technology (e.g., Andrew Feenberg) who argued decades ago that technologies, and specifically information and communication technologies, are sociotechnical objects. + +While I agree with the main argument, in this comment, I will problematize some of the assumptions made in the article. Then I will add some insights from my practice and research as a social worker/community organizer, focusing on the interplay between welfare/community practice and information technology. First, I would like to challenge the contrast between NADRA's lineage/genealogical system design and systems for identifying individuals through the organization and processing of biometric data (i.e., the 'unique' physical, behavioural, and biological characteristics). While biometric data is seen as a more objective/reliable process for identification, it is imperative to remember that, as sociotechnical objects, data systems (and any other artifacts) can also be used (and abused) for purposes that deviate from the original one. Thus, while biometric systems automate, streamline, and particularize the identification process, they have also been used to profile and surveil marginalized groups, including poor and racialized individuals and communities. This brings me to my second argument. Similar to Qadri, I support the call for a more reflexive (critical) approach to the politics and ethics embedded in data system design. However, it is crucial to remember that the way we structure/construct the data (through schemes, possible data types, and even the practice of surveys/questionnaires in social science) is only one side of the story. The other side is how we interpret the data produced by, and/or mined/retrieved from, these systems and databases. In her book, Virginia Eubanks (2018) traces how these two practices—structuring data systems and interpreting their outputs—are increasingly used to profile and punish the poor in the U.S. (e.g., via decision algorithms in the child welfare system and in determining eligibility for allowances and social support). + +The freedom to interpret, however, can also work in another direction and prepare the ground for other social demands. In this regard, the example of Pakistan's Khawaja Sira community is essential. It shows that databases and information technologies can be a site for political demands and social change (even if not perfect). To motivate this change, it is essential to remember that we (still) have spaces (courts, streets, universities, community centers) that we should foster and take care of together to make these struggles for data justice, ownership, and sovereignty possible and effective. ``` diff --git a/02_activities/assignments/DC_Cohort/assignment1.sql b/02_activities/assignments/DC_Cohort/assignment1.sql index c992e3205..2e70adb17 100644 --- a/02_activities/assignments/DC_Cohort/assignment1.sql +++ b/02_activities/assignments/DC_Cohort/assignment1.sql @@ -5,49 +5,126 @@ --SELECT /* 1. Write a query that returns everything in the customer table. */ - +SELECT * +FROM customer; /* 2. Write a query that displays all of the columns and 10 rows from the cus- tomer table, sorted by customer_last_name, then customer_first_ name. */ +SELECT* +FROM customer +ORDER BY customer_first_name, customer_last_name +LIMIT 10; --WHERE /* 1. Write a query that returns all customer purchases of product IDs 4 and 9. */ +SELECT * +FROM customer_purchases +WHERE product_id = 4 +OR product_id = 9; - -/*2. Write a query that returns all customer purchases and a new calculated column 'price' (quantity * cost_to_customer_per_qty), +/*2. Write a query that returns all customer purchases and a new calculated column 'price' (quantity * cost_to_customer_per_qty), filtered by customer IDs between 8 and 10 (inclusive) using either: 1. two conditions using AND 2. one condition using BETWEEN */ -- option 1 +SELECT +quantity, +cost_to_customer_per_qty, +quantity * cost_to_customer_per_qty as price, +product_id, +market_date, +vendor_id, +customer_id, +transaction_time + +FROM customer_purchases + +WHERE customer_id > 7 +AND customer_id < 11 ; -- option 2 +SELECT * +FROM customer_purchases +WHERE product_id = 4 +OR product_id = 9; +/*2. Write a query that returns all customer purchases and a new calculated column 'price' (quantity * cost_to_customer_per_qty), +filtered by customer IDs between 8 and 10 (inclusive) using either: + 1. two conditions using AND + 2. one condition using BETWEEN +*/ +-- option 1 + +SELECT +quantity, +cost_to_customer_per_qty, +quantity * cost_to_customer_per_qty as price, +product_id, +market_date, +vendor_id, +customer_id, +transaction_time + +FROM customer_purchases + +WHERE customer_id BETWEEN 8 AND 10; --CASE /* 1. Products can be sold by the individual unit or by bulk measures like lbs. or oz. Using the product table, write a query that outputs the product_id and product_name columns and add a column called prod_qty_type_condensed that displays the word “unit” if the product_qty_type is “unit,” and otherwise displays the word “bulk.” */ +SELECT +product_id, +product_name, +product_qty_type +,CASE + WHEN product_qty_type = 'unit' THEN 'unit' + ELSE 'bulk' +END as prod_qty_type_condensed +FROM product; /* 2. We want to flag all of the different types of pepper products that are sold at the market. add a column to the previous query called pepper_flag that outputs a 1 if the product_name contains the word “pepper” (regardless of capitalization), and otherwise outputs 0. */ +SELECT +product_id, +product_name, +product_qty_type + +,CASE + WHEN product_qty_type = 'unit' THEN 'unit' + ELSE 'bulk' +END as prod_qty_type_condensed +,CASE + WHEN product_name LIKE '%eppers%' THEN 1 + ELSE 0 +END as pepper_flag + +FROM product; --JOIN /* 1. Write a query that INNER JOINs the vendor table to the vendor_booth_assignments table on the vendor_id field they both have in common, and sorts the result by vendor_name, then market_date. */ +SELECT * + +FROM vendor as v +INNER JOIN vendor_booth_assignments as vba + ON v.vendor_id = vba.vendor_id + +ORDER BY vendor_name, market_date; /* SECTION 3 */ @@ -56,14 +133,30 @@ vendor_id field they both have in common, and sorts the result by vendor_name, t /* 1. Write a query that determines how many times each vendor has rented a booth at the farmer’s market by counting the vendor booth assignments per vendor_id. */ - + SELECT vendor_id ,COUNT (market_date) as number_booth_renting + FROM vendor_booth_assignments + GROUP BY vendor_id; /* 2. The Farmer’s Market Customer Appreciation Committee wants to give a bumper sticker to everyone who has ever spent more than $2000 at the market. Write a query that generates a list of customers for them to give stickers to, sorted by last name, then first name. HINT: This query requires you to join two tables, use an aggregate function, and use the HAVING keyword. */ +SELECT +cp. customer_id, +customer_first_name, +customer_last_name, +SUM (quantity*cost_to_customer_per_qty) as total_spend + +FROM customer_purchases as cp +INNER JOIN customer as c + ON c.customer_id = cp.customer_id + +GROUP BY c.customer_id +HAVING total_spend > 2000 + +ORDER BY customer_last_name, customer_first_name; --Temp Table @@ -78,9 +171,23 @@ When inserting the new vendor, you need to appropriately align the columns to be VALUES(col1,col2,col3,col4,col5) */ +DROP TABLE IF EXISTS temp.new_vendor; + +CREATE TABLE temp.new_vendor AS + +SELECT * + +FROM vendor; + +INSERT INTO temp.new_vendor (vendor_id, vendor_name, vendor_type, vendor_owner_first_name, vendor_owner_last_name ) +VALUES(10,'Thomass Superfood Store', 'Fresh Focused store', 'Thomas', 'Rosenthal') + + + + --- Date +-- Date [no need for this assigment] /*1. Get the customer_id, month, and year (in separate columns) of every purchase in the customer_purchases table. HINT: you might need to search for strfrtime modifers sqlite on the web to know what the modifers for month diff --git a/02_activities/assignments/DC_Cohort/images/Assigment 1 logical diagram.jpg b/02_activities/assignments/DC_Cohort/images/Assigment 1 logical diagram.jpg new file mode 100644 index 000000000..990230560 Binary files /dev/null and b/02_activities/assignments/DC_Cohort/images/Assigment 1 logical diagram.jpg differ