Skip to content

Commit

Permalink
Data ethics (#4)
Browse files Browse the repository at this point in the history
* typo fixed and detail added to model.matrix comparison

* added a new Rmd with content for data ethics chapter
  • Loading branch information
schafert authored Aug 28, 2024
1 parent 233b9e0 commit 261300c
Show file tree
Hide file tree
Showing 2 changed files with 124 additions and 1 deletion.
8 changes: 7 additions & 1 deletion 05_from-scratch-model.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,12 @@ hist(df$LogFare)
unique(df$Pclass) |> sort()
unique(df$Embarked) |> sort()
df |>
select(Sex, Pclass, Embarked) |>
mutate(Pclass = as.character(Pclass)) |>
model.matrix(object = ~.-1) |>
head()
df <- df |>
fastDummies::dummy_cols(select_columns = c("Sex", "Pclass", "Embarked"))
Expand Down Expand Up @@ -118,7 +124,7 @@ loss <- abs(preds - t_dep) |>
loss
```

- save useful functions for repition
- save useful functions for repetition

```{r}
Expand Down
117 changes: 117 additions & 0 deletions 26_data-ethics.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
# Data Ethics

**Learning objectives:**

- Define ethics
- Provide examples of major themes in ML breaches of ethics
- Discuss mitigation strategies

## Ethics {-}

- "study of right and wrong"
+ How do we define those terms?
+ How do we recognize those actions?
+ How do the consequences of those actions show up?
- In the (philosophical) field, there is no consensus
- Best accomplished in a diverse team

## Prompts Going Forward {-}

- What could you have done in the situation?
- What kind of obstructions might have prevented you from getting that done?
- How would you deal with the obstructions?
- What would you look out for?

## Recourse and Accountability {-}

- We need mechanisms for audits and error correction
- We need to take responsibility for learning the plan of implementation

Examples:

- Healthcare algorithm implemented in Arkansas
+ People received benefit cuts with no explanation
+ especially those impacted by diabetes and cerebral palsy
+ Court case revealed software was buggy

- Babies in gang members database

- US credit report system

## Feedback Loops {-}

- Model controls future data collection design
+ reinforcement learning
- Predictions can reinforce actions taken in the real world

Examples:

- Youtube recommendation algorithm lead to a rise in conspiracy theory
- Youtube recommendation algorithm lead to curated pedophile playlists
- Russia Today gaming the Youtube algorithm
- Positive: Meetup doesn't use gender in recommendation algorithm
- Facebook also recommends members of a radical group to join more

## Bias {-}

- Types of bias:
+ historical bias
+ measurement bias
+ aggregation bias
+ representation bias

Examples:

- Google search: "historically Black names received advertisements suggesting that the person had a criminal record, whereas, white names had more neutral advertisements"

## Historical bias {-}

- people, processes, and society are biased
- Lots of examples of racial bias
- bias in society can lead to systematic bias in datasets (i.e., we don't measure people we are biased against)
- fixing problems in ML because input data has problems is **hard**
- bias in the workforce can reinforce

## Other biases {-}

Measurement bias: stroke prediction - data collected on people who use medical care

Aggregation bias: models aggreate in a way that doesn't incorporate all of the appropriate factors, interaction terms, nonlinearities (Simpson's paradox?)

Representation bias: model amplifies a simple relationship (i.e., occupation and gender)

- More data isn't a panacea
- Better data descriptions, contexts, and decisions

## Why does this matter? {-}

- Extreme case: IBM and Nazi Germany
+ IBM provided data tabulation products necessary to track people on massive scale in camps
+ Had a category for method of murder
+ CEO Watson was meeting with Hitler, but lower level employees building the products were not necessarily aware

- How would you feel? Would you want to know?
- Ask questions; if not satisfied with the answers, say "no"
- Algorithms and humans are not interchangeable

## Identifying and Addressing Ethical Issues {-}

Few steps we can do:
- Analyze a project you are working on
- Implement processes at your company to find and address ethical risks
- Support good policy
- Increase diversity

## Meeting Videos {-}

### Cohort 1 {-}

`r knitr::include_url("https://www.youtube.com/embed/URL")`

<details>
<summary> Meeting chat log </summary>

```
LOG
```
</details>

0 comments on commit 261300c

Please sign in to comment.