Blog 3: Exception to Data Driven Rules

Published on:

The problem with Data-Driven rules

News Article:
The Right to Be an Exception to a Data-Driven Rule

This article talks about how data-driven rules are used by companies to speed up their process of looking through things like applications. If someone’s data doesn’t match the rule, they will be denied of some sort of end result, such as getting a loan accepted, or an in person interview. The author argues that people have a right to be an exception to these rules and that it can be dangerous for someone if they are unfairly denied from many different things because of data-driven rules.

A data driven rule is some method of sorting that looks at a collection of data with different variables and analyzes which group a case is sorted into like interview or no interview. A data-driven exception is when a case is sorted into a group that it does not belong in. An exception is the same as an error because it shows that the rule is not perfect and has some flaws to its criteria. I would personally not want to be an exception, because it is unfair to be denied something you are deserving of. This has big implications of discrimination issues, and makes me wonder how it can be changed to better serve the people it reviews. It could potentially be used to get rid of cases that don’t fit the criteria, but then a human actually reviews those that are denied in order to make sure it was rightfully denied.

So what differentiates these data-driven decisions from ones that human would make? As the article states, humans go through the process slowly, and make mistakes in a diverse way. If 10 humans review someone’s resume, chances are someone will accept it, but if 10 computers use data-driven rules, it will most likely be rejected or accepted all 10 times.

One way to change the effectiveness of data-driven rules is by adding individualization, which is tailoring a rule to specific cases. Individualization adds more nuance to the data-driven rule, in hopes of creating a more suitable rule for everyone. This can improve performance by giving a better accuracy to the sorting. However, as the article states, you can end up overfitting; “performing extremely well in training but poorly when deployed.”

Data-driven rules are obviously not going to be 100% effective, which causes uncertainty. This is important to remember when the risk of an outcome is high. If a data-driven rule is being used for things like insurance or court sentences, the risk is high, and no amount of individualization will get the accuracy high enough. I don’t believe that any metric such as accuracy can warrant the uncertainty that comes with a data-driven rule. If a risk is low, having uncertainty is fine, but as risk grows, uncertainty is increasingly dangerous.

After looking at this case study, I wonder; For whom do we check biases, and how should those who make data-driven rules check for these? How do we make sure we have checked for enough biases?

Some groups of people are more obvious than others to account for, but sometimes groups are marginalized and aren’t as noticably detrimented. This makes it hard to ensure that you’ve accounted for everything in the rule. If we can somehow quantify or measure how much a group is effected, we could possibly weigh the costs and benefits of changing a model to account for more biases.

Looking back at this case study, it gave me more insight on how models are trained, and gave me perspective on how certain characteristics can be targeted by data-driven rules. It was interesting to see how humans are also bias, but do it in a way that is more random, hence not causing consistent denial to certain groups. It’s also important to think about whether data-driven rules fit a scenario, because they can easily be too risky when the outcome of an input is more detrimental.