# Algorithm Groups People More Fairly to Reduce AI Bias

## Fair clustering algorithms aim to make artificial intelligence more impartial

Illustration: iStockphoto

Say you work at a big company and you’re hiring for a new position. You receive thousands of resumes. To make a first pass, you may turn to artificial intelligence. A common AI task is called clustering, in which an algorithm sorts through a set of items (or people) and groups them into similar clusters.

In the hiring scenario, you might create clusters based on skills and experience and then hire from the top crop. But algorithms can be unfair. Even if you instruct them to ignore factors like gender and ethnicity, these attributes often correlate with factors you do count, leading to clusters that don’t represent the larger pool’s demographics. As a result, you could end up hiring only white men.

In recent years, computer scientists have constructed fair clustering algorithms to counteract such biases, and a new one offers several advantages over those that came before. It could improve fair clustering, whether the clusters contain job candidates, customers, sick patients, or potential criminals.

Most fair clustering algorithms try to balance only one “sensitive” attribute (such as gender if you want gender equality) at a time. If they can handle more, each attribute must be binary—male or female, say. The new method, described in a paper that will be presented at the Extending Database Technology conference in Copenhagen in April, handles multiple factors simultaneously, and those factors can be binary, numeric, or categorical.

This approach is based on a simple and common method called k-means clustering. In k-means clustering, you first convert each attribute to a number so that a data point representing, say, a person can be placed into a multi-dimensional space. Then if you want to form, for example, three clusters, you’d first pick three data points at random. Assign all the points nearest to each of those points to be in a cluster with it. For each cluster, create a new temporary point that is an average of all the points in the cluster. Once again, assign whatever points are nearest to a center point to be in a cluster with it. Repeat several times.

The new system, called FairKM, makes a critical change. When assigning a point to this or that cluster, it considers not only which center point it’s closest to, but also which cluster it will skew the least in terms of sensitive attributes. If the overall dataset is half female and has an average age of 40, and gender and age are sensitive attributes, the algorithm tries to make each cluster half female and give it an average age of 40. This is called demographic parity and is one of many (often mutually incompatible) forms of fairness.

The researchers, from the Indian Institute of Technology Madras and Queen’s University Belfast applied FairKM to two datasets. The first contained census data from about 15,000 individuals. The algorithm clustered people based on nine attributes including education and occupation, while balancing five sensitive ones including marital status and native country. The second dataset contained physics word problems, which the system clustered based on theme, while balancing the difficulty level.

[shortcode ieee-pullquote quote=""One challenge that we sidestep in this research is the question of fairness in data collection itself."" float="left" expand=1]

The researchers compared their system on both cluster quality and fairness with two benchmarks: k-means clustering that simply ignores the sensitive attributes, and a previously published fair clustering algorithm that balances only one sensitive attribute at a time (that algorithm and FairKM were both tested on one sensitive attribute at a time, even though FairKM balanced them all).

On the two datasets, FairKM could not match the blind k-means on quality, but unfairness was reduced by 30 to 90 percent (the exact number depends on the metric used). Compared with the algorithm that balanced a single sensitive attribute, the quality of results produced by the new approach was at least as high on most metrics, sometimes much higher, and unfairness was reduced by 50 to 85 percent.

Savitha Abraham, a computer scientist at the Indian Institute of Technology Madras and the paper’s primary author, says they’re working to improve the algorithm to handle data that’s even more skewed. They also want to speed it up. Deepak Padmanabhan, a computer scientist at Queen’s University Belfast and co-author of the paper, said, “One challenge that we sidestep in this research is the question of fairness in data collection itself,” but they’re working on technical fixes for that, too.

The method in the paper is “a nice direction to explore,” said Suman Bera, a computer scientist at the University of California, Santa Cruz who was not involved in the work, “and they show it works well in practice as well.” His own work on fair clustering takes a different approach, balancing clusters after k-means is finished forming them. He says his approach is more amenable to theoretical guarantees of fairness, but the two are hard to compare directly. “This is a very evolving field,” he says. “It’s an exciting time to work in this area.”

The Conversation (0)

## Why Functional Programming Should Be the Future of Software Development

### It’s hard to learn, but your code will produce fewer nasty surprises

Vertical
Shira Inbar
DarkBlue1

You’d expectthe longest and most costly phase in the lifecycle of a software product to be the initial development of the system, when all those great features are first imagined and then created. In fact, the hardest part comes later, during the maintenance phase. That’s when programmers pay the price for the shortcuts they took during development.

So why did they take shortcuts? Maybe they didn’t realize that they were cutting any corners. Only when their code was deployed and exercised by a lot of users did its hidden flaws come to light. And maybe the developers were rushed. Time-to-market pressures would almost guarantee that their software will contain more bugs than it would otherwise.

The struggle that most companies have maintaining code causes a second problem: fragility. Every new feature that gets added to the code increases its complexity, which then increases the chance that something will break. It’s common for software to grow so complex that the developers avoid changing it more than is absolutely necessary for fear of breaking something. In many companies, whole teams of developers are employed not to develop anything new but just to keep existing systems going. You might say that they run a software version of the Red Queen’s race, running as fast as they can just to stay in the same place.

It’s a sorry situation. Yet the current trajectory of the software industry is toward increasing complexity, longer product-development times, and greater fragility of production systems. To address such issues, companies usually just throw more people at the problem: more developers, more testers, and more technicians who intervene when systems fail.

Surely there must be a better way. I’m part of a growing group of developers who think the answer could be functional programming. Here I describe what functional programming is, why using it helps, and why I’m so enthusiastic about it.

## With functional programming, less is more

A good way to understand the rationale for functional programming is by considering something that happened more than a half century ago. In the late 1960s, a programming paradigm emerged that aimed to improve the quality of code while reducing the development time needed. It was called structured programming.

Various languages emerged to foster structured programming, and some existing languages were modified to better support it. One of the most notable features of these structured-programming languages was not a feature at all: It was the absence of something that had been around a long time— the GOTO statement.

The GOTO statement is used to redirect program execution. Instead of carrying out the next statement in sequence, the flow of the program is redirected to some other statement, the one specified in the GOTO line, typically when some condition is met.

The elimination of the GOTO was based on what programmers had learned from using it—that it made the program very hard to understand. Programs with GOTOs were often referred to as spaghetti code because the sequence of instructions that got executed could be as hard to follow as a single strand in a bowl of spaghetti.

Shira Inbar

The inability of these developers to understand how their code worked, or why it sometimes didn’t work, was a complexity problem. Software experts of that era believed that those GOTO statements were creating unnecessary complexity and that the GOTO had to, well, go.

Back then, this was a radical idea, and many programmers resisted the loss of a statement that they had grown to rely on. The debate went on for more than a decade, but in the end, the GOTO went extinct, and no one today would argue for its return. That’s because its elimination from higher-level programming languages greatly reduced complexity and boosted the reliability of the software being produced. It did this by limiting what programmers could do, which ended up making it easier for them to reason about the code they were writing.

Although the software industry has eliminated GOTO from modern higher-level languages, software nevertheless continues to grow in complexity and fragility. Looking for how else such programming languages could be modified to avoid some common pitfalls, software designers can find inspiration, curiously enough, from their counterparts on the hardware side.

## Nullifying problems with null references

In designing hardware for a computer, you can’t have a resistor shared by, say, both the keyboard and the monitor’s circuitry. But programmers do this kind of sharing all the time in their software. It’s called shared global state: Variables are owned by no one process but can be changed by any number of processes, even simultaneously.

Now, imagine that every time you ran your microwave, your dishwasher’s settings changed from Normal Cycle to Pots and Pans. That, of course, doesn’t happen in the real world, but in software, this kind of thing goes on all the time. Programmers write code that calls a function, expecting it to perform a single task. But many functions have side effects that change the shared global state, giving rise to unexpected consequences.

In hardware, that doesn’t happen because the laws of physics curtail what’s possible. Of course, hardware engineers can mess up, but not like you can with software, where just too many things are possible, for better or worse.

Another complexity monster lurking in the software quagmire is called a null reference, meaning that a reference to a place in memory points to nothing at all. If you try to use this reference, an error ensues. So programmers have to remember to check whether something is null before trying to read or change what it references.

Nearly every popular language today has this flaw. The pioneering computer scientist Tony Hoare introduced null references in the ALGOL language back in 1965, and it was later incorporated into numerous other languages. Hoare explained that he did this “simply because it was so easy to implement,” but today he considers it to be a “billion-dollar mistake.” That’s because it has caused countless bugs when a reference that the programmer expects to be valid is really a null reference.

Software developers need to be extremely disciplined to avoid such pitfalls, and sometimes they don’t take adequate precautions. The architects of structured programming knew this to be true for GOTO statements and left developers no escape hatch. To guarantee the improvements in clarity that GOTO-free code promised, they knew that they’d have to eliminate it entirely from their structured-programming languages.

History is proof that removing a dangerous feature can greatly improve the quality of code. Today, we have a slew of dangerous practices that compromise the robustness and maintainability of software. Nearly all modern programming languages have some form of null references, shared global state, and functions with side effects—things that are far worse than the GOTO ever was.

How can those flaws be eliminated? It turns out that the answer has been around for decades: purely functional programming languages.

Of the top dozen functional-programming languages, Haskell is by far the most popular, judging by the number of GitHub repositories that use these languages.

The first purely functional language to become popular, called Haskell, was created in 1990. So by the mid-1990s, the world of software development really had the solution to the vexing problems it still faces. Sadly, the hardware of the time often wasn’t powerful enough to make use of the solution. But today’s processors can easily manage the demands of Haskell and other purely functional languages.

Indeed, software based on pure functions is particularly well suited to modern multicore CPUs. That’s because pure functions operate only on their input parameters, making it impossible to have any interactions between different functions. This allows the compiler to be optimized to produce code that runs on multiple cores efficiently and easily.

As the name suggests, with purely functional programming, the developer can write only pure functions, which, by definition, cannot have side effects. With this one restriction, you increase stability, open the door to compiler optimizations, and end up with code that’s far easier to reason about.

But what if a function needs to know or needs to manipulate the state of the system? In that case, the state is passed through a long chain of what are called composed functions—functions that pass their outputs to the inputs of the next function in the chain. By passing the state from function to function, each function has access to it and there’s no chance of another concurrent programming thread modifying that state—another common and costly fragility found in far too many programs.

### Avoiding Null-Reference Surprises

A comparison of Javascript and Purescript shows how the latter can help programmers avoid bugs.

Functional programming also has a solution to Hoare’s “billion-dollar mistake,” null references. It addresses that problem by disallowing nulls. Instead, there is a construct usually called Maybe (or Option in some languages). A Maybe can be Nothing or Just some value. Working with Maybes forces developers to always consider both cases. They have no choice in the matter. They must handle the Nothing case every single time they encounter a Maybe. Doing so eliminates the many bugs that null references can spawn.

Functional programming also requires that data be immutable, meaning that once you set a variable to some value, it is forever that value. Variables are more like variables in math. For example, to compute a formula, y = x2 + 2x – 11, you pick a value for x and at no time during the computation of y does x take on a different value. So, the same value for x is used when computing x2 as is used when computing 2x. In most programming languages, there is no such restriction. You can compute x2 with one value, then change the value of x before computing 2x. By disallowing developers from changing (mutating) values, they can use the same reasoning they did in middle-school algebra class.

Unlike most languages, functional programming languages are deeply rooted in mathematics. It’s this lineage in the highly disciplined field of mathematics that gives functional languages their biggest advantages.

Why is that? It’s because people have been working on mathematics for thousands of years. It’s pretty solid. Most programming paradigms, such as object-oriented programming, have at most half a dozen decades of work behind them. They are crude and immature by comparison.

Imagine if every time you ran your microwave, your dishwasher’s settings changed from Normal Cycle to Pots and Pans. In software, this kind of thing goes on all the time.

Let me share an example of how programming is sloppy compared with mathematics. We typically teach new programmers to forget what they learned in math class when they first encounter the statement x = x + 1. In math, this equation has zero solutions. But in most of today’s programming languages, x = x + 1 is not an equation. It is a statement that commands the computer to take the value of x, add one to it, and put it back into a variable called x.

In functional programming, there are no statements, only expressions. Mathematical thinking that we learned in middle school can now be employed when writing code in a functional language.

Thanks to functional purity, you can reason about code using algebraic substitution to help reduce code complexity in the same way you reduced the complexity of equations back in algebra class. In non-functional languages (imperative languages), there is no equivalent mechanism for reasoning about how the code works.

## Functional programming has a steep learning curve

Pure functional programming solves many of our industry’s biggest problems by removing dangerous features from the language, making it harder for developers to shoot themselves in the foot. At first, these limitations may seem drastic, as I’m sure the 1960s developers felt regarding the removal of GOTO. But the fact of the matter is that it’s both liberating and empowering to work in these languages—so much so that nearly all of today’s most popular languages have incorporated functional features, although they remain fundamentally imperative languages.

The biggest problem with this hybrid approach is that it still allows developers to ignore the functional aspects of the language. Had we left GOTO as an option 50 years ago, we might still be struggling with spaghetti code today.

To reap the full benefits of pure functional programming languages, you can’t compromise. You need to use languages that were designed with these principles from the start. Only by adopting them will you get the many benefits that I’ve outlined here.

But functional programming isn’t a bed of roses. It comes at a cost. Learning to program according to this functional paradigm is almost like learning to program again from the beginning. In many cases, developers must familiarize themselves with math that they didn’t learn in school. The required math isn’t difficult—it’s just new and, to the math phobic, scary.

More important, developers need to learn a new way of thinking. At first this will be a burden, because they are not used to it. But with time, this new way of thinking becomes second nature and ends up reducing cognitive overhead compared with the old ways of thinking. The result is a massive gain in efficiency.

But making the transition to functional programming can be difficult. My own journey doing so a few years back is illustrative.

I decided to learn Haskell—and needed to do that on a business timeline. This was the most difficult learning experience of my 40-year career, in large part because there was no definitive source for helping developers make the transition to functional programming. Indeed, no one had written anything very comprehensive about functional programming in the prior three decades.

To reap the full benefits of pure functional programming languages, you can’t compromise. You need to use languages that were designed with these principles from the start.

I was left to pick up bits and pieces from here, there, and everywhere. And I can attest to the gross inefficiencies of that process. It took me three months of days, nights, and weekends living and breathing Haskell. But finally, I got to the point that I could write better code with it than with anything else.

When I decided that our company should switch to using functional languages, I didn’t want to put my developers through the same nightmare. So, I started building a curriculum for them to use, which became the basis for a book intended to help developers transition into functional programmers. In my book, I provide guidance for obtaining proficiency in a functional language called PureScript, which stole all the great aspects of Haskell and improved on many of its shortcomings. In addition, it’s able to operate in both the browser and in a back-end server, making it a great solution for many of today’s software demands.

While such learning resources can only help, for this transition to take place broadly, software-based businesses must invest more in their biggest asset: their developers. At my company, Panoramic Software, where I’m the chief technical officer, we’ve made this investment, and all new work is being done in either PureScript or Haskell.

We started down the road of adopting functional languages three years ago, beginning with another pure functional language called Elm because it is a simpler language. (Little did we know we would eventually outgrow it.) It took us about a year to start reaping the benefits. But since we got over the hump, it’s been wonderful. We have had no production runtime bugs, which were so common in what we were formerly using, JavaScript on the front end and Java on the back. This improvement allowed the team to spend far more time adding new features to the system. Now, we spend almost no time debugging production issues.

But there are still challenges when working with a language that relatively few others use—in particular, the lack of online help, documentation, and example code. And it’s hard to hire developers with experience in these languages. Because of that, my company uses recruiters who specialize in finding functional programmers. And when we hire someone with no background in functional programming, we put them through a training process for the first few months to bring them up to speed.

## Functional programming’s future

My company is small. It delivers software to governmental agencies to enable them to help veterans receive benefits from the U.S. Department of Veteran’s Affairs. It’s extremely rewarding work, but it’s not a lucrative field. With razor-slim margins, we must use every tool available to us to do more with fewer developers. And for that, functional programming is just the ticket.

It’s very common for unglamorous businesses like ours to have difficulty attracting developers. But we are now able to hire top-tier people because they want to work on a functional codebase. Being ahead of the curve on this trend, we can get talent that most companies our size could only dream of.

I anticipate that the adoption of pure functional languages will improve the quality and robustness of the whole software industry while greatly reducing time wasted on bugs that are simply impossible to generate with functional programming. It’s not magic, but sometimes it feels like that, and I’m reminded of how good I have it every time I’m forced to work with a non-functional codebase.

One sign that the software industry is preparing for a paradigm shift is that functional features are showing up in more and more mainstream languages. It will take much more work for the industry to make the transition fully, but the benefits of doing so are clear, and that is no doubt where things are headed.

This article appears in the December 2022 print issue as “A New Way to Squash Bugs.”