Data is never neutral

News Mar 3, 2021

That's why sometimes you have to gather your own

The consequence of believing in the myth that data (or AI) is neutral, is that it discourages people from asking what agenda that data serves, or what biases it may have. This kind of critical thinking is essential in a world of automated decision making in which that data is used to judge us and govern our working lives.

The so called gig economy uses automated systems to manage workers, or contractors, depending upon the jurisdiction. However there’s growing concern that these systems manipulate, exploit, and sometimes cheat the humans who are doing the labour.

The problem in attempting to prove this hypothesis is that the platform is always biased towards the platform owner. The data that is collected benefits the company and not the users or in this case workers.

What can be done? Well workers could collect their own data, to be able to make their own grievances or demands for justice with the evidence or data to back them up.

Armin Samii had been biking for UberEats for a few weeks last July when he accepted a delivery he estimated would take 20 minutes, tops. But the app led him up one of the steepest hills in Pittsburgh, a 4-mile one-way trip that clocked in at an hour. Then he noticed that Uber had only paid him for 1 mile—the distance between his origin and destination as the crow flies, but not as the man bikes.

“I’d only done 20 deliveries, and this already happened to me,” Samii says. “For people who do this full-time, are they going to look this deeply into this statement? Are they ever going to find this issue?”

Samii is a software engineer. So he created a Google Chrome extension, UberCheats, that helps workers spot pay discrepancies. The extension automatically extracts the start and end points of each trip and calculates the shortest travel distance between the two. If that distance doesn’t match up with what Uber paid for, the extension marks it for closer examination.

So far, only a few hundred workers have installed UberCheats on their browsers. Samii doesn’t have access to how couriers are using the extension, but he says some have told him they’ve used it to find pay inconsistencies. Google briefly removed the extension last week when Uber flagged it as a trademark violation, but reversed its decision after Samii appealed.

What’s fascinating about this story is both the example of automated injustice resulting from the error the algorithm made in the calculation, as well as the company’s attempt to prevent workers from using this kind of data collection tool.

In this case the “organizer” was a software developer, who had the literacy to create a tool to collect data to argue their case against the company. Unfortunately most people do not have that literacy.

Perhaps a Union should?

At the very least to pool the value and power of that data?

But some workers have been drawn to homegrown tools built by other gig workers—and the idea that they might themselves profit off the information that companies collect about them. Driver’s Seat Cooperative launched in 2019 to help workers collect and analyze their own data from ride-hail and delivery apps like Uber, Lyft, DoorDash, and Instacart. More than 600 gig workers in 40 cities have pooled their information through the cooperative, which helps them decide when and where to sign on to each app to make the most money, and how much they are making, after expenses. In turn, the company hopes to sell the data to transportation agencies interested in learning more about gig work, and pass on the profits to cooperative members. Only one city agency, in San Francisco, has paid for the data thus far, for a local mobility study that sent $45,700 to Driver’s Seat.

Owen Christofferson has been driving for Uber and Lyft in Portland for six years and signed up to use Driver’s Seat when it launched. He hopes these sort of projects help all drivers keep better track of what they’re spending to earn. “It’s actually really, really complicated to figure out what you, as a gig worker using your own vehicle, are putting into this,” he says. “It's a kind of active disempowerment on the part of the companies, because many drivers might lack the resources or skills to understand their true hourly wage.”

In this case however I’m not entirely sure of either the value that can be derived from driver’s data (fractions of a penny per user perhaps)? Similarly awareness of how little they’re being paid means nothing without organized labour and a willingness to strike as a counter to the power of the company and a means of enabling a stronger negotiating position.

An open source approach makes more sense if you really wanted to achieve scale across the gig economy.

An open source project called WeClock, launched by the UNI Global Union, seeks to help workers collect and then visualize data on their wages and working conditions, tapping into smartphone sensors to determine how often they’re sitting, standing, or traveling, and how they feel when they’re on the job. Once it’s collected, workers control their own information and how it's used, says Christina Colclough, who helped build the app and now runs an organizing consultancy called the Why Not Lab. “We don’t want to further the surveillance that workers are already subjected to,” she says. “We don’t want to be Big Tech with a conscience.”

Colclough hopes that, eventually, workers might use WeClock to show they’re working longer hours than agreed. For now, the app is being used by 15 freelance UK TV production workers, who say that production companies don’t always pay fair wages for all the work they do. The participants in the pilot use Apple Watches to track their movements while on set.

Overall these are great initiatives, however on their own they pale in comparison to the data and power these companies and platforms posses.

There needs to be a scale, whether cooperative, or federated, or affiliated, that enables alternate data sets, with alternate agendas, to be available.

At the very least these efforts illustrate the inherent bias and agenda of data and by extension any AI application or algorithm.

The question isn’t whether there is bias, but what the bias is, and why. Alternatively what the bias should be.

For designers of AI, this is an opportunity. To design systems with deliberate and specific bias(es) that increase the performance and/or enjoyment of the system.

We’ll elaborate on this concept as part of our exploration of participatory AI, in the meantime here’s more on the subject of unionizing gig workers.

Tags

Great! You've successfully subscribed.
Great! Next, complete checkout for full access.
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.