It’s a Big Data Wonderland for Ethics

Disney’s Alice in Wonderland

The world of big data feels more like Alice in Wonderland than a journey to my favorite website on the world wide web. As a user of multiple digital platforms, I am feeling increasingly uncomfortable with the amount of data I am handing over to organisations like Facebook and Google. Each time I accept terms and conditions , I wonder what I am giving up and what control I have over what’s collected. Is this organisation trustworthy? What will they do with what they know about me?

Facebook users share 30 billion pieces of content every month. A predicted 3.5 billion searches are conducted on Google every day. And if those numbers aren’t scary for the user, they certainly are for the white rabbit (data analysts to continue my analogy).

Disney’s White Rabbit

Forever running to catch up, data analysts, have a seemingly insurmountable task requiring sophisticated AI and mathematical tools to interpret, analyse and discover value in big data. Digital ecosystems are creating new ethical challenges for data generation, collection and its users. While privacy and protection policies abound, this user questions where the transparency, shared benefit and fairness resides? What about ongoing generation, use and reuse of my data?

Is my browsing history as important as the Facebook posts I liked? Is the number of steps I completed or the hours I slept as important as the geotag on my phone when I commute? Should I place different values on each point of data I generate or create? And does my consent extend to understanding how my data is used across big data organisations? Do I really undersand how my information is gathered from my Instagram engagement to my Facebook news feed? Is the trust I have in Big Tech digital ecosystems misplaced? While I love the service provided, I wonder how comfortable I really am and whether my consent is really informed. What am I giving up? How will I know?

Words like “Privacy” and “Protection” are commonplace and founded in existing legislation and regulation governing data collection.

(Privacy Policy — Privacy & Terms — Google, 2011)

In the above statement from Google, it could be argued that Google has articulated and answered my concerns as a user. A brief scan of their privacy site includes plain English statements about data collection, what they do with my data, and how to use privacy tools to control what I share with Google.

Disney’s Cheshire cat

Reviewed against the Data Standard Principals in development by the Australian Competition and Consumer Commission, it could also be argued that Google has “ticked” the boxes to a comprehensive approach to proposed data principles. So why do I still feel like I am being watched by the Cheshire cat?

Data vs Information

The shift in regulatory frameworks from a focus on gathering of “personal information” to “data” brings about a broader conversation on how to tackle the ethical dilemmas big data now poses. Floridi & Tadeo argue that there are levels of abstraction between ethics for data, ethics of algorithms, and ethics of practices. (Floridi & Taddeo, 2016)

The intersection of ethics between those who generate data, acquire data and those who use data.

This is a useful to understand as we measure how organisations deliver on their data practices within user expections. Misuse of machine algorithms (for example) have limited job opportunities for individuals with mental health issues and favored higher rankings of popular posts over honest to goodness news. (O’Neil, 2016)

A recent example of the intersection of data ethics, algorithms and practices, is the proposed legislation News Media and Digital Platforms Mandatory Bargaining Code. The amendment to the Competition and Consumer Act 2010 has Google Executive, Mel Silva, championing what Google has built since 1998.

“Search engines (and the internet as a whole) are built on the ability to link someone to a website for free. You know how it goes. You search for a topic, and the results show up as a series of links and brief snippets of text, giving you an idea of your options before you decide whether to click through and spend your time (and potentially money) with that website or business.” Mel Silva, Managing Director, Google Australia in her open letter to Australia.

This might constitute a deep dive into the ethics of algorithms, but it also highlights that Google utilises the content produced by others, (in this case Australian news outlets) and converts this to profit for advertisers, but even more so for its own bottom line. Users might have been convinced of the social benefits initially to speed the growth of the digital economy, but the volume of big data, and size of organisations brings into question again how user data and algorithms are impacting other industries.

Freedom of the internet may not be the best argument to make by Google when misinformation fueling “fake news” has the government questioning if they should have been keeping a closer eye on the self-regulation argument currently cited by Tech Giants. Especially when some digital platform users (especially those in the political landscapes) are bent on promotion of stories designed to engage our fears rather than inform the populace. (“ACCC vows to pursue Google’s ad dominance, as tech giant threatens to remove its search engine from Australia — ABC News,” 2021)

The increasing concern being voiced by individuals, businesses, academics and governments, shows a distinct shift from what was socially preferable to what is now becoming socially acceptable. And here is where I question what Google would have to do to allay my fears I am the guinea pig in a larger social experiment they are running to figure out how to make more money. Feeding this concern is multiple instances of research experiments conducted without user’s knowledge by organisations holding vast amounts of big data. (Angus, 2021)

The “tumbling down the rabbit hole” moment occurs when we examine how algorithms and their use is impacting what we view on a digital platform. (Levin, 2019) The ethics of which are now regularly called into question.(Naughton, 2019)

Tumbling down the rabbit hole

Google recently fired a lead data researcher after an unfavorable report on AI natural language technology. (Walker, 2020).

AI natural language models are taught using large volumes of web data. This creates risks when there is no context provided as part of the code. I think we can all agree that an AI machine not taught to distinguish racist language or adversely discriminting against individuals, is not worthy of a digitally literate society. (Hao, 2020)

How do we hold an organisation responsible for the impacts on individuals when there is propriety software argument to made not to disclose how machine processing is conducted? (Leprince-Ringuet, 2020)

So what could Google do to restore my faith in its search engine?

No one wants governments to act like the Queen of Hearts, but I am concerned that Google has forgotten what sustains their business — me the user and 3.5 billion others daily googling.

Does Google require regulators to force them to play fair?

What is fair to a user when my data is held by Google search, Youtube, Google home, Google maps, Playstore, Google pictures? Or if I use Facebook who also own and gain data from Instagram, Messenger & WhatsApp?

The counter-argument offered by Google in Australia, is that Google provides high value for the exchange of information it receives from users, and it’s privacy statement is a “trust us we do not harm” beneficence approach. Can they back this up?

Google recently conducted experiments on users to measure the impacts of news businesses and Google Search on each other, again informed consent was taken for granted. (Angus, 2021). This raises the question if my trust in my favorite search engine might be misplaced. The European Union seems to share my view as Anti-Trust fines totaled 8 billion in 2018, 2019. (Nitasha Tiku, 2019)

General Data Protection Regulation (GDPR) enacted in May 2018, was the European Union’s response to toughening data protection to protect citizens. The regulation redefines consent, privacy rights and data protection by design and default. Personal data definitions include direct and indirect data linkages. Data processing definitions include any action performed on data. But Google’s lack of transparency, information and consent resulted in a 50 million euro fine. (The CNIL’s restricted committee imposes a financial penalty of 50 Million euros against GOOGLE LLC | CNIL, 2019) and again another 600,000 euro in 2019 for not complying with the GDRP’s “right to be forgotten” principle. (Abnett, 2020).

Are we at the bottom yet?

In my travels as Alice, I found myself nodding in agreement to ethical guidelines that extend beyond privacy, governance and protection:

  • Fairness — respectful of the person behind the data being gathered.
  • Shared benefit — the concept that data is owned by the people who produce it and joint control is the next step in creating “fairness.”
  • Transparency — openness about how data is used, reused and conducted with greater informed consent, including but not limited to, different points in time as big data and digital platforms evolve.

Professor James Arvanitakis, 2018, produced the following comparison table identifying the gaps in five ethical principles of big tech companies. (Arvanitakis, 2018)

In the Wonderland of Big Data, I think I will find my way be home when shared benefit and fairness in data practice are applied and not just to company policies or government regulation, but when it’s evident to me the user on digital platforms. But for now I would settle for a little more transparency……and a little less “just trust us.”