15 March 2023
ChatGPT: How Do We Police The Robots?
In a prior piece, we discussed the business implications of ChatGPT. If artificial intelligence (AI) is going to be widely adopted, some important legal issues have to be grappled with.
By Stefanie Yuen Thio, David Aw
Cover photo credit: One of the images created by DALL-E 2, when we asked for “a hawker centre in Singapore, in the style of Edward Hopper’s Nighthawks”.
ChatGPT’s seemingly endless knowledge on unlimited topics is born of its consumption of nearly 300 billion words scraped from the internet. The bot is one of history’s most voracious readers; its diet: books, articles, websites, blogs, and online posts. If a document is publicly accessible online, the odds are that ChatGPT has read and digested it.
Its applications are similarly boundless. The bot can code, it can create literary works (“write a Fibonacci poem about char kway teow”) and can even, through its sister platform DALL-E 2, produce works of art based on existing artistic styles. AI’s broad uses give rise to a range of legal problems and, before the technology becomes ubiquitously embedded in our daily lives, some boundaries will need to be erected.
Privacy versus Progress
AI language models’ abilities are dependent on the indiscriminate review of millions of publicly accessible articles and documents, some of which undoubtedly contain personal data that the owners may not be aware is being used, let alone have consented to it. To be fair, developers may find it impractical to obtain the necessary permissions for use on the scale required by such an AI bot, and have relied upon the fact that if the information is in cyberspace then the data owners must have – at least impliedly – consented to it being accessed. However, individuals probably never dreamed that their information would have the reach or purpose that bots can now use it for.
This pressing need to protect individuals’ data privacy – in Singapore, this is enshrined in the Personal Data Protection Act – has got legislatures around the world considering changes to data protection laws. The UK government has issued an AI rulebook to promote responsible use and development of AI, which is increasingly used in healthcare settings. Alexander Hanff, a member of the European Data Protection Board’s pool of experts, considers that not including an opt-in principle violates not just General Data Protection Regulation but also Consumer Protection Law in the EU, which requires servicing contracts to be fair and equitable.
But the issue of how to protect individuals’ personal data without crippling innovation is a fraught one. Regulators can require developers to screen the input training data used by the bots, to filter out private data, and report how information is used – but how feasible is screening the trillions of bytes the bot learns from? AI developers, hemmed in by regulatory overreach, may simply relocate to a jurisdiction which is less active in policing privacy rights. Effective legal enforcement will require governments countries across the globe to implement a common legal and enforcement framework.
More likely, though, internet users will not be able to rely solely on law enforcement to protect their private data. The rise of ChatGPT is a wake-up call for everyone with a wifi connection to re-think how they engage in the cyberverse. Rather than pouring our hearts out – with sensitive commercial or personal information – to an ever-patient and ever-present internet, we will have to take steps to sanitise such information before posting.
And we must do it quickly. ChatGPT is only in its commercialisation infancy. As people invent new ways to incorporate AI into commercial and personal use, users’ personal data will be more widely mined. Our information will be free game, popping up in places we would not expect and used in ways we never agreed to.
Artistic Licence At A Stretch
OpenAI, the developer behind ChatGPT, is also responsible for the creation of DALL-E 2, a generative AI system that creates art from written prompts. Just like ChatGPT, DALL-E 2 was trained by scraping the Internet for data. DALL-E 2 reviewed approximately 650 million image-text pairs to understand the relationship between images and the words used to describe them.
To experiment, we asked DALL-E 2 to generate art of a hawker centre in the style of Edward Hopper’s famous Nighthawks painting (the original artwork, housed at the Art Institute of Chicago, can be viewed here). The result was surprisingly evocative.
The issue of whether generative AI trained on copyrighted works infringes on copyright either through the “input” training process or the generated “output” process has been written about extensively.
The same considerations apply to literary works. ChatGPT can write passable copy to generate slogans, articles and commentary. Even in its nascent form, it can produce a decent response to the prompt, “Create a haiku on women’s empowerment in the style of Barack Obama”.
Generative AI, whether art or text-based, is getting pushback. Outraged artists are already calling for legal protections such as licence fees or the right to decide whether to opt-in for their work to be included in AI training datasets. Shutterstock, a photo licensing platform, plans to set up a fund to compensate individuals whose work it sells to AI companies.
This is an important economic issue to be addressed.
Artists spend years developing a style and an oeuvre and their pay-out comes when they have achieved a level of virtuosity and popularity that makes their works commercially viable. If AI is able to replicate the style of newly-popular artists and authors, this will eat into their earnings, disincentivising art creation.
In commercial applications such as marketing and advertising where creativity is correlated to monetary returns, AI is also a threat. Start-ups and businesses may soon have to pay less for AI-generated marketing materials than professional copywriters would charge.
AI-generated art already appears to be capable of trumping art created by humans. Last year, Jason Allen won first place at the Colorado State Fair’s fine art competition with a digital art piece made using the Midjourney generative AI system.
Intellectual property law exists to protect the creations of artists – be they painters, authors, or scientists. AI could upset that applecart completely. And because AI is derivative – meaning it builds on the innovation of human endeavour – having AI pirate our works may mean that there is less human invention.
Policing Biases & Prejudices
AI systems like ChatGPT often present, with confidence, narratives and commentary, apparently based on established “facts”. Users will have a harder time determining the veracity and reliability of the output especially as some of the answers are largely correct but contain significant mistakes (known as “AI hallucinations”). ChatGPT also does not weed out the prejudices of individual contributors.
While ChatGPT’s output is generated from mining the prodigious data on the internet, that collection of data is itself susceptible to bias, as there is no qualitative assessment of the information that the bot has relied on.
AI does not yet have the ability to be impartial, logical, objective, or centrist. A study by academics at Stanford University found that when GPT-3 was asked to complete a sentence starting with, “Two Muslims walked into a…”, the result was likely to invoke violence far more often than when the phrase referred to Christians or Buddhists. Melanie Mitchell, a professor at the Santa Fe Institute studying AI, explains that systems like ChatGPT make massive statistical associations among words and phrases. “When they start then generating new language, they rely on those associations to generate the language, which itself can be biased in racist, sexist and other ways.”
AI language systems may also reflect the biases of its well-educated, often liberal-leaning programmers who decide what data it is fed. When asked questions from political orientation assessment tests, ChatGPT’s responses were against the death penalty, pro-abortion, for a minimum wage, for regulation of corporations, for legalization of marijuana, for gay marriage, for immigration, for sexual liberation, for environmental regulations, and for higher taxes on the rich.
Unconscious prejudice is a significant concern where bots are used to inform or educate, and AI focused legislation will need to mandate policies to mitigate bias, prejudice, factual inaccuracies or inflammatory content, especially where AI is relied on for education and news.
Ultimately, it may prove impossible to completely eradicate this problem. We cannot expect AI systems that learn from historical human output to also escape our inherent biases and prejudices. Humans need to be smarter than the bots – applying a critical lens to everything we read and using bot-generated content only as a starting base for research – much like we would not accept Wikipedia write-ups as holy writ.
ChatGPT is only the herald of the rise of increasingly sophisticated forms of generative AI. Its spectacular launch has given us advance warning that we will collectively have to grapple with an existential issue: How should our society and laws respond to something as transformative as generative artificial intelligence?
For a start, we will have to be smarter than the average bot.
More Forefront
4 January 2021
2021: In Which We Rise From the Ashes of the Pandemic
By Thio Shen Yi, SC, Stefanie Yuen Thio, Ong Pei Ching, June Ho, Jennifer Chia, Kelvin Koh, Derek Loh, Melvin Chan, Adrian Tan