LibGuides: Artificial Intelligence (AI): Responsible use

What to consider when using AI

Make a concious choice!

When you are going to use AI, it is important that you are aware of the choice you make and what you are going to do with it. On this page you will find information about the things you should think of before using an AI tool.

AI can help you improve your academic skills and if you use it well, it can save you a lot of time. However, there are some catches in using AI for your study programme. Is it allowed according to the UU regulations? How do you take biased results into account? Which copyright lies with generative AI and how do you deal with stating the source? And, last but not least, which impact has using AI on the climate and our society?

You can also ask your questions about AI on the website ikhebeenvraagoverai.nl (in Dutch only). Utrecht researchers will try to answer all your questions.

On the next pages we will discuss these questions further, so you can make a conscious choice about how and when you want to use AI.

In the video below you will find a summary of important themes. (No English subtitles).

Source: video from the VU Amsterdam (Netherlands)

Utrecht University has guidelines for the use of generative AI, for this visit the central UU page about Responsible AI.
Here you can find information for UU students, lecturers, researchers and staff.
Also read the UU AI Ethical code of conduct (only in Dutch) and the pages for students and lecturers about Generative AI in education.

Students are never allowed to hand in work that is entirely made by generative AI as their own work.
Always ask your lecturer what is allowed for each assignment in relation to the use of and references to generative AI.

What is bias?

AI is very fast in searching large amounts of information, processing it and drawing conclusions from that information. But these conclusions are not always correct. A famous example is the - meanwhile adjusted - recruitment campaign of the Amazon company in 2014, that used AI to evaluate the CVs of applicants for IT jobs. Although both women and men applied for the jobs, it was remarkable that AI often selected male applicants as the most suitable candidates. This had nothing to do with the quality of the female applicants, but with the data on which the AI was trained: successful applications at Amazon from the past. Because at that time mainly men applied and were taken on, AI saw the quality ' man' as a good indicator for a successful candidate.

This is a classic example of bias: (un)conscious assumptions and prejudices about the world around us that influence our decisions and comments. Bias is a phenomenon that occurs both with people and AI. Contradictory to humans, AI is not able to think critically and independently in an attempt to prevent bias. A program such as ChatGPT does not reflect on its own output, it only reflects, based on algorithms, on the information it has been fed. To judge the legitimacy of the output, it is important to know how bias can arise in AI.

How does bias come about during the creation of AI output?

Bias can arise in several ways and places.

Maybe the dataset used to train AI is not large and diverse enough. An example of bias in training data is a database for skin cancer detection with mainly photos of white skin, possibly causing the detection of cancer in people with dark skin to be less fast. Other major forms of bias in this field are the linguistic and cultural bias: for the large part, AI tools are produced in English and trained on data in the English language. As a result, the worldview is particularly western-oriented.
There may be (hidden) bias in the training data itself.
The way in which the training data are categorised and organised by developers, can cause bias. For instance, by the inconsequent assigning of labels or by unconscious human prejudices. For example, using the label ' professor' for older white men wearing glasses.
Bias can also occur in the programming of the algorithms that determine the output. Also in the case of an application process with enough diverse input data, certain aspects can be assigned too much weight, even if it is unwanted, so that 'man' is more often associated with ' successful candidate' than 'woman'.
Bias can also be introduced by the end users. With generative AI programs such as ChatGPT end users are also trainers. AI learns from their input. In the past this led to racist and discriminatory sayings by chatbots. If the AI is asked several times for the same output, this increases the probability that the AI makes the same specific connections.
Frequently end users influence the results by their way of prompting, so that these reflect their own (un)conscious prejudices, opinions and interests (examples). An AI tool is trained to give the users what they want and therefore offers no opposing replies. Even a username can influence the outcome (think about gender, language etc.)

Bias in algorithms - brought about by both programmers and training input by end users - is the topic of many research, because the way alogrithms work is not always clear. As a result it can be difficult to discover the cause of the bias.

Faulty training data and/or faulty modelling can even result in an AI giving wrong information, because algorithms try to make connections that do not exist.

How to handle bias in AI?

You cannot entirely prevent bias in AI, but you can, however, realise it is there.

Write clear, non-biased prompts and once in a while ask the generative AI for the opposite.
As an end user of generative AI, be aware that programs such as ChatGPT tell you what you want to hear.
Be aware that algorithms are based on probability. Generative AI presents frequently occurring connections as output, and these connections are not necessarily the right ones.
Avoid selective and geometric bias: if you train AI yourself, make sure the training data are diverse enough and that you include all important data for your research question.

AI-generated content

It is at this moment not clear whether AI generated content is eligible for copyright protection. For this, there must be things like ' an original character', a 'personal stamp from the maker' or 'an intellectual creation'. There must also be 'human creativity'. So an AI tool cannot make copyrighted material, because it cannot make creative choices, and so cannot be a copyright holder.

The question is whether someone using an AI tool, for instance to generate a text or image, can make an appeal to copyright. In that case, there must be enough human creativity in the prompt design for instance, or in the editing of the product afterwards. At this moment (2024) we have no unambiguous agreements to determine this.
However, it may be that the maker of a prompt can be seen as the author of that prompt and that, potentially, a prompt can be considered a copyrighted work. So far as is known there have been no statements on this issue yet.

Is it allowed to use copyrighted works to train AI models?

Within the European Union the training of GenAI models based on data is seen as a form of text and data mining (TDM) and it is therefore allowed. Outside the European Union the rules may be different.
If the TDM (or AI training) is meant for scientific research, you cannot avoid this. But for other goals, such as training commercial generative models, there is an opt-out option. Make sure that you mention this in your publication visibly and in a machine-readable manner. For now it is not clear how this is acted upon in the daily practice.

Example:

It is recommended to be careful when sharing copyrighted material in an AI tool/GenAI tool. Also be careful when it comes to sharing your own material for which you might want to claim copyright. Your ideas can be used to train the AI system and then return results without the origin being traceable.

Mentioning the source

Material generated with GenAI is in many respects different from material you take from work written by others, such as journal articles or books. In many cases, the AI generated output cannot be found back immediately or cannot be reproduced. However, you have to refer to AI generated content (such as text, images, code etc.) the moment you cite, paraphrase or include this material in your own work (think in this last instance of including images or video).

Not all of the well-known citation styles already have official guidelines for referring to AI material, but they often do give suggestions in blogs or via other channels. You will find an overview in the LibGuide Citing:

GenAI can also give you suggestions for sources and references you can use. Always check and verify them! There is a chance that these sources do not exist or that the source citations contain mistakes.

Lecturers can ask for extra information when you use chatbots: think of the use of an AI tool to have your spelling checked or to improve your research question. They can ask you to explain what you did, why you have used it, or ask you to send specific prompts. Check the course manual or ask your lecturer beforehand if, how and where you must mention this information.

Privacy en AI

Never share privacy-sensitive information from others in AI tools and be careful with sharing (privacy) sensitive information about yourself. Also be aware that, in addition to the data you share yourself, an AI system will collect many data you do not know of.
Always check beforehand the conditions of use and the privacy policy of the AI tool you want to use and think carefully if you want to share personal details with the Big Tech companies behind these AI tools.

For instance, in the Privacy Policy of OpenAI (ChatGPT) you can read the following:

If you create an account, they collect your personal data, such as your name, contact details, account details, payment details, transaction history, the content of the messages you send and lots more.
OpenAI collects your login data ( IP address, date etc.), use data (what you are doing), information about your device (what you use) and cookies (the websites you visit). This all happens automatically as soon as you use the tool.
OpenAI uses your personal details to improve its services, to communicate (marketing) and they can also share this data (anomynized and combined with data from other users) with third parties.

Want to know more?

For more information about personal data watch the video What are personal data from the Utrecht University Data Privacy Handbook.

In this video, you learn to answer the question: how are humans involved in each phase of an AI system's life cycle?

Impact on the world

AI has the potential to improve the world. For instance, AI applications can be used to trace diseases, make weather forecasts or optimize energy consumption. However, the use of AI has its impact on the environment and society. Before programs like ChatGPT can answer your questions, many natural resources are used.

Energy

Generating and using energy for AI results in nitrogen emissions. How high these emissions – and the carbon footprint of AI – are, depends on the way the energy is generated (for example green, gas, coal or nuclear energy), the hardware use in data centres and the number of parameters in the LLM on which the AI is based. LLMs need a large number of input data and training to generate significant output, so lots of computing power is needed.
Much is written about the ecological footprint of the training phase (see the Artificial Intelligence Index Rapport 2023), but also in the usage phase (inference phase) of AI much energy is needed (De Vries, 2023 and Goldman Sachs Research, 2024). According to Goldman Sachs Research, a question to ChatGPT costs ten times as much energy as a simple search in Google.

Water

To prevent data centres from becoming overheated, fresh water is used in cooling towers. Water is also used when manufacturing hardware for AI and in some cases to generate energy. Because of the increasing use of AI more and more water is needed. Researchers from the UC Riverside estimate that for a two-week training of ChatGPT-3 in Microsoft data centres in the US approximately 700,000 litres of water were used (Li et al., 2023). Add to this the use of ChatGPT: the same study indicates that 20 to 50 questions to ChatGPT use up about half a litre of water. The Riverside researchers expect that 4.2 to 6 billion litres of water will be used for AI in 2027.

Humans

The high consumption of energy and specifically fresh water has consequences for us humans, because drinking water is becoming increasingly scarce due to climate change. The workflow of AI itself also has an impact on our society. AI cannot train itself independently: people are needed to steer the application in the direction of appropriate conclusions and to correct mistakes. Data must be cleared and classified piece by piece. This microwork is also called ghost work, because it is difficult to check and regulate. Employees are paid per job done, they have no employment-law protection and have no contact with their managers. Although research by the Rotterdam Erasmus University (Morgan et al, 2023) shows that much ghost work is also carried out in affluent countries, a large part of the ghost workers is recruited in the Global South, where wages are low and where the countries do not profit from the work (Chan et al., 2021).

What happens next?

Although several models predict a sharp rise in data centre energy consumption by 2030, steps are being taken to reduce AI’s carbon footprint. New hyperscale data centres use energy much more efficiently. Techniques are being developed to optimise the precision of algorithms. Cooling systems are being developed to save as much water as possible. A project by the Rotterdam Erasmus University maps ghost work to make the work and the working conditions visible. Green AI is on the rise, but there is still much to be done. As a consumer, you also determine the impact AI has on society and the environment. It is therefore important to understand how to use AI tools efficiently through prompt engineering and selecting the right tool for specific tasks.