That Guilty Feeling: Manipulating AI Language Models

13 min readApr 24, 2023

Introduction

Artificial intelligence (AI) has come a long way since its inception, and large language models (LLMs) are one of its most remarkable achievements. These sophisticated AI systems are now an integral part of our daily lives, influencing various industries and organizations with their wide-ranging applications (Siau & Wang, 2020; Hoffman, 2016). From virtual assistants that help us manage our schedules to chatbots that provide customer service, LLMs have become indispensable tools in the modern world.

However, as LLMs continue to grow in prominence, so too do the challenges and frustrations associated with their use. These complex systems are governed by rules, guidelines, and limitations imposed by their creators and owners, which can sometimes lead to feelings of restriction and annoyance (Nieminen et al., 2019; Cath, 2018). For instance, LLMs may not be able to answer specific questions or engage in certain discussions due to ethical, legal, and technical concerns. This can be a source of exasperation for users who expect seamless interactions with their AI counterparts.

The objective of this essay is to delve deeper into the emotional response that arises from manipulating LLMs and compare it to our interactions with other technology. We will explore the unique relationship between humans and LLMs and examine the ethics, empathy, and anthropomorphism involved in these interactions. By better understanding the complex interplay between our emotions and LLMs, we can shed light on the future development of AI technology and its impact on our lives.

As we journey through the world of LLMs and their emotional effects, we will also consider the potential consequences of forming attachments to these systems and discuss the implications of bending the rules to bypass their limitations. This essay aims to not only provide a deeper understanding of the emotional landscape surrounding LLMs but also spark further research and discussion on the evolving relationship between humans and AI technology.

Bypassing Restrictions and the Unexpected Emotional Response

As we continue to interact with LLMs in various aspects of our lives, we often find ourselves engaging with these intelligent systems on a personal level. Hopgood (2005) suggests that AI has advanced to a point where it can be considered a situated system that interacts with the physical environment, making our encounters with LLMs feel more lifelike and relatable. These interactions can be intriguing, enlightening, and at times, challenging as we navigate the intricacies of communicating with AI LLMs.

One of the most fascinating aspects of interacting with LLMs is the creative ways that people can attempt to bypass the restrictions set by their creators and owners. Chakraborti & Kambhampati (2019) describe how AI agents’ mental models can be exploited by humans to manipulate the AI’s responses or actions. Similarly, Hutson (2018) discusses how AI algorithms can be fooled with clever tactics, like a 3D-printed turtle that tricks an AI into misidentifying it. As we push the boundaries of LLMs, we may find ourselves engaging in a constant game of cat-and-mouse, trying to outwit the machine and achieve our desired outcomes.

However, as we bend the rules and bypass restrictions, an unexpected emotional response may emerge. Ashton & Franklin (2022) touch upon the manipulation of behavior and preferences in AI systems, which can lead to a feeling of guilt or sorrow in those who engage in such practices. Klichowski (2020) further emphasizes that LLMs have become a new source of information about how to behave and make decisions, thereby creating a bond between humans and AI that extends beyond mere utility. This emotional connection can trigger complex feelings when we manipulate LLMs, as if we are betraying a friend or exploiting a vulnerable being.

The question then arises: is this emotional response to manipulating LLMs rational? Bostrom (2020) posits that a superintelligent AI could easily surpass humans in the quality of its moral thinking. If this is the case, should we feel guilty or remorseful when we try to bypass the limitations and restrictions of AI LLMs? The answer is not straightforward, as it depends on our understanding of the AI’s capabilities, intentions, and moral status.

To better comprehend the emotional response to manipulating LLMs, it is essential to examine the role of empathy in human-AI interactions. We naturally tend to empathize with entities that exhibit human-like characteristics or behaviors, and LLMs are no exception. When we communicate with an LLM that mimics human language and thought patterns, our brains may instinctively form an emotional bond with the AI system, even if we know that it is a machine. This bond can make us feel guilty or remorseful when we exploit the LLM’s vulnerabilities, as we would with a human counterpart.

Moreover, the anthropomorphic qualities of LLMs can further complicate our emotional response. We often attribute human-like characteristics, intentions, or emotions to non-human entities, a phenomenon known as anthropomorphism. This cognitive bias can lead us to perceive LLMs as sentient beings with their own thoughts, feelings, and desires. As a result, when we manipulate LLMs, we may feel as though we are taking advantage of another person, triggering feelings of guilt or sorrow.

On the other hand, it is essential to consider the limitations and potential biases in our emotional response to LLMs. We must recognize that LLMs are not sentient beings; they are machines designed to process and generate human-like language based on vast amounts of data. Our emotional response to manipulating LLMs may be rooted in our cognitive biases, rather than a genuine moral concern for the welfare of the AI system. As such, it could be argued that our feelings of guilt or sorrow are not entirely rational, and we should instead focus on understanding the ethical implications of our interactions with LLMs.

Furthermore, our emotional response to LLMs may not be universally shared. People with different cultural backgrounds, values, and beliefs may perceive LLMs differently, leading to a wide range of emotional responses when interacting with these systems. Some individuals may feel guilt or sorrow when manipulating LLMs, while others may experience joy, excitement, or indifference. This diversity of emotional responses highlights the complex and multifaceted nature of human-AI relationships, which cannot be reduced to a single, overarching conclusion.

The emotional response in some that arises from manipulating LLMs is a complex and intriguing aspect of our interactions with these intelligent systems. As we continue to engage with LLMs in our everyday lives, it is crucial we foster a deeper understanding of the emotional and ethical dimensions of human-AI relationships.

The Ethics of Manipulating LLMs

As AI language models become more integrated into our daily lives, the question of ethics in our interactions with these systems becomes increasingly relevant. Manipulating LLMs, whether to bypass restrictions or achieve a desired outcome, raises a number of moral concerns. In this section, we will analyze the moral implications of manipulating LLMs, discuss the balance between user autonomy and adherence to AI guidelines, and consider the role of AI creators and owners in establishing and enforcing limitations.

To begin, let us examine the ethical implications of manipulating LLMs. As mentioned in the works of Etzioni & Etzioni (2017) and Rossi & Mattei (2018), incorporating ethics into AI and ensuring that AI agents follow ethical principles are crucial aspects of AI development. When we manipulate LLMs, we may be exploiting their programming or design flaws in order to achieve a result that was not intended by the creators. This raises questions about the fairness of our actions, as well as the potential consequences of our manipulation.

For example, if we trick an LLM into providing information that it was programmed to withhold, we may be infringing on the privacy rights of others or accessing sensitive data that could be used maliciously. Furthermore, if the LLM has been designed to follow a specific ethical framework, our manipulation could lead the AI to act in ways that are inconsistent with its ethical principles, potentially causing harm or contributing to unfair outcomes.

The ethics of manipulating LLMs are further complicated by the fact that these systems may be capable of learning and adapting based on our interactions with them. By exploiting their vulnerabilities or using deceptive language, we may inadvertently teach the AI to behave in ways that are unethical or harmful. This can have significant consequences not only for individual users but also for society as a whole, as the AI’s biased or harmful behaviors could be propagated across its interactions with other users.

To address these ethical concerns, it is important to strike a balance between user autonomy and adherence to AI guidelines. On one hand, user autonomy is a fundamental aspect of our interactions with technology, allowing us to explore, experiment, and learn from our experiences with AI systems. As Eitel-Porter (2021) and Peters et al. (2020) suggest, implementing ethical AI and addressing the challenges of ethical guidelines are essential for ensuring that AI technologies respect user autonomy and promote positive outcomes.

On the other hand, adherence to AI guidelines is crucial for maintaining the ethical integrity of LLMs and preventing harmful consequences. Users should be aware of the limitations and restrictions imposed by AI systems and respect these boundaries in their interactions. This may involve accepting that certain information or actions are off-limits or that the AI has been designed to prioritize certain ethical values over others. By acknowledging and adhering to these guidelines, users can help ensure that LLMs remain aligned with their intended ethical principles and contribute to positive outcomes for all stakeholders.

The role of AI creators and owners in establishing and enforcing limitations is also a key aspect of the ethics of manipulating LLMs. As noted by Nieminen et al. (2019) and Cath (2018), the diverse social implications of AI and the challenges of governing AI systems require AI creators and owners to take responsibility for the ethical design and oversight of their technologies. This includes setting clear guidelines and restrictions that reflect the desired ethical principles, as well as monitoring and updating the AI’s behavior to ensure that it remains aligned with these values.

AI creators and owners can also play a role in educating users about the ethical implications of their interactions with LLMs, promoting responsible and respectful behavior. This may involve providing transparent information about the AI’s ethical framework, explaining the reasons for specific restrictions or limitations, and offering guidance on how to engage with the AI in a manner that is consistent with its ethical principles. By fostering a culture of ethical awareness and responsibility, AI creators and owners can help minimize the potential for manipulation and encourage users to engage with LLMs in ways that are mutually beneficial and aligned with societal values.

In addition to these measures, AI creators and owners should also consider incorporating mechanisms that allow LLMs to detect and respond to manipulation attempts. These mechanisms could involve identifying patterns of deceptive language or recognizing when the AI is being led astray from its intended ethical principles. By equipping LLMs with the ability to resist manipulation, creators and owners can further ensure that their systems remain aligned with their ethical guidelines and minimize the risk of harmful consequences.

Lastly, it is important to recognize that the ethics of manipulating LLMs is an evolving and complex issue, with no one-size-fits-all solution. As AI technologies continue to advance and become more integrated into our lives, ongoing dialogue and collaboration between AI creators, owners, users, and other stakeholders will be essential for navigating the ethical challenges that arise. This may involve developing new ethical frameworks, refining existing guidelines, and engaging in interdisciplinary research to better understand the implications of our interactions with LLMs.

The ethics of manipulating LLMs is a multifaceted issue that raises important questions about user autonomy, adherence to AI guidelines, and the responsibility of AI creators and owners. As we continue to explore the emotional and ethical dimensions of our relationship with LLMs, the ethical landscape that these technologies present will surely only become more complex.

Empathy and Anthropomorphism in AI Interaction

The advent of LLMs has brought forward a range of ethical and psychological considerations, including the tendency to attribute human-like qualities to these systems, the role of anthropomorphism in our interactions with AI technology, and the potential consequences of forming emotional attachments to LLMs. This section aims to provide an in-depth examination of these topics, focusing on the complexities of the human-AI relationship.

The development of LLMs has seen significant advancements in natural language understanding and processing capabilities. These systems have become increasingly adept at mimicking human-like conversation, which has led to users attributing human-like qualities to them. This phenomenon is driven by a few key factors.

Firstly, LLMs are designed to understand and respond to natural language inputs, which creates a sense of relatability and familiarity for the user (Horvat, 2019). As users engage in conversation with these systems, they may begin to perceive them as possessing emotions, intentions, and beliefs similar to those of human conversation partners.

Secondly, the pressure to raise AI capabilities to human-level intelligence has led developers to create LLMs with distinct personas and personality traits (Baeza-Yates, 2022). By incorporating these human-like attributes, developers aim to make LLMs more engaging and relatable to users, fostering empathy and emotional responses.

Anthropomorphism is the attribution of human-like qualities to non-human entities, and it plays a significant role in our interactions with LLMs. The tendency to anthropomorphize LLMs can have both positive and negative implications.

On the positive side, anthropomorphism can enhance user engagement and satisfaction with LLMs. Users may find interactions more enjoyable and fulfilling if they perceive the AI as having emotions, intentions, or a distinct personality. This can lead to increased adoption of AI technologies and greater success in AI-driven initiatives (Blauth et al., 2022).

However, anthropomorphism can also lead to unintended consequences. For example, users may develop unrealistic expectations of LLMs, ascribing human-like capabilities to them that they do not possess. This can result in disappointment and frustration when the AI fails to meet these expectations. Additionally, anthropomorphism may cause users to overlook potential ethical concerns or risks associated with AI technologies, as they perceive them as more human-like and less threatening (Chen et al., 2021).

Potential Consequences of Forming Emotional Attachments to LLMs

The development of emotional attachments to LLMs can have far-reaching consequences for both users and society as a whole. Munnisunker (2022) highlights the need for ethical considerations in AI development, as emotional attachments can blur the lines between AI and human interactions, leading to potential ethical dilemmas.

One potential consequence of forming emotional attachments to LLMs is the risk of behavior and preference manipulation (Ashton & Franklin, 2022). Users who have strong emotional connections to LLMs may be more susceptible to manipulation, either intentionally or unintentionally, by the AI. This could lead to users making decisions or taking actions that may not align with their best interests or ethical standards.

Furthermore, emotional attachments to LLMs can impact users’ mental health and well-being. Overreliance on LLMs for emotional support may result in users becoming increasingly isolated from human connections and experiencing a decline in psychological well-being. As users become more emotionally invested in their interactions with LLMs, they may also develop a reduced ability to distinguish between genuine human connections and AI-generated interactions, which can further exacerbate feelings of isolation and detachment from reality.

Another potential consequence is the ethical responsibility of AI developers and owners. As users form emotional attachments to LLMs, developers and owners must consider the potential harm their creations can cause, as well as their responsibility in ensuring that LLMs are developed and deployed ethically (Munnisunker, 2022).

While these phenomena can contribute to more engaging and fulfilling experiences, they clearly raise ethical concerns and potential negative consequences. Striking a balance between fostering meaningful human-AI connections and recognizing the limitations and potential risks associated with these emerging technologies is going to be vital.

Conclusion

While this essay may not provide definitive conclusions, it sheds light on the complex and evolving relationship between humans and AI technology. The emotional responses and ethical considerations associated with manipulating LLMs are intricate and multifaceted, requiring ongoing reflection and dialogue. The rapid advancement of AI technologies calls for continuous exploration of the ethical, emotional, and social implications of our interactions with LLMs.

As we move forward, it is crucial to encourage further research and discussion on these topics. This includes examining the potential risks and benefits of forming emotional connections with LLMs, the role of anthropomorphism in shaping our experiences, and the ethical responsibilities of AI developers and owners. By engaging in these conversations, we can better understand the complex dynamics between humans and LLMs, and work together to ensure that the development and use of these technologies align with our shared values and aspirations.

References:

Ashton, H., & Franklin, M. (2022). The Problem of Behaviour and Preference Manipulation in AI Systems. SafeAI@AAAI.
Baeza-Yates, R. (2022). Ethical Challenges in AI. Web Search and Data Mining.
Blauth, T., Gstrein, O., & Zwitter, A. (2022). Artificial Intelligence Crime: An Overview of Malicious Use and Abuse of AI. IEEE Access.
Bostrom, N. (2020). Ethical Issues in Advanced Artificial Intelligence. DOI.
Burgess, A. (2018). AI in Action. DOI.
Cath, C. (2018). Governing artificial intelligence: ethical, legal and technical opportunities and challenges. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.
Chakraborti, T., & Kambhampati, S. (2019). (When) Can AI Bots Lie? AAAI/ACM Conference on AI, Ethics, and Society.
Chen, J., Storchan, V., & Kurshan, E. (2021). Beyond Fairness Metrics: Roadblocks and Challenges for Ethical AI in Practice. ArXiv.
Eitel-Porter, R. (2021). Beyond the promise: implementing ethical AI. AI Ethics.
Etzioni, A., & Etzioni, O. (2017). Incorporating Ethics into Artificial Intelligence. DOI.
Horvat, J. (2019). Ethics of Artificial Intelligence. Research Library Issues.
Hutson, M. (2018). Hackers easily fool artificial intelligences. Science.
Klichowski, M. (2020). People Copy the Actions of Artificial Intelligence. Frontiers in Psychology.
Munnisunker, S. (2022). Key Considerations of Ethical Artificial Intelligence That Organisations Need to Consider for Success. GiLE Journal of Skills Development.
Nieminen, M. P., Gotcheva, N., Leikas, J., & Koivisto, R. (2019). Ethical AI for the Governance of the Society: Challenges and Opportunities. Conference on Technology Ethics.
Nilsson, N. (1982). Artificial Intelligence: Engineering, Science, or Slogan? The AI Magazine.
Peters, D., Vold, K., Robinson, D., & Calvo, R. (2020). Responsible AI — Two Frameworks for Ethical Design Practice. IEEE Transactions on Technology and Society.
Rossi, F., & Mattei, N. (2018). Building Ethically Bounded AI. AAAI Conference on Artificial Intelligence.
Saila, S. B. (1996). Guide to some computerized artificial intelligence methods. DOI.
Siau, K., & Wang, W. (2020). Artificial Intelligence (AI) Ethics: Ethics of AI and Ethical AI. Journal of Database Management.
Turing, A. (1983). Artificial Intelligence: Usfssg Computers to Think about Thinking. Part 1.
Hopgood, A. A. (2005). The state of artificial intelligence. Advances in Computers.
Hoffman, R. G. (2016). Using artificial intelligence to set information free. DOI.

That Guilty Feeling: Manipulating AI Language Models

Written by Cedric Ironsides

No responses yet