Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

  • LLMs are used to generate content for a range of tasks
  • Need to ensure models are aligned with human preferences and do not produce unsafe, inaccurate or toxic outputs
  • Alignment techniques can mitigate safety concerns and improve model capabilities
  • Personalising LLMs through micro-level preference learning processes may result in models better aligned with each user
  • Normative challenges in defining bounds of societally-acceptable and safe degree of personalisation

Paper Content

Introduction

  • LLMs have improved in recent years and are being used in a wide range of applications
  • ChatGPT was released in 2022 and reached over 100 million users in two months
  • LLMs are being aligned with human preferences using reward learning
  • More reward learning can lead to LLMs having polarised views on certain topics
  • Current implementations of reward learning are limited and rely on a small group of people
  • This paper presents a taxonomy and policy framework for explicit personalisation of LLMs
  • Personalised LLMs could provide tailored assistance and adapt to diverse groups
  • Personalised LLMs could also reinforce biases and narrow information diets
  • A policy framework is needed to govern personalised LLMs safely and ethically

Background

  • AI systems are being applied to complex tasks
  • Alignment is desirable to avoid undesirable behaviours
  • Alignment is a technical challenge
  • Recent works have tried to align LLMs with human preferences
  • Three axes of alignment: what, what, who
  • Implicit personalisation is occurring to meet expectations of non-representative crowdworkers

From implicit to explicit personalisation

  • Given the diversity of human values and preferences, and the importance of pragmatics for contextual understanding, the aim to fully align models across human populations may be a futile one.
  • Personalisation in internet technologies and NLP is not new.
  • The technical apparatus for effective feedback learning exists.
  • LLMs are designed for adaption via transfer learning.
  • The HuggingFace hub is evidence of the demand for customisation.
  • Recent industry model developments and announcements suggest the imminent possibility of personalisation.

A taxonomy of the benefits and risks from personalised llms

  • Consider the effects of personalised LLMs at individual and societal levels
  • Benefits and risks of personalised LLMs are summarized in Table 1
  • Benefits and risks can be directly related to each other
  • Benefits and risks can accumulate at the societal level
  • Constructed taxonomy through increased stakeholders involvement, allowing for a more participatory and inclusive approach
  • Personalised LLMs can increase efficiency, utility, autonomy, and connection
  • Increased efficiency is the inverse of quality of service harms
  • Increased utility is the inverse of discrimination and exclusion harms
  • Increased autonomy is the inverse of representational harms
  • Increased connection is the inverse of emotional harms
  • Cost incurred by end-users in providing personalised feedback to an LLM

Societal level

  • Personalised LLMs may better adapt to the needs of marginalised communities
  • Personalised LLMs could improve access to resources and reduce cost
  • Personalisation democratises how values or preferences are embedded into an LLM
  • Personalised LLMs could improve work productivity
  • Personalised LLMs could entrench digital divide
  • Personalised LLMs could lead to polarisation and breakdown of shared social cohesion
  • Personalised LLMs could be used for malicious purposes
  • Personalised LLMs could lead to labour displacement
  • Personalised LLMs could have a large environmental impact

A three-tiered policy framework for personalised llms

  • Proposes a new policy framework for managing benefits and risks of personalised LLMs
  • Provides a principled and holistic way of deciding how personalisation should be managed

The limits of personalisation

  • Deciding limits of personalisation is a subjective and contentious decision
  • Deciding which aspects of model behaviour should be personalised and how they should be allowed to be personalised
  • Restrictions and requirements should be applied to different types of personalised LLM outputs

How people interact with llms

  • Workflows in machine learning models have changed in the past 5 years
  • Pre-trained models are used instead of training from scratch
  • Generative models are fine-tuned on in-domain data
  • Model provider, application provider and end-user are involved in creating an LLM application
  • Personalisation of LLMs needs to be managed to ensure benefits and mitigate risks

A three-tiered policy framework

  • Three-tiered policy framework proposed
  • Tier One: Immutable restrictions, e.g. terrorist content, CSAM, threats of violence or sexual assault
  • Tier Two: Optional restrictions and requirements, based on values and preferences of actor
  • Tier Three: Tailored requirements, user decides personal preferences within boundaries set by higher tiers

Discussion

  • Personalisation of LLMs is a likely pathway for their continued expansion
  • Taxonomy of benefits and risks from personalised LLMs
  • Policy framework to govern benefits and risks
  • Assumption that personalised LLMs are technically feasible
  • Demand for customised and highly-adapted LLMs
  • Technical decision-points and engineering challenges
  • Challenges to implementing and enforcing policy framework
  • Plans for maturing research and iterating on taxonomy

Technical challenges

  • Learning from human feedback requires less data than pre-training
  • Exact amount of data needed for personalisation is unclear
  • Cold start problem exists when no feedback data points are available
  • Data format and quality is important
  • Model scale may not contribute significantly to performance
  • Alignment tax must be balanced to prevent overfitting

Policy framework enforcement

  • Personalised LLMs have wide-reaching benefits and risks
  • Framework outlines properties for appropriate governance
  • Policy enforcement to minimize violations and friction
  • Must comply with existing regulations
  • Difficult to regulate dynamic and distributed systems
  • Responsibility distributed among model, application and end-users
  • Open question of principles to define organisational bounds

Next steps

  • Work relies on informed speculation about future development and governance of personalised LLMs
  • Affordances, constraints and harms depend on design, use and safeguards
  • Future iteration of taxonomy and policy framework will involve interviews with end-users, providers and policymakers
  • Aim is to avoid long lags in understanding, documenting and governing harms from personalised LLMs