Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

LLMs are used to generate content for a range of tasks
Need to ensure models are aligned with human preferences and do not produce unsafe, inaccurate or toxic outputs
Alignment techniques can mitigate safety concerns and improve model capabilities
Personalising LLMs through micro-level preference learning processes may result in models better aligned with each user
Normative challenges in defining bounds of societally-acceptable and safe degree of personalisation

Paper Content

Introduction

LLMs have improved in recent years and are being used in a wide range of applications
ChatGPT was released in 2022 and reached over 100 million users in two months
LLMs are being aligned with human preferences using reward learning
More reward learning can lead to LLMs having polarised views on certain topics
Current implementations of reward learning are limited and rely on a small group of people
This paper presents a taxonomy and policy framework for explicit personalisation of LLMs
Personalised LLMs could provide tailored assistance and adapt to diverse groups
Personalised LLMs could also reinforce biases and narrow information diets
A policy framework is needed to govern personalised LLMs safely and ethically

Background

AI systems are being applied to complex tasks
Alignment is desirable to avoid undesirable behaviours
Alignment is a technical challenge
Recent works have tried to align LLMs with human preferences
Three axes of alignment: what, what, who
Implicit personalisation is occurring to meet expectations of non-representative crowdworkers

From implicit to explicit personalisation

Given the diversity of human values and preferences, and the importance of pragmatics for contextual understanding, the aim to fully align models across human populations may be a futile one.
Personalisation in internet technologies and NLP is not new.
The technical apparatus for effective feedback learning exists.
LLMs are designed for adaption via transfer learning.
The HuggingFace hub is evidence of the demand for customisation.
Recent industry model developments and announcements suggest the imminent possibility of personalisation.

A taxonomy of the benefits and risks from personalised llms

Consider the effects of personalised LLMs at individual and societal levels
Benefits and risks of personalised LLMs are summarized in Table 1
Benefits and risks can be directly related to each other
Benefits and risks can accumulate at the societal level
Constructed taxonomy through increased stakeholders involvement, allowing for a more participatory and inclusive approach
Personalised LLMs can increase efficiency, utility, autonomy, and connection
Increased efficiency is the inverse of quality of service harms
Increased utility is the inverse of discrimination and exclusion harms
Increased autonomy is the inverse of representational harms
Increased connection is the inverse of emotional harms
Cost incurred by end-users in providing personalised feedback to an LLM

Societal level

Personalised LLMs may better adapt to the needs of marginalised communities
Personalised LLMs could improve access to resources and reduce cost
Personalisation democratises how values or preferences are embedded into an LLM
Personalised LLMs could improve work productivity
Personalised LLMs could entrench digital divide
Personalised LLMs could lead to polarisation and breakdown of shared social cohesion
Personalised LLMs could be used for malicious purposes
Personalised LLMs could lead to labour displacement
Personalised LLMs could have a large environmental impact

A three-tiered policy framework for personalised llms

Proposes a new policy framework for managing benefits and risks of personalised LLMs
Provides a principled and holistic way of deciding how personalisation should be managed

The limits of personalisation

Deciding limits of personalisation is a subjective and contentious decision
Deciding which aspects of model behaviour should be personalised and how they should be allowed to be personalised
Restrictions and requirements should be applied to different types of personalised LLM outputs

How people interact with llms

Workflows in machine learning models have changed in the past 5 years
Pre-trained models are used instead of training from scratch
Generative models are fine-tuned on in-domain data
Model provider, application provider and end-user are involved in creating an LLM application
Personalisation of LLMs needs to be managed to ensure benefits and mitigate risks

A three-tiered policy framework

Three-tiered policy framework proposed
Tier One: Immutable restrictions, e.g. terrorist content, CSAM, threats of violence or sexual assault
Tier Two: Optional restrictions and requirements, based on values and preferences of actor
Tier Three: Tailored requirements, user decides personal preferences within boundaries set by higher tiers

Discussion

Personalisation of LLMs is a likely pathway for their continued expansion
Taxonomy of benefits and risks from personalised LLMs
Policy framework to govern benefits and risks
Assumption that personalised LLMs are technically feasible
Demand for customised and highly-adapted LLMs
Technical decision-points and engineering challenges
Challenges to implementing and enforcing policy framework
Plans for maturing research and iterating on taxonomy

Technical challenges

Learning from human feedback requires less data than pre-training
Exact amount of data needed for personalisation is unclear
Cold start problem exists when no feedback data points are available
Data format and quality is important
Model scale may not contribute significantly to performance
Alignment tax must be balanced to prevent overfitting

Policy framework enforcement

Personalised LLMs have wide-reaching benefits and risks
Framework outlines properties for appropriate governance
Policy enforcement to minimize violations and friction
Must comply with existing regulations
Difficult to regulate dynamic and distributed systems
Responsibility distributed among model, application and end-users
Open question of principles to define organisational bounds

Next steps

Work relies on informed speculation about future development and governance of personalised LLMs
Affordances, constraints and harms depend on design, use and safeguards
Future iteration of taxonomy and policy framework will involve interviews with end-users, providers and policymakers
Aim is to avoid long lags in understanding, documenting and governing harms from personalised LLMs

Link to paper#

Abstract#

Paper Content#

Introduction#

Background#

From implicit to explicit personalisation#

A taxonomy of the benefits and risks from personalised llms#

Societal level#

A three-tiered policy framework for personalised llms#

The limits of personalisation#

How people interact with llms#

A three-tiered policy framework#

Discussion#

Technical challenges#

Policy framework enforcement#

Next steps#

Link to paper

Abstract

Paper Content

Introduction

Background

From implicit to explicit personalisation

A taxonomy of the benefits and risks from personalised llms

Societal level

A three-tiered policy framework for personalised llms

The limits of personalisation

How people interact with llms

A three-tiered policy framework

Discussion

Technical challenges

Policy framework enforcement

Next steps