Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

GPT models and related technologies could have implications on the US labor market.
A new rubric was used to assess occupations based on their correspondence with GPT capabilities.
80% of the US workforce could have at least 10% of their work tasks affected by GPTs.
19% of workers may see at least 50% of their tasks impacted.
Impact is not limited to industries with higher recent productivity growth.
GPTs exhibit characteristics of general-purpose technologies.

Paper Content

Introduction

Recent progress in generative AI and large language models (LLMs)
LLMs can process and produce various forms of sequential data
A new rubric proposed to measure the overall exposure of tasks to GPTs
19% of jobs have at least 50% of their tasks exposed to GPTs
3% of U.S. workers have over half of their tasks exposed to GPTs
49% of workers could have half or more of their tasks exposed to LLMs
Occupations with higher wages generally present with high exposure
Science and critical thinking skills negatively correlated with exposure
Programming and writing skills positively associated with LLM exposure
Higher barriers to entry in jobs tend to experience more exposure to LLMs
Information processing industries exhibit high exposure
Weak connection between productivity growth and overall GPT exposure
GPTs are general-purpose technologies
Measurements of LLM impact potential and use case of applying LLMs to develop such measurements efficiently and at scale

Literature review

The advancement of large language models

LLMs have risen to prominence in AI research
LLMs can tackle complex language-based tasks
Factors such as increased model parameter count, greater training data volume, and enhanced training configurations have fueled progress
LLMs excel in diverse applications like translation, classification, creative writing, and code generation
Methods like fine-tuning and reinforcement learning with human feedback have improved the steerability, reliability, and utility of LLMs
LLMs have potential to program and control other digital tools
LLMs are becoming increasingly integrated into specialized applications
LLMs may be unreliable for various tasks due to issues such as factual inaccuracies, inherent biases, privacy concerns, and disinformation risks
LLMs can become valuable assets in machine learning model development
LLMs can contribute to economic decision-making at the task level
LLMs can continuously enhance performance, but bring a variety of serious risks

The economic impacts of automation technologies

Research has shown that technological progress raises the demand for skilled workers over unskilled workers
Studies have explored the effects of technological change and automation on workers within a task-based framework
Workers involved in routine and repetitive tasks are at a higher risk of technology-driven displacement
New technology increases the need for a wider array of labor-intensive tasks
Automation technologies have resulted in wage inequality in the US
Various approaches have been used to estimate the overlap between AI capabilities and the tasks and activities workers undertake in different occupations
Realization of general purpose technologies’ full potential requires extensive co-invention
Many studies of machine learning technologies focus on systems-level adoption
Task-level information is used to assess whether LLMs fulfill GPT criteria
Findings are aggregated to occupations and industries to capture the overall potential exposure in the contemporary U.S. labor market

Methods and data collection

Data on activities and tasks performed by occupation in the us

Used O*NET 27.2 database
Contains information on 1,016 occupations
Detailed Work Activities (DWAs) and tasks
Sample of tasks and DWAs in Table 1
19,265 tasks with task description and occupation
Most tasks associated with one or more DWAs
2,087 DWAs with most connected to one or more tasks

Data on wages, employment, and demographics

Obtained employment and wage data from 2020 and 2021 Occupational Employment series
Dataset includes occupational titles, number of workers, and employment projections for 2031
Typical education and on-the-job training required for entry in an occupation
Used BLS-recommended crosswalk to ONET to link ONET task and DWA dataset
BLS Labor Force Demographics derived from Current Population Survey

Exposure

Human ratings and GPT-4 ratings were collected using an exposure rubric
Rubric defines exposure as a measure of whether access to a GPT or GPT-powered system would reduce the time required for a human to perform a specific DWA or complete a task by at least 50 percent
Three primary measures of exposure were constructed: E1, E2, and E3
E1 corresponds to the lower bound of the proportion of exposed tasks within an occupation
E2 is the sum of E1 and 0.5*E2, accounting for additional investment
E3 is the sum of E1 and E2, an upper bound of exposure
E3 is used for the remainder of the analysis, assuming tasks directly exposed are considered twice as exposed as tasks requiring complementary innovation

Limitations of our methodology

Validity of task-based framework
Relative vs. absolute measures
Lack of expertise and task interpretation
Forward-looking and subject to change
Sources of disagreement between humans and GPT-4
Human raters and GPT-4 ratings show high degree of agreement
Biased judgments due to lack of occupational diversity
High-quality labels require workers engaged in occupations
GPT-4 sensitive to wording, order, composition, detail, and definitions
Iterating on prompt can enhance agreement between model and rubric
GPT-4 capable of applying intricate taxonomies

Results

General-purpose technologies are rare and have pervasive, long-term impacts.
GPTs can influence labor, productivity, and capital input.
GPTs have the potential to affect a diverse range of occupations and wage structures.

Summary statistics

Average occupation-level values suggest 15% of tasks are exposed to GPTs
Over 30% of tasks are exposed to GPTs for some occupations, and over 50% for others
80% of workers belong to an occupation with at least one task exposed to GPTs
19% of workers are in an occupation where over half the tasks are labeled as exposed

Wages and employment

Exposure intensity across the economy is displayed in Figure 3
Exposure is measured in terms of total workers and total occupations
Worker concentration in occupations is not highly correlated with occupational exposure to GPTs or GPT-powered software
Human and GPT-4 annotations exhibit qualitative similarities and tend to correlate
Higher wages are associated with increased exposure to GPT
Potential exposure to GPTs has little correlation with current employment levels

Skill importance

Skill importance and exposure measures are related
Science and critical thinking skills are negatively associated with exposure
Programming and writing skills are positively associated with exposure

Barriers to entry

Job Zone is a proxy for barriers to entry
Median income increases with Job Zone
Exposure to GPT increases from Job Zone 1 to 4, then decreases or remains similar at Job Zone 5
Higher wage occupations tend to be more exposed to GPT
Bachelor’s, Master’s, and professional degrees are more exposed to GPT than those without formal education
Jobs with least exposure require longest training, lower payoff
Jobs with no on-the-job training required or only internship/residency required yield higher income but are more exposed to GPT

Validation of measures

Comparison to earlier efforts

Aim to build on previous studies of occupational exposure to AI and automation
Previous studies used a variety of methods
Mapping text descriptions of tasks to descriptions of technological advances in patents
Linking capabilities of AI systems to occupational abilities
Mapping results of AI task benchmark evaluations to worker tasks
Expert labeling of automation potential for certain occupations
Developing a rubric for evaluating the “suitability for machine learning”
Summary statistics on many of these prior efforts
Methodology builds upon the SML approach
Results of OLS regressions of new LLM exposure measurements on occupation-level exposure measures
Four separate output variables representing new scores
Generally positive and statistically significant correlations between LLM exposure measures and previous measurements
Encouragingly, SML exposure scores show significant and positive associations
Webb software and AI patent-based measures, SML, and normalized routine cognitive and manual scores all exhibit positive associations
Low correlations with Felten et al. and Frey and Osborne
28-40% unexplained variance compared to other measurements

Discussion

Gpts as a general-purpose technology

GPTs could be classified as a general-purpose technology if they meet three criteria
GPTs are improving in capabilities over time
GPTs can have pervasive impacts across the economy
Complementary innovations enabled by GPTs can have widespread application to economic activity
Adoption and use of LLMs is becoming increasingly widespread
Adoption of LLMs will vary across different economic sectors due to various factors

Implications for us public policy

Automation technologies, including LLMs, have been linked to economic disparity and labor disruption.
Results from the US suggest the need for policy preparedness for the potential economic disruption posed by LLMs.
Prior work has suggested policy directions related to education, worker training, and safety net programs.

Limitations and future work

Study has limitations that need further investigation
Focus on US restricts generalizability
Need to extend scope and share methods
Need to explore GPT adoption patterns and actual capabilities/limitations of state-of-the-art models
Need to consider vision capabilities in ratings

Conclusion

Generative Pre-trained Transformers (GPTs) generate profound transformations
19% of jobs have at least 50% of their tasks exposed to GPTs
GPTs can have pervasive impacts across a wide swath of occupations in the US
GPTs can augment or displace human labor
GPTs can impact job quality, inequality, and skill development
New rubric for understanding LLM capabilities and their potential effects on jobs
Direct exposure (E1) - Writing and transforming text and code according to complex instructions
Exposure by LLM-powered applications (E2) - Summarizing documents longer than 2000 words and answering questions about those documents
Exposure given image capabilities (E3) - Reading text from PDFs, scanning images, or creating or editing digital images according to instructions
No exposure (E0) - Tasks requiring a high degree of human interaction, precise measurements, reviewing visuals in detail, use of a hand or walking, making decisions that might impact human livelihood, existing technology not powered by an LLM
Developed capacities that facilitate learning or the more rapid acquisition of knowledge
Background structures needed to work with and acquire more specific skills in a variety of different domains
Procedures that contribute to the more rapid acquisition of knowledge and skill across a variety of domains
Programming - Writing computer programs for various purposes
Impact potential is present across nearly all industries, with wide heterogeneity
Productivity growth since 2012 and exposure to LLM technologies appear unrelated
Occupations with the highest exposure according to each measurement
Regression of occupation-level, human-annotated exposure to GPTs on skill importance
Mean exposure to GPTs by job zone

Link to paper#

Abstract#

Paper Content#

Introduction#

Literature review#

The advancement of large language models#

The economic impacts of automation technologies#

Methods and data collection#

Data on activities and tasks performed by occupation in the us#

Data on wages, employment, and demographics#

Exposure#

Limitations of our methodology#

Results#

Summary statistics#

Wages and employment#

Skill importance#

Barriers to entry#

Validation of measures#

Comparison to earlier efforts#

Discussion#

Gpts as a general-purpose technology#

Implications for us public policy#

Limitations and future work#

Conclusion#

Link to paper

Abstract

Paper Content

Introduction

Literature review

The advancement of large language models

The economic impacts of automation technologies

Methods and data collection

Data on activities and tasks performed by occupation in the us

Data on wages, employment, and demographics

Exposure

Limitations of our methodology

Results

Summary statistics

Wages and employment

Skill importance

Barriers to entry

Validation of measures

Comparison to earlier efforts

Discussion

Gpts as a general-purpose technology

Implications for us public policy

Limitations and future work

Conclusion