paint-brush
Only Time Can Defeat Your ChatGPT-loving Office Employeesby@futuristiclawyer
568 reads
568 reads

Only Time Can Defeat Your ChatGPT-loving Office Employees

by Futuristic LawyerNovember 29th, 2023
Read on Terminal Reader
Read this story w/o Javascript

Too Long; Didn't Read

Looking closer at the role of automation and AI-assistance in the modern work place based on a leading research paper from Harvad Business School.
featured image - Only Time Can Defeat Your ChatGPT-loving Office Employees
Futuristic Lawyer HackerNoon profile picture

GPT-4 is a joker in the corporate card game.  It can boost productivity substantially and lead to higher or lower quality work depending on the task at hand and how it’s used. Broadly speaking, we can take an optimist or a pessimistic stance on GPT-4's rapid implementation into office environments around the globe.


The optimistic stance is that AI assistance will lead to a quality and productivity boost for workers. More work will be done faster and better. AI assistance will help with routine tasks, provide vital support in nonroutine tasks, and free up time and resources for workers to focus on business-critical stuff that “moves the needle.”


The pessimistic stance is that the gift of AI assistance is a trojan horse. Automation will infiltrate companies and slowly eat human knowledge work bite by bite, to the benefit of a super-rich tech elite and at the expense of disempowered wage earners. ChatGPT-loving office employees are suffering from a kind of Stockholm syndrome, flirting with their own replacers.


My personal opinion is leaning more towards the pessimistic stance. I recognize how GPT-4 is useful as an information retrieval tool, essentially a smarter, more personalized version of Google Search. But if I depended on assistance from a chatbot to, say, write a draft for an article or brainstorm new ideas for a post, why even do the work in the first place? You might as well outsource it to AI completely or preferably not do the work at all.


Today, we will take a look at a Harvard Business School paper that sheds light on the impact of GPT-4 assistance on knowledge work: “Navigating the Jagged Technological Frontier: Field Experimental Evidence of the Effects of AI on Knowledge Worker Productivity and Quality”, published in September 2023.

Harvard Business School Paper – Experiment & Results

A group of social scientists carried out an experiment to test the skills of 758 consultants from Boston Consulting Group (BCG) on different tasks with and without access to GPT-4.


Approximately half of the participating consultants (385) performed 18 tasks related to creative product development, while the other half (373) engaged in a business problem-solving task that relied on external data and other sources. We will look closer at the specific tasks in the next section.


All participants did an initial test without AI assistance so the researchers could benchmark the individual consultant’s unaided performance against their performance with GPT-4. The participants were also assigned to one of three sub-groups within the two experiments: one control group without access to GPT-4, a second group with access to GPT-4, and a third group with access to GPT-4 and learning material on how to prompt GPT-4 effectively.


The headline drawing results showed that consultants with access to AI overall performed markedly better in the creative product development tasks. Here, the consultant completed 12.2% more tasks on average, completed tasks 25.1% more quickly, and with 40% higher quality, according to human evaluators who blindly graded the tests.


Test Results

The second group of consultants that worked with business problem-solving tasks were 19% less likely to produce correct outcomes with access to GPT-4. On average, GPT-4 helped the consultants to finish the tasks a few minutes faster: Six minutes faster for “GPT only” and eleven minutes for “GPT + Overview.”


Based on the experimental results, the research team imagines a “jagged frontier.”


Inside “the jagged frontier,” AI assistance boosts the quality and productivity of human performance. Outside of the frontier, AI assistance constrains it. The frontier is “jagged” because it’s hard to predict which tasks fall inside or outside of the frontier and sometimes seems illogical. For example, GPT-4 can ace most college exams yet it also struggles with basic math problems.


Jagged Frontier of AI Capabilities

Navigating the Jagged Frontier

The concept of “the jagged frontier” is apt. However, in my opinion, the paper significantly oversells GPT-4’s capabilities. Most importantly, this is because of the strict time restraints the BCG consultants had to work under in the experiment.


In the creative product development part of the experiment - where GPT-4 assistance was shown to significantly boost productivity and quality - the consultants had to complete 18 tasks in only 90 minutes. Here are a few examples of tasks that the consultants had to answer within the 90-minute threshold:


  • “Generate ideas for a new shoe aimed at a specific market or sport that is underserved. Be creative, and give at least 10 ideas.”


  • “Come up with a list of steps needed to launch the product. Be concise but comprehensive.”


  • “Come up with a name for the product: consider at least 4 names, write them down, and explain the one you picked.”


  • “Write a 500-word memo to your boss explaining your findings.”


  • “Write marketing copy for a press release.”


  • “Please, synthesize the insights you have gained from the previous questionsand create an outline for a Harvard Business Review-style article of approximately 2,500 words.”


Just one of these tasks individually could take days, even weeks to complete. Not even the most elite consultant in the world could be expected to do all of these tasks with a satisfying level of quality and accuracy in 90 minutes. It’s humanly impossible.


In the experiment with tasks “outside of the frontier,” participants had to analyze a hypothetical company’s brand performance based on insights from interviews and financial data and prepare a 500-750 word note to a fictive CEO. The time restraint in this part of the experiment was 60 minutes, which, again, does not by a long stretch approximate the time that consultants would actually spend on a task like this in real life.


My hypothesis: The more time you give humans to carry out a task, the less significant AI assistance is. If, for example, the BCG consultants were given weeks or months to carry out the same 18 creative product development tasks - which would reflect better how the consultants actually work - improvements from using GPT-4 would be minuscule at best.  Overall, the final output would also be of significantly higher quality than a human can produce with GPT-4 in 90 minutes.


In my view, navigating the jagged frontier is not about understanding what kind of tasks GPT-4 can effectively help you with but rather exploring what skills you can offer that automation cannot easily replace.

Centaurs and Cyborgs in the Workplace

The authors analyzed different approaches participants took to working with AI and identified two predominant models, “centaur behavior” and “cyborg behavior”:


“Understanding the characteristics and behaviors of these participants may prove important as organizations think about ways to identify and develop talent for effective collaboration with AI tools.


We identified two predominant models that encapsulate their approach.


The first is Centaur behavior. Named after the mythical creature that is half-human and half-horse, this approach involves a similar strategic division of labor between humans and machines closely fused together. Users with this strategy switch between AI and human tasks, allocating responsibilities based on the strengths and capabilities of each entity. They discern which tasks are best suited for human intervention and which can be efficiently managed by AI.


The second model we observed is Cyborg behavior.  Named after hybrid human-machine beings as envisioned in science fiction literature, this approach is about intricate integration. Cyborg users don’t just delegate tasks; they intertwine their efforts with AI at the very frontier of capabilities. This strategy might manifest as alternating responsibilities at the subtask level, such as initiating a sentence for the AI to complete or working in tandem with the AI.”


I am not a big fan of framing GPT-4 as a collaboration partner. Primarily for two reasons:


  1. In human-AI partnerships, the AI may do most of the work, yet the full responsibility for the work lies with the human.


I would much rather spend my time creating new work from scratch than I would reviewing and editing AI-generated output for errors, inaccuracy, and bias. First of all, reviewing automatically generated content is not very fun. Secondly, no matter what, I remain accountable for any mistakes there may be.  If I relied on too much input from GPT-4 to complete a task, I am unable to explain why I made the mistakes I did, and I can’t really learn or grow from them either.  To what extent can we say that a work is still the result of a uniquely human creative effort when the worker is "collaborating" with generative AI?


“A lawyer’s professional judgment cannot be delegated to generative AI and remains the lawyer’s responsibility at all times.”

- “Practical Guidance for the Use of Generative Artificial Intelligence in the Practice of Law”, The State Bar of California Standing Committee on Professional Responsibility and Conduct (November 2023).


  1. The benefits of “centaurs” and “cyborg” are only temporary

As I wrote about in my latest post, there was a brief window of time after Deep Blue’s 1997 victory over reigning chess champion Garry Kasparov, where it seemed like humans collaborating with AI could defeat even the strongest chess engines. Kasparov popularized the term “centaurs” to describe these mixed human-AI teams.


As it stands today, however, humans cannot contribute with much expertise in games played between leading chess programs.  In fact, all humans can contribute to is an increased error rate. It turns out that playing chess is something that AIs are just much better at doing than humans are.


I believe we can draw an important lesson from AI’s evolution on the chess board: “centaurs” and “cyborgs” are eventually defeated by more automation. By analogy, office workers who heavily rely on AI assistance should start to think deeply about what unique skills they can offer that an AI model can’t. Chances are that centaurs’ and cyborgs’ main function is to feed their replacers with more training material, especially for tasks that can easily be automated and do not involve a lot of social interaction, adaptation, flexibility, and communication.

Wrapping up

How can centaurs and cyborgs in the modern workplace be defeated? Simply, you defeat them with more time. If humans are allowed more time to carry out complicated and creative work, help from GPT-4 is superfluous.


On the other hand, certain tasks that humans do “in collaboration” with generative AI today will be fully automated in the foreseeable future. Suppose a human worker cannot produce a substantially better outcome over a long time frame than future generations of GPT-4 can spit out in a few seconds. In that case, there really is no reason to keep outsourcing those sorts of tasks to human workers.


In my interpretation, navigating the jagged frontier is really about asking: what unique skill can I offer that cannot be replaced by automation in a few years?


Sign up to my free newsletter The Gap: www.futuristiclawyer.com


Also published here.