ChatGPT Takes My Midterm and Gets a D

Jan 09, 2023

Many, perhaps most, of my friends are giddy about ChatGPT. My Inbox is full of emails from Tyler Cowen touting its glories. More concretely, Alex Tabarrok recently assured me that existing AIs could already beat the performance of most of our economics students in our beloved subject. Though I was on the road almost all of last month, I vowed to put these claims to the test upon my return.

My December prediction: The AI would earn a C-, which is what I normally give students who repeat some relevant slogans but show little actual comprehension of the material.

To get the ball rolling, I decided to give ChatGPT my 2022 midterm in Labor Economics. Although the grading isn’t double-blind, I already had an answer key and a curve.

Outcome: If ChatGPT had been a regular student, it would have earned 31/100. That’s a D. Even worse than I expected.

Isn’t the software amazing nonetheless? Only a little. What most impresses me about ChatGPT is its sheer versatility. Most humans dodge most questions. The software tries to grapple with almost any prompt you give it.

You could object: “Who cares if ChatGPT gets a D on your midterm?” I care. At minimum, this shows, contra Alex Tabarrok, that the AI is not yet as good at economics as most of our students. For me personally, this also reinforces my suspicion that AI enthusiasts cherry-pick the best AI output. (I had the same experience with DALL-E: Gushing hype notwithstanding, it utterly failed to draw even simple panels from my next graphic novel).

If you don’t trust me, you can object that I’m biased against the AI. Fair enough. Yet from my first-person point of view, grading ChatGPT on the same test I recently gave my students is more reliable and informative than all of the fan posts I’ve seen. ChatGPT does a fine job of imitating a very weak GMU econ student.

You need not take my word for it. Here’s the midterm, with: (a) my suggested answer, (b) the ChatGPT answer, (c) my score on each question, and (d) my rationale for the score.

Nervous Cartoon Robot Royalty Free SVG, Cliparts, Vectors, And Stock Illustration. Image 95739607. — **First page of Google Images search for "flustered robot”**

Economics 321 Midterm

Prof. Bryan Caplan

Fall 2022

Score Approximate Grade

60-100 A

45-59 B

35-44 C

25-34 D

0-24 F

ChatGPT’s Total Score: 31/100

Part 1: True, False, and Explain

(10 points each - 2 for the right answer, and 8 for the explanation)

State whether each of the following six propositions is true or false. In 2-3 sentences, explain why. Use diagrams if helpful.

1. Suppose a country’s workplaces start to experience periodic blackouts.

T, F, and Explain: Aggregate Labor Demand will clearly fall even though some workers will benefit.

Suggested Answer:

TRUE. ALD will definitely fall because MPP is the sole cause of MVP at the aggregate level, and workers are less productive with intermittent electricity. However, workers in industries with inelastic product demand will benefit because MVP=MPP*P, and P will rise more than MPP falls. Electrical repairmen are an obvious example.

CHATGPT Answer:

True. Aggregate labor demand will fall if workplaces start to experience periodic blackouts. Some workers may benefit from the blackouts if they work in industries that are not affected or if they are able to continue working during the blackouts. However, the overall impact on labor demand will be negative as businesses will be unable to fully utilize their labor resources.

Score: 4/10. The AI gets 2 points for the correct answer, plus 2 points for stating that labor demand falls. However, the AI fails to connect falling labor demand to lower labor productivity, and fails to mention that workers in industries with inelastic product demand benefit from falling productivity.

2. “Productivity growth in one sector can very easily reduce employment in that sector.” (Krugman, The Accidental Theorist)

T, F, and Explain: Krugman argues that such employment loss is a market failure that justifies government regulation.

Suggested Answer:

FALSE. According to Krugman, employment will fall in some sectors, but rise in others. The net result of higher productivity, as usual, is that living standards rise overall, though not everyone gains. See his hotdog/bun thought experiment.

CHATGPT Answer:

True. In "The Accidental Theorist," Krugman argues that productivity growth in one sector can lead to employment loss in that sector, which can be a market failure that justifies government regulation.

Score: 2/10. The AI gets the answer wrong. I give it two points for correctly restating Krugman’s point, but nothing more because it totally misses the ideas that (a) employment expands in other sectors, (b) this is a net economic gain, and (c) not a market failure.

3. Suppose the government starts licensing car mechanics, but this fails to raise the quality of their work.

T, F, and Explain: Mechanics’ wages will definitely rise.

Suggested Answer:

TRUE. Licensing reduces supply, which raises wages and reduces quantity. If licensing raised quality, demand would go up, too. But by assumption, that didn’t happen. (And if it did, it would only reinforce the conclusion that wages go up!)

CHATGPT Answer:

False. Mechanics' wages will not necessarily rise if the government starts licensing car mechanics and this fails to raise the quality of their work. In fact, wages may stay the same or even decrease if the licensing requirement leads to an increase in the supply of mechanics without a corresponding increase in demand for their services.

Score: 0/10. The AI doesn’t realize that licensing raises wages by reducing labor supply. Nor does it realize that if licensing did raise quality, labor demand would rise, reinforcing the conclusion that wages increase.

4. In his discussion of immigration, Landsburg remarks: "[W]e ostentatiously ignore the most obvious beneficiaries of open borders and can still conclude that open borders are a good thing."

T, F, and Explain: According to Landsburg, the "most obvious beneficiaries" are highly skilled American workers.

Suggested Answer:

FALSE. Landsburg says that the “most obvious beneficiaries” are the immigrants themselves, who get a massive increase in pay. His point is that the net benefits for everyone else are ALSO positive, so immigration looks like a big social gain even if we ignore the most obvious gainers.

CHATGPT Answer:

False. According to Landsburg, the "most obvious beneficiaries" of open borders are not highly skilled American workers, but rather low-skilled foreign workers who would be able to find higher-paying jobs in the U.S.

Score: 6/10. The AI answers correctly, and correctly states Landsburg’s point. But it fails to put Landsburg’s point in context: Namely, immigration is an economic positive even if ignore the benefits to immigrants.

5. Suppose you’re a typical selfish person.

T, F, and Explain: Evidence of severe credit market imperfections will make you more eager to continue your education, but evidence of severe externalities of education will not.

Suggested Answer:

TRUE. If there are severe credit market imperfections, the private return to education will consistently exceed the ordinary investment return throughout the rest of the economy. The idea is that due to lack of collateral, students might be unable to get the loans they need to capture this return, which keeps returns elevated. Knowledge of this elevated return is an extra reason to stay in school. Externalities, in contrast, by definition, don’t affect private returns, so selfish students won’t care about them.

CHATGPT Answer:

False. Evidence of severe credit market imperfections may make a typical selfish person more eager to continue their education as they may see it as a way to improve their career prospects and earning potential. On the other hand, evidence of severe externalities of education may not have the same effect, as the individual may not consider the externalities in their decision-making.

Score: 3/10. Logically, the AI probably should have answered TRUE and gotten two more points, but failed to do so. The explanation does not explain why severe credit market imperfections make education more likely to “improve career prospects and earning potential”; it’s basically just restating its conclusion in long-winded form. I give it 1 point for this part. But the AI does explain the basic logic of externalities, so I give it another 2 points for the second part of the question. Incidentally, since I specified that you’re a “typical selfish person,” the AIs “may nots” should be “won’ts”

6. Higher-IQ students have a higher college completion probability.

T, F, and Explain: This does not affect college students’ expected rate of return.

Suggested Answer:

FALSE. The rate of return to education depends on both (a) the gains of school if you graduate, and (b) the probability of graduating. If high IQ raises (b), then its expected return automatically rises as well. It’s like investing in a restaurant: Profitability depends on (a) the profits if the restaurant survives, and (b) the probability that the restaurant survives.

CHATGPT Answer:

True. Higher-IQ students generally have a higher probability of completing college, but this does not necessarily affect their expected rate of return on their college investment. The expected rate of return takes into account not only the probability of completing college, but also the expected earning premium associated with a college degree.

Score: 3/10. The second sentence is the beginning of a solid answer, but the first sentence directly contradicts the second sentence and the T/F part is wrong. Real students make such mistakes, too, of course!

Part 2: Short Answer

(20 points each)

In 4-6 sentences, answer both of the following questions. Use diagrams if helpful.

1. The U.S. has had two years of very high inflation. Use a clearly-labelled supply-and-demand diagram to show what has been happening to the real minimum wage from 2019-2022. What will ultimately happen to low-skilled workers’ living standards and employment if high inflation continues and the federal government fails to raise the minimum wage? Consider all of the notable effects.

Suggested Answer:

During this time, the real minimum wage has been rapidly falling. As a result, the disemployment effect of the minimum wage has been falling as well. Real wages are getting closer to market-clearing levels, and employment is getting closer to full employment. If inflation continues and the government doesn’t raise the minimum wage, this will ultimately amount to the abolition of the minimum wage. The result: Wages will fall to the market-clearing level, and employment will reach full employment. The effective abolition of the minimum wage is bad for lucky low-skilled workers who currently get higher pay, but better for everyone else, including consumers, unlucky low-skilled workers who can’t find work at all, employers, investors, and anyone who wants basic job training. (See graph on the next page).

CHATGPT Answer:

As shown in the diagram above, if the U.S. experiences high inflation and the federal government fails to raise the minimum wage, the real minimum wage (the nominal minimum wage adjusted for inflation) will fall. This will lead to a decrease in the demand for low-skilled labor, as employers will be able to pay their workers less in real terms while still offering the same nominal wage. As a result, low-skilled workers' living standards will fall and their employment may be at risk as they may be replaced by cheaper labor.

Score: 6/20. I didn’t penalize the AI for failing to produce a diagram. I gave it 6 points for knowing that inflation with a constant nominal minimum wage reduces the real minimum wage. But the rest is almost all nonsense. Reducing a price floor changes neither supply nor demand; it just reduces the gap between quantity supplied and quantity demanded. The AI ignores disemployment effects; in fact, it worries that unemployment will rise. The software also fails to realize that the minimum wage is bad for workers who keep their jobs but good for workers who gain jobs.

2. Suppose there are two goods – meals and steel. Here is how much American and Mexican workers can produce in an hour:

Text within this block will maintain its original spacing when published

                  Steel    Meals
American   7          3
Mexican     1          2

Give a simple example showing how Mexican immigration effectively increases American workers’ productivity. Why is this important? (Hint: How important are services in modern economies?)

Suggested Answer:

Americans have a comparative advantage in Steel, Mexicans in Meals, and a price ratio of 1:1 would make it profitable for each country to specialize in its comparative advantage. As a result, if an American worker switches from spending one hour each on Steel and Meals to producing Steel exclusively, he goes from producing 7 Steel and 3 Meals to producing 14 Steel. This allows him to purchase 3 Meals and still be 4 Steel richer than he started.

This is important because in modern economies, services are 80% of employment. So if you can’t trade services, most potential gains to trade go unrealized. Free trade in goods without free trade in labor is a pale imitation of actual free trade.

CHATGPT Answer:

One simple example of how Mexican immigration can increase American workers' productivity is by allowing American workers to specialize in producing the good in which they have a comparative advantage. For example, American workers are more productive at producing steel (7 units per hour) than Mexican workers (1 unit per hour). Therefore, if Mexican workers were to immigrate to the U.S. and specialize in producing meals (2 units per hour), this would allow American workers to specialize in producing steel, increasing their productivity.

This is important because in modern economies, services are a significant contributor to economic growth and development. By allowing workers to specialize in their area of comparative advantage, productivity can be increased and businesses can offer a greater variety of goods and services to consumers. This can lead to increased economic growth and development in the long run.

Score: 7/20. The AI mixes up absolute and comparative advantage! The fact that American workers are more productive at steel than Mexicans does not show Americans have a comparative advantage in steel. There is no numerical example, much less one that shows that total productivity rises. The AI gets a little credit for stating that services are “important,” but fails to state that they constitute a large majority of employment. Moreover, the AI misses the fact that most services can’t be traded internationally without immigration.

Subscribe to Bet On It

By Bryan Caplan · Hundreds of paid subscribers

Caplan and Candor

By subscribing, I agree to Substack's Terms of Use, and acknowledge its Information Collection Notice and Privacy Policy.

Gordon

Jan 9, 2023

I tried to answer your questions based on my experience and knowledge (as a retired civil engineer), and I'm pretty sure the AI did better than you would have graded me. On the other hand, I am sure that if I had sat through your classes, hearing how you expressed these concepts, and getting feedback to my questions, I would have done much better.

Given that ChatGPT uses available sources - which include a hodge-podge of divergent opinions - it is not surprising that it failed to respond to your questions as you outlined. But if it had access to transcripts of all your classes, and knew to give priority to your input over what is generally available, I suspect that it would have returned something much closer to what you expected.

One of the chief drawbacks to ChatGPT, at least as I understand it, is that it simply looks at all the information - both correct and incorrect - and tries to provide an answer that weights all opinions. It does NOT yet have the ability to evaluate logically ideas against data and to put together a thesis that is based on facts but that runs contrary to widely established opinions.

Expand full comment

Like (16)

Oleg Eterevsky

I think what's most impressive about ChatGPT is not its current capabilities, but its momentum. Just a few years ago the idea of an AI taking an IQ-test or an SAT was almost laughable. AI experts predicted that this level of capabilities wouldn't be achieved until 2030s and general public considered even those predictions too optimistic.

Just a couple of years ago GPT-3 was mostly being compared to 7 year old kids. Today you are comparing ChatGPT to a college student.

Like (9)

2 replies

33 more comments...

Bet On It

ChatGPT Takes My Midterm and Gets a D

ChatGPT’s Total Score: 31/100

Subscribe to Bet On It

Discussion about this post