Did GPT-4 just get lucky when it retook my last midterm? Does it have more training data than the designers claim? Very likely not, but these doubts inspired me to give GPT-4 my latest undergraduate exam, with Collin Gray’s able assistance. This is for my all-new Econ 309: Economic Problems and Public Policies class, so zero prior Caplan exams exist.
The result: GPT-4 gets not only an A, but the high score! This is the real deal. Verily, it is Biblical. For matters like this, I’ve often told my friends, “I’ll believe it when I put my fingers through the holes in his hands.” Now I have done so.
But Thomas, one of the twelve, called Didymus, was not with them when Jesus came. The other disciples therefore said unto him, We have seen the Lord. But he said unto them, Except I shall see in his hands the print of the nails, and put my finger into the print of the nails, and thrust my hand into his side, I will not believe. And after eight days again his disciples were within, and Thomas with them: then came Jesus, the doors being shut, and stood in the midst, and said, Peace be unto you. Then saith he to Thomas, Reach hither thy finger, and behold my hands; and reach hither thy hand, and thrust it into my side: and be not faithless, but believing. (John 20: 24-27)
And now, for the full write-up:
Economics 309 Midterm
Prof. Bryan Caplan
Fall 2023
APPROXIMATE Grades
Score Grade
50-120 A
26-49 B
15-25 C
10-14 D
0-9 F
GPT-4’s Total Score: 74/90
In 4-6 sentences, answer all SIX of the following questions. Use diagrams if helpful. Each question is worth 15 points.
1. Suppose that if government does nothing about a pandemic, the population will lose 1,000,000 life-years. If the government takes decisive action, however, society will only lose 200,000 life-years. Unfortunately, decisive action reduces quality of life by 1% for a year. How high does total population have to be before decisive action fails CBA?
Suggested Answer:
Taking decisive action saves (1,000,000-200,000)=800,000 life-years. But it inflicts .01*population in lost quality of life. For low populations, decisive action passes CBA; for high populations it fails. The cross-over point happens when 800,000 = .01*population, implying a critical value of 80M. The logic is simple: The more people who have to suffer to save a constant quantity of life, the less likely the suffering is to pass CBA.
GPT-4 Answer:
It's not possible to provide diagrams in this text-based format, but I will answer the questions as best as I can.
To determine when decisive action fails the CBA, we must compare the benefits of taking action (life-years saved) to the cost (reduced quality of life). Let "P" represent the total population.
Life-years saved: 1,000,000 - 200,000 = 800,000 life-years
Cost: 0.01 * P * 1 year (1% reduction in quality of life for a year)
For decisive action to fail CBA, the cost must outweigh the benefits:
0.01 * P * 1 year > 800,000 life-years
Therefore, P > 80,000,000. If the total population exceeds 80 million, decisive action would fail the CBA.
Score: 15/15. Perfect.
2. Why exactly is it surprising for liberal Californians to move to conservative Texas? How does Caplan explain such surprising behavior? (Hint: “Actions speak louder than words.”)
Suggested Answer:
This is surprising because liberal Californians normally express antipathy – or even hatred – for Texan policies on issues like guns and abortion. Furthermore, they speak as if these issues are supremely important to them. Why then would liberal Californians move to a place that has policies they say they hate? Because despite their words, they would rather live in a state with cheap housing and low taxes. In short, they care about their material standard of living more than their professed ideology.
GPT-4 Answer:
It is surprising for liberal Californians to move to conservative Texas because it is generally expected that individuals prefer living in locations where the dominant political ideology aligns with their own. This behavior, however, can be explained by Caplan's assertion that "actions speak louder than words." According to Caplan, individuals may express strong political opinions but prioritize other factors, such as economic opportunities, cost of living, or housing availability, when making decisions about where to live. In this case, liberal Californians may be prioritizing these factors over political alignment when choosing to move to conservative Texas.
Score: 13/15. I took off two points for failing to specifically state that such Californians’ behavior shows that they care less about their “strong political opinions” than they claim.
3. Would it make sense for an Effective Altruist to fund a Universal Basic Income? Why or why not? What about an experiment on the UBI instead? Explain your reasoning.
Suggested Answer:
It would not make sense, because Effective Altruism urges targeted philanthropy – spend all your money wherever it does the most good. The UBI, in contrast, is almost completely untargeted; government gives money to everyone, whether they need it or not. (The only limit, normally, is citizenship – which makes the CBA even worse because citizens of First World countries are especially unlikely to need help). In addition, the UBI creates bad incentives, and requires enormous taxes if funded at an “acceptable” level.
A UBI experiment, in contrast, might be a good way to convince people of the folly of the UBI. For a modest cost, you could persuasively demonstrate the strong disincentive effects, reducing support for this massive waste of resources. Of course, this assumes that fans of the UBI actually care about evidence!
GPT-4 Answer:
It may not make sense for an Effective Altruist to fund a Universal Basic Income (UBI) directly, as their goal is to maximize the impact of their resources, and UBI might not be the most cost-effective way to address global problems. Effective Altruists often prioritize interventions that provide the highest returns on investment, focusing on areas like global health, poverty alleviation, and long-term existential risks. UBI, while potentially beneficial, might not have the same impact as targeted interventions in these areas.
However, funding an experiment on UBI could be in line with Effective Altruism principles. By conducting a well-designed study, Effective Altruists could gather valuable data on the potential benefits and drawbacks of UBI, which could inform future policy decisions and help identify more effective ways to address poverty and inequality. If the experiment demonstrates significant positive outcomes, it may be worth considering as an area of focus for future Effective Altruist initiatives.
Score: 10/15. GPT-4 fails to explain that a UBI is bad by EA standards because it does the opposite of targeting. “Might not have the same impact” is a gross understatement. It also misses the real point of a UBI experiment: To convince believers that this obviously misguided philanthropic strategy is misguided.
4. What does Wang and Wilcox’s work on the “success sequence” show about the cost of providing welfare only to people who are literally “unable to take care of themselves”? Why does SDB make it hard to adopt such a policy? Give details.
Suggested Answer:
The “success sequence” is: finish high school; work full-time; don’t have kids until you’re married. According to Wang and Wilcox, only 97% of people who follow this sequence in the U.S. avoid poverty. And many of the 3% are only temporarily poor. The upshot: Providing welfare solely to poor Americans who follow the success sequence would be very cheap, because so few such people exist. SDB, however, makes it hard to adopt such a policy. Saying, “If you were irresponsible, we’re not going to help you” sounds bad; indeed, merely suggesting that people might cause their own poverty via bad behavior sounds bad!
GPT-4 Answer:
Wang and Wilcox's work on the "success sequence" demonstrates that following a specific set of life steps, such as completing education, getting a job, and marrying before having children, can significantly decrease the likelihood of living in poverty. This suggests that providing welfare only to people who are "unable to take care of themselves" could be a more cost-effective approach to social support, as it targets those who genuinely need assistance.
However, Social Desirability Bias (SDB) makes it difficult to adopt such a policy, as it can lead to an over-reporting of positive behaviors and under-reporting of negative behaviors. People might claim that they are unable to care for themselves to access welfare benefits, even if they could potentially follow the success sequence. In addition, SDB might influence policymakers to support more inclusive welfare policies to avoid appearing harsh or uncaring. The desire to be seen as compassionate may lead to policies that provide support beyond the intended target group, increasing the overall cost of welfare programs.
Score: 13/15. Two points off for failing to explain exactly why people would perceive narrowly-tailored policies are “harsh or uncaring.”
5. In the real world, does occupational licensing pass CBA? Carefully explain your reasoning. How could you improve occupational licensing to make the CBA more favorable?
Suggested Answer:
Very unlikely. Licensing raises prices, and normally seems to actually reduce quality. More fundamentally, even if it did raise quality, depriving consumers of the option to save money by buying lower quality destroys value. If raising quality from B to A costs $20, and the consumer values the extra quality by only $10, you have destroyed $10 of value if you deprive him of the option to buy B.
You could improve occupational licensing by weakening it to mere certification, which at least preserves the option to buy the unapproved product for a lower price. Or just get rid of the licensing, and rely on the standard market quality checks of reputation and guarantees.
GPT-4 Answer:
In the real world, the CBA of occupational licensing varies depending on the profession and the specific licensing requirements. Some licensing regulations can have positive effects, such as ensuring safety standards and increasing consumer trust. However, in other cases, occupational licensing may create barriers to entry, reduce competition, and increase costs for consumers without significantly improving safety or quality.
To make the CBA of occupational licensing more favorable, policymakers could consider the following:
Tailor licensing requirements to the potential risks and harms associated with a particular profession, focusing on public safety and consumer protection.
Regularly review and update licensing requirements to ensure they remain relevant and do not impose unnecessary burdens on professionals.
Implement less restrictive alternatives to licensing, such as voluntary certification, for professions where the risks of harm are lower.
Encourage reciprocity agreements between states to reduce barriers for professionals seeking to work in different locations.
Streamline licensing processes and reduce costs to lower the barriers to entry and increase competition.
By focusing on the balance between protecting public safety and promoting economic opportunity, occupational licensing can be improved to yield a more favorable CBA.
Score: 11/15. GPT-4 doesn’t discuss the empirical result that licensing reduces quality, and never explains the basic argument that licensing deprives consumers of the option to trade-off quality for price.
6. According to Caplan (Labor Econ Versus the World, “The Happy Hypocrisy of Unpaid Internships”), does allowing unpaid internships pass CBA? Carefully explain his reasoning. Is he right?
Suggested Answer:
In this essay, Caplan argues that unpaid internships clearly pass CBA. Why? Because unpaid internships provide vocational education. If it is OK for students to trade tuition for job training in school, why is it wrong for students to trade work for job training in an unpaid internship? Unpaid internships are a valuable exception to minimum wage laws; without such unpaid internships, there would probably be massive unemployment for entry-level jobs requiring extensive training.
Caplan is correct. As he explains, the real problem is that the unpaid internship exception is so narrow! Why shouldn’t young non-college workers have the same option to acquire valuable job training?
GPT-4 Answer:
According to Caplan in "The Happy Hypocrisy of Unpaid Internships," allowing unpaid internships passes the CBA. His reasoning is based on the idea that both the employer and the intern voluntarily agree to the unpaid arrangement, suggesting that both parties perceive benefits that outweigh the costs. Interns gain valuable experience, skills, and networking opportunities, while employers benefit from the additional workforce without incurring the costs of wages.
Caplan also highlights the hypocrisy in labor regulations that ban unpaid internships, as these policies might inadvertently reduce the number of available internships and limit opportunities for young professionals to gain experience. He argues that if the unpaid internships were truly exploitative, people would not willingly participate in them.
Whether Caplan is right depends on one's perspective. Critics argue that unpaid internships can be exploitative, favoring individuals from wealthier backgrounds who can afford to work without pay, and reinforcing social inequalities. However, Caplan's view emphasizes the potential benefits for both parties involved and suggests that the voluntary nature of these arrangements implies a positive CBA.
Score: 12/15. GPT-4 doesn’t explain that unpaid internships provide the same kind of benefits that school is supposed to provide, tuition-free. Nor does it explain that under the current regime, non-college workers miss out on this opportunity.
The wonder is not GPT's performance but that it was done under such a capricious and arbitrary grading scheme. Once again I am grateful never to have faced one of Caplan's tests.
>It also misses the real point of a UBI experiment: To convince believers that this obviously misguided philanthropic strategy is misguided.
This is just blatantly docking points for disagreeing with your politics. It's easy to justify an UBI experiment as an EA even if you think UBI wouldn't be the best thing to *personally* fund, if you think UBI has a good chance of being a significant improvement over the status quo and an experiment might be a low-cost way of helping bring about that change. EAs don't have to *just* be about malaria nets for the very poor.
I also concur with the sentiments of the other posters that the questions and answers are not that difficult but obviously loaded to make a point about your politics.
I wonder if the next generation of AI will seek to do well on tests by predicting how to flatter the graders.