On Plagiarism and Related Issues

Jan 24, 2024

If the plagiarism war has indeed begun, as Ian Bogost has argued, it is likely to be over in short order. The tools for detecting copied language are already widespread and inexpensive, and it won’t be long before the entire corpus of material indexed in Google Scholar has been scrutinized. The incentives—both offensive and defensive—are certainly in place for doing so.

It might be worth thinking about the kinds of transgressions that might be revealed in the process. I can think of three distinct types of offense, varying by intentionality and severity.

One category involves the accidental use of the phrasing and arguments of others, as quotation marks and references are dropped through carelessness in the editing process. To take a completely innocuous example, consider the new proposed constitution for the University of Pennsylvania that I discussed in an earlier post. The proposal is clearly influenced by the Kalven Report, from which it borrows language and ideas without quotation or citation. For example, consider this from the proposal:

The agents of dissent and critical discourse within the university are the individual community members of the University of Pennsylvania. The university serves as the hosting entity for these critics, but it does not act as the critic itself.

And this from the 1967 Kalven Report:

The instrument of dissent and criticism is the individual faculty member or the individual student. The university is the home and sponsor of critics; it is not itself the critic.

It seems clear to me that the former passage started life as the latter, perhaps with quotation marks and a reference in place. At some point in the editing process, which involved multiple authors, the quotation marks and the citation were dropped. Further editing then brought the language into better alignment with the rest of the document. There was no intent to deceive, and no real harm done. But strictly speaking, according to guidelines provided to students at Harvard, this qualifies as mosaic plagiarism.

I suspect that this kind of thing is extremely common, and can arise even when there is a single author. It can also be very serious. Doris Kearns Goodwin claimed that she had “confused verbatim notes with her own words” while writing one of her books, resulting in a copyright claim and a monetary settlement for an undisclosed sum that the recipient has characterized as substantial.

A second category involves the borrowing of language from technical definitions, descriptions of methodologies, and literature reviews without proper attribution. These things look peculiar in quotation marks, but proper paraphrasing requires time and effort on a relatively mundane task. Ruth Marcus and John McWhorter have both described this as boilerplate plagiarism. It is wrong, and unseemly, but does not warrant retraction of an academic paper and does not compromise the novelty or significance of the ideas contained therein.

The most serious category of plagiarism involves the theft of novel ideas and creations. But this type of activity is the least likely to be detected by the weapons deployed in the plagiarism war.

Consider, for instance the claim by Arnold Kling that when he was first on the academic job market, he was interviewed by a young economist who “listened to me explain my dissertation, took the idea, and then published it under his own name.” I have no way of knowing whether or not this transpired, and the party accused here may well contest the claim. But suppose that something along these lines did, in fact, occur. It does not seem likely to me that plagiarism detection software, even powered by generative AI, would reliably identify ideas in the dissertation that later made an appearance in the published articles. There would be very little overlap of language, and each document would be written in the author’s own style. Of course, the parties involved could extract and present evidence to bolster their respective claims, but just feeding a large corpus of documents to an algorithm tasked with identifying such theft would result in false positives at an intolerably high rate.

So as the plagiarism war unfolds, we are likely to see countless accusations of relatively minor transgressions with little exposure of really egregious intellectual misconduct. The conflagration will be intense but short-lived, as it quickly burns through the source material. Reputations will be tarnished, corrections made, and quotation marks retroactively inserted. And then perhaps we will start using different words for different categories of offense.

In closing, while we’re on the subject of higher education, there are a couple of things I’d like to get off my chest.

First, I’ll repeat what I have said elsewhere—the treatment of Claudine Gay over the past couple of months has been unspeakably cruel and contemptuous. By 2007 she had five single-authored papers in what I understand to be the top three journals in political science. When she was recruited with tenure by Harvard in 2006, four of these were already in print with the fifth complete and likely in press. Political scientists can and should debate the validity and significance of her findings, but based on journal rankings and single-authorship alone this is a spectacular record for a young researcher to have.

Second, while our major research universities can be legitimately criticized on multiple grounds, they are one of the few sectors in which we continue lead the world. Higher education, the entertainment industries (including music, movies, and theater), professional sports, software, finance, and consulting are all major export engines that help contain the growth of our trade deficits, and facilitate the large-scale import of consumer electronics and other goods. Perhaps we will see a revival of manufacturing at some point in the future, but until then we will continue to rely on the significant foreign demand for the products of these much maligned industries. I think that amid all the criticism, some of it well-deserved, it is worth keeping this simple fact in mind.

Imperfect Information

Discussion about this post