Especially in student papers, plagiarism is an issue that it seems just won't go away. However, instructors cannot just give up and permit plagiarism without giving up most of their pedagogical goals and ideals. As tempting a behavior as this may be (at least to some students, if not to all), it is our duty to smack it down.
Is there any effective way to deliver a preemptive smackdown to student plagiarists? That's the question posed by a piece of research, "Is There an Effective Approach to Deterring Students from Plagiarizing?" by Lidija Bilic-Zulle, Josip Azman, Vedran Frkovic, and Mladen Petrovecki, published in 2008 in Science and Engineering Ethics.
To introduce their research, the authors write:
Academic plagiarism is a complex issue, which arises from ignorance, opportunity, technology, ethical values, competition, and lack of clear rules and consequences. ... The cultural characteristics of academic setting strongly inﬂuence students' behavior. In societies where plagiarism is implicitly or even explicitly tolerated (e.g. authoritarian regimes and post-communist countries), a high rate of plagiarism and other forms of academic dishonesty and scientiﬁc misconduct may be expected. However, even in societies that ofﬁcially disapprove of such behavior (e.g. western democracies), its prevalence is disturbing. (140)
Here, there is some suggestion of potentially relevant cultural factors that may make plagiarism attractive -- and not the cultural factors I tend to hear about here in California, on the Pacific Rim. But maybe we can extend Tolstoy's observation about how each unhappy marriage is unhappy in its own way to recognize the variety of cultural contexts that spawn dishonest students.
And this is not just a matter of the interactions between students and teachers. Bilic-Zulle et al. point to plagiarism in school as something like a gateway drug for unethical behavior in one's professional life -- so potentially, reducing academic dishonesty could have important consequences beyond saving professors headaches.
In any case, the big question the researchers take on is how to reduce the prevalence. Is it effective to emphasize the importance of academic integrity, or to threaten harsh penalties if plagiarism is detected?
This study focused on medical students at the Rijeka University School of Medicine in Croatia. The medical program there included an assignment required of all second-year medical students in which they were asked to write an original essay based on one of four articles selected by their professor. The researchers examined how the instructions given to the students affected the rate of plagiarism in the essays the students submitted. As well, the researchers focused on word-for-word plagiarism, since that's the least ambiguous variety of plagiarism.
The researchers had done previous studies drawing on the second-year cohort in the 2001/2002 and 2002/2003 academic years. For the essay assignment, the second-year medical students in 2001/2002 were asked to write an original essay about one of the four designated articles. In contrast, the second-year medical students in 2002/2003 were asked to write an original essay about one of the four designated articles and were also given an explanation of what plagiarism is and warned not to commit it in their essays.
The essays each group submitted were examined with the plagiarism detection software WCopyfind. The average plagiarism rate of the two second-year cohorts was 19% -- with no significant difference between the group educated about and warned against plagiarism and the group who had received no explicit instructions to avoid plagiarism.
Explaining what plagiarism is and telling students not to do it, in other words, seemed not to make a differences in what students actually did.
Bilic-Zulle et al. then extended that research to look at another cohort (the second-year medical students in the 2004/2005 academic year) to see whether a different warning to students might make a difference.
Here are some details:
During the mandatory course in Medical Informatics, the students were required to write an essay containing 250-1500 words based on a published scientiﬁc article. They could choose among the four articles written in Croatian, of which two were available only in hardcopy format, and two were available in electronic format and posted at the Medical School's website. The topics of one article in hardcopy format and one article in electronic format were less complex, whereas the topics of the other two articles were considered more complex. Complexity of the topics was estimated by the instructor, who also explained the content of each article during the Medical Informatics course. Students were allowed to use of additional literature sources for their essay. (141)
I'm going to pause here and note that, with the type size I normally use, 250-1500 words comes to one to five double-spaced pages. These are not tremendously long essays -- not 50 page tomes where students might be expected to be desperate to find extra words.
The number of students in each cohort in the study ranged from 87 to 111 -- so these are not huge samples, but they're not tiny either. All three cohorts of students were offered the same selection of four articles on which to write their essays. Like the 2002/2003 cohort, the 2004/2005 cohort was explicitly warned not to commit plagiarism (and given the explanation of what counts as plagiarism). Unlike the 2002/2003 cohort, the 2004/2005 cohort was told that their essays would be checked for plagiarism by the plagiarism detection software. The promised penalty for being caught plagiarizing (which I assume was promised to all three cohorts) was that the instructor would not verify their regular attendance of the Medical Informatics course until they submitted a properly written essay that was not plagiarized. (Such instructor verification was necessary for the students to take the final exam, which was itself necessary for the students to pass this required course.)
Worth noting here is that the plagiarism detection software was being used to determine whether the students were committing word-for-word plagiarism from the source article on which they were writing their essay. In other words, they were not checking for instances of plagiarism from other sources (e.g., published papers in the literature that might have responded to a source article, or contributed something in the same general area). Nor were they checking for students plagiarizing the work of other students, something that might plausibly happen in an assignment in a required course that used the same set of four source articles year to year.
I am not sure that the admonition to the students that software would be used to check for plagiarism was clear in conveying the limited scope of this check. (We'll get back to this issue in a moment.) But I have to say, if your instructor provides source articles, explains plagiarism, and then tells you not to plagiarize, plagiarizing from the provided source articles strikes me as pretty dumb. Isn't there a good chance that the instructor who provided those source articles is pretty familiar with what's in them?
Anyway, the students submitted electronic versions of their essays, and the researchers did the analysis for plagiarism:
[T]he body text of each essay was compared to the appropriate source article by using WCopyﬁnd plagiarism detection software. The comparison rules were set in accordance with both the software's author recommendations and available published data respecting ''the six words rule''. Portions of the text consisting of six consecutive words that matched exactly six consecutive words in the source article were considered to be plagiarized. Proportion of the text copied from the source article was calculated from the number of copied words (in strings of six or more consecutive words) and total number of words. Figures and Tables were excluded from analysis due to software's inability to compare content other than text. All essays were manually (visually) controlled by the investigators to ensure that properly quoted text was not counted as plagiarized text. However, no properly referenced direct quote was found in any of student essays. (142)
If the students in the sample were including direct quotes in their essays, that last observation makes me sad.
For the purposes of their analysis, Bilic-Zulle et al. counted the papers with 10% or less of their word count found by the software to be plagiarized from the source article as "not plagiarized", and those with more than 10% of their word count flagged by the software as "plagiarized".
Here's the table from the paper with the results:
The authors note that while there wasn't a significant difference in the median proportion of plagiarized text between the 2001/2002 cohort (given no special warning about plagiarism) and the 2002/2003 cohort (given an explanation of plagiarism and a warning not to do it), there was a significant drop in the median proportion of plagiarized text in the essays of the 2004/2005 cohort that got the warnings that their essays would be run through the plagiarism detection software (down to 2% compared to 17% or 21%).
They also note that the essays of the 2004/2005 cohort were significantly shorter. One wonders whether giving up on expressing complex ideas in one's own words (and instead copying those words from someplace else) leads to verbosity.
Another trend over the three cohorts studied was that each successive cohort chose a higher proportion of source articles that were available in electronic form rather than just hardcopy, and that each successive cohort had a higher number of students electing to write their essays on more complex topics rather than simpler ones. It's not clear what, if anything, these trends have to do with the different warnings the three cohorts got about the originality of their essays. However, as the authors note, these trends tend to weigh against claims that the availability of electronic documents (and of keyboard shortcuts to cut and paste text) is to blame for plagiarism.
In the 2001/2002 cohort, a full 66% (73 students) turned in essays that crossed the researcher's "plagiarized" threshold (more than 10% of the words in the essay copied verbatim, in strings of 6 or more words). The 2002/2003 cohort that received the stern warning not to plagiarize still had 66% (57 students) turning in essays that crossed this threshold. However, in the 2004/2005 cohort that received a warning not to plagiarize and the information that software would be used to scan their papers for plagiarism, only 11% (10 students) turned in papers that met the researchers' definition of plagiarized.
The students, in other words, seemed largely to take seriously the power of the plagiarism detection software.
In case you're curious about the proportion of papers in each cohort with only "a little plagiarism" (i.e., 1-10% of the words in the essay copied verbatim, in strings of 6 or more words), those were 25 % for 2001/2002, 26% for 2002/2003, and 51% for 2004/2005. This means that the totally clean essays accounted for only 9% of the 2001/2002 total, 8% of the 2002/2003 total, and 35% of the "scared straight" 2004/2005 total .
This is, as the authors note, a high prevalence of plagiarism.
As mentioned above, the software the researchers used was only checking the student essays for instances of plagiarism that involved verbatim copying from the source article. Other software tools like Turnitin compare the essays they test against a much larger pool of sources, including other articles in the literature and other student papers that have been submitted to Turnitin. However, the researchers noted that tools like Turnitin tend to be set up to check against sources in English. To detect Croatian-language plagiarism, their scope would not actually be so powerful.
I'm inclined to think, though, that with a sufficiently vague warning ("We will be using software tools to check your essays for plagiarism") and without firsthand knowledge of the limitations of such software, the students might have assumed more thorough plagiarism detection than the software could actually deliver. Think of it as the same principle upon which polygraph operators depend: a polygraph can detect lies because the subjects of the polygraph test believe that it can detect lies. Maybe part of how the announcement that plagiarism-detecting software actually works to discourage plagiarism is that the students imagine that the software will detect more plagiarism than it actually can.
The authors of this study do not argue that automated plagiarism-detection is the answer to the underlying problem of academic dishonesty. They write:
Although suffering consequences for plagiarism may deter students from doing so, we favor promoting academic integrity and honesty and teaching students how to avoid plagiarism over sole enforcement of strict rules and penalties. Clear rules, code of ethics at universities, and awareness of responsibility among students may significantly contribute to the reduction of academic misconduct. As the values adopted at university will likely be carried into future professional life, it is very important that faculty continue educating students on inappropriateness of plagiarism and create an environment where academic dishonesty will not be tolerated. (146)
I'm definitely on board with the spirit of this -- understanding how academic dishonesty harms the features of the educational experience and the learning community on which you depend is probably a more compelling reason not to cheat than fear of detection and punishment (especially given that some people seem not to think they'll get caught, or they enjoy the risk-taking that goes with the cheating). And practically, it strikes me that this approach is necessarily, unless we are willing to impose automated screenings for cheating in the professional sphere as well.
Indeed, we probably need to recognize that as powerful as plagiarism-detection software becomes, there are competing products designed to help students outwit the software that is scanning their papers. (Sadly, it doesn't seem like they do much in the way of training students to give proper citations to the sources of their words and ideas.) It's a technological arms race, which means that no computer program can fully replace the careful eye of a human being reading student papers.
But maybe if students believe the humans reading their essays have sufficiently powerful technology at their disposal, they'll decide bending the rules is too big a risk to take.
Postscript: I've just noticed an article in the Chronicle of Higher Education that raises related issues. I don't know if I'll discuss it in any detail here, but while I'm making up my mind you may be interested in reading it.
Bilic-Zulle, L., Azman, J., Frkovic, V., & Petrovecki, M. (2008). Is There an Effective Approach to Deterring Students from Plagiarizing? Science and Engineering Ethics, 14 (1), 139-147 DOI: 10.1007/s11948-007-9037-2