Jach's personal blog

(Largely containing a mind-dump to myselves: past, present, and future)
Current favorite quote: "Supposedly smart people are weirdly ignorant of Bayes' Rule." William B Vogt, 2010

Plagiarism in Code

Note I'm not talking about copyright infringement, which is criminal, but mere academic plagiarism, which is a civil matter whose probable worst-case outcome is expulsion and reputation damage. Nevertheless it falls inside the sphere of the often- (and ill-) used concept of intellectual property.

I've always had a problem with plagiarism as a concept as well as something to avoid. In English, here's some teacher asking me to analyze a literary work. Okay. The literary work is simple, so what I turn in is near-exact in terms of information as many other students' work, but with my own phrasing. If I happen to match the phrasing of another student in many instances, or now with modern tools anyone on the entire internet, then the teacher suspects plagiarism. How can I avoid something that could happen due to chance alone? (By lowering the chance but that has to pollute the production in some way. I might normally say "Candy tastes good." To avoid plagiarism accusations, I might say "The bumps on the muscle that resides in my head, in the mouth, normally called the 'tongue', detect the chemicals in this sugary substance about to enter my digestive system and tell my brain it's okay to continue." You can continue that line with asking "why?" to every statement, and adding a "because" for all of them (potentially adding on "because"s to each "because"). I could have gone into evolutionary biology with the above.)

My eventual conclusion on plagiarism is that it's not meant to catch uniqueness of thought, nor is it an effort to catch whether a student understands something, but is an attempt to determine whether work was done according to some specification. Even plagiarizers have to do some work, sometimes more work, depending on the circumstances, but it's not the kind of work desired. So plagiarism is very much an effort to regulate the way of doing something, even if the outcome is the same.

Now let's talk a bit about code. The way professors want their students to do the coding assignments is to sit down and code it, potentially getting help from X sources but never copy-pasting code. This is due to the fact that (apart from a very few number of geniuses) you can't learn how to program just by looking at code. You have to program stuff. So professors want to make sure you're programming so that you learn, with the threat of expulsion if you don't.

Yet just "programming stuff" too isn't sufficient for learning how to program. I used to think "programming it shows you can understand it." This isn't really the case; programming it shows you can program it. Whether you understand what you're programming, or whether you will understand it in the future, has little to do with whether you can program it at this instance. To any professors doubting my claim, I ask you to cross-examine students' homework assignments with their exams. An exam question might ask something exactly the same as what had to happen to program the assignment, but students still get those wrong even if they did fine in the assignment.

So what good was the assignment anyway? If those who cheat and those who don't cheat can both miss a concept on the final that was in the assignments, has learning taken place in either of them? So why get mad at the cheater?

The plagiarism attitude in programming classes particularly is bad for the profession. While it may cultivate the certain mentality of always reinventing the wheel, that's not usually productive or beneficial except in very certain cases. In a C class you teach the student from day 1 to "#include ", which is, guess, what, code they didn't write. Plagiarism!

"No, it's GPL'd/free for education."

Oh, so then for my Linked List assignment I can use this MIT licensed version I found on GitHub?

Of course not, since we've already established that the point of plagiarism is to do work a particular way. If the work was "go on GitHub and find a linked list class to use and send me a link", and my friend finds a link and sends it, he's fine. But if I just copy his email and send it too, I'm plagiarizing. Even if someone else found the same link, they did the actual work of "finding". For a more serious example, I might copy someone's list sorting function exactly, but change the variables and coding style so no one can tell I plagiarized it. Even though it compiles to the same assembly, it's "different" and I can get away with it. This is ridiculous, why do we bother doing these things?

Well, the school wants to be able to say with some authority that this person to whom a degree was granted "earned" it in some way. The whole purpose of grades, testing, and catching cheaters is so a school can determine if a certain person deserves some credentials they give them or not. Ideally we wouldn't need them, but you tell me a better way apart from projects (which have their own problems if they're in groups) to measure a student's proficiency in something while remembering they're taking 5 other classes at the same time with professors who may do the same thing. I think tests are the way to go with optional assignments, then you employ all your cheating filters at test-time. (Along with devoting more time to creating tests that are harder to game.)

Morally, of course, it seems that passing off someone else's work as your own should be frowned upon. I think a Massive Reputation System would take care of that problem, though, rather than calling it a crime. And anyway, plagiarism freaks inconsistently determine what constitutes someone else's work, and you end up with stupid things like "A single-line for loop counter you write yourself is yours, one you copy-paste from somewhere is someone else's and if we allow you to use it at all it must be cited." Here's an idea: stop giving programming assignments that are so easy to plagiarize with a single google search. In other subjects where plagiarism is common, stop giving research paper assignments without a project component to apply research.

Posted on 2011-07-25 by Jach

Tags: intellectual property, school


Trackback URL:

Back to the top

Back to the first comment

Comment using the form below

(Only if you want to be notified of further responses, never displayed.)

Your Comment:

LaTeX allowed in comments, use $$\$\$...\$\$$$ to wrap inline and $$[math]...[/math]$$ to wrap blocks.