Proofreading Tips

(Afterword: this entry was written around 2004, when I wasn't terribly fluent writing scientific texts in English - things have gotten much easier since, and new writeups now need only a little editing. Nevertheless these comments may still be useful for (foreign?) people who haven't gone through that rite of passage.)

It has often happened, when I ask Eduardo to look at a write-up, that he comes with criticisms in the form of the text which I could myself have been able to tell if I had just looked at the draft. This sounds very contradictory, since just before I gave it to him I had myself looked at it very thoroughly. It seems that for some reason, looking at a writeup with a fresh head lets you find mistakes, typos, etc that you would not be able to find all at once, no matter how hard you try! Looking into this has lead me into the (now I know) venerable art of proofreading. After having done this the hard way I write here my comments on the subject, hoping that maybe I can shorten someone's toil by making her or him aware of the traps. (As a disclaimer, I apologize for the typos in this text itself! I will try to get rid of them; then again, this is not a peer-reviewed journal submission.)

A First Approach

A first approach strategy for the (sometimes very annoying) task of writeup debugging is to write something down and correct it until it looks completely right, then let it rest for some time, and then read it again with a fresh head and get rid of whatever remaining errors you find. Then do this again later. And repeat if necessary (which it will be), until all problems have been addressed. (It is important to mention here that we are only talking about formal errors, as opposed to those in the line of the argument.) If this 'maturation' process seems somewhat obsessive, that's because it is. But it is a way to spot virtually all the problems, which is necessary to make the draft kosher. A plus is that it allows to be more free and careless when the draft is written the first time, which is not a good time to worry about too many details, let alone style issues and typos.

After spending a shamelessly high percentage of my working time looking for formal errors in a submission last year (we are talking about hundreds of iterations of the above procedure, no exaggeration), and after spending a few weekends trying in vain to fix the same 30 page long writeup and seemingly never getting it done, I read a few websites on the subject and figured that a more systematic approach can be useful. As mentioned above, it is important to wait some time so that one doesn't systematically ignore the same mistakes every time. One should also choose an editing environment that one feels comfortable with (editing in Acrobat Professional has worked best for me), since making the necessary changes in the file itself afterwards is quite easy. One can potentially trick the brain into thinking that one has a new text in front, by changing the layout of the text: different font, different format, etc.

Another important point: stop often, and come back often. If you lose concentration, some parts of the text will remain essentially unchecked - but since you don't know which parts, all the text will need to be checked again. The solution: make lots of shorts pauses. Also, separate the rounds by topic, instead of trying to do everything at once. The first-approach technique will never work completely, and it tends to make one feel very stupid looking vaguely at the computer for weekends at a time. As to which are the topics, that will depend on which errors you make most commonly. The beauty of the separation is that one can start to learn tricks that are specific to each error type, and become still more proficient with time. The secret to make this work seems to be to really stick to the error at hand, otherwise the individual rounds will be too long.

The following is how I have ended up doing my own proofreading (this description is a bit specific to the subject at hand (mathematics) and to the typesetting program I use (latex)). I start by making a 'proofread mode' version of the draft, in which every sentence is in its own paragraph, and ends with a clear word like STOP. This can be done by find-replacing '. ' by '. STOP //' (note the spacing!). I use two columns, which makes the lines overall shorter, and allows to make a really large zoom which otherwise wouldn't be possible. Once the proofread version is ready, I don't bother editing in tex. Instead, I make a .pdf file out of it, and do all the editing in Acrobat. This is a much nicer interface, with mouse scroll, typesetted pages, pencil, hyperlinks, find command, etc. I start making all sorts of specialized passes, that is, passes in which only one thing is looked at every time. I take lots of 5 minute breaks, say every page - I stop the moment that I feel tired or when I see I am working too mechanically.

Every time that a large pass is finished, or that the .pdf file is crammed, I go to the .tex file and do lots of "find" commands, making the changes mechanically but carefully. Actually making the changes is quick and easy (and the work is further divided into simpler parts). That said, an especially insidious kind of mistake is to mark a problem in the .pdf file, and not to fix it after by not seeing it. I try that all marks are on vertical order on the margin, so that there is a clear algorithm to carry out the fixes, and if possible, only scroll down the .pdf file so much that the first mark barely appears. Fix it, and scroll again. That way no mark can in principle be missed.

The Rounds

The different error types that I subdivide into are the following (not necessarily in that order).

Sentence Parsing
This is probably the longest round. Pretend that you are not reading English but some foreign language in which you aren't very fluent. Make sure that the syntax is right, by decomposing (ie parsing) every single sentence and part thereof. It is unimportant which noun is the subject of the sentence, but only that it is a noun, etc. In particular, the content is utterly irrelevant and should stay out of the thought process. Do some homework by reading when commas should be placed, etc. This sentence parsing round worked wonders on me: it took me a long time to do for 30 journal pages, but I did it leisurely (one could almost say I enjoyed it), and after it all silly comma mistakes and the like really did disappear.
Equation Parsing
The same as for sentence parsing, but now only looking at the equations. Check that all greek letters show up as &phi and not phi, check that parentheses open and close, that indices are in the right place, that the spacing is correct. It is debatable how much of the content to check here (has an x been written wrongly as y?). If one is to stick to the points above, one shouldn't at all in this round.
Definition Order
Make a list of all the new terms that are introduced starting after the preliminaries section, and while you do it make sure that every single technical term that gets mentioned in the text has been introduced appropiately, ie shows up in the list when you come to it. Try to group definitions together, and to include most of them in the first section. That way one avoids omitting a definition, or using it before it is defined, which probably looks worse.
The de-we-ing Issue
There seem to be two modes of writing math (both equally dry). One is the we-mode, in which the authors (and the readers) 'together' follow the arguments, and where many verbs are conjugated in the second person plural. The other is the impersonal mode, where the proverbious "let" comes up again and again. The thing is, and this is a comment that is passed on from mathematician to mathematician, the two modes should be kept separated, and not mixed, say, in the same paragraph. Thus, one can for example keep the impersonal mode for a while, and then shift into we-mode while giving an introduction or so. Since the we-mode is characterised by this pronoun (and the other one isn't necessarily characterised by "let"), one can leave only the impersonal mode by searching for 'we' throughout and de-we-ing the text.
Spell Check
Run a program (for instance in your tex application) to find spelling errors and double words.
Sentence Length
Simply split those sentences that are too long, say, with more than 30 words. This is where the STOP signs come in handy.
References to Equations, Lemmas and Theorems
Check this once at the end, just to make sure that the references are OK.

Back to