Finding Duplicate Paragraphs in Microsoft Word

If you’ve used MS Word, you’ll be familiar with search for/find words and phrases. To bring up a search dialogue box in older Word versions (pre 2007), press Ctrl-F. In newer versions of Word, Ctrl-F will work but the Advanced Find box is harder to get at. It can be accessed by clicking the down arrow at the right end of the search box. I’ve written a post about Advanced Find in new versions of Word here.

That’s fine but what about finding duplicate paragraphs in your document? This could occur when there are a number of collaborators on a document and they independently paste in repeat paragraphs of the same text. Well I’ve discovered that there’s a way to find repeat paragraphs. I had to edit another author’s document last week. After getting most of the way through it, a paragraph sounded very familiar. I checked back though the document and sure enough, he had used the exact same paragraph earlier – they had probably both been pasted into the document on different occasions. I then found further obvious repeated paragraphs, and then it occurred to me, what if I’ve missed less obvious duplicate paragraphs? Can Word find these repeat paragraphs for me automatically?

I searched around on Google and found one answer that seems to work and I’m indebted to Klaus Linke who commented on the Wordbanter forum. Go to the top of your document and open the Advanced Find box as outlined at the start of this post. Paste the following into the Find what box: (^13[!^13]@^13)*\1 (I’ve no idea what it means!) and make sure Use wildcards is checked

Find repeat paragraphs

When you click Find Next, the first repeated paragraph it finds will be the paragraphs at the beginning and end of the selection. Delete the last selected paragraph, then return to the top of the document and repeat the procedure until all the duplicate paragraphs have been removed. Remember that this only works if the paragraphs are exactly the same (same capitalization, same word spacing, etc). And there must always be one paragraph (or at least a double carriage return) between the repeated paragraphs.

I admit, it’s not the most elegant of solutions but it works for me. I’m running Word 2010 but the routine should also work in older versions or Word. Have you found a better solution to find repeated paragraphs? Drop a comment below.

4 Responses

  1. Matthew Schmidt Says:

    Thanks for the helpful tip. I am looking for duplicate phrases, not paragraphs, in a lengthy word doc so this didn’t solve me problem. Have you heard of a way to search for duplicate phrases? That would be really helpful.

  2. Marius Says:

    Thanks. It’s working

  3. Terry Johnston Says:

    Another option is to convert the text to a one column table, paste it into Excel and use the find duplicates function in Excel.

  4. Ant Says:

    The search term (< ?{20,}>)*\1 will find duplicate phrases.

    (Note from techandlife: There should be no space between the less-than symbol and the question mark in the equation above. For some reason, WordPress has inserted this.) The “20” represents the minimum length of phrase (in characters). For longer phrases, higher values will help reduce false hits. (Set it too low and it’ll catch repeated words and even letters!) Bear in mind though that longer times between finds could give the impression Word has hung, especially with long documents. This wasn’t far from a first attempt, so it could be improved upon.

    This was handy tip. I hadn’t considered Word would be able to do this. Thanks.

