Advice - General / Tracking down pirates / Effective searching for text

To find all copies of one piece of your work online, you need to learn strategies for effective web searching. The one most important tip is this: don't search for words that describe an article, search for phrases that are contained in that article - and not in anyone else's.

If you are searching for web pages that may contain rip-offs of one of your photos, that means that you should be searching for words that would appear together in a decent caption for that photo, and nowhere else.

Ego-surfing

First, of course, you will look for your own name, in case the rippers-off have been silly enough to include it. This can happen when they're simply unaware that they need your permission to copy your work.

You almost always want to put names in quote marks, thus -

"Jane Smith"

- so that the search engine should find only web pages with that name - "Jane Smith" together as a phrase - not those that merely mention a Charles Smith and a Jane Dickens.

Leave it out!

If you are unfortunate enough to have namesakes widely referenced on the web, you may be able to exclude them. For example Jane Smith could exclude a namesake who sells quilting patterns:

"Jane Smith" -quilt -quilting

- the minus sign should exclude pages containing either of the words "quilt" or "quilting". (The minus sign is the same as the hyphen on a computer keyboard! You will want to use your own name, not Jane's.)

You want to head for the "advanced search" option, if there is one, and look for the place to enter words that "must not be included" or to search for pages that include "none of these words".

Narrowing it down

Or you could add extra words that narrow the search results down to your work. Think laterally. If you are Jane Smith and you are looking for reports on nuclear power, you may get a lot of results. Instead think of a word that few others use:

"Jane Smith" Magnox

Note this general principle: the more words you put into a search form, the fewer web pages you should get back. That's because the search engine should try to find you pages that contain all the words.

Of course, it's better if you had the foresight to start doing your journalism under a unique name. The actors' union Equity insists that no two members share a stage name for a reason.

Originality rewarded

Far better than searching for your name, though, is searching for the most unusual phrase in an article.

If, for example, you've reviewed a book entitled "Logic Made Easy", a search for that will produce everything everyone's posted to the web about it, plus every page that uses this phrase about anything else. But a search for the phrase

"cooked up by envious environmentalists"

(with quote marks) should produce that article, and only that article. (And of course this page.)

Life beyond Google™

You need to use other search engines as well as www.google.com.

Google's advantage is a patented scheme that frequently displays the web pages that you want first, before less-relevant pages. But www.yahoo.com may include more web pages from many sites than Google does - and if you search cleverly, as described here, you can easily outdo Google's cleverness.

That said, the market dominance of Google has narrowed the field a lot since we first published this advice. Yahoo search is currently powered by Microsoft's www.bing.com search engine. The site www.duckduckgo.com markets itself as preserving your privacy and now apparently has its own search engine, as well as offering results from www.bing.com and www.yandex.ru - which you can use directly by going to www.yandex.com to get its interface in English.

All search engines appear to be moving away from strict searching and toward using machine learning (hyped as "artificial intelligence") to try to work out what they "think" you wanted instead of what you specified. This backfires on skilled searchers: the main effect it has is to show you more things you didn't want.

This is annoying. The following tips don't work as well as they used to, but they still help.

Improbable combinations of words

If your article doesn't include any improbable phrases, you can search for an improbable combination of words or phrases:

cooked "envious environmentalists"

should find the above review, and only it (and this), using Google. (In August 2021 it also found a list of apparently-random phrases.) An exercise for you: how does this search differ from the one above that contains the same words?

The general principle is, again: search for a combination of words and phrases that should appear in the web page you are looking for, and no other. Looking for a different piece, the bizarre search

"bicameral mind" "bed-covers"

produced only the original review when we first checked in Google, but now turns up a book quoting it, sometimes an unauthorised "web reprint", this page and a bunch of randoms.

By the way, there's little point searching for a phrase of more than about 10 words, at least initially. Most search engines will ignore excess words and some may get confused.

TIP: Given a page of search results, Windows users can click with the right mouse button on the links and Mac users can hold down the Control key while clicking on it. A menu pops up and you can select either "open in new window" or "open in new tab" from it. This makes it easy to keep your place in the search results.

You should soon acquire the skill of scanning the 20-word extract from a page and its URL to see whether it's worth a quick flip over to look at it. (Like learning to swim, this skill is hard to describe in writing.)

TIP: Google provides an alerts service. Once you have found the perfect search term to find copies of a particular piece of your work, you can ask Google to email you every time a new matching page shows up. When an email arrives you can quickly check whether it's telling you about a new unauthorised copy.

Scanning individual sites

Some sites have useful search facilities of their own. But you can often get better results using the facility Google provides to scan a particular site, with searches such as these increasingly specific examples:

site:channel4.com
site:channel4.com/news site:channel4.com/news glastonbury

The rule when you specify site: at the start of your "search term" is that immediately after it you type part of the URL - up to and including the .com or .co.uk or .ac.uk or whatever - and then optionally add part of the stuff after the "slash" - for example /news and a space and more words. Yahoo Advanced Search offers a similar facility - enter this information under "only search in this domain/site".

More advice and links...
* search.yahoo.com search engine
* www.bing.com from Microsoft
* www.duckduckgo.com/ search anonymously
* Uploaded 09/06/2323: if you have a printout, check the current version at www.londonfreelance.org/feesguide/GeTraRat.html
[www.londonfreelance.org]
* Rates for the Job good, bad and ugly
* Join the NUJ to get individual advice & representation

Text © Mike Holderness & previous contributors; Moral rights asserted. The collection (database right) © National Union of Journalists. Comments to ffg@londonfreelance.org please. You may find the glossary helpful.

The National Union of Journalists must not, can not and would not wish to dictate rates or terms of engagement to members or to editors. The information presented here is for guidance and as an aid to equitable negotiation only.

Suggestions apply to contracts governed by UK law only. In any event, nothing here should be construed as legal advice.