Why Does Google Scholar Not Find My Research Paper?
Google Scholar is an incredibly useful tool for scientists and academics as well as their potential readers, including those all-important hiring committees and funding bodies, but it is essential to be aware that the indexing of publications and citations Google Scholar offers is far from perfect. Inaccuracies of various kinds are common, as are duplications, omissions and misattributions of publications, with most of these problems caused by the way in which search robots and parser software are used to gather information on scholarly publications. Among the most frustrating of situations for the academics and scientists whose work should be indexed arises when Google Scholar simply will not find important papers and therefore cannot provide those publications and the citations they may have received for its users or include that vital information in an author’s profile.
Without examining each case in detail, it is difficult to determine exactly why a particular paper might have been ignored by Google Scholar. As the following list demonstrates, however, there may be several different reasons why publications are either omitted altogether or indexed in such a way that they prove virtually impossible to find.
• Google Scholar is inclusive in terms of the types of academic and scientific documents it includes, but publications must be accessible on a website with content that is primarily scholarly. Personal websites presenting research papers or their abstracts may not appear scholarly enough to the search robots, but university and publisher sites generally do.
• The full text of a paper, its author-written abstract or its first page must be immediately available at all times to users who follow Google Scholar’s links. If advertisements, login boxes and other distractions are associated with the link to a paper, Google Scholar may ignore it.
• If the website containing scholarly publications is particularly slow, configured incorrectly for Google Scholar’s use or riddled with server errors, the search robots will not find and index the papers available there. The same is the case if more than ten clicks are required to move from the home page to the papers.
• The text of papers and abstracts indexed in Google Scholar must be available in searchable pdf files with one paper or abstract per file. A paper that consists of several files or a single file that contains several papers is unlikely to be indexed accurately.
• Documents that exceed 5MB in size will not be indexed by Google Scholar’s search robots. Long theses, monographs or articles with many images should therefore be uploaded to Google Books first and they will then be automatically indexed in Google Scholar.
• Papers that use an unconventional format may not be indexed because Google Scholar’s search robots detect academic and scientific papers through conventional indications of content, such as a large-font title followed by author names at the beginning of the document and a traditional bibliographical list of references at the end.
• Bibliographical information for scholarly papers should also be presented in a conventional manner and must be both complete and accurate or Google Scholar’s parsers may wreak havoc with it. A list of publications with full bibliographical information presented on a simple html page with clear links to pdf files will facilitate indexing, and arranging papers by date of publication can be helpful too.
• When the bibliographical information available for a paper is incomplete, incorrect or presented in an unusual manner, errors can and do occur. This means that a paper may be indexed but not turn up in search results because of the errors, so it is important to search in a variety of ways – by title, author name, journal title, keywords, DOI etc.
• Although Google Scholar aims to be inclusive in terms of research areas and kinds of scholarly documents, the humanities have not yet received the wide and thorough international coverage available for scientific and technical publications, so the topic of a paper may affect its indexing.
• If there are multiple versions of a paper (a conference presentation and a preprint as well as the published journal article, for instance) the final version may not turn up as quickly as it would were it the only version. The versions will eventually be merged in most cases, but changes in the title or author names can delay updated indexing.
• Journals and presses that have been deemed predatory are not included in Google Scholar, so papers published in such venues may not be indexed.
• A paper may be dropped from Google Scholar if it is moved from its original location to a new online home. Whenever a paper is moved, an http 301 redirect to the new location and url should therefore be used
• It takes weeks, sometimes months for new papers to be added and even more time (up to a year or more) for updates when changes are made to a paper, so patience is definitely a virtue.
This list does not exhaust the possibilities and contains only the most basic information about omitted papers, but Google Scholar’s technical inclusion guidelines offer detailed instructions on making your website and papers Google Scholar friendly, as well as troubleshooting notes on identifying and resolving common coverage issues. Alternatively, you may want to set up a Google Scholar profile which will allow you to add missing papers manually and claim them as your own.