Open Access repositories may be institutionally-based, enhancing the visibility and impact of the institution, or they may be centralised, subject-based collections like the economics repository RePEc (Research Papers in Economics) or the physics repository, arXiv. Institutional repositories are digital collections of the outputs created within a university or research institution. Whilst the purposes of repositories may vary (for example, some universities have teaching/learning repositories for educational materials), in most cases they are established to provide Open Access to the institution’s research output.
There are currently just over 1,400 repositories around the world. Over the past three years the number has been growing at an average rate of one per day. The statistics on numbers and where they are can be found in the Registry of Open Access Repositories (ROAR) and in the Directory of Open Access Repositories (OpenDOAR). Repositories are also shown on a world map at Repository66.
Repositories adhere to an internationally-agreed set of technical standards that means that they expose the metadata (the bibliographic details such as author names, institutional affiliation, date, titles of the article, abstract and so forth) of each item they contain in the same basic way. In other words, they are ‘interoperable’. The common protocol to which they all adhere is called the open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). The contents of all repositories are then indexed by Web search engines such as Google and Google Scholar, creating online Open Access databases of freely-available global research. As the level of self-archiving (the process by which authors deposit their work in repositories) grows the Open Access corpus will represent an increasingly large proportion of the scholarly literature.
What types of content do repositories collect?
The primary type of content in OA repositories is the peer-reviewed journal literature. Authors benefit from increased visibility for their work and concomitant citation impact. A collection of the journal articles published from an institution, provided in Open Access through the repository, also gives the institution’s research programme worldwide visibility and increases its impact. OA repositories can collect journal articles because the majority of journals permit authors to archive their post-prints in an institutional or disciplinary repository. These include major commercial publishers, such as Elsevier, and many of the large scholarly society publishers. The SHERPA-ROMEO service that monitors publishers so called ‘self-archiving’ policies shows that the majority of publishers allow authors to deposit a copy of their published paper into a repository.
Now that research data are increasingly created in digital form, repositories are developing methods for collecting the data that underpin peer-reviewed articles. More and more research funders are requiring their grant-holders to make their data Open Access, once they have themselves analysed and published their findings from the data. This is in order that other researchers can use the data to verify results, to compare with their own data or to re-use in some way to generate new data and knowledge. Datasets may be of many types – spreadsheets, photographs, audio files, video files, representations of artwork, diagrams, charts and so on. They may even be ‘complex objects’, that is, combinations of several types of data, such as a numerical dataset recording weather patterns with accompanying satellite images.
A growing number of institutional repositories also contain books or book chapters. Books are often written for monetary gain (royalties on sales) and in such cases authors may be reluctant to make them available for free in a repository. In these cases it is still important for the book to be deposited, with the metadata (title, author, synopsis, publisher details, etc) on display, but the text may be ‘hidden’ from viewers. Having the metadata visible means that the book is counted in the institution’s assessment procedures and it can be located by Web search engines. The evidence is accumulating, however, to show that when the entire content of a book is visible in a repository, sales of the book frequently rise. This is because the visibility in the repository is raising awareness of the book and promoting it to an audience which is then likely to buy the book if it seems relevant to their work. It is analogous to what Amazon offers with its ‘Look inside’ facility.
As well as the types of content described above, institutional repositories frequently contain theses, dissertations and other research-related outputs such as presentations and images.
Who uses institutional repositories?
Because Google and the other Web search engines index the content of repositories, anyone with internet access can find themselves arriving at an article or dataset in a university or research institution’s repository via a Web search. But there are other ways that repositories are used, too. Users may search a particular repository for work by a a specific researcher at that institution. Or they may follow a link from another researcher’s website or blog. Although these specific ‘referrals’ are not uncommon, by far the most usual way for searchers to arrive in a repository is through a Web search engine such as Google. Les Carr’s data on how the repository at Southampton University is used showed that Web search engines accounted for 64% of user traffic into the repository. This underlines how important these informal ‘world research databases’ that the Web search engines have created are for repositories and their institutions.
How do repositories fit into the scholarly communication landscape?
Repositories form a permanent and critically important part of the scholarly communication process. Their first role is to provide the Open Access literature. Additionally, services may be added to repositories to provide extra functionality. For example, a usage-reporting service gives authors and the institution information on how the content of the repository is being used. A search service may help users find specific items more easily. A service that organises content in specific ways may help authors, for example, to download a list of articles into their CV, or aid institutions in assessing the institution’s research programme or for reporting data to governments or for other statutory requirements. We may be looking forward to a time when repositories play a formal role in the publishing process. Repositories can collect articles from the institution’s authors when they are ready for peer review and a peer review service will collect them from the repository for processing. There are already signs of these things happening. A few scholarly society publishers encourage authors to notify them when a paper has been deposited in a repository and is ready to be peer reviewed and published. Some university presses are working hand-in-hand with the repository when publishing books by institutional authors.
Libraries and institutional repositories
The ARL report, The Research Library’s Role in Digital Repository Services, states that "the delivery of repository services is a crucial function of research libraries". (pg. 35) Libraries deploy repositories to support open access; but also to collect, preserve and provide access to a broad range of content produced by the university community. Libraries can choose from a variety of open source or proprietary software platforms, or contract with a company to manage the repository for them. Most IRs are organized according to disciplinary community, and libraries usually work with individual research units to determine collection policies for each community. Building a repository is a fairly simple process for those who have the appropriate technical expertise and the costs of managing an IR are not excessive, involving mainly staffing to maintain the software and promote the repository on campus.
There are many national and international repository initiatives, which aim to enhance the value of individual repositories through content aggregation and the development of common services, such as DRIVER (EU), DRF (Japan), JISC (UK), DARE (Netherlands), HAL (France), CARL IR Program (Canada), others.