Embedding a repository implies that it is no longer a separate data silo, cut off from the mainstream of the institution’s overall research management processes. Such a separate repository will necessarily struggle to get the attention of most researchers, who will have to engage in a deliberate discrete process to deposit in it. Consequently the repository will have difficulty reaching a critical mass of publications. This has indeed been the experience of many repositories.
Simplifying deposit workflows
One of the key requirements for successful embedding of a repository, regardless of the precise configuration of the other systems in operation to manage research information, is that the submission workflow for researchers is as simple as possible.
Such a workflow should simultaneously:
- Minimise the time and effort required to submit to the repository
- Ensure that all the information needed for multiple purposes – for example, to feed web staff profiles, to support the Research Excellence Framework (REF), to support the institution’s Business and Community Engagement programmes – is captured as efficiently as possible, without duplication of effort
Re-use of information incentivises deposit, but the workflow or flows must not put barriers in the way.
A number of projects are underway which attempt to move deposit closer to the current workflows of the user, for example, to allow them to deposit from research publications management systems (RePosit) and from authoring packages such as Word (DepositMO), see below.
This Guide focuses on research, but if learning materials are included in the repository, or in a separate repository for internal use or Open Educational Resources purposes, it will be crucial also to integrate deposit into the workflow for creating teaching and learning materials.
It is also worth considering whether new or changed workflows should be introduced in phases, or as a single change. It may be less disruptive to introduce one single change if the underlying systems will support it: that will avoid resistance to future changes.
Once they’ve [the academics] put the information into GALA [Greenwich Academic Literature Archive], that’s all they need to do, they’re not asked to repeat that anywhere else…Quite a few schools reuse the data, the information from GALA is fed into staff profile pages.” Nadine Edwards
User requirements and needs
Designing a smooth and simple workflow starts from where the users are currently, even if there is going to be a reconfiguration of the systems for managing research information. For example, as part of the embedding process, the central repository may replace a number of departmental publications databases. These are probably more time-consuming to populate and less efficient; they will certainly have fewer possibilities for re-use and the data could be held in different formats – EndNote or Excel or Access. But from the researchers’ point of view a transition to a central database is not self-evidently positive; there must be no increase of burden on them (preferably a decrease) plus demonstrable benefits.
Equally, an embedded repository may have to integrate with a number of departmental or faculty systems. The I-WIRE project developed a tool called Manage My Publications, within a portal environment , for the submission, indexing, and re-purposing of research outputs in Cardiff University’s Institutional Repository , ORCA. The tool works directly within the intranet, so that when authors log in they can deposit from the portlet presented to them. The project team had to deal with the challenge of working with already established local systems for managing research information, for example the Medical School, which uses the Symplectic system.
There is a danger of being distracted into seeing such integration as primarily a technical challenge, but it is perfectly possible to achieve technical integration but through neglect of user requirements still create a system that does not fulfil its potential because it doesn’t meet the needs of the users. As the Cardiff project team noted:
“Although this sounds like a technical development project, we are conscious that the success of the project depends on how effectively we engage with our users. A crucial first step is to understand the ways in which authors, Schools and administrative Directorates currently manage research data, so that we design a process that best meets their needs.”The project took a user centric approach to gathering requirements, with individual interviews with stakeholders and group discussions with academic authors, as described in the final project report:
“We gathered user requirements by arranging individual interviews, using a structured questionnaire [this is included as Appendix B in the report], with research administrators from a wide cross-section of the 29 academic schools and departments. For help in doing this, we utilised existing structures. The subject librarians in particular played an important role in engaging stakeholders and helped the project team gain rapid and direct access to researchers, research directors and heads of school in some instances. We also arranged group sessions with academics using LEAN methodology; these sessions captured and documented the current processes that authors follow for the management of their publications, along with associated issues. The process was looked at end-to-end from identification of research opportunities to production of reports for Schools management, in order to identify any opportunities for linking the deposit process with other processes such as Performance Management. This also gave us a baseline as-is process from which to measure the outcomes of the project. The group then agreed and documented an enhanced and simplified future state process that the same set of authors agreed would encourage them to self-deposit in the repository, along with other requirements that they may have.
After the completion of the requirements capture phase, the project team held group sessions that analysed results of the requirements gathering, accepted and rejected findings. These were then written up as a set of user stories. The Design phase was broken into number of iterations, grouped by theme of user story. We analysed each requirement that had arisen from the interviews and group sessions to see what was feasible within the scope of the project.“
It is worth emphasising that although it is clearly vital to ensure that depositing is easy and part of a seamless process linked to research workflows, repositories also need to be concerned with usage, both as a goal in itself and in order to incentivise researchers to deposit. This in turn implies an understanding of search and discovery workflows, which have been much studied in recent years, for example in a series of studies sponsored by the Research Information Network. See also Driving and Measuring Usage section.
It is also important to ensure that researchers have an easy way to correct their records. At Glasgow, this has now been integrated into the university’s helpdesk system, so that researchers can easily send ‘Request a Correction’ emails if they have been mistakenly assigned a publication, for example.
It is very important to test the workflows with users themselves; it may take far longer than you think for a non-expert user to deposit something and the process may have pitfalls which are not apparent to the expert.
Depositing non-text research outputs
Repositories initially focused on textual outputs, generally PDFs. It has become clear that this has short-changed creative arts researchers, who may be keener to have other outlets for disseminating their work than researchers in other disciplines, and therefore be natural users of repositories. JISC has funded several projects, notably KULTUR and KULTIVATE, which have developed solutions for these users.
The KULTUR final report describes the project:
“A detailed user analysis established that the arts community needed a repository that could manage complex objects, capture processes as well as outputs, provide a flexible metadata schema/workflow, and offer a range of options for protecting the copyright of visual and time-based works, all through a highly visual user interface. The added value of a repository for this community resided in its potential to assist researchers in mediating between academic and professional art environments. Using EPrints software, the project developed a demonstrator repository tailored to these needs. The demo was populated with over 300 records of events and artefacts, and was continually refined in response to community feedback. This formed the basis of two new institutional repositories for University of the Arts London and University for the Creative Arts, and enhanced the University of Southampton’s existing institutional repository. The project also investigated policy for the effective management and population of a repository specialising in creative material, with particular attention to rights issues. As well as local benefits to the project partners, the outcomes of the project have a broader application for other institutions seeking a framework for the management of practice-led research outputs.”
The KULTIVATE project is building on KULTUR to share and support best practice in building repositories appropriate to the needs of creative and visual arts researchers.
There is increased interest in repositories being used to hold research data and some subject repositories already provide this facility. Many projects are being carried out on research data management and curation at present, see for example the JISC programme and understanding the requirements is advancing all the time. One of the key problems is in actually identifying what the data is and where it is currently held. On the other hand, many repository teams are already struggling to get enough resources to stimulate and handle greater text deposits and are reluctant to embark on offering to curate and make data accessible as well. There are major challenges in metadata, preservation and storage costs to meet.
Finding the right workflow for deposit is not always going to be straightforward. Different institutions will have different approaches, and which one is right for your repository depends on the context—especially what your long-term objectives are. There are several possible variations in use across the spectrum of repositories in UK HEIs.
As part of understanding and designing workflows, the SONEX project identified different deposit scenarios or use cases:
- deposit by author
- deposit by administrative assistant
- deposit as part of research management (e.g. for institutional or funder records)
- deposit by publishers in various ways
- combinations of those (e.g. metadata from an external source, followed by author deposit of the final manuscript).
An example operational system, the workflow in Cardiff’s Manage My Publications tool offers three choices of deposit route:
- quick deposit, which asks for the minimum amount of publication data to be entered and is auto-populated as much as possible for ease of deposit
- DOI deposit, which uses CrossRef to automatically populate the publication data
- Web of Science deposit, which searches the Web of Science database for an author’s publications.
All three were developed from user suggestions, as was the Selected Publications feature.
Allowing several categories of depositor, including authors themselves, is the most generally accepted model. As part of the Names project, repository managers were surveyed on a number of issues, including what categories of user were entitled to deposit to their repository. 65 responded. The majority of repositories (over 83%) surveyed indicated that IR managers/administrators, authors/contributors and faculty members are entitled to submit resources to their repository. In a few repositories, materials can also be added by departmental administrators. Nine respondents indicated that only repository staff could submit items to the repository (i.e. the repository was a fully mediated one).
The needs of a diverse group of depositors should be considered particularly carefully when designing interfaces and instructions.
Deposit from within other systems
In order to get closer to users’ workflows, it’s necessary to find ways to enable deposit from within other systems, rather than as a discrete repository upload process. JISC has funded several projects under the Depo programme to test processes and design tools for this purpose.
The changing nature of the information ecosystem should be borne in mind when considering deposit from other systems. The other systems themselves may be subject to change and development, so care needs to be taken to ensure that any systems and interfaces used in such worksflows are kept up-to-date, and the effort required factored into relevant plans and budgets.
Although there is no single established model of deposit from other systems, several implementations and projects indicate possible directions for an embedded repository. These provide some indication of what is feasible and the main implementation issues:
From CRIS or publications management systems
The JISC-funded RePosit project involves 5 universities (Exeter, Keele, Leeds, Plymouth and Queen Mary University of London) at various stages of the implementation of publications management systems, and focuses on increasing deposit through the use of these systems (in this case, Symplectic, which is also a project partner) as the primary interface to the repository. In the Pure CRIS, deposit and workflow is in Pure and full-text plus related metadata is pushed through to the repository (subject to any embargo period and copyright clearance by repository staff) with no additional metadata entry or workflow within the repository itself.
In relation to Symplectic, the team responsible for the White Rose Research Online (WRRO) repository has noted that the implementation at Leeds has changed their processes:
“It is likely that, for Leeds, Symplectic will become the primary ingest route for both metadata and full text. This changes the way we work in some significant ways. Potentially, we have a source of high quality metadata for publications. But we also lose control of metadata quality as the Symplectic installation becomes our metadata authority source for any records that co-occur in Symplectic and WRRO…..As Symplectic is set up to email individual authors directly, we potentially have a new mechanism for reaching out to authors and reminding them to deposit their research outputs.”
Clearly this type of workflow is likely in a number of institutions with pre-existing repositories where the decision has been made to purchase external systems or to upgrade existing research management systems to cope with the demands of the REF. Examples of how metadata flows in these type of embedded repositories are in the Metadata Creation and Flows section.
From authoring software (office applications and content management systems)
The DepositMO project is collaborating with Microsoft to enable the easy deposit of in progress and completed works from common authoring packages on the user’s desktop to repositories built on the most popular software platforms. It is also examining the question of using repositories as collaborative authoring environments. DepositMO employs the SWORD (Simple Web service Offering Deposit) protocol which has clients for the desktop, Facebook and Microsoft Office and the web.
Open Access Repository Junction (OARJ) is a project to facilitate the deposit of outputs into more than one repository and to improve interoperability between repositories. This is to be achieved either by giving information to the depositor about suitable repositories for their output (based on metadata or the file itself) allowing them to control the process of deposit or through an automated ‘deposit broker’ service. As the repositories may not even be in the UK, this project has a wide scope. The broker service helps to fulfil organisational mandates because a depositor can register with the broker instead of registering with each repository and the target repositories can also register, so that the broker establishes the trusted relationship with both elements and can provide a traceable record
Glasgow is co-operating with OARJ to implement functionality between its repository and both UKPubMed and the publishers of Nature. Cardiff has also developed a system for sending data from its repository to the Social Science Research Network (SSRN) when papers in the relevant disciplines are deposited.
Boosting deposit through automation and services
There is an extremely useful account with practical examples of how the White Rose Research Online repository explored how to increase its level of deposits through taking advantage of opportunities to automate deposits, use plug-ins and tools, and import records in bulk from local publication databases, researchers’ personal web pages and subject repositories. See the IncReASe final report.