DSpace Basics
Training
Documentation
https://wiki.lyrasis.org/display/DSDOC7x
Technical FAQ
https://wiki.lyrasis.org/display/DSPACE/TechnicalFAQ
How to Guides
https://wiki.lyrasis.org/display/DSPACE/How+To+Guides
3rd party Tutorials
Objects Definition:
Object | Example |
Community | Laboratory of Computer Science; Oceanographic Research Center |
Collection | LCS Technical Reports; ORC Statistical Data Sets |
Item | A technical report; a data set with accompanying description; a video recording of a lecture |
Bundle | A group of HTML and image bitstreams making up an HTML document |
Bitstream | A single HTML file; a single image file; a source code file |
Bitstream Format | Microsoft Word version 6.0; JPEG encoded image format |
Authorization
DSpace's authorization system is based on associating actions with objects and the lists of EPeople who can perform them. The associations are called Resource Policies, and the lists of EPeople are called Groups. There are two built-in groups: 'Administrators', who can do anything in a site, and 'Anonymous', which is a list that contains all users. Assigning a policy for an action on an object to anonymous means giving everyone permission to do that action. (For example, most objects in DSpace sites have a policy of 'anonymous' READ.) Permissions must be explicit - lack of an explicit permission results in the default policy of 'deny'. Permissions also do not 'commute'; for example, if an e-person has READ permission on an item, they might not necessarily have READ permission on the bundles and bitstreams in that item. Currently Collections, Communities and Items are discoverable in the browse and search systems regardless of READ authorization.
Collection
ADD/REMOVE | add or remove items (ADD = permission to submit items) |
DEFAULT_ITEM_READ | inherited as READ by all submitted items |
DEFAULT_BITSTREAM_READ | inherited as READ by Bitstreams of all submitted items. Note: only affects Bitstreams of an item at the time it is initially submitted. If a Bitstream is added later, it does not get the same default read policy. |
COLLECTION_ADMIN | collection admins can edit items in a collection, withdraw items, map other items into this collection. |
Item
ADD/REMOVE | add or remove bundles |
READ | can view item (item metadata is always viewable) |
WRITE | can modify item |
Bundle
ADD/REMOVE | add or remove bitstreams to a bundle |
Bitstream
READ | view bitstream |
WRITE | modify bitstream |
DSpace Item State Definitions
https://wiki.lyrasis.org/display/DSDOC7x/DSpace+Item+State+Definitions
Workspace item
An item that is under submission and active edit by an authorized user. The workspace item is visible only to the submitter and the system administrators. (Currently there is no simple way to find/browse such items other than with the direct item ID or to use the supervisor functionality). Using the supervisor functionality, a system admin can allow other authorized user to see/edit the item in the workspace state.
Expected use cases:
- Self deposit
- Collaboration over an in-progress submission for a small group of researchers. (This use case is implemented only with major limitations, using the supervision feature – concurrency, lack of delegation: supervision must be defined by the system administrators, etc.)
Workflow Item
An item that is under review for quality control and policy compliance. The workflow item is visible to the original submitter (currently only basic metadata are visible out-of-box in the mydspace summary list), users assigned to the specific workflow step where the item resides, and system administrators. (Currently there is no simple way to find/browse such items other than with the direct item ID or to use the abort workflow functionality).
Expected use cases:
- Quality control
- Improvements to the bibliographic record (metadata available in workflow can be different than those asked of the submitter)
- Check of policy / copyright
Withdrawn item
It is a logical deletion. The Item can be restored and it can be used to keep track of what has been available for a while on the public site.
Expected use cases:
- Staging area for item to be removed when copyright issues arise with publisher. If the copyright issue is confirmed, the item will be permanently deleted or kept in the withdrawn state for future reference.
- Logical deletion delegated to community/collection admin, where permanent deletion is reserved to system administrators
- Logical deletion, where permanent deletion is not an option for an organization
- Removal of an old version of an item, forcing redirect to a new up-to-date version of the item (this use case is not currently implemented out-of-box in DSpace, see )
Private item
This state should only refer to the discoverable nature of the item. A private item will not be included in any system that aims to help users to find items. So it will not appear in:
- Browse
- Recent submission
- Search result
- OAI-PMH (at least for the ListRecords and ListIdentifiers verb; though the OAI-PMH specification is not clear about inconsistent implementation of the ListRecords and GetRecord verb)
- REST list and search methods
It should be accessible under the actual ACL rules of DSpace using direct URL or query method such as:
- Splash page access (i.e. /handle/<xxxxx>/<yyyyy>)
- OAI-PMH GetRecord verb
- REST direct access /rest/item/<item-id> or equivalent
Expected use cases:
- Provide a light rights awareness feature where discovery is not enabled for search and/or browse
- Hide “special items” such as repository presentations, guides or support materials
- Hide an old version of an Item in cases where real versioning is not appropriate or liked
- Hide specific types of item such as “Item used to record Journal record: Journal Title, ISSN, Publisher etc.” used as authority file for metadata (dc.relation.ispartof) of “normal item”
Archived/Published item
An item that is in a stable state, available in the repository under the defined ACL rule. Changes to these items are possible only for a restricted group of users (administrators) and should produce versioning according to the Institution's policy.
Embargoed Item
https://wiki.lyrasis.org/display/DSDOC7x/Embargo
Are a special case of Archived/Published Item. The item has some time based access policy attached to it and/or the underlying bitstreams. Specifically, read permission for someone (EPerson Group) starting from a defined date. Typically embargo is applied to the bitstreams so that "fulltext" has initially very limited access (normally administrators or other "repository staff" groups) and only after a defined date will the fulltext become visible to all users (Anonymous group). This scenario is used to implement typical "embargo requirements" from publishers -- see Delayed Open Access.
If the metadata of the item should be visible only to a specific group of users, it is possible to define an embargo policy also for the ITEM itself. A READ policy for a specific group will mean that only the users in that group will be able to access the item splash page. Note that currently only some UIs (JSPUI/XMLUI) are fully rights aware (see Discovery documentation for more information, especially the section on "Access Rights Awareness"). This means that in different UIs, some metadata of a restricted item could be exposed to unauthorized users. When you need to work with UIs not fully rights aware, a workaround can be to use the "Private Item" flag to make the item undiscoverable so that metadata will be not exposed to unauthorized users. Please note that this workaround has several major limitations:
- No one, not ever authorized users, is able to find the item by browsing or searching the repository.
- You need to manage externally a schedule that alerts you when the embargo is expired so that you may re-enable the discoverable nature of the item.
Bulk Import Items
https://wiki.lyrasis.org/display/DSDOC7x/Importing+and+Exporting+Items+via+Simple+Archive+Format
DSpace use simple archive format to export and bulk import items to the repository: The basic concept behind the DSpace's Simple Archive Format is to create an archive, which is a directory containing one subdirectory per item. Each item directory contains a file for the item's descriptive metadata, and the files that make up the item.
archive_directory/ item_000/ dublin_core.xml -- qualified Dublin Core metadata for metadata fields belonging to the dc schema metadata_[prefix].xml -- metadata in another schema, the prefix is the name of the schema as registered with the metadata registry contents -- text file containing one line per filename collections -- text file that contains the handles of the collections the item will belong two. Optional. Each handle in a row. -- Collection in first line will be the owning collection file_1.doc -- files to be added as bitstreams to the item file_2.pdf item_001/ dublin_core.xml contents file_1.png ... |
**** A sample zip archive is available to model after, also a sample can be obtained by exporting an item from DSpace. ****
The dublin_core.xml or metadata_[prefix].xml file has the following format, where each metadata element has it's own entry within a <dcvalue> tagset. There are currently three tag attributes available in the <dcvalue> tagset:
- <element> - the Dublin Core element
- <qualifier> - the element's qualifier
- <language>- (optional)ISO language code for element
<dublin_core> <dcvalue element="title" qualifier="none">A Tale of Two Cities</dcvalue> <dcvalue element="date" qualifier="issued">1990</dcvalue> <dcvalue element="title" qualifier="alternative" language="fr">J'aime les Printemps</dcvalue> </dublin_core> |
- (Note the optional language tag attribute which notifies the system that the optional title is in French.)
Every metadata field used, must be registered via the metadata registry of the DSpace instance first, see Metadata and Bitstream Format Registries.
Recommended Metadata
It is recommended to minimally provide "dc.title" and, where applicable, "dc.date.issued". Obviously you can (and should) provide much more detailed metadata about the Item. For more information see: Metadata Recommendations.
The contents file simply enumerates, one file per line, the bitstream file names. See the following example:
file_1.doc file_2.pdf license |
Please notice that the license is optional, and if you wish to have one included, you can place the file in the .../item_001/ directory, for example.
The bitstream name may optionally be followed by any of the following:
- \tbundle:BUNDLENAME
- \tpermissions:PERMISSIONS
- \tdescription:DESCRIPTION
- \tprimary:true
Where '\t' is the tab character.
'BUNDLENAME' is the name of the bundle to which the bitstream should be added. Without specifying the bundle, items will go into the default bundle, ORIGINAL.
'PERMISSIONS' is text with the following format: -[r|w] 'group name'
'DESCRIPTION' is text of the files description.
Primary is used to specify the primary bitstream.
Configuring metadata_[prefix].xml for Different Schema
It is possible to use other Schema such as EAD, VRA Core, etc. Make sure you have defined the new scheme in the DSpace Metada Schema Registry.
- Create a separate file for the other schema named metadata_[prefix].xml, where the [prefix] is replaced with the schema's prefix.
- Inside the xml file use the dame Dublin Core syntax, but on the <dublin_core> element include the attribute schema=[prefix].
- Here is an example for ETD metadata, which would be in the file metadata_etd.xml:
<?xml version="1.0" encoding="UTF-8"?> <dublin_core schema="etd"> <dcvalue element="degree" qualifier="department">Computer Science</dcvalue> <dcvalue element="degree" qualifier="level">Masters</dcvalue> <dcvalue element="degree" qualifier="grantor">Michigan Institute of Technology</dcvalue> </dublin_core> |