Import

From swissbib
Jump to: navigation, search

Moved to https://ub-basel.atlassian.net/wiki/spaces/SWISSBIB/pages/1815478285/Import

General description

swissbib has two options to import structured metadata and both are actually in use:

  1. The basic import path goes through the data preparation module and is best suited for any library metadata that is not unique
  2. The additional import path goes directly in the search engines index and is very well suited for unique materials that need not to be deduplicated (archival collection and alike)


Primary import path

During the import the data is analysed and transformed to Pica+ the internal format of the swissbib data preparation engine CBS (Central Bibliographic System). At this stage some of the data is excluded as it is not useful for swissbib.

The conversion to the internal format as well as other steps can be defined by source in order to get the maximum out of the individual data delivered by the local library networks.

A lot of the conversion steps are shared between the different networks.

Criteria for excluding data

swissbib applies general criteria for exclusion of data. There mainly centred around two topics:

  1. data quality and richness (minimal requirements to be met)
  2. usefulness of the information provided

Details about criteria for the exclusion of records, see: Import criteria.

Data sources imported through primary import path

  • all library data (12 sources)
  • the Swiss posters collection (1 source)
  • the institutional repositories (3 sources)
  • the e-periodica collection (1 source)
  • the e-codices collection (1 source)


Secondary import path

The secondary import path is used to index full text, authority data and structured metadata that describes unique data.

Full text

swissbib uses the full text indexation to enrich the library metadata with abstracts and indexes that are scanned by the Swiss libraries. Unfortunately not all institutions share their content openly.

Authority data

Alternative forms of names, titles, places and subject headings from the Gemeinsame Normdatei (GND) are indexed if a heading is present in the bibliographic data.

Structured metadata

For unique data (mainly archival data) it makes little sense to transform and mix it with library data in the "data preparation module" because of the nature of this data. It is a lot easier to index it directly. In order to smoothen the process the data is converted into a MARC-field-structure that can be indexed with the same definitions as the library data.

Other transformation scenarios are possible but it isn't clear what could be gained with it.

Currently this applies to the following data:

  • Swiss national library: digitized content of the Swiss Literary Archives SLA
  • Swiss national library: digitized content of the Federal Archives of historic monuments (FAHM)
  • Verbund Handschriften Archive und Nachlässe (HAN): digitized content and collection records