swissbib has two options to import structured metadata and both are actually in use:
- The basic import path goes through the data preparation module and is best suited for any library metadata that is not unique
- The additional import path goes directly in the search engines index and is very well suited for unique materials that need not to be deduplicated (archival collection and alike)
Primary import path
During the import the data is analysed and transformed to Pica+ the internal format of the swissbib data preparation engine CBS (Central Bibliographic System). At this stage some of the data is excluded as it is not useful for swissbib.
The conversion to the internal format as well as other steps can be defined by source in order to get the maximum out of the individual data delivered by the local library networks.
A lot of the conversion steps are shared between the different networks.
Criteria for excluding data
swissbib applies general criteria for exclusion of data. There mainly centred around two topics:
- data quality and richness (minimal requirements to be met)
- usefulness of the information provided
Details about criteria for the exclusion of records, see: Import criteria.
Data sources imported through primary import path
- all library data (12 sources)
- the Swiss posters collection (1 source)
- the institutional repositories (3 sources)
- the e-periodica collection (1 source)
- the e-codices collection (1 source)
Secondary import path
The secondary import path is used to index full text, authority data and structured metadata that describes unique data.
swissbib uses the full text indexation to enrich the library metadata with abstracts and indexes that are scanned by the Swiss libraries. Unfortunately not all institutions share their content openly.
Alternative forms of names, titles, places and subject headings from the Gemeinsame Normdatei (GND) are indexed if a heading is present in the bibliographic data.
For unique data (mainly archival data) it makes little sense to transform and mix it with library data in the "data preparation module" because of the nature of this data. It is a lot easier to index it directly. In order to smoothen the process the data is converted into a MARC-field-structure that can be indexed with the same definitions as the library data.
Other transformation scenarios are possible but it isn't clear what could be gained with it.
Currently this applies to the following data:
- Swiss national library: digitized content of the Swiss Literary Archives SLA
- Swiss national library: digitized content of the Federal Archives of historic monuments (FAHM)
- Verbund Handschriften Archive und Nachlässe (HAN): digitized content and collection records