mySobek Home   |   Help
Skip Navigation Links.
MISSING BANNER
Technical Help >> Architecture >> Database Design >> Metadata Searching Module

Metadata Searching Database Support

Overview

All metadata searching occurs within these tables in the database. This allows any metadata updates to be immediately searchable from the SobekCM web application. There are two seperate table structures here; one supports advanced searching with particular metadata fields selected and faceting while the second provides extremely quick basic searching capabilities.

To provide extremely quick results, the full-text indexing in SQL 2008 is used. This allows for a new grammar to be used which provides extremely efficient results with relevancy ranking.

Tables and Relationships

Below is the diagram of the tables which make up this module:

Figure 1: Tables and Relationships within the Metadata Searching Module

Notes on Tables

Below are notes regarding some specific tables within this module:

  • SobekCM_Metadata_Unique_Search_Table

    This table contains unique metadata value-type pairs. So, a single row in this table might be 'Haiti', with a metadata type that links to 'Subject' in the SobekCM_Metadata_Types table. Any item that uses that metadata value as a subject would link to this same row. However, if 'Haiti' was used as a TITLE, for example, then a new row would appear with 'Haiti' linking to the metadata type 'Title'.

    The MetadataValue column has a full-text index ( UniqueMetadataCatalog ) and is a unicode-encoded string with the maximum possible length [ nvarchar(max) ]. For some purposes, such as exact searches or for saving the metadata related to an item, it is necessary to do a more standard match on the metadata, rather than employing the full-text index. However, a standard index cannot be placed on a column of type nvarchar(max). For this reason, the MetadataValueStart column was added and a standard index placed upon it. This column holds the first 100 characters of the metadata value and results in a huge performance boon when performing such tasks. For exact matches, this indexed column is first checked, and then the entire value is checked afterwards.

  • SobekCM_Metadata_Basic_Search_Table

    This table allows for efficient basic searching, where no particular fields are indicated. Each item in the database will have one row in this table which includes the entire full citation of the item. This full citation is created from the data held in the other metadata search tables by running the SobekCM_Create_Full_Citation_Value stored procedure which loops through each individual piece of metadata linked to the item and adds it to the full citation.

    The FullCitation column has a full-text index ( BasicSearchCatalog ) and is a unicode-encoded string with the maximum possible length [ nvarchar(max) ]. This column is only ever used for metadata searching.

  • SobekCM_Search_Stop_Words

    This contains all of the stop words used by the MS SQL full-text indexes and allows the SobekCM application to filter these words out of the query before passing them along to the database stored procedures. This list is routinely updated to reflect any changes in the lists held within the database.

  • SobekCM_Item_Aggregation_Metadata_Link

    While metadata is actually linked to items, there are many times that the metadata must be pulled by item aggregation, such as facets when browsing all items or when browsing metadata directly. To increase the speed of this type of action, this table contains links directly between the metadata and the item aggregation, and includes statistical information about how many times a piece of metadata appears within a collection for a certain metadata field. This data is updated nightly through the Admin_Update_Cached_Aggregation_Metadata_Links stored procedure. This cached table is actually only used if the number of items in the aggregation exceeds some threshold, such as 1000 items. Otherwise, the requested metadata values are determined on the fly for more accuracy.

Last modified: Wednesday, April 22, 2015 mvs