Configure Vocabularies in a Pipeline

The following are the steps to configure vocabularies in a pipeline.

  1. Click on the Vocabularies tab. This is also the default tab in the pipeline editor.
  2. Click on the (+) button on the top-right corner of the vocabularies screen
  3. Select the vocabularies you want to register in the pipeline for consumption and click on Save. The pop-up screen is classified into Predefined Vocabularies and Managed Vocabularies section as per their actual types.
  4. The selected vocabularies will be added to the vocabularies list.
  5. Click on one added vocabulary at a time to configure its behavior in the pipeline.
  6. Specify the vocabulary category. By default, the category field will be populated with the actual value assigned to it.
  7. The search Checkbox is selected by default. Uncheck it if you are not going to use the vocabulary for search query expansion Otherwise, follow the steps given below.
    1. Whenever the engine detects a matching concept in the vocabulary, it would return the expansion name with the extracted value. This name can be configured in the search engine for expansion with appropriate boosts.
    2. Click on the (+) button in the Expansioncolumn. This would add a row at the end of the table. Specify the desired expansion name. The system by default provides three expansions, namely standardname, synonyms and narrower. You may want to add more names like broader or any custom name as per your need.
    3. In the next column, named Labels, you need to specify the list of predicates whose literal values should be populated to it. Multiple predicates can be configured. For example, SKOS predicates prefLabel and altLabel, both may qualify as synonyms.
    4. Specify the maximum number of values to be extracted in the COUNTS column.
    5. If you want to extract information about related concept instead of concept detected, click on the Recursive arrow button in the expansion name column.
    6. Select the predicate for navigating to the related concept. As in Step c, specify the Labels. Note that the labels specified here would extract values from the related concept and not the detected concept.
    7. Click on the Traverse Object checkbox, to extract relation from object to the subject.
  8. Check the Facts checkbox, if you want to extract the raw facts about the concept and specify the predicates by clicking the (+) button.
  9. Last but not the least, the pipeline allows you to blacklist tokens from a specify vocabulary by adding values to the Terms list. The terms added here will never be detected and extracted by the pipeline for this specific vocabulary.