thredds.cataloggen
Class CollectionLevelScanner

java.lang.Object
  extended by thredds.cataloggen.CollectionLevelScanner

public class CollectionLevelScanner
extends Object

CollectionLevelScanner maps between the CrawlableDataset realm and the InvCatalog/InvDataset realm. It scans a single level of a dataset collection and generates a catalog. The generated catalog contains InvCatalogRef objects for all contained collection datasets.

Three different levels of the dataset collection must be provided to to properly map from CrawlableDataset to InvCatalog/InvDataset:

  1. the collection level is the top of the data collection (the data root);
  2. the catalog level is the level in the collection for which a catalog is to be constructed; and
  3. the current level (only different from catalog level when the resulting single-level catalog will be used in the construction of a multi-level catalog ) is the level in the collection for which a catalog is currently being constructed.

Besides the three CrawlableDatasets that define the collection to be cataloged, there are a variety of ways to modify or enhance the resulting catalog. For more details, see the documentation for the various setters (setCollectionId(), setIdentifier(), setNamer(), setDoAddDataSize(), setSorter() , setProxyDsHandlers(), addChildEnhancer().

Example

Here we'll look at the parameters used to construct a CollectionLevelScanner and to generate a catalog for the following request:

http://my.server:8080/thredds/ncep/nam/80km/catalog.xml

In the constuctor, we have:

The two datasets we'll use in the example are:

Following are the details on how the resulting InvDataset and InvCatalogRef objects are created.

  • The ID of a catalog dataset element is the ID of the parent dataset and the name of the corresponding CrawlableDataset seperated by a "/". So, it ends up being the path of the corresponding CrawlableDataset from the point where the collection CrawlableDataset path ends then prefixed by the collectionId which is set using the setCollectionId() string. Example:
     <dataset name="20060208_1200_nam80km.grib" ID="NCEP/nam/80km/20060208_1200_nam80km.grib"/>
     <catalogRef xlink:title="2000archive" ID="NCEP/nam/80km/2000archive" />
     
    where the values were determined as follows:
  • The urlPath of a dataset element is the collectionPath plus the path of the corresponding CrawlableDataset starting at the point where the collection CrawlableDataset path ends. Example:
     <dataset name="20060208_1200_nam80km.grib" ID="NCEP/nam/80km/20060208_1200_nam80km.grib"
     urlPath="ncep/nam/80km/20060208_1200_nam80km.grib" />
     
    where the values were determined as follows:
  • The xlink:href of a catalogRef element is the path of the corresponding CrawlableDataset starting at the point where the catalogLevel CrawlableDataset ends plus "/catalog.xml". Example:
     <catalogRef xlink:title="2000archive" xlink:href="2000archive/catalog.xml"/>
     
    where the values were determined as follows:
  • See DatasetScanCatalogBuilder for more details on how a THREDDS server config file (catalog.xml) and the contained datasetScan elements map into CollectionLevelScanner.

    Multi-level Catalogs

    Resulting single level catalogs can be used to construct multi-level catalogs by replacing InvCatalogRef objects with the catalogs generated for the corresponding CrawlableDataset objects. Construction of multi-level catalogs is supported in several ways:

    NOTE: The StandardCatalogBuilder class is an example of using ColletionLevelScanner to construct multi-level catalogs.

    Since:
    Jun 14, 2005T12:41:23 PM
    Author:
    edavis

    Constructor Summary
    CollectionLevelScanner(CollectionLevelScanner cs)
              Copy constructor
    CollectionLevelScanner(String collectionPath, CrawlableDataset collectionLevel, CrawlableDataset catalogLevel, CrawlableDataset currentLevel, CrawlableDatasetFilter filter, InvService service)
              Construct a CollectionLevelScanner.
     
    Method Summary
     void addChildEnhancer(DatasetEnhancer childEnhancer)
              Add the given DatasetEnhancer to the list that will be applied to each of the child datasets.
     InvCatalogImpl generateCatalog()
               
     InvCatalogImpl generateProxyDsResolverCatalog(ProxyDatasetHandler pdh)
              Generate the catalog for a resolver request of the given ProxyDatasetHandler.
    protected  String getCollectionId()
               
    protected  String getCollectionName()
               
    protected  boolean getDoAddDataSize()
               
    protected  CrawlableDatasetLabeler getIdentifier()
               
    protected  CrawlableDatasetLabeler getNamer()
               
     Map getProxyDsHandlers()
               
     CrawlableDatasetSorter getSorter()
               
     void scan()
              Scan the collection and gather information on contained datasets.
     void setCollectionId(String collectionId)
              Set the value of the base dataset ID.
     void setCollectionName(String collectionName)
              Set the value of the collection Name.
     void setDoAddDataSize(boolean doAddDataSize)
              Determines if datasetSize metadata will be added to each InvDataset built during catalog generation.
     void setIdentifier(CrawlableDatasetLabeler identifier)
              Set the CrawlableDatasetLabeler used to determine the ID of the InvDataset built during catalog generation.
     void setNamer(CrawlableDatasetLabeler namer)
              Set the CrawlableDatasetLabeler used to determine the name of each InvDataset built during catalog generation.
     void setProxyDsHandlers(Map proxyDsHandlers)
               
     void setSorter(CrawlableDatasetSorter sorter)
              Set the sorter with which to sort the list of child CrawlableDatasets.
     void setTopLevelMetadataContainer(InvDatasetImpl topLevelMetadataContainer)
              Set the InvDatasetImpl that contains the metadata for the top level dataset.
     
    Methods inherited from class java.lang.Object
    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
     

    Constructor Detail

    CollectionLevelScanner

    public CollectionLevelScanner(String collectionPath,
                                  CrawlableDataset collectionLevel,
                                  CrawlableDataset catalogLevel,
                                  CrawlableDataset currentLevel,
                                  CrawlableDatasetFilter filter,
                                  InvService service)
    Construct a CollectionLevelScanner.

    The collectionLevel and catalogLevel parameters are used to properly determine the dataset urlPath. The catalogLevel must either be the collectionLevel or be a decendent of the collectionLevel. The currentLevel, if not null, must either be the catalogLevel or be a decendent of the catalogLevel.

    The currentLevel parameter indicates what level is to be scanned. It is the same as the catalogLevel except for the case when catalogRefs are not used for all collection levels. (The urlPath is still determined as described above. Only the location of the datasets is changed.)

    Parameters:
    collectionPath - the path of the collection, used as the base of all resulting dataset@urlPath values (may be "", if null, "" is used).
    collectionLevel - the root of the collection to be cataloged (must not be a CrawlableDatasetAlias).
    catalogLevel - the location, within the collection, for which a catalog is being generated.
    currentLevel - the location, at or below the catalog level, which is to be scanned for datasets. Only necessary when multiple catalogs are to be aggregated. May be null. If null, assumed to be same as catalog level.
    filter - determines which CrawlableDatasets are accepted as part of the collection.
    service - the default service of all InvDatasets in the generated catalog.
    Throws:
    IllegalArgumentException

    CollectionLevelScanner

    public CollectionLevelScanner(CollectionLevelScanner cs)
    Copy constructor

    Method Detail

    getSorter

    public CrawlableDatasetSorter getSorter()

    setSorter

    public void setSorter(CrawlableDatasetSorter sorter)
    Set the sorter with which to sort the list of child CrawlableDatasets.

    Parameters:
    sorter - the CrawlableDatasetSorter that will be used to sort the list of child CrawlableDatasets.

    getProxyDsHandlers

    public Map getProxyDsHandlers()

    setProxyDsHandlers

    public void setProxyDsHandlers(Map proxyDsHandlers)

    setCollectionId

    public void setCollectionId(String collectionId)
    Set the value of the base dataset ID. The value is used to construct the value of the dataset@ID attribute for all datasets.

    Parameters:
    collectionId -

    getCollectionId

    protected String getCollectionId()

    setCollectionName

    public void setCollectionName(String collectionName)
    Set the value of the collection Name. The value is used to name the top-level dataset in the top-level collection catalog (that is, only when the catalog level is the same as the collection level).

    Parameters:
    collectionName -

    getCollectionName

    protected String getCollectionName()

    setIdentifier

    public void setIdentifier(CrawlableDatasetLabeler identifier)
    Set the CrawlableDatasetLabeler used to determine the ID of the InvDataset built during catalog generation. The labeler is applied to the CrawlableDataset that corresponds to each InvDataset built.

    Parameters:
    identifier -

    getIdentifier

    protected CrawlableDatasetLabeler getIdentifier()

    setNamer

    public void setNamer(CrawlableDatasetLabeler namer)
    Set the CrawlableDatasetLabeler used to determine the name of each InvDataset built during catalog generation. The labeler is applied to the CrawlableDataset that corresponds to each InvDataset built.

    Parameters:
    namer -

    getNamer

    protected CrawlableDatasetLabeler getNamer()

    setDoAddDataSize

    public void setDoAddDataSize(boolean doAddDataSize)
    Determines if datasetSize metadata will be added to each InvDataset built during catalog generation. The CrawlableDataset.length() method is used to determine the size of the dataset.

    Parameters:
    doAddDataSize -

    getDoAddDataSize

    protected boolean getDoAddDataSize()

    addChildEnhancer

    public void addChildEnhancer(DatasetEnhancer childEnhancer)
    Add the given DatasetEnhancer to the list that will be applied to each of the child datasets. The DatasetEnhancer only modify InvDataset objects but can use the corresponding CrawlableDataset for information.

    Parameters:
    childEnhancer -

    setTopLevelMetadataContainer

    public void setTopLevelMetadataContainer(InvDatasetImpl topLevelMetadataContainer)
    Set the InvDatasetImpl that contains the metadata for the top level dataset.

    Parameters:
    topLevelMetadataContainer -

    scan

    public void scan()
              throws IOException
    Scan the collection and gather information on contained datasets.

    Throws:
    IOException - if an I/O error occurs while locating the contained datasets.

    generateCatalog

    public InvCatalogImpl generateCatalog()
                                   throws IOException
    Throws:
    IOException

    generateProxyDsResolverCatalog

    public InvCatalogImpl generateProxyDsResolverCatalog(ProxyDatasetHandler pdh)
    Generate the catalog for a resolver request of the given ProxyDatasetHandler.

    Parameters:
    pdh - the ProxyDatasetHandler corresponding to the resolver request.
    Returns:
    the catalog for a resolver request of the given proxy dataset.
    Throws:
    IllegalStateException - if this collection has not yet been scanned.
    IllegalArgumentException - if the given ProxyDatasetHandler is not known by this CollectionLevelScanner.


    Copyright © 1999-2011 UCAR/Unidata. All Rights Reserved.