thredds.catalog.crawl
Class CatalogCrawler

java.lang.Object
  extended by thredds.catalog.crawl.CatalogCrawler

public class CatalogCrawler
extends Object

This crawls a catalog tree for its datasets, which are sent to a listener. You can get all or some of the datasets. A "direct" dataset is one which hasAccess() is true, meaning it has one or more access elements.

Example use:

 CatalogCrawler.Listener listener = new CatalogCrawler.Listener() {
   public void getDataset(InvDataset dd) {
     if (dd.isHarvest())
       doHarvest(dd);
   }
 };
 CatalogCrawler crawler = new CatalogCrawler( CatalogCrawler.USE_ALL_DIRECT, false, listener);
 

Author:
John Caron

Nested Class Summary
static interface CatalogCrawler.Listener
           
 
Field Summary
static int USE_ALL
          return all datasets
static int USE_ALL_DIRECT
          return all direct datasets, ie that have an access URL
static int USE_FIRST_DIRECT
          return first dataset in each collection of direct datasets.
static int USE_RANDOM_DIRECT
          return one random dataset in each collection of direct datasets.
static int USE_RANDOM_DIRECT_NOT_FIRST_OR_LAST
          return one random dataset in each collection of direct datasets.
 
Constructor Summary
CatalogCrawler(int type, boolean skipDatasetScan, CatalogCrawler.Listener listen)
          Constructor.
 
Method Summary
 int crawl(InvCatalogImpl cat, CancelTask task, PrintStream out, Object context)
          Crawl a catalog thats already been opened.
 int crawl(String catUrl, CancelTask task, PrintStream out, Object context)
          Open a catalog and crawl (depth first) all the datasets in it.
 void crawlDataset(InvDataset ds, CancelTask task, PrintStream out, Object context, boolean release)
          Crawl this dataset recursively, return all datasets
 void crawlDirectDatasets(InvDataset ds, CancelTask task, PrintStream out, Object context, boolean release)
          Crawl this dataset recursively.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

USE_ALL

public static final int USE_ALL
return all datasets

See Also:
Constant Field Values

USE_ALL_DIRECT

public static final int USE_ALL_DIRECT
return all direct datasets, ie that have an access URL

See Also:
Constant Field Values

USE_FIRST_DIRECT

public static final int USE_FIRST_DIRECT
return first dataset in each collection of direct datasets.

See Also:
Constant Field Values

USE_RANDOM_DIRECT

public static final int USE_RANDOM_DIRECT
return one random dataset in each collection of direct datasets.

See Also:
Constant Field Values

USE_RANDOM_DIRECT_NOT_FIRST_OR_LAST

public static final int USE_RANDOM_DIRECT_NOT_FIRST_OR_LAST
return one random dataset in each collection of direct datasets.

See Also:
Constant Field Values
Constructor Detail

CatalogCrawler

public CatalogCrawler(int type,
                      boolean skipDatasetScan,
                      CatalogCrawler.Listener listen)
Constructor.

Parameters:
type - CatalogCrawler.USE_XXX constant: When you get to a dataset containing leaf datasets, do all, only the first, or a randomly chosen one.
skipDatasetScan - if true, dont recurse into DatasetScan elements. This is useful if you are looking only for collection level metadata.
listen - this is called for each dataset.
Method Detail

crawl

public int crawl(String catUrl,
                 CancelTask task,
                 PrintStream out,
                 Object context)
Open a catalog and crawl (depth first) all the datasets in it. Close catalogs and release their resources as you.

Parameters:
catUrl - url of catalog to open
task - user can cancel the task (may be null)
out - send status messages to here (may be null)
context - caller can pass this object in (used for thread safety)
Returns:
number of catalog references opened and crawled

crawl

public int crawl(InvCatalogImpl cat,
                 CancelTask task,
                 PrintStream out,
                 Object context)
Crawl a catalog thats already been opened. When you get to a dataset containing leaf datasets, do all, only the first, or a randomly chosen one.

Parameters:
cat - the catalog
task - user can cancel the task (may be null)
out - send status messages to here (may be null)
context - caller can pass this object in (used for thread safety)
Returns:
number of catalog references opened and crawled

crawlDataset

public void crawlDataset(InvDataset ds,
                         CancelTask task,
                         PrintStream out,
                         Object context,
                         boolean release)
Crawl this dataset recursively, return all datasets

Parameters:
ds - the dataset
task - user can cancel the task (may be null)
out - send status messages to here (may be null)
context - caller can pass this object in (used for thread safety)

crawlDirectDatasets

public void crawlDirectDatasets(InvDataset ds,
                                CancelTask task,
                                PrintStream out,
                                Object context,
                                boolean release)
Crawl this dataset recursively. Only send back direct datasets

Parameters:
ds - the dataset
task - user can cancel the task (may be null)
out - send status messages to here (may be null)
context - caller can pass this object in (used for thread safety)


Copyright © 1999-2011 UCAR/Unidata. All Rights Reserved.