setupDatasetsXml.html

<!DOCTYPE html>
<html lang="en-US">
<head>
<title>ERDDAP™ - Working with the datasets.xml File</title>
<meta charset="UTF-8">
<link rel="shortcut icon" href="https://coastwatch.pfeg.noaa.gov/erddap/images/favicon.ico">
<link href="../images/erddap2.css" rel="stylesheet" type="text/css">
<meta name="viewport" content="width=device-width, initial-scale=1">
</head>

<body> 
<table class="compact nowrap" style="width:100%; background-color:#128CB5;"> 
  <tr> 
    <td style="text-align:center; width:80px;"><a rel="bookmark"
      href="https://www.noaa.gov/"><img 
      title="National Oceanic and Atmospheric Administration" 
      src="../images/noaab.png" alt="NOAA"
      style="vertical-align:middle;"></a></td> 
    <td style="text-align:left; font-size:x-large; color:#FFFFFF; ">
      <strong>ERDDAP™</strong>
      <br><small><small><small>Easier access to scientific data</small></small></small>
      </td> 
    <td style="text-align:right; font-size:small;"> 
      &nbsp; &nbsp;
      <br>Brought to you by 
      <a title="National Oceanic and Atmospheric Administration" rel="bookmark"
      href="https://www.noaa.gov">NOAA</a>  
      <a title="National Marine Fisheries Service" rel="bookmark"
      href="https://www.fisheries.noaa.gov">NMFS</a>  
      <a title="Southwest Fisheries Science Center" rel="bookmark"
      href="https://www.fisheries.noaa.gov/about/southwest-fisheries-science-center">SWFSC</a> 
      <a title="Environmental Research Division" rel="bookmark"
      href="https://www.fisheries.noaa.gov/about/environmental-research-division-southwest-fisheries-science-center">ERD</a>  
      &nbsp; &nbsp;
      </td> 
  </tr> 
</table>

<div class="standard_width"> 

&nbsp;
<!-- <br><button type="button" onClick="history.go(-1);return true;">Back</button> -->

<h1 style="text-align:center">Working with the datasets.xml File</h1>

[This web page will only be of interest to ERDDAP™ administrators.]
<p>After you have followed the ERDDAP™ 
<a rel="help" href="https://erddap.github.io/setup.html">installation instructions</a>, 
you must edit the datasets.xml file 
in <i>tomcat</i>/content/erddap/ to describe the datasets that your ERDDAP™ installation will serve. 

<h2><a class="selfLink" id="TableOfContents" href="#TableOfContents" rel="bookmark">Table of Contents</a></h2>
<ul>
<li><a rel="help" href="#introduction"><strong>Introduction</strong></a> (Please read all of this.)
  <ul>
  <li><a rel="help" href="#effort">Some Assembly Required</a> 
  <li><a rel="help" href="#DataProviderForm">Data Provider Form</a> 
  <li><a rel="help" href="#Tools">Tools</a>
  <li><a rel="help" href="#basicStructure">The basic structure of the datasets.xml file</a>
  <li><a rel="help" href="#xinclude">XInclude</a>
    <br>&nbsp;
  </ul>

<li><a rel="help" href="#notes">Notes</a> (Please read all of this.)
  <ul>
  <li><a rel="help" href="#useCtrlF">Use Ctrl-F To Find Things On This Web Page</a>
  <li><a rel="help" href="#InternalLinks">Internal Links</a>
  <li><a rel="help" href="#ChoosingTheDatasetType">Choosing the Dataset Type</a>
  <li><a rel="help" href="#ServingTheDataAsIs">Serving the Data As Is</a>
  <li><a rel="help" href="#encodingSpecialCharacters">Encoding Special Characters</a>
  <li><a rel="help" href="#noSyntaxErrors">XML doesn't tolerate syntax errors.</a>
  <li><a rel="help" href="#diagnoseProblems">Other Ways To Help Diagnose Problems With Datasets</a>
  <li><a rel="help" href="#LLAT">The longitude, latitude, altitude (or depth), and time (LLAT) 
    variables are special.</a>
  <li><a rel="help" href="#dataStructures">Why just two basic data structures?</a>
  <li><a rel="help" href="#differentDimensions">What if the grid variables in the source dataset DON'T share the 
      same axis variables?</a>
  <li><a rel="help" href="#projections">Projected Gridded Data</a>
  <li><a rel="help" href="#dataTypes">Data Types</a>
  <li><a rel="help" href="#MediaFiles">Media Files</a>
  <li><a rel="help" href="#AwsS3Files">AWS S3 Files</a>
  <li><a rel="help" href="#NcML">NcML</a>
  <li><a rel="help" href="#NCO">NCO</a>
  <li><a rel="help" href="#limits">Limits to the Size of a Dataset</a>
  <li><a rel="help" href="#switchToACDD13">Switch to ACDD-1.3</a> 
  <li><a rel="help" href="#Zarr">Zarr</a>
    <br>&nbsp;
  </ul>
<li><a rel="help" href="#datasetTypes"><strong>List of Dataset Types</strong></a> (Read as needed)
  <br>&nbsp;

<li><a rel="help" href="#datasetDescriptions">Detailed Descriptions of Dataset Types</a> (Read as needed)
  <br>&nbsp;

<li><a rel="help" href="#details">Details</a> (Read as needed)
  <ul>
  <li><a rel="help" href="#cacheMinutes"><kbd>&lt;cacheMinutes&gt;</kbd></a>
  <li><a rel="help" href="#convertInterpolateDatasetIDVariableExample"><kbd>&lt;convertInterpolateDatasetIDVariableExample&gt;</kbd></a>
  <li><a rel="help" href="#convertInterpolateDatasetIDVariableList"><kbd>&lt;convertInterpolateDatasetIDVariableList&gt;</kbd></a>
  <li><a rel="help" href="#convertToPublicSourceUrl"><kbd>&lt;convertToPublicSourceUrl&gt;</kbd></a>
  <li><a rel="help" href="#dataImagePngBase64"><kbd>data:image/png;base64</kbd></a>
  <li><a rel="help" href="#drawLandMask"><kbd>&lt;drawLandMask&gt;</kbd></a>
  <li><a rel="help" href="#graphBackgroundColor"><kbd>&lt;graphBackgroundColor&gt;</kbd></a>
  <li><a rel="help" href="#ipAddressMaxRequests"><kbd>&lt;ipAddressMaxRequests&gt;</kbd></a>
  <li><a rel="help" href="#ipAddressMaxRequestsActive"><kbd>&lt;ipAddressMaxRequestsActive&gt;</kbd></a>
  <li><a rel="help" href="#ipAddressUnlimited"><kbd>&lt;ipAddressUnlimited&gt;</kbd></a>
  <li><a rel="help" href="#loadDatasetsMinMinutes"><kbd>&lt;loadDatasetsMinMinutes&gt;</kbd></a>
  <li><a rel="help" href="#loadDatasetsMaxMinutes"><kbd>&lt;loadDatasetsMaxMinutes&gt;</kbd></a>
  <li><a rel="help" href="#logLevel"><kbd>&lt;logLevel&gt;</kbd></a>
  <li><a rel="help" href="#partialRequestMaxBytes"><kbd>&lt;partialRequestMaxBytes&gt;</kbd></a>
  <li><a rel="help" href="#partialRequestMaxCells"><kbd>&lt;partialRequestMaxCells&gt;</kbd></a>
  <li><a rel="help" href="#requestBlacklist"><kbd>&lt;requestBlacklist&gt;</kbd></a>
  <li><a rel="help" href="#slowDownTroubleMillis"><kbd>&lt;slowDownTroubleMillis&gt;</kbd></a>
  <li><a rel="help" href="#standardText">Standard Text</a>
  <li><a rel="help" href="#subscriptionEmailBlacklist"><kbd>&lt;subscriptionEmailBlacklist&gt;</kbd></a>
  <li><a rel="help" href="#unusualActivity"><kbd>&lt;unusualActivity&gt;</kbd></a>
  <li><a rel="help" href="#updateMaxEvents"><kbd>&lt;updateMaxEvents&gt;</kbd></a>


  <li><a rel="help" href="#user"><kbd>&lt;user&gt;</kbd></a>

  <li><a rel="help" href="#dataset"><kbd>&lt;dataset&gt;</kbd></a> 
    <ul>
    <li><a rel="help" href="#datasetID"><kbd>datasetID="..."</kbd></a>
    <li><a rel="help" href="#active"><kbd>active="..."</kbd></a>

    <li><a rel="help" href="#accessibleTo"><kbd>&lt;accessibleTo&gt;</kbd></a>
    <li><a rel="help" href="#graphsAccessibleTo"><kbd>&lt;graphsAccessibleTo&gt;</kbd></a>
    <li><a rel="help" href="#accessibleViaFiles"><kbd>&lt;accessibleViaFiles&gt;</kbd></a>
    <li><a rel="help" href="#accessibleViaWMS"><kbd>&lt;accessibleViaWMS&gt;</kbd></a>
    <li><a rel="help" href="#addVariablesWhere"><kbd>&lt;addVariablesWhere&gt;</kbd></a>
    <li><a rel="help" href="#defaultDataQuery"><kbd>&lt;defaultDataQuery&gt;</kbd></a>
    <li><a rel="help" href="#defaultGraphQuery"><kbd>&lt;defaultGraphQuery&gt;</kbd></a>
    <li><a rel="help" href="#fgdcFile"><kbd>&lt;fgdcFile&gt;</kbd></a>
    <li><a rel="help" href="#iso19115File"><kbd>&lt;iso19115File&gt;</kbd></a>
    <li><a rel="help" href="#onChange"><kbd>&lt;onChange&gt;</kbd></a>

    <li><a rel="help" href="#reloadEveryNMinutes"><kbd>&lt;reloadEveryNMinutes&gt;</kbd></a>
    <li><a rel="help" href="#updateEveryNMillis"><kbd>&lt;updateEveryNMillis&gt;</kbd></a>
    <li><a rel="help" href="#sourceCanConstrainStringEQNE"><kbd>&lt;sourceCanConstrainStringEQNE&gt;</kbd></a>
    <li><a rel="help" href="#sourceCanConstrainStringGTLT"><kbd>&lt;sourceCanConstrainStringGTLT&gt;</kbd></a>
    <li><a rel="help" href="#sourceCanConstrainStringRegex"><kbd>&lt;sourceCanConstrainStringRegex&gt;</kbd></a>
    <li><a rel="help" href="#sourceNeedsExpandedFP_EQ"><kbd>&lt;sourceNeedsExpandedFP_EQ&gt;</kbd></a>
    <li><a rel="help" href="#sourceUrl"><kbd>&lt;sourceUrl&gt;</kbd></a>

    <li><a rel="help" href="#addAttributes"><kbd>&lt;addAttributes&gt;</kbd></a> 
    <li><a rel="help" href="#globalAttributes">Global Attributes / Global <kbd>&lt;addAttributes&gt;</kbd></a>
      <!-- ul>
      <li><a rel="help" href="#cdm_data_type">cdm_data_type</a>
      <li><a rel="help" href="#globalDrawLandMask">drawLandMask</a>
      <li><a rel="help" href="#history">history</a>
      <li><a rel="help" href="#infoUrl">infoUrl</a>
      <li><a rel="help" href="#institution">institution</a>
      <li><a rel="help" href="#license">license</a>
      <li><a rel="help" href="#sourceUrlAttribute">sourceUrl</a>
      <li><a rel="help" href="#subsetVariables">subsetVariables</a>
      <li><a rel="help" href="#summary">summary</a>
      <li><a rel="help" href="#title">title</a>
      </ul -->
    <li><a rel="help" href="#axisVariable"><kbd>&lt;axisVariable&gt;</kbd></a>
    <li><a rel="help" href="#dataVariable"><kbd>&lt;dataVariable&gt;</kbd></a>
    <li><a rel="help" href="#variableAttributes">Variable Attributes / Variable <kbd>&lt;addAttributes&gt;</kbd></a>
      <!-- ul>
      <li><a rel="help" href="#actual_range">actual_range</a>
      <li><a rel="help" href="#colorBar">Color Bar Attributes</a>
      <li><a rel="help" href="#data_min">data_min and data_max</a>
      <li><a rel="help" href="#variableDrawLandMask">drawLandMask</a>
      <li><a rel="help" href="#ioos_category">ioos_category</a>
      <li><a rel="help" href="#long_name">long_name</a>
      <li><a rel="help" href="#missing_value">missing_value and _FillValue</a>
      <li><a rel="help" href="#scale_factor">scale_factor and add_offset</a>
      <li><a rel="help" href="#standard_name">standard_name</a>
      <li><a rel="help" href="#units">units</a>
      </ul -->
      <li><a rel="help" href="#removeMVRows"><kbd>&lt;removeMVRows></removeMVRows&gt;</kbd></a>
      <br>&nbsp;
    </ul>
  </ul>
<li><a rel="bookmark" href="#contact">Contact</a> 
</ul>

<br>&nbsp;
<hr>
<h2><a class="selfLink" id="introduction" href="#introduction" rel="bookmark">Introduction</a></h2>

<p><a class="selfLink" id="effort" href="#effort" rel="bookmark"><strong>Some Assembly Required</strong></a> 
<br>Setting up a dataset in ERDDAP™ isn't just a matter of pointing to the dataset's
directory or URL. You have to write a chunk of XML for datasets.xml which describes the dataset.
<ul>
<li>For gridded datasets, in order to make the dataset conform to ERDDAP's data structure for gridded data,
  you have to identify a subset of the dataset's variables which share the same dimensions.
   (<a rel="help" href="#dataStructures">Why?</a> <a rel="help" href="#differentDimensions">How?</a>)
<li>The dataset's current metadata is imported automatically.
  But if you want to modify that metadata or add other metadata, you have to specify it in datasets.xml.
  And ERDDAP™ needs other metadata, including <a rel="help" href="#globalAttributes">global attributes</a>
    (such as infoUrl, institution, 
  sourceUrl, summary, and title) and <a rel="help" href="#variableAttributes">variable attributes</a> 
    (such as long_name and units).
  Just as the metadata that is currently in the dataset adds descriptive information to the dataset,
  the metadata requested by ERDDAP™ adds descriptive information to the dataset.
  The additional metadata is a good addition to your dataset and helps ERDDAP™ do a better job of 
  presenting your data to users who aren't familiar with it.
<li>ERDDAP™ needs you to do special things with the 
  <a rel="help" href="#LLAT">longitude, latitude, altitude (or depth), and time variables</a>.
</ul>
If you buy into these ideas and expend the effort to create the XML for datasets.xml,
you get all the advantages of ERDDAP™, including:
<ul>
  <li>Full text search for datasets
  <li>Search for datasets by category
  <li>Data Access Forms (<i>datasetID</i>.html) so you can request a subset of data in lots of different file formats
  <li>Forms to request graphs and maps (<i>datasetID</i>.graph) 
  <li>Web Map Service (WMS) for gridded datasets
  <li>RESTful access to your data
</ul>
Making the datasets.xml takes considerable effort for the first few datasets, but <strong>it gets easier</strong>.
After the first dataset, you can often re-use a lot of your work for the next dataset.
Fortunately, ERDDAP™ comes with two <a rel="help" href="#Tools">Tools</a> to help you create the XML for each 
dataset in datasets.xml.
<br>If you get stuck, please send an email with the details to <kbd>erd dot data at noaa dot gov</kbd>.
    <br>Or, you can join the <a rel="help"
        href="#ERDDAPMailingList">ERDDAP™ Google Group / Mailing List</a> 
        and post your question there.

<p><a class="selfLink" id="DataProviderForm" href="#DataProviderForm" rel="bookmark"><strong>Data Provider Form</strong></a> 
<br>When a data provider comes to you hoping to add some data to your ERDDAP,
    it can be difficult and time consuming to collect all of the metadata
    (information about the dataset) needed to add the dataset into ERDDAP. 
    Many data sources (for example, .csv files, Excel files, databases) 
    have no internal metadata,  
    so ERDDAP™ has a Data Provider Form which gathers metadata 
    from the data provider and gives the data provider
    some other guidance, including extensive guidance for 
    <a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/dataProviderForm1.html#databases"
    >Data In Databases</a>. 
    The information submitted is converted into the datasets.xml format and then
    emailed to the ERDDAP™ administrator (you) and written (appended) to 
    <i>bigParentDirectory</i>/logs/dataProviderForm.log .
    Thus, the form semi-automates the process of getting a dataset into ERDDAP,
    but the ERDDAP™ administrator still has to complete the datasets.xml chunk 
    and deal with getting the data file(s) from the provider or connecting to the database.

    <p>The submission of actual data files from external sources is a huge security risk,
    so ERDDAP™ does not deal with that. You have to figure out a solution that 
    works for you and the data provider, for example, email (for small files), 
    pull from the cloud (for example, DropBox or Google Drive),
    an sftp site (with passwords), or sneakerNet (a USB thumb drive or external hard drive). 
    You should probably only accept files from people you know.
    You will need to scan the files for viruses and take other security precautions.

    <p>There isn't a link in ERDDAP™ to the Data Provider Form 
    (for example, on the ERDDAP™ home page).
    Instead, when someone tells you they want to have their data served by your ERDDAP,
    you can send them an email saying something like:
<br><kbd>Yes, we can get your data into ERDDAP. To get started, 
please fill out the form at https://<i>yourUrl</i>/erddap/dataProviderForm.html (or http:// if https:// isn't enabled).
<br>After you finish, I'll contact you to work out the final details.
</kbd>
    <br>If you just want to look at the form (without filling it out),
    you can see the form on ERD's ERDDAP:
    <a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/dataProviderForm.html"
    >Introduction</a>,  
    <a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/dataProviderForm1.html"
    >Part&nbsp;1</a>,
    <a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/dataProviderForm2.html"
    >Part&nbsp;2</a>,
    <a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/dataProviderForm3.html"
    >Part&nbsp;3</a>, and
    <a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/dataProviderForm4.html"
    >Part&nbsp;4</a>.
    These links on the ERD ERDDAP™ send information to me, not you, so don't submit
    information with them unless you actually want to add data to the ERD ERDDAP.

    <p>If you want to remove the Data Provider Form from your ERDDAP™, put
    <br><kbd>&lt;dataProviderFormActive&gt;false&lt;/dataProviderFormActive&gt;</kbd>
    <br>in your setup.xml file.

    <p>The impetus for this was NOAA's 2014 <a rel="help"
    href="https://www.glerl.noaa.gov/review2016/reviewer_docs/NOAA_PARR_Plan_v5.04.pdf"
    >Public Access to Research Results (PARR) directive<img 
      src="../images/external.png" alt=" (external link)" 
      title="This is a link to an external website."/></a>,
    which requires that all NOAA environmental data funded through taxpayer dollars 
    be made available via a data service (not just files) within 12 months of creation.
    So there is increased interest in using ERDDAP™ to make datasets available via a service ASAP.
    We needed a more efficient way to deal with a large number of data providers.

    <p>Feedback/Suggestions? 
    This form is new, so please email erd dot data at noaa dot gov
    if you have any feedback or suggestions for improving this.

<p><a class="selfLink" id="Tools" href="#Tools" rel="bookmark"><strong>Tools</strong></a> 
<br>ERDDAP™ comes with two command line programs which are tools to help you create the XML 
   for each dataset that you want your ERDDAP™ to serve.
   Once you have set up ERDDAP™ and run it (at least one time),
   you can find and use these programs in the <i>tomcat</i>/webapps/erddap/WEB-INF directory.
   There are Linux/Unix shell scripts (with the extension .sh) and
   Windows scripts (with the extension .bat) for each program.
   [On Linux, run these tools as the same user (tomcat?) that will run Tomcat.]
   When you run each program, it will ask you questions.
   For each question, type a response, then press Enter. 
   Or press ^C to exit a program at any time.

   <p><a class="selfLink" id="OldVersionOfJava" href="#OldVersionOfJava" rel="bookmark">Program won't run?</a>
   <ul>
   <li>If you get an <kbd>unknown program</kbd> (or similar) error message, 
     the problem is probably that the operating system couldn't find Java.
     You need to figure out where Java is on your computer, then
     edit the java reference in the .bat or .sh file that you are trying to use.

   <li>If you get a <kbd>jar file not found</kbd> or <kbd>class not found</kbd> 
     error message, then Java couldn't find one of the classes listed in the 
     .bat or .sh file you are trying to use. The solution is to figure out
     where that .jar file is, and edit the java reference to it in the .bat or .sh file.

   <li>If you are using a version of Java that is too old for a program,
     the program won't run and you will see an error message like
     <br><kbd>Exception in thread "main" java.lang.UnsupportedClassVersionError: 
     <br><i>some/class/name</i>: Unsupported major.minor version <i>someNumber</i></kbd>
     <br>The solution is to update to the most recent version of Java
     and make sure the .sh or .bat file for the program is using it.
   </ul>

   <p><a class="selfLink" id="ErrorVsWarning" href="#ErrorVsWarning" rel="bookmark">The tools print various diagnostic messages:</a>
    <ul>
    <li>The word "ERROR" is used when something went so wrong that the procedure failed to complete.
      Although it is annoying to get an error, the error forces you to deal with the problem.
    <li>The word "WARNING" is used when something went wrong, but the procedure was able to be completed. 
      These are pretty rare.
    <li>Anything else is just an informative message. 
      You can add <kbd>-verbose</kbd> to the 
      <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml</a> or
      <a rel="help" href="#DasDds">DasDds</a>
      command line to get 
      additional informative messages, which sometimes helps solve problems.
    </ul>

   <p>The two tools are a big help, but you still must read all of these instructions
   on this page carefully and make important decisions yourself. 

    <ul>
    <li><a class="selfLink" id="GenerateDatasetsXml" href="#GenerateDatasetsXml" rel="bookmark"><strong>GenerateDatasetsXml</strong></a> 
      is a command line program that can generate a rough draft
      of the dataset XML for almost any type of dataset.

      <p>We STRONGLY RECOMMEND that you use GenerateDatasetsXml instead of creating 
      chunks of datasets.xml by hand because:
      <ul>
      <li>GenerateDatasetsXml works in seconds. Doing this by hand is at least an hour's work, 
        even when you know what you're doing.
      <li>GenerateDatasetsXml does a better job.  
        Doing this by hand requires extensive knowledge of 
        how ERDDAP™ works.  It is unlikely that you will do a better job by hand.
        (Bob Simons always uses GenerateDatasetsXml for the first draft, and he wrote ERDDAP.)
      <li>GenerateDatasetsXml always generates a valid chunk of datasets.xml.  
        Any chunk of datasets.xml 
        that you write will probably have at least a few errors that prevent 
        ERDDAP™ from loading the dataset. 
        It often takes people hours to diagnose these problems.
        Don't waste your time. Let GenerateDatasetsXml do the hard work.
        Then you can refine the .xml by hand if you want.
      </ul>

      <p>When you use the GenerateDatasetsXml program:
      <ul>
      <li>On Windows, the first time you run GenerateDatasetsXml, you need to edit the 
        GenerateDatasetsXml.bat file with a text editor to change the path to the java.exe file
        so that Windows can find Java.
      <li>GenerateDatasetsXml first asks you to specify the EDDType 
        (Erd Dap Dataset Type)
        of the dataset.  See the 
        <a rel="help" href="#datasetTypes">List of Dataset Types</a> 
        (in this document)
        to figure out which is type appropriate for the dataset you are working on.
        In addition to the regular EDDTypes, there are also a few 
        <a rel="help" href="#SpecialPseudoDatasetTypes">Special/Pseudo Dataset Types</a>
        (e.g., one which crawls a THREDDS catalog to generate a chunk of 
        datasets.xml for each of the datasets in the catalog).
        
      <li>GenerateDatasetsXml then asks you a series of questions 
        specific to that EDDType. 
        The questions gather the information needed for ERDDAP™ to access the 
        dataset's source.
        To understand what ERDDAP™ is asking for, 
        see the documentation for the EDDType that you specified 
        by clicking on the same dataset type in the
        <a rel="help" href="#datasetTypes">List of Dataset Types</a>.

        <p>If you need to enter a string with special characters (e.g., 
        whitespace characters at the beginning or end, non-ASCII characters),
        enter a 
        <a rel="help" href="https://www.json.org/json-en.html" >JSON-style string<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a>
        (with special characters escaped with \ characters). 
        For example, to enter just a tab character, enter "\t" (with the surrounding double quotes,
        which tell ERDDAP™ that this is a JSON-style string.

      <li>Often, one of your answers won't be what GenerateDatasetsXml needs.
        You can then try again, with revised answers to the questions, 
        until GenerateDatasetsXml can successfully find and understand the source data.
      <li>If you answer the questions correctly (or sufficiently correctly), 
         GenerateDatasetsXml will connect
         to the dataset's source and gather basic information (for example, variable names and metadata).
         <br>For datasets that are from local NetCDF .nc and related files,
         GenerateDatasetsXml will often print the ncdump-like structure of the
         file after it first reads the file. This may give you information
         to answer the questions better on a subsequent loop through GenerateDatasetsXml.
      <li>GenerateDatasetsXml will then generate a rough draft of the dataset XML for that dataset.
      <li>Diagnostic information and the rough draft of the dataset XML will be written to
        <i>bigParentDirectory</i>/logs/GenerateDatasetsXml.log .
      <li>The rough draft of the dataset XML will be written to 
        <i>bigParentDirectory</i>/logs/GenerateDatasetsXml.out .
      <li><a class="selfLink" id="GenerateDatasetsXml_0Files" href="#GenerateDatasetsXml_0Files" rel="bookmark">"0 files" Error Message</a>
          <br>If you run GenerateDatasetsXml or 
          <a rel="help" href="#DasDds">DasDds</a>, 
          or if you try to load an 
          EDDGridFrom...Files or EDDTableFrom...Files dataset in ERDDAP™, 
          and you get a "0 files" error message indicating that
          ERDDAP™ found 0 matching files in the directory 
          (when you think that there are matching files in that directory):
          <ul>
          <li>Check that you have specified the full name of the directory.
            And if you specified the sample filename, make sure you specified
            the file's full name, including the full directory name.
          <li>Check that the files really are in that directory.
          <li>Check the spelling of the directory name.
          <li>Check the fileNameRegex. It's really, really easy to make mistakes with regexes.
            For test purposes, try the regex .* which should match all filenames.
          (See this <a rel="help"
          href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/regex/Pattern.html"
          >regex documentation<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a> 
            and 
          <a rel="help" href="https://www.vogella.com/tutorials/JavaRegularExpressions/article.html"
          >regex tutorial<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a>.)
          <li>Check that the user who is running the program (e.g., user=tomcat (?) for Tomcat/ERDDAP)
            has 'read' permission for those files. 
          <li>In some operating systems (for example, SELinux) and depending on system settings, 
            the user who ran the program must have 'read' permission for the 
            whole chain of directories leading to the directory that has the files. 
          </ul>

      <li>If you have problems that you can't solve, 
        <a rel="help" href="#diagnoseProblems">send an email to Bob</a> with as much 
        information as possible.
        Similarly, if it seems like the appropriate EDDType for a given dataset
        doesn't work with that dataset, or if there is no appropriate EDDType,
        please send an email to Bob with the details (and a sample file if relevant).
        <br>&nbsp;

      <li><a class="selfLink" id="EditingGDXOutput" href="#EditingGDXOutput" rel="bookmark"
        ><strong>You need to edit the output from GenerateDatasetsXml to make it better.</strong></a>
        <br>&nbsp;
        <ul>
        <li>DISCLAIMER:
          <br>THE CHUNK OF datasets.xml MADE BE GenerateDatasetsXml ISN'T PERFECT.
          YOU MUST READ AND EDIT THE XML BEFORE USING IT IN A PUBLIC ERDDAP.
          GenerateDatasetsXml RELIES ON A LOT OF RULES-OF-THUMB WHICH AREN'T ALWAYS CORRECT.
          YOU ARE RESPONSIBLE FOR ENSURING THE CORRECTNESS OF THE XML THAT YOU
          ADD TO ERDDAP'S datasets.xml FILE.

          <p>(Fun Fact: I'm not shouting. For historical legal reasons, disclaimers must be written in all caps.)

          <p>The output of GenerateDatasetsXml is a rough draft.           
          <br>You will almost always need to edit it. 
          <br>We've made and continue to make a huge effort
          to make the output as ready-to-go as possible, but there are limits.
          Often, needed information is simply not available from the source metadata.

          <p>A fundamental problem is that we're asking a computer program (GenerateDatasetsXml) 
          to do a task where, if you gave the same task to 100 people, 
          you would get 100 different results. There is no single "right" answer. 
          Obviously, the program comes closest to reading Bob's mind (not yours), 
          but even so, it isn't an all-understanding AI program, 
          just a bunch of heuristics cobbled together to do an AI-like task.
          (That day of an all-understanding AI program may come, but it hasn't yet. 
          If/when it does, we humans may have bigger problems. Be careful what you wish for.)          

        <li>For informational purposes, the output shows the global 
          sourceAttributes and variable sourceAttributes as comments.
          ERDDAP™ combines sourceAttributes and addAttributes (which have
          precedence) to make the combinedAttributes that are shown to the user.
          (And other attributes are automatically added to longitude, latitude,
          altitude, depth, and time variables when ERDDAP™ actually makes the dataset).
          <br>&nbsp;
        <li>If you don't like a sourceAttribute, overwrite it by adding an
          addAttribute with the same name but a different value
          (or no value, if you want to remove it).
          <br>&nbsp;
        <li>All of the addAttributes are computer-generated suggestions. Edit them!
          If you don't like an addAttribute, change it.
          <br>&nbsp;
        <li>If you want to add other addAttributes, add them.
          <br>&nbsp;
        <li>If you want to change a destinationName, change it.
          But don't change sourceNames.
          <br>&nbsp;
        <li>You can change the order of the dataVariables or remove any of them.
          <br>&nbsp;
        </ul>
      
      <li>You can then use <a rel="help" href="#DasDds">DasDds</a>
        (see below) to repeatedly test the XML for that dataset
        to ensure that the resulting dataset appears as you want it to in ERDDAP.
      <li>Feel free to make small changes to the datasets.xml chunk that was generated, 
        for example, supply a better <kbd>infoUrl, summary,</kbd> or <kbd>title</kbd>.             
      <li><a class="selfLink" id="doNotAddStandardNames" href="#doNotAddStandardNames" rel="bookmark">-doNotAddStandardNames</a> --
        If you include <kbd>-doNotAddStandardNames</kbd>
        as a command line parameter when you run generateDatasetsXml, 
        generateDatasetsXml will not add <kbd>standard_name</kbd> to the <kbd>addAttributes</kbd>
        for any variables other than variables named <kbd>latitude, longitude, altitude, depth</kbd> or 
        <kbd>time</kbd> (which have obvious standard_names).
        This can be useful if you are using the output from generateDatasetsXml directly in 
        ERDDAP™ without editing the output, because generateDatasetsXml often guesses
        standard_names incorrectly. (Note that we always recommend that you 
        do edit the output before using it in ERDDAP.) Using this parameter 
        will have other minor related effects because the guessed standard_name
        is often used for other purposes, e.g., to create a new long_name,
        and to create the colorBar settings.

      <li><a class="selfLink" id="ScriptingGenerateDatasetsXml" href="#ScriptingGenerateDatasetsXml" rel="bookmark">Scripting:</a> 
        As an alternative to answering the questions interactively 
        at the keyboard and looping to generate additional datasets, 
        you can provide command line arguments to answer 
        all of the questions to generate one dataset.
        GenerateDatasetsXml will process those parameters, 
        write the output to the output file, and exit the program.
        <p>To set this up, first use the program in interactive mode 
        and write down your answers. Here's a partial example:
        <br>Let's say you run the script: <kbd>./GenerateDatasetsXml.sh</kbd>
        <br>Then enter: <kbd>EDDTableFromAsciiFiles</kbd>
        <br>Then enter: <kbd>/u00/data/</kbd>
        <br>Then enter: <kbd>.*\.asc</kbd>
        <br>Then enter: <kbd>/u00/data/sampleFile.asc</kbd>
        <br>Then enter: <kbd>ISO-8859-1</kbd>
        <p>To run this in a non-interactive way, 
        use this command line:
        <br><kbd>./GenerateDatasetsXml.sh EDDTableFromAsciiFiles /u00/data/ .*\.asc /u00/data/sampleFile.asc 
        ISO-8859-1</kbd>
        <br>So basically, you just list all the answers on the command line.
        <br>This should be useful for datasets that change frequently in a way that 
        necessitates re-running GenerateDatasetsXml (notably EDDGridFromThreddsCatalog).
        <p>Details:
        <ul>
        <li>If a parameter contains a space or some special character, then encode the
          parameter as a 
          <a rel="help" href="https://www.json.org/json-en.html" >JSON-style string<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a>, e.g., 
          <kbd>"my parameter with spaces and two\nlines"</kbd>.
        <li>If you want to specify an empty string as a parameter, use: <kbd>nothing</kbd>
        <li>If you want to specify the default value of a parameter, use: <kbd>default</kbd>
            <br>&nbsp;
        </ul>
      <li>GenerateDatasetsXml supports a -i<i>datasetsXmlName</i>#<i>tagName</i>
        command line parameter which inserts the output into the specified datasets.xml file 
        (the default is <i>tomcat</i>/content/erddap/datasets.xml). 
        GenerateDatasetsXml looks for two lines in datasetsXmlName:
        <br><kbd>&lt;!-- Begin GenerateDatasetsXml #<i>tagName someDatetime</i> --&gt;</kbd>
        <br>and 
        <br><kbd>&lt;!-- End GenerateDatasetsXml #<i>tagName someDatetime</i> --&gt;</kbd>
        <br>and replaces everything in between those lines with the new content, and changes the someDatetime.
        <ul>
        <li>The -i switch is only processed (and changes to datasets.xml are only made)
          if you run GenerateDatasetsXml with command line arguments which specify all
          the answers to all of the questions for one loop of the program. (See 'Scripting' above.)
          (The thinking is: This parameter is for use with scripts.
          If you use the program in interactive mode (typing info on the keyboard), you are
          likely to generate some incorrect chunks of XML before you generate
          the one you want.)
        <li>If the Begin and End lines are not found, then those lines and the new content
          are inserted right before &lt;/erddapDatasets&gt;.
        <li>There is also a -I (capital i) switch for testing purposes which 
          works the same as -i, 
          but creates a file called datasets.xml<i>DateTime</i> and doesn't make 
          changes to datasets.xml.
        <li>Don't run GenerateDatasetsXml with -i in two processes at once. 
          There is a chance only one set of changes will be kept. 
          There may be serious trouble (for example, corrupted files).
        </ul>
      </ul>
      If you use "GenerateDatasetsXml -verbose", it will print more diagnostic messages than usual.


      <p><a class="selfLink" id="SpecialPseudoDatasetTypes" href="#SpecialPseudoDatasetTypes" rel="bookmark">Special/Pseudo Dataset Types</a>
      <br>In general, the EDDType options in GenerateDatasetsXml 
      match of the EDD types described in this document 
      (see the 
        <a rel="help" href="#datasetTypes">List of Dataset Types</a>) 
      and
      generate one datasets.xml chunk to create one dataset from one specific data source.
      There are a few exceptions and special cases:
      <ul>
      <li>EDDGridFromErddap 
        <br>This EDDType generates all of the datasets.xml chunks needed to make
        <a rel="help" href="#EDDGridFromErddap">EDDGridFromErddap</a> datasets 
        from all of the EDDGrid datasets in a remote ERDDAP.
        You will have the option of keeping the original datasetIDs 
        (which may duplicate some datasetIDs already in your ERDDAP) 
        or generating new names which will be unique (but usually aren't as human-readable).
        <br>&nbsp;

      <li>EDDTableFromErddap 
        <br>This EDDType generates all of the datasets.xml chunks needed to make
        <a rel="help" href="#EDDTableFromErddap">EDDTableFromErddap</a> datasets 
        from all of the EDDTable datasets in a remote ERDDAP.
        You will have the option of keeping the original datasetIDs 
        (which may duplicate some datasetIDs already in your ERDDAP) 
        or generating new names which will be unique (but usually aren't as human-readable).
        <br>&nbsp;

      <li><a class="selfLink" id="EDDGridFromThreddsCatalog" href="#EDDGridFromThreddsCatalog" rel="bookmark">EDDGridFromThreddsCatalog</a> 
          <br>This EDDType generates all of the datasets.xml chunks needed for all of 
          the <a rel="help" href="#EDDGridFromDap">EDDGridFromDap</a> datasets 
          that it can find by crawling recursively through a THREDDS (sub) catalog.
          There are many forms of THREDDS catalog URLs.
          This option REQUIRES a THREDDS .xml URL with <kbd>/catalog/</kbd> in it, for example,
          <br>https://oceanwatch.pfeg.noaa.gov/thredds/catalog/catalog.xml or
          <br>https://oceanwatch.pfeg.noaa.gov/thredds/catalog/Satellite/aggregsatMH/chla/catalog.xml
          <br>(a related .html catalog is at
          <br>https://oceanwatch.pfeg.noaa.gov/thredds/Satellite/aggregsatMH/chla/catalog.html
          , which is not acceptable for EDDGridFromThreddsCatalog).
          <br>If you have problems with EDDGridFromThreddsCatalog:
          <ul>
          <li>Make sure the URL you are using is valid, includes <kbd>/catalog/</kbd>,
            and ends with /catalog.xml .
          <li>If possible, use a public IP address (for example, https://oceanwatch.pfeg.noaa.gov)
            in the URL, not a local numeric IP address (for example, https://12.34.56.78).
            If the THREDDS is only accessible via the local numeric IP address, you can use
            <a rel="help" href="#convertToPublicSourceUrl"><kbd>&lt;convertToPublicSourceUrl&gt;</kbd></a>
            so ERDDAP™ users see the public address, even though ERDDAP™ gets data from the 
            local numeric address. 
          <li>If you have problems that you can't solve, 
            <a rel="help" href="#diagnoseProblems">send an email to Bob</a> with as much 
            information as possible.
          <li>The low level code for this now uses the Unidata netcdf-java 
            catalog crawler code (thredds.catalog classes) 
            so that it can handle all THREDDS catalogs 
            (which can be surprisingly complex)
            Thanks to Unidata for that code.
          <br>&nbsp;
          </ul>     
          
      <li><a class="selfLink" id="EDDGridLonPM180FromErddapCatalog" href="#EDDGridLonPM180FromErddapCatalog" rel="bookmark">EDDGridLonPM180FromErddapCatalog</a>
          <br>This EDDType generates the datasets.xml to make 
            <a rel="help" href="#EDDGridLonPM180">EDDGridLonPM180</a> datasets
            from all of the EDDGrid datasets in an ERDDAP
            that have any longitude values greater than 180.
          <ul>
          <li>If possible, use a public IP address (for example, https://oceanwatch.pfeg.noaa.gov)
            in the URL, not a local numeric IP address (for example, https://12.34.56.78).
            If the ERDDAP™ is only accessible via the local numeric IP address, you can use
            <a rel="help" href="#convertToPublicSourceUrl"><kbd>&lt;convertToPublicSourceUrl&gt;</kbd></a>
            so ERDDAP™ users see the public address, even though ERDDAP™ gets data from the 
            local numeric address. 
            <br>&nbsp;
          </ul>
          
      <li><a class="selfLink" id="EDDGridLon0360FromErddapCatalog" href="#EDDGridLon0360FromErddapCatalog" rel="bookmark">EDDGridLon0360FromErddapCatalog</a>
          <br>This EDDType generates the datasets.xml to make 
            <a rel="help" href="#EDDGridLon0360">EDDGridLon0360</a> datasets
            from all of the EDDGrid datasets in an ERDDAP
            that have any longitude values less than 0.
          <ul>
          <li>If possible, use a public IP address (for example, https://oceanwatch.pfeg.noaa.gov)
            in the URL, not a local numeric IP address (for example, https://12.34.56.78).
            If the ERDDAP™ is only accessible via the local numeric IP address, you can use
            <a rel="help" href="#convertToPublicSourceUrl"><kbd>&lt;convertToPublicSourceUrl&gt;</kbd></a>
            so ERDDAP™ users see the public address, even though ERDDAP™ gets data from the 
            local numeric address. 
            <br>&nbsp;
          </ul>
          
      <li><a class="selfLink" id="EDDsFromFiles" href="#EDDsFromFiles" rel="bookmark">EDDsFromFiles</a> 
        <br>Given a start directory, 
        this traverses the directory and all subdirectories and tries
        to create a dataset for each group of data files that it finds.
        <ul>
        <li>This assumes that when a dataset is found, the dataset includes all 
          subdirectories.
        <li>If a dataset is found, similar sibling directories will be treated 
          as separate datasets
          (for example, directories for the 1990's, the 2000's, the 2010's, 
          will generate separate datasets).
          They should be easy to combine by hand -- just change the 
          first dataset's <kbd>&lt;fileDir&gt;</kbd> to the parent directory and delete all the
          subsequent sibling datasets.
        <li>This will only try to generate a chunk of datasets.xml for the most
          common type of file extension in a directory (not counting .md5, which is ignored).
          So, given a directory with 10 .nc files and 5 .txt files, 
          a dataset will be generated for the .nc files only.
        <li>This assumes that all files in a directory with the same extension 
          belong in the same dataset. If a directory has some .nc files with SST data 
          and some .nc files with chlorophyll data, just one sample .nc 
          file will be read (SST? chlorophyll?) and just one dataset
          will be created for that type of file.  That dataset will probably fail
          to load because of complications from trying to load two types
          of files into the same dataset.
        <li>If there are fewer than 4 files with the most common extension in a directory, 
          this assumes that they aren't data files and just skips the directory.
        <li>If there are 4 or more files in a directory, 
          but this can't successfully generate a chunk of datasets.xml for the files
          (for example, an unsupported file type), this will generate an
          <a rel="help" href="#EDDTableFromFileNames">EDDTableFromFileNames</a> 
          dataset for the files.
        <li>At the end of the diagnostics that this writes to the log file, just before
          the datasets.xml chunks, this will print a table with a summary of information
          gathered by traversing all the subdirectories. The table will 
          list every subdirectory and indicate the most common type of file extension,
          the total number of files, and which type of dataset was created for
          these files (if any). If you are faced with a complex, deeply nested 
          file structure, consider running GenerateDatasetsXml with EDDType=EDDsFromFiles
          just to generate this information,
        <li>This option may not do a great job of guessing the best EDDType for a given 
          group of data files, but it is quick, easy, and worth a try.
          If the source files are suitable, it works well and is a good first 
          step in generating the datasets.xml for a file system with lots of 
          subdirectories, each with data files from different datasets.
          <br>&nbsp;
        </ul>      

      <li><a class="selfLink" id="EDDTableFromEML" href="#EDDTableFromEML" rel="bookmark">EDDTableFromEML and EDDTableFromEMLBatch</a>
          <br>These special EDDType generates the datasets.xml to make an 
            <a rel="help" href="#EDDTableFromAsciiFiles">EDDTableFromAsciiFiles</a> dataset
            from each of the tables described in an 
            <a rel="help"
            href="https://knb.ecoinformatics.org/external//emlparser/docs/index.html"
            >Ecological Metadata Language<img 
          src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
            XML file. 
        The "Batch" variant works on all of the EML files in a local or remote directory.  
        Please see the separate
            <a rel="help"
            href="https://erddap.github.io/EDDTableFromEML.html"
            >documentation&nbsp;for&nbsp;EDDTableFromEML</a>.
          <br>&nbsp;

      <li><a class="selfLink" id="EDDTableFromInPort" href="#EDDTableFromInPort" rel="bookmark">EDDTableFromInPort</a>
          <br>This special EDDType generates the datasets.xml to make an 
            <a rel="help" href="#EDDTableFromAsciiFiles">EDDTableFromAsciiFiles</a> dataset
            from the information in an  
            <a rel="help" href="https://inport.nmfs.noaa.gov/inport"
            >inport-xml<img 
          src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
            file. 
        If you can get access to the source data file (the inport-xml file should
        have clues for where to find it), you can make a working dataset in ERDDAP.
        
        <p>The following steps outline how to use GenerateDatasetsXml 
          with an inport-xml file in order to get a working dataset in ERDDAP.
        <ol>
        <li>Once you have access to the inport-xml file (either as a URL or a local file):
          run GenerateDatasetsXml, specify <kbd>EDDType=EDDTableFromInPort</kbd>, 
          specify the inport-xml URL or full filename,
          specify <kbd>whichChild=0</kbd>, and specify the other requested information (if known).
          (At this point, you don't need to have the source data file or specify its name.)
          The <kbd>whichChild=0</kbd> setting tells GenerateDatasetsXml to 
          write out the information for <strong>all</strong> of the
          &lt;entity-attribute-information&gt;&lt;entity&gt;'s in the inport-xml file
          (if there are any).
          It also prints out a <kbd>Background</kbd> information summary, including
          all of the <kbd>download-url</kbd>'s listed in the inport-xml file. 
        <li>Look through all that information (including the <kbd>Background</kbd>
          information that GenerateDatasetsXml prints) 
          and visit the <kbd>download-url</kbd>(s)
          in order to try to find the source data file(s).
          If you can find it(them), download it(them) into a directory that is 
          accessible to ERDDAP. 
          (If you can't find any source data files, there is no point in proceeding.)
        <li>Run GenerateDatasetsXml again.
          <br>If the source data file 
            corresponds to one of the inport-xml file's 
            &lt;entity-attribute-information&gt;&lt;entity&gt;'s, 
            specify <kbd>whichChild=<i>thatEntity'sNumber</i></kbd> (e.g., 1, 2, 3, ...). 
            ERDDAP™ will try to match the column names in the source data file
            to names in the entity information, and prompt to accept/reject/fix
            any discrepancies.
          <br>Or, if the inport-xml file doesn't have any 
            &lt;entity-attribute-information&gt;&lt;entity&gt;'s, specify <kbd>whichChild=0</kbd>.
        <li>In the chunk of datasets.xml that was made by GenerateDatasetsXml,
          revise the 
          <a rel="help" href="#globalAttributes">global &lt;addAttributes&gt;</a> 
          as needed/desired.
        <li>In the chunk of datasets.xml that was made by GenerateDatasetsXml, 
          add/revise the 
          <a rel="help" href="#dataVariable">&lt;dataVariable&gt;</a> 
          information as needed/desired to describe each of the variables.
          Be sure you properly identify each variable's
          <br><a rel="help" href="#sourceName">&lt;sourceName&gt;</a> 
            (as it appears in the source), 
          <br><a rel="help" href="#destinationName">&lt;destinationName&gt;</a>
            (which has more limitations on allowed characters than sourceName), 
          <br><a rel="help" href="#units">&lt;units&gt;</a>
            (especially if it is a 
            <a rel="help" href="#timeStampVariable">time or timestamp variable</a>
            where the units need to 
            specify the format), and
          <br><a rel="help" href="#missing_value">&lt;missing_value&gt;</a>,
        <li>When you are close to finishing, repeatedly use the
          <a rel="help" href="#DasDds">DasDds</a> 
          tool to quickly see if the dataset description is valid and if
          the dataset will appear in ERDDAP™ as you want it to.
          <br>&nbsp;
        </ol>

        It would be great if groups using InPort to document their datasets 
        would also use ERDDAP™ to make the actual data available: 
        <ul>
        <li>ERDDAP™ is a solution that can be used right now so you can fulfill NOAA's
          <a rel="help" 
          href="https://nosc.noaa.gov/EDMC/PD.DSP.php"
          >Public Access to Research Results (PARR) requirements</a>
          right now, not at some vague time in the future.
        <li>ERDDAP™ makes the actual data available to users, not just the metadata.
          (What good is metadata without data?)
        <li>ERDDAP™ supports metadata (notably, the units of variables), 
          unlike some other data server software being considered. 
          (What good is data without metadata?)
          To use software that doesn't support metadata is to invite the data to be 
          misunderstood and misused.          
        <li>ERDDAP™ is free and open-source software
          unlike some other software being considered.
          Ongoing development of ERDDAP™ is already paid for.
          Support for ERDDAP™ users is free.
        <li>ERDDAP's appearance can be easily customized to reflect
          and highlight your group (not ERD or ERDDAP). 
        <li>ERDDAP™ offers a consistent way to access all datasets.
        <li>ERDDAP™ can read data from many types of data files and from relational 
          databases. 
        <li>ERDDAP™ can deal with large datasets, including datasets where
          the source data is in many data files.
        <li>ERDDAP™ can write data to many types of data files, at the user's request,
          including scientific data file types like netCDF, ESRI .csv, and ODV .txt. 
        <li>ERDDAP™ can make custom graphs and maps of subsets of the data, 
          based on the user's specifications.
        <li>ERDDAP™ can deal with non-data datasets such as collections
          of image, video, or audio files.
        <li>ERDDAP™ has been installed and used at 
        <a rel="bookmark" 
        href="https://erddap.github.io/setup.html#organizations"
        >more than 60 institutions around the world</a>.
        <li>ERDDAP™ is listed as one of the data servers recommended for use within NOAA
           in the
        <a rel="bookmark" 
href="https://www.ngdc.noaa.gov/wiki/index.php/Data_Access_Technical_Recommendations#Software_implementations">NOAA Data Access Procedural Directive<img 
  src="../images/external.png" alt=" (external link)" 
  title="This is a link to an external website."/></a>, 
          unlike some other software being considered.
        <li>ERDDAP™ is a product of NMFS/NOAA, so using it within NMFS and NOAA 
          should be a point of pride for NMFS and NOAA.
        </ul>
        Please give ERDDAP™ a try. If you need help, please post a message in the ERDDAP™ Google group.
        <br>&nbsp;

      <li>addFillValueAttributes
          <br>This special EDDType option isn't a dataset type. It is a tool 
          which can add _FillValue attributes to some variables in some datasets.
          See 
          <a href="#addFillValueAttributes" rel="bookmark">addFillValueAttributes</a>.
        <br>&nbsp;

      <li><a class="selfLink" id="findDuplicateTime" href="#findDuplicateTime" rel="bookmark">findDuplicateTime</a>
          <br>This special EDDType option isn't a dataset type. 
          Instead, it tells GenerateDatasetsXml to search through a collection of 
          gridded .nc (and related) files to find and print out a list of files with duplicate time values.
          When it looks at the time values, it converts them from the original units 
          to "seconds since 1970-01-01" in case different files use different units strings.
          You need to provide the starting directory (with or without the trailing slash),
          the file name regular expression (e.g., .*\.nc ), and the name of the 
          time variable in the files.
        <br>&nbsp;

      <li><a class="selfLink" id="ncdump" href="#ncdump" rel="bookmark">ncdump</a>
          <br>This special EDDType option isn't a dataset type. 
          Instead, it tells GenerateDatasetsXml to print an 
            <a rel="help"
            href="https://linux.die.net/man/1/ncdump"
            >ncdump<img 
          src="../images/external.png" alt=" (external link)" 
          title="This link to an external website does not constitute an endorsement."></a>-like
          printout of an .nc, .ncml, or .hdf file.
          It actually uses the netcdf-java's 
            <a rel="help"
            href="https://docs.unidata.ucar.edu/netcdf-java/5.4/javadoc/ucar/nc2/write/Ncdump.html"
            >NCdump<img 
          src="../images/external.png" alt=" (external link)" 
          title="This link to an external website does not constitute an endorsement."></a>,
          which is a more limited tool than the C version of NCdump.
          If you use this option, GenerateDatasetsXml will ask you to use one
          of the options: "-h" (header), "-c" (coordinate vars),
          "-vall" (default), "-v var1;var2", "-v var1(0,0:10,0:20)".
          This is useful because, without ncdump it is hard to know what is in
          an .nc, .ncml, or .hdf file and thus which EDDType you should specify for GenerateDatasetsXml.
          For an .ncml file, this will print the ncdump output for the result
          of the .ncml file changes applied to the underlying .nc or .hdf file.
        <br>&nbsp;

      </ul>


    <li><a class="selfLink" id="DasDds" href="#DasDds" rel="bookmark"><strong>DasDds</strong></a> is a command line program that you can use
      after you have created a first attempt at the XML for a new dataset in datasets.xml.
      With DasDds, you can repeatedly test and refine the XML.
      When you use the DasDds program:
      <ol>
      <li>On Windows, the first time you run DasDds, you need to edit the 
        DasDds.bat file with a text editor to change the path to the java.exe file
        so that Windows can find Java.
      <li>DasDds asks you for the datasetID for the dataset you are working on.
      <li>DasDds tries to create the dataset with that datasetID.
        <ul>
        <li>DasDds always prints lots of diagnostic messages.
          <br>If you use "DasDds -verbose", DasDds will print more diagnostic messages than usual.
        <li>For safety, DasDds always deletes all of the cached dataset information (files) 
          for the dataset before trying
          to create the dataset.  
          This is the equivalent of setting a 
          <a rel="help" href="https://erddap.github.io/setup.html#hardFlag">hard flag</a>
          So for aggregated datasets, you might want to adjust the 
          fileNameRegex temporarily to limit the number of files the data constructor finds.
        <li>If the dataset fails to load (for whatever reason), 
          DasDds will stop and show you the error message for the first error it finds.
          <br><strong>Don't try to guess what the problem might be. Read the ERROR message carefully.</strong>
          <br>If necessary, read the preceding diagnostic messages to find more clues and information, too.
        <li><strong>Make a change to the dataset's XML to try to solve THAT problem</strong> 
          <br>and let DasDds try to create the dataset again.
        <li><strong>If you repeatedly solve each problem, you will eventually solve all the problems</strong>
          <br>and the dataset will load.
        </ul>
      <li>All DasDds output (diagnostics and results) are written to the screen and to
        <i>bigParentDirectory</i>/logs/DasDds.log .
      <li>If DasDds can create the dataset, DasDds will then show you the 
        <a rel="help" 
        href="https://coastwatch.pfeg.noaa.gov/erddap/griddap/documentation.html#fileType_das"
        >.das (Dataset Attribute Structure)</a>,
        <a rel="help" 
        href="https://coastwatch.pfeg.noaa.gov/erddap/griddap/documentation.html#fileType_dds"
        >.dds (Dataset Descriptor Structure)</a>, and 
        <a rel="help" 
        href="https://coastwatch.pfeg.noaa.gov/erddap/griddap/documentation.html#timeGaps"
        >.timeGaps (time gaps)</a>         
         information for the dataset on your screen and write them to 
         <i>bigParentDirectory</i>/logs/DasDds.out .
      <li>Often, you will want to make some small 
          change to the dataset's XML to clean up the dataset's metadata and rerun DasDds.
      </ol>
    </ul>

<a class="selfLink" id="ERDDAPlint" href="#ERDDAPlint" rel="bookmark"><strong>Bonus Third-Party Tool: ERDDAP-lint</strong></a> 
    <br>ERDDAP-lint is a program from Rob Fuller and Adam Leadbetter of the Irish Marine Institute 
       that you can use to improve the metadata of your ERDDAP™ datasets.
       ERDDAP-lint "contains rules and a simple static web application for running 
       some verification tests against your ERDDAP™ server. All the tests are run in the web browser."
       Like the 
       <a rel="help"
            href="https://en.wikipedia.org/wiki/Lint_(software)"
            >Unix/Linux lint tool<img 
          src="../images/external.png" alt=" (external link)" 
          title="This link to an external website does not constitute an endorsement."></a>,
       you can edit the existing rules or add new rules. 
       See <a rel="help"
            href="https://github.com/IrishMarineInstitute/erddap-lint"
            >ERDDAP-lint<img 
          src="../images/external.png" alt=" (external link)" 
          title="This link to an external website does not constitute an endorsement."></a>
        for more information.  

       <p>This tool is especially useful for datasets that you created some time ago 
       and now want to bring up-to-date with your current metadata preferences.
       For example, early versions of GenerateDatasetsXml didn't put any effort
       into creating global creator_name, creator_email, creator_type, or creator_url
       metadata. You could use ERDDAP-lint to identify the datasets that lack 
       those metadata attributes.

       <p>Thanks to Rob and Adam for creating this tool
        and making it available to the ERDDAP™ community.
        <br>&nbsp;


<p><strong><a class="selfLink" id="basicStructure" href="#basicStructure" rel="bookmark">The Basic Structure of the datasets.xml File</a></strong>
<br>The required and optional tags allowed in a datasets.xml file 
  (and the number of times they may appear) are shown below.
  In practice, your datasets.xml will have lots of &lt;dataset&gt;'s tags and 
  only use the other tags within &lt;erddapDatasets&gt; as needed.
<pre>
&lt;?xml version="1.0" encoding="ISO-8859-1" ?&gt;
&lt;erddapDatasets&gt;
  <a rel="help" href="#angularDegreeUnits">&lt;angularDegreeUnits&gt;</a>...&lt;/angularDegreeUnits&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#angularDegreeTrueUnits">&lt;angularDegreeTrueUnits&gt;</a>...&lt;/angularDegreeTrueUnits&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#cacheMinutes">&lt;cacheMinutes&gt;</a>...&lt;/cacheMinutes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#commonStandardNames">&lt;commonStandardNames&gt;</a>...&lt;/commonStandardNames&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#convertInterpolateDatasetIDVariableExample">&lt;convertInterpolateDatasetIDVariableExample /&gt;</a> &lt;!-- 0 or more --&gt;
  <a rel="help" href="#convertInterpolateDatasetIDVariableList">&lt;convertInterpolateDatasetIDVariableList /&gt;</a> &lt;!-- 0 or more --&gt;
  <a rel="help" href="#convertToPublicSourceUrl">&lt;convertToPublicSourceUrl /&gt;</a> &lt;!-- 0 or more --&gt;
  <a rel="help" href="#decompressedCacheMaxGB">&lt;decompressedCacheMaxGB&gt;</a>...&lt;/decompressedCacheMaxGB&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#decompressedCacheMaxMinutesOld">&lt;decompressedCacheMaxMinutesOld&gt;</a>...&lt;/decompressedCacheMaxMinutesOld&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#drawLandMask">&lt;drawLandMask&gt;</a>...&lt;/drawLandMask&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#emailDiagnosticsToErdData">&lt;emailDiagnosticsToErdData&gt;</a>...&lt;/emailDiagnosticsToErdData&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#graphBackgroundColor">&lt;graphBackgroundColor&gt;</a>...&lt;/graphBackgroundColor&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#ipAddressMaxRequests">&lt;ipAddressMaxRequests&gt;</a>...&lt;/ipAddressMaxRequests&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#ipAddressMaxRequestsActive">&lt;ipAddressMaxRequestsActive&gt;</a>...&lt;ipAddressMaxRequestsActive&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#ipAddressUnlimited">&lt;ipAddressUnlimited&gt;</a>...&lt;ipAddressUnlimited&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#loadDatasetsMinMinutes">&lt;loadDatasetsMinMinutes&gt;</a>...&lt;/loadDatasetsMinMinutes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#loadDatasetsMaxMinutes">&lt;loadDatasetsMaxMinutes&gt;</a>...&lt;/loadDatasetsMaxMinutes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#logLevel">&lt;logLevel&gt;</a>...&lt;/logLevel&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#nGridThreads">&lt;nGridThreads&gt;</a>...&lt;/nGridThreads&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#nTableThreads">&lt;nTableThreads&gt;</a>...&lt;/nTableThreads&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#palettes">&lt;palettes&gt;</a>...&lt;/palettes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#partialRequestMaxBytes">&lt;partialRequestMaxBytes&gt;</a>...&lt;/partialRequestMaxBytes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#partialRequestMaxCells">&lt;partialRequestMaxCells&gt;</a>...&lt;/partialRequestMaxCells&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#requestBlacklist">&lt;requestBlacklist&gt;</a>...&lt;/requestBlacklist&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#slowDownTroubleMillis">&lt;slowDownTroubleMillis&gt;</a>...&lt;/slowDownTroubleMillis&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#subscriptionEmailBlacklist">&lt;subscriptionEmailBlacklist&gt;</a>...&lt;/subscriptionEmailBlacklist&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#unusualActivity">&lt;unusualActivity&gt;</a>...&lt;/unusualActivity&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#updateMaxEvents">&lt;updateMaxEvents&gt;</a>...&lt;/updateMaxEvents&gt; &lt;!-- 0 or 1 --&gt;

  <a rel="help" href="#standardText">&lt;standardLicense&gt;</a>...&lt;/standardLicense&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#standardText">&lt;standardContact&gt;</a>...&lt;/standardContact&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#standardText">&lt;standardDataLicenses&gt;</a>...&lt;/standardDataLicenses&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#standardText">&lt;standardDisclaimerOfEndorsement&gt;</a>...&lt;/standardDisclaimerOfEndorsement&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#standardText">&lt;standardDisclaimerOfExternalLinks&gt;</a>...&lt;/standardDisclaimerOfExternalLinks&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#standardText">&lt;standardGeneralDisclaimer&gt;</a>...&lt;/standardGeneralDisclaimer&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#standardText">&lt;standardPrivacyPolicy&gt;</a>...&lt;/standardPrivacyPolicy&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#standardText">&lt;startHeadHtml5&gt;</a>...&lt;/startHeadHtml5&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#standardText">&lt;startBodyHtml5&gt;</a>...&lt;/startBodyHtml5&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#standardText">&lt;theShortDescriptionHtml&gt;</a>...&lt;/theShortDescriptionHtml&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#standardText">&lt;endBodyHtml5&gt;</a>...&lt;/endBodyHtml5&gt; &lt;!-- 0 or 1 --&gt;

  <a rel="help" href="#user">&lt;user username="..." password="..." roles="..." /&gt;</a> &lt;!-- 0 or more --&gt;

  <a rel="help" href="#datasetTypes">&lt;dataset&gt;</a>...&lt;/dataset&gt; &lt;!-- 1 or more --&gt;
&lt;/erddapDatasets&gt;
</pre>
It is possible that other encodings will be allowed in the future, but for now, only ISO-8859-1 is recommended.

<br>&nbsp;
<p><strong><a class="selfLink" id="xinclude" href="#xinclude" rel="bookmark">XInclude</a></strong>
  <br>New in version 2.25 is support for XInclude. This requires you are using the SAX parser <kbd>&lt;useSaxParser&gt;true&lt;/useSaxParser&gt;</kbd> in your setup.xml.
  This can allow you to write each dataset in its own file, then include them all in the main datasets.xml, reuse parts of dataset definitions, or both.
  If you want to see an example, <a href="https://github.com/ERDDAP/erddap/blob/main/src/test/java/testDataset/EDDTestDataset.java">EDDTestDataset.java</a> sets up XInclude to reuse variable definitions.
  
<br>&nbsp;
<hr>
<h2><a class="selfLink" id="notes" href="#notes" rel="bookmark">Notes</a></h2>
Working with the datasets.xml file is a non-trivial project. 
Please read all of these notes carefully.
After you pick a <a rel="help" href="#datasetTypes">dataset type</a>, 
please read the detailed description of it carefully.

<ul>
<li><strong><a class="selfLink" id="useCtrlF" href="#useCtrlF" rel="bookmark">Use Ctrl-F To Find Things On This Web Page</a></strong> 
<br>All of the information about working with datasets.xml is on this one, 
very long, .html web page,
not several .html pages as some people prefer.
The advantage of one .html web page is that you can use Ctrl-F (Command-F on a Mac) 
in your web browser to search for text (for example, <kbd>time_precision</kbd>) within this web page.
<p>Alternatively, at the top of this document, there is a 
<a href="#TableOfContents" rel="help">Table of Contents</a>.

<li><strong><a class="selfLink" id="InternalLinks" href="#InternalLinks"
rel="bookmark">Internal Links</a></strong> 
<br>ERDDAP's web pages 
have a large number of almost invisible, internal links 
(the text is black and not underlined). 
If you hover over one of these links (usually the first few words of headings 
and paragraphs), the cursor becomes a hand.
If you click on the link, the URL is the internal link to that section of the 
document. This makes it easy to refer to specific sections of ERDDAP™ web pages.
As an example, hover over, and click on, the bold "Internal Links" at the start 
of this paragraph.
<br>&nbsp;

<li><strong><a class="selfLink" id="ChoosingTheDatasetType" href="#ChoosingTheDatasetType" rel="bookmark">Choosing the Dataset Type</a></strong> 
<br>In most cases, 
there is just one ERDDAP™ dataset type that is appropriate 
for a given data source. In a few cases (e.g., .nc files), there are a 
few possibilities, but usually one of them is definitely best. 
The first and biggest decision you must make is: is it appropriate to treat the 
dataset as a group of multidimensional arrays (if so see the 
  <a rel="help" href="#EDDGrid">EDDGrid dataset types</a>)
or as a database-like table of data (if so see the 
  <a rel="help" href="#EDDTable">EDDTable dataset types</a>).
<br>&nbsp;

<li><strong><a class="selfLink" id="ServingTheDataAsIs" href="#ServingTheDataAsIs" rel="bookmark">Serving the Data As Is</a></strong>
<br>Usually, there is no need to modify the data source
(e.g., convert the files to some other file type) so that ERDDAP™ can serve it. 
One of the assumptions of ERDDAP™ is that the data source will be used as is. 
Usually this works fine. Some exceptions are:
  <ul>
  <li>Relational Databases and Cassandra -- ERDDAP™ can serve data directly
    from relational databases and Cassandra. But for security, load balancing, 
    and performance issues,
    you may choose to set up another database with the same data or
    save the data to NetCDF v3 .nc files and have ERDDAP™ serve the data from
    the new data source. See
    <a rel="help" href="#EDDTableFromDatabase">EDDTableFromDatabase</a> and
    <a rel="help" href="#EDDTableFromCassandra">EDDTableFromCassandra</a>.

  <li>Not Supported Data Sources -- ERDDAP™ can support a large number of types
  of data sources, but the world is filled with 1000's (millions?) of different
  data sources (notably, data file structures). If ERDDAP™ doesn't support 
  your data source:
    <ul>
    <li>If the data source is NetCDF .nc files, you can use 
      <a rel="help" href="#NcML">NcML</a> to modify the data files on-the-fly,
      or use <a rel="help" href="#NCO">NCO</a> to permanently modify the data files.
    <li>You can write the data to a data source type that ERDDAP™ supports.
      NetCDF-3 .nc files are a good, general recommendation because they are binary files
      that ERDDAP™ can read very quickly.
      For tabular data, consider storing the data in a collection of 
      .nc files that use the 
      <a rel="help"
      href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html#discrete-sampling-geometries"
      >CF Discrete Sampling Geometries (DSG)<img 
          src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
      Contiguous Ragged Array data structures and so can be handled with ERDDAP's
      <a rel="help" href="#EDDTableFromNcCFFiles">EDDTableFromNcCFFiles</a>).
      If they are logically organized (each with data for a chunk of space and time), 
      ERDDAP™ can extract data from them very quickly.
    <li>You can request that support for that data source be added to ERDDAP™ by 
      emailing Chris.John at noaa.gov.
    <li>You can add support for that data source by writing the code to handle
      it yourself. See 
      <a rel="help" 
      href="https://erddap.github.io/setup.html#programmersGuide"
      >the ERDDAP™ Programmer's Guide</a>
    </ul>

  <li>Speed -- ERDDAP™ can read data from some data sources much faster than others. 
    For example, reading NetCDF v3 .nc files is fast and reading ASCII files is slower. 
    And if there is a large (&gt;1000) or huge (&gt;10,000) number of source data files,
    ERDDAP™ will respond to some data requests slowly.
    Usually, the difference isn't noticeable to humans. 
    However, if you think ERDDAP™ is slow for a given dataset, 
    you may choose to solve the problem by writing the data to a more efficient setup 
    (usually: a few, well-structured, NetCDF v3 .nc files). For tabular data, see
    <a rel="help" href="#EDDTableFromFiles_MillionsOfFiles">this advice</a>.
    <br>&nbsp;
  </ul>

<li><strong>Hint</strong>
<br>It is often easier to generate the XML for a dataset by 
making a copy of a working dataset description in dataset.xml and then modifying it.
<br>&nbsp;

<li><strong><a class="selfLink" id="encodingSpecialCharacters" href="#encodingSpecialCharacters" rel="bookmark">Encoding Special Characters</a></strong>
<br>Since datasets.xml is an XML file, you MUST 
<a rel="help" href="https://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references#Predefined_entities_in_XML">&amp;-encode<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a> 
"&amp;", "&lt;", and "&gt;" 
  in any content as "&amp;amp;", "&amp;lt;", and "&amp;gt;".
  <br>Wrong: <kbd>&lt;title&gt;Time &amp; Tides&lt;/title&gt;</kbd>
  <br>Right: &nbsp;&nbsp;<kbd>&lt;title&gt;Time &amp;amp; Tides&lt;/title&gt;</kbd> 
<br>&nbsp;

<li><strong><a class="selfLink" id="noSyntaxErrors" href="#noSyntaxErrors" rel="bookmark">XML doesn't tolerate syntax errors.</a></strong> 
<br>After you edit the dataset.xml file, it is a good idea 
to verify that the result is 
<a rel="help" href="https://www.w3schools.com/xml/xml_dtd.asp">well-formed XML<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a> 
by pasting the XML text into an XML checker like
<a rel="help" href="https://www.xmlvalidation.com/">xmlvalidation<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a>.
<br>&nbsp;

<li><strong><a class="selfLink" id="diagnoseProblems" href="#diagnoseProblems" rel="bookmark">Other Ways To Diagnose</a>
  <a class="selfLink" id="errorMessages" href="#errorMessages" rel="bookmark">Problems With Datasets</a></strong>
<br>In addition to the two main <a rel="help" href="#Tools">Tools</a>, 
<ul>
<li><a rel="help" href="https://erddap.github.io/setup.html#log">log.txt</a> 
  is a log file with all of ERDDAP's diagnostic messages.
<li>The <a rel="help" href="https://erddap.github.io/setup.html#dailyReport">Daily Report</a> 
    has more information than the status page, including a list of datasets that 
  didn't load and the exceptions (errors) they generated.
<li>The <a rel="help" href="https://erddap.github.io/setup.html#statusPage">Status Page</a> 
  is a quick way to check ERDDAP's status from any web browser.
  It includes a list of datasets that didn't load (although not the related exceptions) and 
  taskThread statistics (showing the progress of 
    <a rel="help" href="#EDDGridCopy">EDDGridCopy</a> and 
    <a rel="help" href="#EDDTableCopy">EDDTableCopy</a> datasets
    and any <a rel="help" href="#EDDGridFromFiles">EDDGridFromFiles</a> or 
    <a rel="help" href="#EDDTableFromFiles">EDDTableFromFiles</a> datasets that use 
    <a rel="help" href="#cacheFromUrl">cacheFromUrl</a> (but not cacheSizeGB)). 
<li>If you get stuck, please send an email with the details to <kbd>erd dot data at noaa dot gov</kbd>.
    <br>Or, you can join the <a rel="help"
        href="#ERDDAPMailingList">ERDDAP™ Google Group / Mailing List</a> 
        and post your question there.
<br>&nbsp;
</ul>


<li><strong><a class="selfLink" id="LLAT" href="#LLAT" rel="bookmark">The longitude, latitude, altitude (or depth), 
  and time (LLAT) variable</a> <a rel="help" href="#destinationName">destinationName</a>s are special.</strong>
  <ul>
  <li>In general:
    <ul>
    <li>LLAT variables are made known to ERDDAP™ if the axis variable's 
      (for EDDGrid datasets) or data variable's (for EDDTable datasets) 
      <a rel="help" href="#destinationName">destinationName</a> 
      is "longitude", "latitude", "altitude", "depth", or "time".
    <li>We strongly encourage you to use these standard names for these variables 
      whenever possible. None of them is required.
      If you don't use these special variable names, ERDDAP™ won't recognize their significance.
      For example, LLAT variables are treated specially by Make A Graph (<i>datasetID</i>.graph): 
      if the X Axis variable is 
      "longitude" and the Y Axis variable is "latitude", you will get a map (using a standard
      projection, and with a land mask, political boundaries, etc.) instead of a graph.    
    <li>ERDDAP™ will automatically add lots of metadata to LLAT variables (for example, 
      "<a rel="help" href="#ioos_category">ioos_category</a>", 
      "<a rel="help" href="#units">units</a>", 
      and several standards-related attributes like "_CoordinateAxisType"). 
    <li>ERDDAP™ will automatically, on-the-fly, add lots of global metadata 
      related to the LLAT values 
      of the selected data subset (for example, "geospatial_lon_min"). 
    <li>Clients that support these metadata standards will be able to take 
      advantage of the added metadata
      to position the data in time and space.
    <li>Clients will find it easier to generate queries that include LLAT variables
      because the variable's 
      names are the same in all relevant datasets.
    </ul>
  <li>For the <a class="selfLink" id="longitudeVariable" href="#longitudeVariable" rel="bookmark">"longitude"</a> variable and 
      the <a class="selfLink" id="latitudeVariable" href="#latitudeVariable" rel="bookmark">"latitude"</a> variable:
    <ul>
    <li>Use the <a rel="help" href="#destinationName">destinationName</a>s "longitude" and "latitude" only if the <a rel="help" href="#units">units</a> are degrees_east 
      and degrees_north, respectively.
      If your data doesn't fit these requirements, use different variable names 
      (for example, x, y, lonRadians, latRadians).
    <li>If you have longitude and latitude data expressed in different units and thus 
      with different destinationNames, for example, 
      lonRadians and latRadians, Make A Graph (<i>datasetID</i>.graph) 
      will make graphs (for example, time series) instead of maps.
    </ul>
  <li>For the <a class="selfLink" id="altitudeVariable" href="#altitudeVariable" rel="bookmark">"altitude"</a> variable 
    and the <a class="selfLink" id="depthVariable" href="#depthVariable" rel="bookmark">"depth"</a> variable:
    <ul>
    <li>Use the <a rel="help" href="#destinationName">destinationName</a> 
      "altitude" to identify the data's distance 
      above sea level (positive="up" values).  
      Optionally, you may use "altitude" for distances below sea level if the 
      values are negative below the sea (or if you use, for example,
      <br>
      <a rel="help" href="#scale_factor"><kbd>&lt;att name="scale_factor" 
        type="int"&gt;-1&lt;/att&gt;</kbd></a> 
      to convert depth values into altitude values.
    <li>Use the destinationName "depth" to identify the data's distance 
      below sea level (positive="down" values).  
    <li>A dataset may not have both "altitude" and "depth" variables.
    <li>For these variable names, the <a rel="help" href="#units">units</a>
      must be "m", "meter", or "meters".
      If the units different (for example, fathoms), you can use
      <br>
      <a rel="help" href="#scale_factor"><kbd>&lt;att name="scale_factor"&gt;<i>someValue</i>&lt;/att&gt;</kbd></a> and
      <a rel="help" href="#units"><kbd>&lt;att name="units"&gt;meters&lt;/att&gt;</kbd></a> 
      to convert the units to meters.
    <li>If your data doesn't fit these requirements, 
      use a different destinationName (for example, aboveGround, distanceToBottom).
    <li>If you know the vertical CRS please specify it in the metadata, e.g.,    
       "EPSG:5829" (instantaneous height above sea level), 
       "EPSG:5831" (instantaneous depth below sea level), or "EPSG:5703" (NAVD88 height).
    </ul>
  <li>For the <a class="selfLink" id="timeVariable" href="#timeVariable" rel="bookmark">"time"</a> variable:
    <ul>
    <li>Use the <a rel="help" href="#destinationName">destinationName</a> "time" only for variables that include the entire date+time 
      (or date, if that is all there is).
      If, for example, there are separate columns for date and timeOfDay, don't use the variable name "time".
    <li>See <a rel="help" href="#timeUnits">units</a> for more information about the units attribute for time and timeStamp variables.
    <li>The time variable and related 
      <a rel="help" href="#timeStampVariable">timeStamp variables</a> are unique in that they
      always convert data values from the source's time format (whatever it is) into a numeric value
      (seconds since 1970-01-01T00:00:00Z) or a String value (ISO 8601:2004(E) format), depending on the
      situation.
    <li>When a user requests time data, they can request it by specifying the time as a numeric value
      (seconds since 1970-01-01T00:00:00Z) or a String value (ISO 8601:2004(E) format).
    <li>ERDDAP™ has a utility to 
      <a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/convert/time.html">Convert
      a Numeric Time to/from a String Time</a>.
    <li>See <a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/convert/time.html#erddap">How ERDDAP
       Deals with Time</a>.
      <br>&nbsp;
    </ul>
  </ul>

<li><strong><a class="selfLink" id="dataStructures" href="#dataStructures" rel="bookmark">Why just two basic data structures?</a></strong>
  <ul>
  <li>Since it is difficult for human clients and computer clients to deal with a complex set of 
    possible dataset structures, ERDDAP™ uses just two basic data structures: 
    <ul>
    <li>a <a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/griddap/documentation.html#dataModel">gridded 
    data structure</a> (for example, for satellite data and model data) and 
    <li>a <a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/tabledap/documentation.html#dataModel">tabular 
    data structure</a> (for example, for in-situ buoy, station, and trajectory data).
    </ul>
  <li>Certainly, not all data can be expressed in these structures, but much of it can. 
    Tables, in particular, are very flexible data structures
    (look at the success of relational database programs).
  <li>This makes data queries easier to construct.
  <li>This makes data responses have a simple structure, which makes it easier to serve the data 
    in a wider variety of standard file types (which often just support simple data structures). 
    This is the main reason that we set up ERDDAP™ this way.
  <li>This, in turn, makes it very easy for us (or anyone) to write client software which works with all
    ERDDAP™ datasets.
  <li>This makes it easier to compare data from different sources. 
  <li>We are very aware that if you are used to working with data in other data structures 
    you may initially think that this approach is simplistic or insufficient.
    But all data structures have tradeoffs. None is perfect. 
    Even the do-it-all structures have their downsides: working with them is complex and 
    the files can only be written or read with special software libraries.
    If you accept ERDDAP's approach enough to try to work with it, you may find that it has its
    advantages (notably the support for multiple file types that can hold the data responses).
    The 
    <a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/images/erddapTalk/erddapTechTalk.html">ERDDAP™ slide show</a>
    (particularly the
    <a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/images/erddapTalk/erddapTechTalk.html#dataStructures">data
    structures slide</a>)
    talks a lot about these issues.
  <li>And even if this approach sounds odd to you, most ERDDAP™ clients will never notice --
    they will simply see that all of the datasets have a nice simple structure
    and they will be thankful that they can get data from a wide variety of sources returned in a
    wide variety of file formats.
    <br>&nbsp;
  </ul>

<li><strong><a class="selfLink" id="differentDimensions" href="#differentDimensions" rel="bookmark">What if the grid variables in the source dataset DON'T share the same axis variables?</a></strong>
<br>In EDDGrid datasets, all data variables MUST use (share) all of the axis variables. 
  So if a source dataset has some variables with one set of dimensions, and other variables with
  a different set of dimensions, you will have to make two datasets in ERDDAP.
  For example, you might make one ERDDAP™ dataset entitled "Some Title (at surface)" to hold variables
  that just use [time][latitude][longitude] dimensions and make another ERDDAP™ dataset entitled 
  "Some Title (at depths)" to hold the variables that use [time][altitude][latitude][longitude].
  Or perhaps you can change the data source to add a dimension with a single value (for example,
  altitude=0) to make the variables consistent. 
  <p>ERDDAP™ doesn't handle more complicated datasets (for example, models that use a mesh
  of triangles) well. You can serve these datasets in ERDDAP™ by creating two or more datasets in
  ERDDAP™ (so that all data variables in each new dataset share the same set of axis variables),
  but that isn't what users want.  
  For some datasets, you might consider making a regular gridded version of the dataset and offering
  that in addition to the original data. Some client software can only deal with a regular grid, 
  so by doing this, you reach additional clients.
  <br>&nbsp;

<li><strong><a class="selfLink" id="projections" href="#projections" rel="bookmark">Projected Gridded Data</a></strong>
<br>Some gridded data has a complex structure.
  For example, satellite level 2 ("along track") data does not use a simple projection.
  Modelers (and others) often work with gridded data on various
  non-cylindrical projections (for example, conic, polar stereographic, tripolar)
  or in unstructured grids (a more complex data structure).  
  Some end users want this data as is, so there is no loss of information. 
  For those clients, ERDDAP™ can serve the data, as is, only if the ERDDAP™ administrator 
  breaks the original dataset into a few datasets, with each part including variables 
  which share the same axis variables. 
  Yes, that seems odd to people involved and it is different from most OPeNDAP servers.
  But ERDDAP™ emphasizes making the data available in many formats. 
  That is possible because ERDDAP™ uses/requires a more uniform data structure. 
  Although it is a little awkward (i.e., different than expected), ERDDAP™ can distribute 
  the projected data.

<p>[Yes, ERDDAP™ could have looser requirements for the data structure, 
  but keep the requirements for
  the output formats. But that would lead to confusion among many users, particularly newbies,
  since many seemingly valid requests for data with different structures would be invalid
  because the data wouldn't fit into the file type. 
  We keep coming back to the current system's design.]

<p>Some end users want data in a lat lon cylindrical projection like 
  Equirectangular / plate carr&eacute;e or Mercator)
  for ease-of-use in different situations. 
  For these situations, we encourage the ERDDAP™ administrator to use some
  other software (NCO? Matlab? R? IDV? ...?) to re-project the 
  data onto a geographic (Equirectangular projection / plate carr&eacute;e) or other
  cylindrical projection and serve that form of the data in ERDDAP™ as a different dataset. 
  This is similar to what people do when they convert satellite level 2 data into level 3 data.
One such tool is 
<a rel="help" href="https://nco.sourceforge.net/nco.html#Regridding">NCO<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a>
which offers extension options for regridding data.

<p><strong>GIS and <a class="selfLink" id="ReprojectingData" href="#ReprojectingData" rel="bookmark">Reprojecting Data</a></strong> -- 
Since the GIS world is often map oriented, GIS programs usually offer
support for reprojecting the data, 
i.e., plotting the data on a map with a different projection.

<p>Currently, ERDDAP™ does not have tools to reproject data. Instead, we recommend
that you use an external tool to make a variant of the dataset, 
where data has been reprojected from its original form onto a rectangular 
(latitude longitude) array suitable for ERDDAP.

<p>In our opinion, the CF/DAP world is a little different than the GIS world 
and works at a slightly lower level. ERDDAP™ reflects that.
In general, ERDDAP™ is designed to work primarily with data (not maps) 
and doesn't want to change (e.g., reproject) that data. For ERDDAP™, 
gridded data is often/usually/preferably associated with lat lon values and a cylindrical projection,
and not some projection's x,y values. In any case, ERDDAP™ doesn't do anything 
with the data's projection; it just passes the data through, as is, 
with its current projection, on the theory that a reprojection is a significant 
change to the data and ERDDAP™ doesn't want to be involved with significant changes. 
Also, subsequent users might naively reproject the data again,
which would be not as good as just doing one reprojection.
(So, if the ERDDAP™ administrator wants to offer the data in a different projection, fine; 
just reproject the data offline and offer that as a different dataset in ERDDAP. 
Lots of satellite-based datasets are offered as what NASA calls 
Level 2 (swath) and as Level 3 (Equirectangular projection) versions.)  
When ERDDAP™ makes maps (directly or via WMS or KML), 
ERDDAP™ currently only offers to make maps with the Equirectangular / plate carr&eacute;e 
projection which, fortunately, is accepted by most mapping programs.

<p>We encourage ERDDAP™ administrators to use some
  other software (NCO? Matlab? R? IDV? ...?) to re-project the 
  data onto a geographic (Equirectangular projection / plate carr&eacute;e) or other
  cylindrical projection and serve that form of the data in ERDDAP™ as a different dataset. 
  This is similar to what people do when they convert satellite level 2 data into level 3 data.
One such tool is 
<a rel="help" href="https://nco.sourceforge.net/nco.html#Regridding">NCO<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a>
which offers extension options for regridding data.

<p>We hope that ERDDAP™ will have built-in tools to offer maps with other projections in the future. 
We also hope to have better connections to the GIS world in the future 
(other than the current WMS service). 
It is terrible that in this "modern" world, the links between the CF/DAP world 
and the GIS world are still so weak.
Both of those things are on the To Do list.
(If you want to help, notably with connecting ERDDAP™ to MapServer, 
please email Chris.John at noaa.gov .)


<li><strong><a class="selfLink" id="dataTypes" href="#dataTypes" rel="bookmark">Data Types</a></strong>
<br>ERDDAP™ supports the following data types
<br>(the names are case sensitive; 'u' prefix stands for "unsigned"; the number many of the names in other systems is the number of bits):
<ul>
<li><strong>byte</strong> has signed integer values with a range of -128 to 127.
  <br>In other systems, this is sometimes called int8. 
  <br>This is called "tinyint" by SQL and Cassandra.
  <br>ERDDAP™ converts <a rel="help" href="#booleanData">boolean</a> from some sources (e.g., SQL and Cassandra)
    into bytes in ERDDAP™ with a value of 0=false, 1=true, and 127=missing_value.
<li><strong>ubyte</strong> has unsigned integer values with a range of 0 to 255.
  <br>In other systems, this is sometimes called uint8.
<li><strong>short</strong> has signed integer values with a range of -32768 to 32767.
  <br>In other systems, this is sometimes called int16.
  <br>This is called "smallint" by SQL and Cassandra.
<li><strong>ushort</strong> has unsigned integer values with a range of 0 to 65535.
  <br>In other systems, this is sometimes called uint16.
<li><strong>int</strong> has signed integer values with a range of -2147483648 to 2147483647.
  <br>In other systems, this is sometimes called int32. 
  <br>This is called "integer|numeric(?)" by SQL and "int" by Cassandra.
<li><strong>uint</strong> has unsigned integer values with a range of 0 to 4294967295.
  <br>In other systems, this is sometimes called uint32. 
<li><strong>long</strong> has signed integer values with a range of -9223372036854775808 to 9223372036854775807.
  <br>In other systems, this is sometimes called int64.
  <br>This is called "bigint|numeric(?)" by SQL and "bigint" by Cassandra.
  <br>Because many file types don't support long data, their use is discouraged.
    When possible, use double instead (see below).
<li><strong>ulong</strong> has unsigned integer values with a range of 0 to 18446744073709551615
  <br>In other systems, this is sometimes called uint64.
  <br>Because many file types don't support ulong data, their use is discouraged.
    When possible, use double instead (see below).
<li><strong>float</strong> is an IEEE 754 float with a range of approximately +/- 3.402823466e+38.
  <br>In other systems, this is sometimes called float32. 
  <br>This is called "real|float(?)|decimal(?)|numeric(?)" by SQL and "float" by Cassandra.
  <br>The special value <kbd>NaN</kbd> means Not-a-Number.
  <br>ERDDAP™ converts positive and negative infinity values to NaN.
<li><strong>double</strong> is an IEEE 754 double with a range of approximately 
  <br>+/- 1.7976931348623157E+308.
  <br>In other systems, this is sometimes called float64. 
  <br>This is called "double precision|float(?)|decimal(?)|numeric(?)" by SQL and "double" by Cassandra.
  <br>The special value <kbd>NaN</kbd> means Not-a-Number.
  <br>ERDDAP™ converts positive and negative infinity values to NaN.
<li><a class="selfLink" id="charData" href="#charData" rel="bookmark"><strong>char</strong></a>
  is a single, 2-byte (16-bit)
  <a rel="help" href="https://en.wikipedia.org/wiki/UTF-16"
      >Unicode UCS-2 character<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a>
   ranging from \u0000 (#0) through \uffff (#65535).
  <br>\uffff's definition is Not-a-Character, analogous to a double value of NaN.
  <br>The use of char is discouraged because many file types either don't support chars or only support
      1-byte chars (see below). Consider using String instead.
  <br>Users can use char variables to make graphs. ERDDAP™ will convert
      the characters to their Unicode code point number, which 
      can be used as numeric data.
<li><a class="selfLink" id="StringData" href="#StringData" rel="bookmark"><strong>String</strong></a>
  is a sequence of 0 or more, 2-byte (16-bit)
  <a rel="help" href="https://en.wikipedia.org/wiki/UTF-16"
      >Unicode UCS-2 characters<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a>.
  <br>ERDDAP™ uses/interprets a 0-length string as a missing value. ERDDAP™ does not support a true null string.
  <br>The theoretical maximum string length is 2147483647 characters, 
  but there are probably various problems in various places even with somewhat shorter Strings.
  <br>Use ERDDAP's String for SQL's character, varchar, character varying, binary, varbinary, interval, 
    array, multiset, xml, and any other database data type that doesn't fit cleanly
    with any other ERDDAP™ data type.
  <br>Use ERDDAP's String for Cassandra's "text" and any other Cassandra data type 
    that doesn't fit cleanly with any other ERDDAP™ data type.
  <br>&nbsp;
</ul>
Before ERDDAP™ v2.10, ERDDAP™ did not support unsigned integer types internally
  and offered limited support in its data readers and writers.

<p><a class="selfLink" id="DataTypeLimitations" href="#DataTypeLimitations" rel="bookmark"
    ><strong>Data Type Limitations</strong></a>
<br>You can think of ERDDAP™ as a system which has virtual datasets,
and which works by reading data from a dataset's source into an internal data model
and writing data to various services (e.g., (OPeN)DAP, WMS) and file types in response to user requests.
  <ul>
  <li>Each input reader supports a subset of the data types that ERDDAP™ supports.
     So reading data into ERDDAP's internal data structures isn't a problem.
  <li>Each output writer also supports a subset of data types. That's a problem because ERDDAP
    has to squeeze, for example, long data into file types that don't support long data.
    <br>&nbsp;
  </ul>
Below are explanations of the limitations (or none) of various output writers
and how ERDDAP™ deals with the problems.
Such complications are an inherent part of ERDDAP's goal of making disparate systems interoperable.
  <ul>
  <li><a class="selfLink" id="AsciiDataTypes" href="#AsciiDataTypes" rel="bookmark"
    >ASCII (.csv, .tsv, etc.) text files</a> -
    <ul>
    <li>All numeric data is written via its String representation
      (with missing data values appearing as 0-length strings).
    <li>Although ERDDAP™ writes long and ulong values correctly to ASCII text files,
      many readers (e.g., spreadsheet programs) can't correctly deal with long and ulong values and instead
      convert them to double values (with loss of precision in some cases).
    <li>Char and String data are written via JSON Strings, which handle all Unicode characters
      (notably, the "unusual" characters beyond ASCII #127, e.g., the Euro character appears as "\u20ac").
    </ul>
    <br>&nbsp;
  <li><a class="selfLink" id="JsonDataTypes" href="#JsonDataTypes" rel="bookmark"
    >JSON (.json, .jsonlCSV, etc.) text files</a> -
    <ul>
    <li>All numeric data is written via its String representation.
    <li>Char and String data are written as JSON Strings, which handle all Unicode characters
    (notably, the "unusual" characters beyond ASCII #127, e.g., the Euro character appears as "\u20ac").
    <li>Missing values for all numeric data types appear as <kbd>null</kbd>.
      <br>&nbsp;
    </ul>
  <li><a class="selfLink" id="nc3DataTypes" href="#nc3DataTypes" rel="bookmark">.nc3 files</a> -
    <ul>
    <li>.nc3 files don't natively support any unsigned integer data types. 
      Before CF v1.9, CF did not support unsigned integer types.
      To deal with this, ERDDAP™ 2.10+
      follows the NUG standard and always adds an "_Unsigned" attribute with a value of "true" or "false"
      to indicate if the data is from an unsigned or signed variable. All integer attributes
      are written as signed attributes (e.g., byte) with signed values (e.g., a ubyte actual_range attribute
      with values 0 to 255, appears as a byte attribute with values 0 to -1 (the inverse of the
      two's complement value of the out-of-range value). There is no easy way to know which (signed) integer
      attributes should be read as unsigned attributes.
      ERDDAP™ supports the "_Unsigned" attribute when it reads .nc3 files.
    <li>.nc3 files don't support the long or ulong data types.
      ERDDAP™ deals with this by temporarily converting them to be double variables.
      Doubles can exactly represent all values up to +/- 9,007,199,254,740,992 which is 2^53.
      This is an imperfect solution. Unidata refuses to make a minor
      upgrade to .nc3 to deal with this and related problems, citing .nc4 (a major change) as the solution.
    <li>The CF specification (before v1.9) said it supports a char data type
      but it is unclear if char is intended only as the building blocks of char arrays, which are effectively Strings.
      Questions to their mailing list yielded only confusing answers.
      Because of these complications, it is best to avoid char variables in ERDDAP™ and use String variables whenever possible.
    <li>Traditionally, .nc3 files only supported strings with ASCII-encoded (7-bit, #0 - #127) characters.
      NUG (and ERDDAP) extend that (starting ~2017) by including the attribute "_Encoding" with a value of
      "ISO-8859-1" (an extension of ASCII which defines all 256 values of each 8-bit character)
      or "UTF-8" to indicate how the String data is encoded. Other encodings may be legal but are discouraged.
      <br>&nbsp;
    </ul>
  <li><a class="selfLink" id="nc4DataTypes" href="#nc4DataTypes" rel="bookmark">.nc4 files</a>
    support all of ERDDAP's data types.
    <br>&nbsp;
  <li><a class="selfLink" id="NccsvDataTypes" href="#NccsvDataTypes" rel="bookmark">NCCSV files</a> -
    <br>NCCSV 1.0 files don't support any unsigned integer data types.
    <br><a rel="help" href="https://erddap.github.io/NCCSV.html">NCCSV 1.1+ files</a>
    support all unsigned integer data types.
    <br>&nbsp;
  <li><a class="selfLink" id="OpendapDataTypes" href="#OpendapDataTypes" rel="bookmark"
    >(OPeN)DAP (.das, .dds, .asc ASCII files, and .dods binary files)</a> -
    <ul>
    <li>(OPeN)DAP handles short, ushort, int, uint, float and double values correctly.
    <li>(OPeN)DAP has a "byte" data type that it defines as unsigned, whereas historically,
      THREDDS and ERDDAP™ have treated "byte" as signed in their (OPeN)DAP services. 
      To deal with this better, ERDDAP™ 2.10+
      follows the NUG standard and always adds an "_Unsigned" attribute with a value of "true" or "false"
      to indicate if the data is what ERDDAP™ calls byte or ubyte. All byte and ubyte attributes
      are written as "byte" attributes with signed values (e.g., a ubyte actual_range attribute
      with values 0 to 255, appears as a byte attribute with values 0 to -1 (the inverse of the
      two's complement value of the out-of-range value). There is no easy way to know which "byte"
      attributes should be read as ubyte attributes.
    <li>(OPeN)DAP does not support signed or unsigned longs.
      ERDDAP™ deals with this by temporarily converting them to be double variables and attributes.
      Doubles can exactly represent all values up to 9,007,199,254,740,992 which is 2^53.
      This is an imperfect solution. OPeNDAP (the organization) refuses to make a minor
      upgrade to DAP 2.0 to deal with this and related problems, citing DAP4 (a major change) as the solution.
    <li>Because (OPeN)DAP has no separate char data type and technically only supports 
       1-byte ASCII characters (#0 - #127) in Strings,
       char data variables will appear as 1-character-long Strings in 
       (OPeN)DAP .das, .dds, and .dods responses.
    <li>Technically, the (OPeN)DAP specification only supports strings with ASCII-encoded characters (#0 - #127).
      NUG (and ERDDAP) extend that (starting ~2017) by including the attribute "_Encoding" with a value of
      "ISO-8859-1" (an extension of ASCII which defines all 256 values of each 8-bit character)
      or "UTF-8" to indicate how the String data is encoded. Other encodings may be legal but are discouraged.
      <br>&nbsp;
    </ul>
  </ul>  
  
<a class="selfLink" id="dataTypesOtherComments" href="#dataTypesOtherComments" rel="bookmark">Other comments:</a>
  <ul>
  <li>Because of the poor support for long, ulong, and char data in many file types,
    we discourage the use of these data types in ERDDAP. When possible,
    use double instead of long and ulong, and use String instead of char.
    <br>&nbsp;
  <li>Metadata - Because (OPeN)DAP's .das and .dds responses don't support long or ulong attributes or data types
     (and instead show them as doubles),
     you may instead want to use ERDDAP's tabular representation of metadata as seen in the
     http.../erddap/<strong>info</strong>/<i>datasetID</i>.html web page 
      (for example, <a rel="help"
        href="https://coastwatch.pfeg.noaa.gov/erddap/info/cwwcNDBCMet/index.html"
             >https://coastwatch.pfeg.noaa.gov/erddap/info/cwwcNDBCMet/index.html</a> )
      (which you can also get in other file types, e.g., 
      .csv, .htmlTable,  .itx, .json, .jsonlCSV1, .jsonlCSV, .jsonlKVP, .mat, .nc, .nccsv, .tsv, .xhtml)
      or the .nccsvMetadata response
      (for example, <a rel="help"
        href="https://coastwatch.pfeg.noaa.gov/erddap/tabledap/cwwcNDBCMet.nccsvMetadata"
             >https://coastwatch.pfeg.noaa.gov/erddap/tabledap/cwwcNDBCMet.nccsvMetadata</a> 
      although .nccsvMetadata is only available for tabular datasets),
      both of which supports all data types (notably, long, ulong, and char).
    <br>&nbsp;
  </ul> 


<li><strong><a class="selfLink" id="MediaFiles" href="#MediaFiles" rel="bookmark"
  >Media Files</a></strong>
<br>Not all data are arrays of numbers or text. Some datasets consist of 
or include media files, such as image, audio and video files.
ERDDAP™ has some special features to make it easier for users to get access
to media files. It's a two step process:
<br>&nbsp;

<ol>
<li>Make each file accessible via its own URL, via a system that supports byte range requests.
<br>The easiest way to do this is to put the files in a directory 
that ERDDAP™ has access to. (If they are in a container like a .zip file, unzip them,
although you may want to offer the .zip file to users too.)
Then, make an
<a rel="help" href="#EDDTableFromFileNames">EDDTableFromFileNames</a>
dataset to make those files accessible via ERDDAP™, notably via ERDDAP's
<a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/files/documentation.html"
  >"files" system</a>.

<p>All files made accessible via EDDTableFromFileNames and ERDDAP's "files" system 
support
<a rel="help" href="https://en.wikipedia.org/wiki/Byte_serving">byte&nbsp;range&nbsp;requests<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a>.
Normally, when a client (e.g., a browser) makes a request to a URL, it gets the 
entire file as the response. 
But with a byte range request, the request specifies a range of bytes from the file,
and the server only returns those bytes.
This is relevant here because the audio and video players in browsers only work 
if the file can be accessed via byte range requests.

<p>Optional: If you have more than one dataset with associated media files,
you can make just one EDDTableFromFileNames which has a subfolder for each
group of files. The advantage is that when you want to add new media files for 
a new dataset, all you have to do is create a new folder and put the files
in that folder. The folder and files will be automatically added to the 
EDDTableFromFileNames dataset.

<li>Optional: If you have a dataset which includes references to media files, add it to ERDDAP.
<br>For example, you may have a .csv file with a row for each time someone saw 
a whale and a column which includes the name of an image file related to that sighting.
If the name of the image file is just the filename, e.g., Img20141024T192403Z,
not a full URL, then you need to add 
<a href="#fileAccessBaseUrl" rel="bookmark">fileAccessBaseUrl and/or fileAccessSuffix</a> 
attributes to the metadata for that dataVariable
which specifies the baseURL and suffix for those filenames.
If you made the files accessible via EDDTableFromFileNames, the URL will be in the form
<br><i>baseUrl</i>/erddap/files/<i>datasetID</i>/
<br>For example,
<br><kbd>&lt;att name="fileAccessBaseUrl"&gt;<i>someBaseURL</i>&lt;/a&gt;</kbd>
<br><kbd>&lt;att name="fileAccessSuffix"&gt;.png&lt;/a&gt;</kbd>
<p>If there is a .zip or other container file with all of the media files related to 
a data variable, we recommend that you also make that file accessible to users 
(see step 1 above) and then identify
it with a
<a href="#fileAccessArchiveUrl" rel="bookmark">fileAccessArchiveUrl</a> attribute.
</ol>

<p>[Starting in ERDDAP™ v1.82] If you do the first step above (or both steps), 
then when a user views the ERDDAP™ "files" system for 
that dataset 
(or asks to see a subset of the dataset via an .htmlTable request, if you did the second step),
ERDDAP™ will show a '?' icon to the left of the filename. 
If the user hovers over that icon, they will see a popup showing the image,
or an audio player, or a video player. 
Browsers only support a limited number of types of 
<ul>
<li>image (usually .gif, .jpg, and .png), 
<li>audio (usually .mp3, .ogg, and .wav), and 
<li>video files (usually .mp4, .ogv, and .webm).
</ul>
<p>Support varies with different versions of different browsers on different
operating systems. So if you have a choice of which file type to offer,
it makes sense to offer these types. 

<p>Or, if a user clicks on the filename shown on a ERDDAP™ web page, 
their browser will show the image,
audio or video file as a separate web page. This is mostly useful
to see a very large image or video scaled to full screen, instead of in a popup.


<li><a class="selfLink" id="AwsS3Files" href="#AwsS3Files" rel="bookmark"><strong>Working with AWS S3 Files</strong></a>
<br><a rel="help" href="https://aws.amazon.com">Amazon Web Service (AWS)<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
is a seller of 
<a rel="help" href="https://en.wikipedia.org/wiki/Cloud_computing">cloud computing<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
services.
<a rel="help" href="https://aws.amazon.com/s3/">S3<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
is an object storage system offered by AWS.
Instead of the hierarchical system of directories and files of a traditional file system
(like a hard drive in your PC),
S3 offers just "buckets" which hold "objects" (we'll call them "files").

<p>For ASCII files (e.g., .csv), ERDDAP™ can work with the files in the buckets
directly. The only thing you need to do is specify the &lt;fileDir&gt; for the 
dataset using a specific format for the AWS bucket, e.g.,
https://<i>bucketName</i>.s3.<i>aws-region</i>.amazonaws.com/<i>subdirectory</i>/ .
You should not use &lt;cacheFromUrl&gt; . 
See below for details.

<p>But for binary files (e.g., .nc, .grib, .bufr, and .hdf files), you do need to use 
the &lt;cacheFromUrl&gt; system described below.
ERDDAP, netcdf-java (which ERDDAP™ uses to read data from these files),
and other scientific data software are designed to work with 
files in a traditional file system which offers 
<a rel="help" href="https://en.wikipedia.org/wiki/Block-level_storage"
  >block level<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
access to files (which permits reading chunks of a file),
but S3 only offers 
<a rel="help" href="https://en.wikipedia.org/wiki/Block-level_storage"
  >file level (object)<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
access to files (which only permits reading the entire file).
AWS offers an alternative to S3,
<a rel="help" href="https://aws.amazon.com/ebs/"
  >Elastic Block Store (EBS)<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>),
which supports block level access to files 
but it is more expensive than S3, so it is rarely used for bulk storage of large
quantities of data files.
(So when people say storing data in the cloud (S3) is cheap, it is usually an 
apples to oranges comparison.)

<p><a class="selfLink" id="AwsS3Buckets" href="#AwsS3Buckets" rel="bookmark"
  ><strong>The Contents of a Bucket. Keys. Objects. Delimiters.</strong></a>
<br>Technically, S3 buckets aren't organized in a hierarchical file structure like a file system on a computer.
Instead, buckets only contain "objects" (files), each of which has a "key" (a name).
An example of a key in that noaa-goes17 bucket is 
<pre>ABI-L1b-RadC/2019/235/22/OR_ABI-L1b-RadC-M6C01_G17_s20192352201196_e20192352203569_c20192352204013.nc</pre>
The corresponding URl for that object is 
<pre><a rel="help" 
  href="https://noaa-goes17.s3.us-east-1.amazonaws.com/ABI-L1b-RadC/2019/235/22/OR_ABI-L1b-RadC-M6C01_G17_s20192352201196_e20192352203569_c20192352204013.nc"
       >https://noaa-goes17.s3.us-east-1.amazonaws.com/ABI-L1b-RadC/2019/235/22/OR_ABI-L1b-RadC-M6C01_G17_s20192352201196_e20192352203569_c20192352204013.nc<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a></pre>
AWS supports a little variation in how that URL is constructed, 
but ERDDAP™ requires this one specific format:
<br>&nbsp;&nbsp;https://<i>bucketName</i>.s3.<i>region</i>.amazonaws.com/<i>key</i> 
<br>It is common practice, as with this example, to make key names look like a hierarchical path 
plus a file name, but technically they aren't.
Since it is common and useful, ERDDAP™ treats keys with /'s as if they are a hierarchical path plus file name,
and this documentation will refer to them as such.
If a bucket's keys don't use /'s (e.g., a key like 
<br>ABI-Lib.2018.052.22.OR_ABI-L1b-RadM2-M3C10_G16_s20180522247575), then 
ERDDAP™ will just treat the whole key as a long file name.

<p>Private vs Public Buckets -- The administrator for the S3 bucket may make
the bucket and its contents public or private. If public, any file in the bucket
may be downloaded by anyone using the URL for the file. Amazon has an
<a rel="help" 
  href="https://aws.amazon.com/opendata/">Open Data<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
program which hosts public datasets (including data from NOAA, NASA, and USGS) for free
and doesn't charge for anyone to download the files from those buckets.
If a bucket is private, files in the bucket are only accessible to authorized
users and AWS charges a fee (usually paid by the bucket's owner) for downloading
files to a non-AWS S3 computer. ERDDAP™ can work with data in public and private 
buckets.
<ul>
<li><a class="selfLink" id="AwsCredentials" href="#AwsCredentials" rel="bookmark"
   >AWS Credentials</a> -- 
   To make it so that ERDDAP™ can read the contents of private buckets, you need AWS credentials and
   you need to store a credentials file in the standard place so ERDDAP™ can find the information.
  See the AWS SDK for Java 2.x documentation: <a rel="help" 
    href="https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/setup.html#setup-credentials"
         >Set default credentials<img 
          src="../images/external.png" alt=" (external link)" 
          title="This link to an external website does not constitute an endorsement."></a>.
   (The option to store the values as Java command line parameters in [tomcat]/bin/setenv.sh
   may be a good option.)

<li>/files/ system -- The ERDDAP™ <a rel="help" href="#accessibleViaFiles">/files/ system</a> 
  allows users to download the source files for a dataset. 
  We recommend that you turn this on for all datasets with source files because
  many users want to download the original source files.
  <ul>
  <li>If the files are in a private S3 bucket, the user's request to download a file
    will be handled by ERDDAP™, which will read the data from the file and then
    transmit it to the user, thus increasing the load on your ERDDAP™, 
    using incoming and outgoing bandwidth, and making you (the ERDDAP™ administrator) 
    pay the data egress fee to AWS. 
  <li>If the files are in a public S3 bucket, the user's request to download a file
    will be redirected to the AWS S3 URL for that file, so the data won't flow
    through ERDDAP™, thus reducing the load on ERDDAP.
    And if the files are in an Amazon Open Data (free) public bucket, 
    then you (the ERDDAP™ administrator) won't have to pay any data egress fee to AWS.
    Thus, there is a big advantage serving data from public (not private) S3 buckets,
    and a huge advantage to serving data from Amazon Open Data (free) buckets.
  </ul>

</ul>

<p><a class="selfLink" id="ErddapAndS3Buckets" href="#ErddapAndS3Buckets" rel="bookmark"
  ><strong>ERDDAP™ and AWS S3 Buckets</strong></a>
<br>Fortunately, after much effort, 
ERDDAP™ has a number of features which allow it to deal with the 
inherent problems of working with S3's block level access to files
in a reasonably efficient way:
<ul>
<li>[Disclaimer: Working with AWS S3 buckets is a lot of extra work.
  AWS is a huge ecosystem of services and features. 
  There's a lot to learn. 
  It takes time and effort, but it is do-able.
  Be patient and you'll get things working.
  Look/ask for help
  <br>(<a rel="help" href="https://aws.amazon.com/documentation/gettingstarted/"
    >AWS documentation<img 
    src="../images/external.png" alt=" (external link)" 
    title="This link to an external website does not constitute an endorsement."></a>,
  websites like 
  <a rel="help" href="https://stackoverflow.com/"
    >Stack Overflow<img 
    src="../images/external.png" alt=" (external link)" 
    title="This link to an external website does not constitute an endorsement."></a>,
  and the regular 
  <br><a rel="bookmark" href="#contact">ERDDAP™ support options</a>)
  if/when you get stuck.]
  <br>&nbsp;

<li>It can be hard to even find out the directory structure and file names of 
  the files in an S3 bucket. ERDDAP™ has a solution for this problem:
  EDDTableFromFileNames has a special 
  <a href="#fromOnTheFly" rel="help">***fromOnTheFly</a> option which 
  lets you make an EDDTableFromFileNames dataset which allows users to 
  browse the contents of an S3 bucket (and download files)
  via the dataset's "files" option.
  There is an 
  <a href="#AwsS3ViewBucketContents" rel="help">example of this below</a>.
  <br>&nbsp;

<li>ERDDAP™ can read data from 
  <a rel="help" href="#ExternallyCompressedFiles">externally compressed data files</a>,
  so it is fine if the files on S3 are stored as 
  .gz, .gzip, .bz2, .Z, or other types of externally compressed data files,
  which can dramatically (2 - 20X) cut down on file storage costs. 
  There is often no time penalty for using externally compressed files,
  since the time saved by transferring a smaller file from S3 to ERDDAP
  roughly balances the extra time needed for ERDDAP™ to decompress the file.
  To use this feature, you just have to make sure that the dataset's
  <kbd>&lt;fileNameRegex&gt;</kbd> allows for the compressed file type (e.g.,
  by adding <kbd>(|.gz)</kbd> to the end of the regex).
  <br>&nbsp;

<li>For the most common case, where you have an ERDDAP™ installed on your PC
for test/development and where the dataset has binary data files which are stored
as objects in an S3 bucket, one approach to getting the dataset in ERDDAP™ is:
  <ol>
  <li>Create a directory on your PC to hold a few test data files.
  <li>Download two data files from the source to the directory you just created.
  <li>Use 
    <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml</a>
    to generate the chunk of datasets.xml for the dataset
    based on the two local data files.
  <li>Check that that dataset works as desired with 
    <a rel="help" href="#DasDds">DasDds</a> and/or your local ERDDAP.
    <p><strong>The following steps make a copy of that dataset
    (which will get data from the S3 bucket) on a public ERDDAP.</strong>
  <li>Copy the chunk of datasets.xml for the dataset to the datasets.xml for the 
    public ERDDAP™ that will serve the data.
  <li>Create a directory on the public ERDDAP's local hard drive 
    to hold a cache of temporary files. 
    The directory won't use a lot of disk space
    (see cacheSizeGB below). 
  <li>Change the value of the dataset's <kbd>&lt;fileDir&gt;</kbd> tag so that 
    it points to the directory you just created (even though the directory is empty).
  <li>Add a <a rel="help" href="#cacheFromUrl"><kbd>cacheFromUrl</kbd></a> tag
    which specifies the dataset's bucket name and optional prefix (i.e., directory) 
    in the specific
    <a rel="help" href="#AwsS3URLFormat">Aws S3 URL Format that ERDDAP™ requires</a>.    
  <li>Add a <a rel="help" href="#cacheFromUrl">&lt;cacheSizeGB&gt;</a> tag
    to the dataset's xml (e.g., 10 is a good value for most datasets)
    to tell ERDDAP™ to limit the size of the local cache 
    (i.e., don't try to cache all of the remote files).
  <li>See if that works in the public ERDDAP.
    Note that the first time ERDDAP™ loads the dataset, it will take a
    long time to load, because ERDDAP™ needs to download and read
    all of the data files. 
    <p>If the dataset is a huge collection of huge gridded data files, 
    this will take a very long time and be impractical. 
    In some cases, for gridded data files, 
    ERDDAP™ can extract the needed information
    (e.g., the time point for the data in a gridded data file) from the file name
    and avoid this problem.
    See <a rel="help" href="#EDDGridFromFiles_AggregationViaFileNames">Aggregation via File Names</a>.
  <li>Optionally (but especially for EDDTableFromFiles datasets), you can add an
    <a href="#nThreads" rel="bookmark">nThreads</a> tag to the dataset to tell ERDDAP
    to use more than 1 thread when responding to a user's request for data.
    This minimizes the effects of the delay that occurs when ERDDAP™ 
    reads data files from (remote) AWS S3 buckets into the local cache
    and (perhaps) decompressing them.
  </ol>

</ul>

<p><a class="selfLink" id="AwsS3OpenData" href="#AwsS3OpenData" rel="bookmark"
  ><strong>AWS S3 Open Data</strong></a>
<br>As part of NOAA's
<a rel="help" href="https://www.noaa.gov/nodd/about"
  >Big Data Program<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>,
NOAA has partnerships with five organizations, including AWS, 
"to explore the potential benefits of storing copies of key observations and 
model outputs in the Cloud to allow computing directly on the data without 
requiring further distribution". 
AWS includes the datasets it gets from NOAA as part of its program to offer 
public access to a large collection of 
<a rel="help" href="https://registry.opendata.aws/"
  >Open Data on AWS S3<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
from any computer, whether it is 
an Amazon compute instance (a rented computer) on the AWS network or your own PC on any network.
The example below assumes you are working with a publicly accessible dataset. 

<p><a class="selfLink" id="AwsS3URLFormat" href="#AwsS3URLFormat" rel="bookmark"
><strong>Accessing Files in an AWS S3 Bucket</strong></a>
<br>For a private S3 data bucket, the bucket's owner must give you access to the bucket.
(See the AWS documentation.)

<p>In all cases, you will need an AWS account because the AWS SDK for Java 
(which ERDDAP™ uses to retrieve information about the contents of a bucket)
requires AWS account credentials. (more on this below)

<p>ERDDAP™ can only access AWS S3 buckets
if you specify the <a rel="help" href="#cacheFromUrl">&lt;cacheFromUrl&gt;</a>
(or &lt;fileDir&gt;) in a specific format: 
<br><kbd>https://<i>bucketName</i>.s3.<i>aws-region</i>.amazonaws.com/<i>prefix/</i></kbd>
<br>where
<ul>
<li>The bucketName is the short form of the bucket name, e.g. noaa-goes17 .
<li>The aws-region, e.g., us-east-1, is from the "Region" column in one of 
  the tables of 
  <a rel="help" 
  href="https://docs.aws.amazon.com/general/latest/gr/rande.html"
  >AWS Service Endpoints<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
  where the bucket is actually located.
<li>The prefix is optional. If present, it must end with '/'.
</ul>
For example, <kbd>https://noaa-goes17.s3.us-east-1.amazonaws.com/ABI-L1b-RadC/</kbd>
<br>This URL format is one of the AWS S3 recommendations: see
<a rel="help" 
  href="https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingBucket.html"
  >Accessing a Bucket<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
 and 
<a rel="help" 
  href="https://docs.aws.amazon.com/AmazonS3/latest/dev/ListingKeysHierarchy.html"
  >this description of prefixes<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
ERDDAP™ requires that you combine the bucket URL and the optional prefix into one URL
in order to specify the &lt;cacheFromUrl&gt; (or &lt;fileDir&gt;) where the files are located.

<p><a class="selfLink" id="AwsS3Test" href="#AwsS3Test" rel="bookmark">For public buckets, you can and should test the bucket URL</a>
of the AWS S3 directory in your browser, e.g., 
<br><a rel="help" 
  href="https://noaa-goes17.s3.us-east-1.amazonaws.com"
       >https://noaa-goes17.s3.us-east-1.amazonaws.com<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>  
If the bucket URL is correct and appropriate for ERDDAP,
it will return an XML document which has (partial) listing of the contents of that bucket.
Unfortunately, the full URL (i.e., bucket URL plus prefix) 
that ERDDAP™ wants for a given dataset doesn't work in a browser.
AWS doesn't offer a system to browse the hierarchy of a bucket easily in your browser.
(If that is incorrect, please email Chris.John at noaa.gov. 
Otherwise, Amazon, please add support for this!)

<p><a class="selfLink" id="AwsS3ViewBucketContents" href="#AwsS3ViewBucketContents" rel="bookmark"><strong>Viewing the Contents of a Bucket</strong></a>
<br>S3 buckets often contain a couple of categories of files,
in a couple of pseudo subdirectories, which could become a couple of ERDDAP™ datasets.
To make the ERDDAP™ datasets, you need to know the starting directory for &lt;cacheFromUrl&gt; 
(or &lt;fileDir&gt;) and the format of the file names which identify that subset of files.
If you try to view the entire contents of a bucket in a browser, 
S3 will just show you the first 1000 files, which is insufficient.
Currently, the best way for you to view all of the contents of a bucket is to make an 
<a rel="help" 
  href="#EDDTableFromFileNames"
        >EDDTableFromFileNames</a>
dataset (on your PC's ERDDAP™ and/or on your public ERDDAP), which also gives you an 
easy way to browse the directory structure and download files. 
The &lt;fileDir&gt; for that will be the URL you made above, e.g., 
<kbd>https://noaa-goes17.s3.us-east-1.amazonaws.com</kbd> .
[Why doesn't AWS S3 offer a quick and easy way for anyone to do this without an AWS account?]
Note that when I do this on my PC on a non-Amazon network, 
it appears that Amazon slows down the response to a trickle (about 100(?) files per chunk)
after the first few chunks (of 1000 of files per chunk) are downloaded. 
Since buckets may have a huge number of files (noaa-goes17 has 26 million), 
getting all of the contents of a bucket may take EDDTableFromFileNames several hours
(e.g., 12!) to finish. 
[Amazon, is that right?!]

<p><a class="selfLink" id="AwsS3MakeAnEDDTableFromFileNamesDataset" href="#AwsS3MakeAnEDDTableFromFileNamesDataset" rel="bookmark"
  ><strong>Making an EDDTableFromFileNames Dataset with the Contents of an AWS S3 Bucket</strong></a>
<br>If you have a bucket name, but don't already have a list of files in the S3 bucket
or the prefix that identifies location of the relevant files in the bucket,
use the instructions below to make an EDDTableFromFileNames dataset so you can
browse the directory hierarchy of the S3 bucket via ERDDAP's "files" system. 
<ol>
<li>Open an AWS Account
<br>ERDDAP™ uses the 
<a rel="help" 
  href="https://docs.aws.amazon.com/sdk-for-java/index.html"
       >AWS SDK for Java<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
to get bucket information from AWS,
so you need to 
<a rel="help" 
  href="https://aws.amazon.com/premiumsupport/knowledge-center/create-and-activate-aws-account/"
       >create and activate an AWS account<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
That's a pretty big job, with lots of things to learn.
<br>&nbsp;

<li>Put your AWS Credentials where ERDDAP™ can find them.
<br>Follow the instructions at 
<a rel="help" 
  href="https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/setup.html#setup-credentials"
       >Set up AWS Credentials and Region for Development<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
so ERDDAP™ (specifically, the AWS SDK for Java) will be able to find and use your AWS credentials.
If ERDDAP™ can't find the credentials, you will see a
<br><kbd>java.lang.IllegalArgumentException: profile file cannot be null</kbd>
error in ERDDAP's log.txt file.

<p>Hint for Linux and Mac OS: the credentials file must be in the home directory of 
the user that is running Tomcat (and ERDDAP) (for this paragraph, we'll assume user=tomcat)
in a file called ~/.aws/credentials .
Don't assume that ~ is /home/tomcat -- actually use <kbd>cd ~</kbd> to find out 
where the operating system thinks ~ for user=tomcat is. Create the directory if it doesn't exist.
Also, after you put the credentials file in place, make sure the user and group for 
the file are <kbd>tomcat</kbd> and then use <kbd>chmod 400 credentials</kbd> 
to make sure the file is read-only for user=tomcat. 

<!-- also:?
  https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html#having-ec2-create-your-key-pair -->

<li>Create the bucket URL in the 
<a rel="help" href="#AwsS3URLFormat">format that ERDDAP™ requires</a>, e.g., 
<br><a rel="help" 
  href="https://noaa-goes17.s3.us-east-1.amazonaws.com"
       >https://noaa-goes17.s3.us-east-1.amazonaws.com<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> ,
and (for public buckets) test it in a browser to make sure it returns an XML document which has a partial 
listing the contents of that bucket.
<br>&nbsp;

<li>Use <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml</a>
to create an
<a rel="help" href="#EDDTableFromFileNames">EDDTableFromFileNames</a> dataset:
  <ul>
  <li>&nbsp;&nbsp;For the <kbd>Starting directory</kbd>, use this syntax:
    <br><kbd>***fromOnTheFly,<i>yourBucketUrl</i></kbd>
    <br>for example,
    <br><kbd>***fromOnTheFly,https://noaa-goes17.s3.us-east-1.amazonaws.com/</kbd>
  <li><kbd>File name regex? .*</kbd>
  <li><kbd>Recursive? true</kbd>
  <li><kbd>reloadEveryNMinutes? 10080</kbd>
  <li><kbd>infoUrl? https://registry.opendata.aws/noaa-goes/</kbd>
  <li><kbd>institution? NOAA</kbd>
  <li><kbd>summary? nothing</kbd> (ERDDAP™ will create a decent summary automatically.)
  <li><kbd>title? nothing</kbd> &nbsp; &nbsp; (ERDDAP™ will create a decent title automatically.)
  </ul>
  As usual, you should edit the resulting XML 
  to verify correctness and make improvements before 
  the chunk of datasets using it in datasets.xml.

<li><a class="selfLink" id="AwsS3Example"
                        href="#AwsS3Example" rel="bookmark"
  >If you follow the instructions above and load the dataset in ERDDAP,</a>
  you have created an EDDTableFromFiles dataset.
  As an example, and to make it easier for anyone to browse and download
  files from the AWS Open Data buckets, we have created EDDTableFromFileNames
  datasets (see the list at
  <br><a rel="help" 
  href="https://upwell.pfeg.noaa.gov/erddap/search/index.html?searchFor=awsS3Files_"
       >https://upwell.pfeg.noaa.gov/erddap/search/index.html?searchFor=awsS3Files_</a>)
  for almost all of the 
  <a rel="help" href="https://registry.opendata.aws/"
  >AWS S3 Open Data buckets<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
<br>[The few buckets that we didn't include 
  either have a large number of files in the root directory (more than can be downloaded in a reasonable amount of time),
  or don't allow public access (aren't they all supposed to be public?),
  or are Requester Pays buckets (e.g., Sentinel).]

<br>If you click on the "files" link for one of these datasets, 
  you can browse the directory tree and files in that S3 bucket.
  Because of the way ***fromOnTheFly EDDTableFromFiles works, these
  directory listings are always perfectly up-to-date because ERDDAP™ gets them on-the-fly.
  If you click down the directory tree to an actual file name and click on the file name,
  ERDDAP™ will redirect your request to AWS S3 so that you can download the file 
  directly from AWS. You can then inspect that file.

<p>Trouble? 
<br>If your EDDTableFromFiles won't load in ERDDAP™ (or DasDds), look in the 
log.txt file for an error message.
If you see a 
<br><kbd>java.lang.IllegalArgumentException: profile file cannot be null</kbd>
error, the problem is that the AWS SDK for Java (used by ERDDAP) isn't
finding the credentials file. See the credentials instructions above.
<br>&nbsp;
</ol>

It is unfortunate that AWS doesn't simply allow people to use a browser
to view the contents of a public bucket.

<p><strong>Then you can make ERDDAP™ datasets that give users access to the data in the files.</strong>
<br>See the instructions in 
<a href="#ErddapAndS3Buckets" rel="bookmark">ERDDAP™ and S3 Buckets</a> (above).
<br>For the sample EDDTableFromFileNames dataset that you made above, 
  if you do a little poking around with the directory and file names in the directory tree,
  it becomes clear that the top level directory names (e.g., ABI-L1b-RadC)
  correspond to what ERDDAP™ would call separate datasets.
  The bucket you are working with may be similar.
  You could then pursue creating separate datasets in ERDDAP™ for each of those datasets, using, e.g., 
  <br>https://noaa-goes17.s3.us-east-1.amazonaws.com/ABI-L1b-RadC/
  <br>as the <kbd>&lt;cacheFromUrl&gt;</kbd>. Unfortunately, for this particular
  example, the datasets in the bucket all seem to be level 1 or level 2 datasets,
  which ERDDAP™ 
  <a href="#differentDimensions" rel="bookmark">isn't particularly good at</a>,
  because the dataset is a more complicated collection of variables which use different dimensions.
<br>&nbsp;


<li><a class="selfLink" id="NcML" href="#NcML" rel="bookmark"><strong>NcML .ncml Files</strong></a>
<br>NcML files let you specify on-the-fly changes to one or more
  original source NetCDF (v3 or v4) .nc, .grib, .bufr, or .hdf (v4 or v5) files, 
  and then have ERDDAP™ treat the .ncml files as the source files.
  ERDDAP™ datasets will accept .ncml files whenever .nc files are expected.
  The NcML files MUST have the extension .ncml.
  See the 
  <a rel="help" 
  href="https://docs.unidata.ucar.edu/netcdf-java/current/userguide/ncml_overview.html"
  >Unidata NcML documentation<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
  NcML is useful because you can do some things with it (for example, making different changes to 
  different files in a collection, including adding a dimension with a specific value to a file),
  that you can't do with ERDDAP's datasets.xml.
  <ul>
  <li>Changes to an .ncml file's lastModified time will cause the file to be 
    reloaded whenever the dataset is reloaded, but changes to the underlying .nc data files
    won't be directly noticed.
  <li>Hint: NcML is *very* sensitive to the order of some 
    items in the NcML file. Think of NcML as specifying a series
    of instructions in the specified order, with the intention of
    changing the source files (the state at the start/top of the NcML file)
    into the destination files (the state at the end/bottom of the NcML file).
  </ul>

  <p>An alternative to NcML is the 
  <a rel="help" href="#NCO">NetCDF Operators (NCO)</a>.
  The big difference is that NcML is a system for making changes on-the-fly
  (so the source files aren't altered), whereas NCO can be used to make
  changes to (or new versions of) the files.
Both NCO and NcML are very, very flexible and allow you to make almost any
change you can think of to the files.
For both, it can be challenging to figure out exactly how to do what you want to do --
check the web for similar examples.
Both are useful tools for preparing netCDF and HDF files for use with ERDDAP,
notably, to make changes beyond what ERDDAP's manipulation system can do.

  <p>Example #1: Adding a Time Dimension with a Single Value
  <br>Here's an .ncml file that creates a new outer dimension (time, with
1 value: 1041379200) and adds that dimension to the pic variable in the file named
A2003001.L3m_DAY_PIC_pic_4km.nc:
<pre>
&lt;netcdf xmlns='https://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2'&gt;
  &lt;variable name='time' type='int' shape='time' /&gt;
  &lt;aggregation dimName='time' type='joinNew'&gt;
    &lt;variableAgg name='pic'/&gt;
    &lt;netcdf location='A2003001.L3m_DAY_PIC_pic_4km.nc' coordValue='1041379200'/&gt;
  &lt;/aggregation&gt;
&lt;/netcdf&gt;
</pre>

  <p>Example #2: Changing an Existing Time Value
  <br>Sometimes the source .nc file already has a time dimension and time value,
    but the value is incorrect (for your purposes). 
    This .ncml file says: for the data file named ""19810825230030-NCEI...",
    for the dimension variable "time", set the units attribute to be
    'seconds since 1970-01-01T00:00:00Z' and set the time value to be 367588800.
<pre>
&lt;netcdf xmlns='https://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2'
  location="19810825230030-NCEI-L3C_GHRSST-SSTskin-AVHRR_Pathfinder-PFV5.3_NOAA07_G_1981237_day-v02.0-fv01.0.nc"&gt;
  &lt;variable name="time"&gt;
    &lt;attribute name='units' value='seconds since 1970-01-01T00:00:00Z' /&gt;
    &lt;values&gt;367588800&lt;/values&gt;
  &lt;/variable&gt;
&lt;/netcdf&gt;
</pre>


<li><a class="selfLink" id="NCO" href="#NCO" rel="bookmark"><strong>NetCDF Operators (NCO)</strong></a>
<br>"The netCDF Operators (NCO) comprise a dozen standalone, command-line
programs that take netCDF [v3 or v4], HDF [v4 or v5], [.grib, .bufr,] and/or DAP 
files as input, then operate 
(e.g., derive new data, compute statistics, print, hyperslab, manipulate metadata)
and output the results to screen or files in text, binary, or netCDF formats. 
NCO aids analysis of gridded scientific data. The shell-command style of NCO 
allows users to manipulate and analyze files interactively, or with expressive 
scripts that avoid some overhead of higher-level programming environments." 
(from the
<a rel="help" href="https://nco.sourceforge.net/">NCO<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
homepage).

<p>An alternative to NCO is  
<a rel="help" href="#NcML">NcML</a>. 
The big difference is that NcML is a system for making changes on-the-fly
(so the source files aren't altered), whereas NCO can be used to make
changes to (or new versions of) the files.
Both NCO and NcML are very, very flexible and allow you to make almost any
change you can think of to the files.
For both, it can be challenging to figure out exactly how to do what you want to do --
check the web for similar examples.
Both are useful tools for preparing netCDF and HDF files for use with ERDDAP,
notably, to make changes beyond what ERDDAP's manipulation system can do.

<p>For example, you can use NCO to make the units of the time variable
consistent in a group of files where they weren't consistent originally.
Or, you can use NCO to apply scale_factor and add_offset in a group of 
files where scale_factor and add_offset have different values in different source files.
<br>(Or, you can now deal with those problems in ERDDAP™ via
<a rel="help" href="#EDDGridFromNcFilesUnpacked">EDDGridFromNcFilesUnpacked</a>,
which is a variant of EDDGridFromNcFiles which unpacks packed data and
standardizes time values at a low level in order to deal with a collection files 
that have different scale_factors and add_offset, or different time units.)


<p>NCO is Free and Open Source Software which uses the 
<a rel="help" href="https://www.gnu.org/licenses/gpl-3.0.html">GPL 3.0<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
license.

<p>Example #1: Making Units Consistent
<br>EDDGridFromFiles and EDDTableFrom Files insist that the units for a given variable
be identical in all of the files. If some of the files are trivially (not functionally) 
different from others (e.g., time units of 
<br>"seconds since 1970-01-01 00:00:00 UTC" versus 
<br>"seconds since 1970-01-01T00:00:00Z",
you could use NCO's 
<a rel="help"
href="https://nco.sourceforge.net/nco.html#ncatted-netCDF-Attribute-Editor">ncatted<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
to change the units in all of the files to be identical with 
<br><kbd>nco/ncatted -a units,time,o,c,'seconds since 1970-01-01T00:00:00Z' *.nc</kbd>
<br>[For many problems like this in EDDTableFrom...Files datasets, you can now use 
<a rel="help" href="#EDDTableFromFiles_standardizeWhat">standardizeWhat</a> to tell ERDDAP
to standardize the source files as they are read into ERDDAP.]

<li><strong><a class="selfLink" id="limits" href="#limits" rel="bookmark">Limits to the Size of a Dataset</a></strong>
<br>You'll see many references to "2 billion" below. 
More accurately, that is a reference to 2,147,483,647 (2^31-1),
which is the maximum value of a 32-bit signed integer. 
In some computer languages, for example Java (which ERDDAP™ is written in), 
that is the largest data type that can be used for 
many data structures (for example, the size of an array). 

<p>For String values (for example, for variable names, 
  attribute names, String attribute values, and String data values),
  the maximum number of characters per String in ERDDAP™ is ~2 billion. 
  But in almost all cases, there will be small or large problems if 
  a String exceeds a reasonable size
  (e.g., 80 characters for variable names and attribute names,
  and 255 characters for most String attribute values and data values).
  For example, web pages which display long variable names will be awkwardly wide
  and long variable names will be truncated if they exceed the limit of the 
  response file type.

<p>For gridded datasets:
<ul>
<li>The maximum number of axisVariables is ~2 billion.
<br>The maximum number of dataVariables is ~2 billion.
<br>But if a dataset has &gt;100 variables, it will be cumbersome for users to use.
<br>And if a dataset has &gt;1 million variables, 
  your server will need a lot of physical memory and there will be other problems.
<li>The maximum size of each dimension (axisVariable) is ~2 billion values.
<li>I think the maximum total number of cells (the product of all dimension sizes)
    is unlimited, but it may be ~9e18.
</ul>

<p>For tabular datasets:
<ul>
<li>The maximum number of dataVariables is ~2 billion.
<br>But if a dataset has &gt;100 variables, it will be cumbersome for users to use.
<br>And if a dataset has &gt;1 million variables, 
  your server will need a lot of physical memory and there will be other problems.
<li>The maximum number of sources (for example, files) that can be aggregated is ~2 billion.
<li>In some cases, the maximum number of rows from an individual source 
  (for example, a file, but not a database) is ~2 billion rows. 
<li>I don't think there are other limits.
</ul>

<p>For both gridded and tabular datasets, there are some internal 
limits on the size of the subset that can be requested by a user in a single request
(often related to &gt;2 billion of something or ~9e18 of something),
but it is far more likely that a user will hit the file-type-specific limits.
<ul>
<li>NetCDF version 3 .nc files are limited to 2GB bytes. 
  (If this is really a problem for someone, let me know: 
  I could add support for the NetCDF version 3 .nc 64-bit extension
  or NetCDF Version 4, 
  which would increase the limit significantly, but not infinitely.) 
<li>Browsers crash after only ~500MB of data, 
  so ERDDAP™ limits the response to .htmlTable requests to ~400MB of data. 
<li>Many data analysis programs have similar limits
  (for example, the maximum size of a dimension is often ~2 billion values),
  so there is no reason to work hard to get around the file-type-specific limits.
<li>The file-type-specific limits are useful in that they prevent naive requests for truly 
  huge amounts of data (for example, "give me all of this dataset" when the dataset has
  20TB of data), which would take weeks or months to download.
  The longer the download, the more likely it will fail for a variety of reasons.
<li>The file-type-specific limits are useful in that they force the user to deal 
  with reasonably-sized subsets (for example, dealing with 
  a large gridded dataset via files with data from one time point each).
  <br>&nbsp;
</ul>


<li><strong><a class="selfLink" id="switchToACDD13" href="#switchToACDD13" rel="bookmark">Switch to ACDD-1.3</a></strong>
<br>We (notably 
<a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml</a>)
currently recommend
  <a rel="bookmark" 
  href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3">ACDD version 1.3</a>,
  which was ratified in early 2015 
  and which is referred to as "ACDD-1.3" in the global Conventions attribute.
  Prior to ERDDAP™ version 1.62 (released in June 2015),
  ERDDAP™ used/recommended the original, version 1.0,
  of the <a rel="bookmark" 
  href="https://wiki.esipfed.org/ArchivalCopyOfVersion1">NetCDF Attribute Convention for Dataset Discovery</a>
  which was referred to as "Unidata Dataset Discovery v1.0" in the
  global Conventions and Metadata_Conventions attributes.
  
  <p>If your datasets use earlier versions of ACDD, we RECOMMEND that you 
  switch to ACDD-1.3. 
  It isn't hard. ACDD-1.3 is highly backward compatible with version 1.0.
  To switch, for all
  datasets (except EDDGridFromErddap and EDDTableFromErddap datasets):
  <ol>
  <li>Remove the newly deprecated global <kbd>Metadata_Conventions</kbd> 
    attribute by adding (or by changing the existing <kbd>Metadata_Conventions</kbd> attribute) 
    <br><kbd>&lt;att name="Metadata_Conventions"&gt;null&lt;/att&gt;</kbd>
    <br>to the dataset's global &lt;addAttributes&gt;.
    <br>&nbsp;

  <li>If the dataset has a <kbd>Conventions</kbd> attribute in the global &lt;addAttributes&gt;,
    change all "Unidata Dataset Discovery v1.0" references to "ACDD-1.3".
    <br>If the dataset doesn't have a <kbd>Conventions</kbd> attribute in the 
    global &lt;addAttributes&gt;, then add one that refers to ACDD-1.3.
    For example,
    <br><kbd>&lt;att name="Conventions"&gt;COARDS, CF-1.6, ACDD-1.3&lt;/att&gt;</kbd>    
    <br>&nbsp;

  <li>If the dataset has a global <kbd>standard_name_vocabulary</kbd> attribute,
    please change the format of the value to, for example, 
    <br><kbd>&lt;att name="standard_name_vocabulary"&gt;CF Standard Name Table v65&lt;/att&gt;</kbd>
    <br>If the reference is to an older version of the 
    <a rel="help" href="https://cfconventions.org/Data/cf-standard-names/current/build/cf-standard-name-table.html">CF standard name table<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
    it is probably a good idea to switch to the current version (65, as we write this),
    since new standard names are added to that table with subsequent versions,
    but old standard names are rarely deprecated and never removed.
    <br>&nbsp;

  <li>Although ACDD-1.0 included global attributes for 
    <kbd>creator_name, creator_email, creator_url</kbd>, 
    <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml</a>
    didn't automatically add them until sometime around ERDDAP™ v1.50. 
    This is important information:
    <ul>
    <li>creator_name lets users know/cite the creator of the dataset.
    <li>creator_email tells users the preferred email address for contacting
      the creator of the dataset, for example if they have questions about the dataset.
    <li>creator_url gives users a way to find out more about the creator.
    <li>ERDDAP™ uses all of this information when generating FGDC and ISO 19115-2/19139
      metadata documents for each dataset. Those documents are often used
      by external search services.
    </ul>
    Please add these attributes to the dataset's global &lt;addAttributes&lt;.
    <br><kbd>&lt;att name="creator_name"&gt;NOAA NMFS SWFSC ERD&lt;/att&gt;
    <br>&lt;att name="creator_email"&gt;erd.data@noaa.gov&lt;/att&gt;
    <br>&lt;att name="creator_url"&gt;https://www.pfeg.noaa.gov&lt;/att&gt;</kbd>
    <br>&nbsp;
  </ol>     
  That's it. I hope that wasn't too hard.
  <br>&nbsp;

<li><strong><a class="selfLink" id="Zarr" href="#Zarr" rel="bookmark">Zarr</a></strong>
<br>
<p>As of version 2.25 ERDDAP™ can read local Zarr files using
  <a rel="help" href="#EDDTableFromNcFiles">EDDTableFromNcFiles</a> and <a rel="help" href="#EDDGridFromNcFiles">EDDGridFromNcFiles</a>.

<p>(As of August 2019)
    We could easily be wrong, but we are not yet convinced that 
    <a rel="help" href="https://github.com/zarr-developers/zarr-python">Zarr<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a>,
    or similar systems which break data files up into smaller chunks, are great
    solutions to the problem of ERDDAP™ reading data stored in cloud services like Amazon AWS S3. 
    Zarr is a great technology that has shown its usefulness in a variety of situations,
    we're just not sure that ERDDAP+S3 will be one of those situations. Mostly we're saying:
    before we rush to make the effort to store all our data in Zarr, let's do some tests to see
    if it is actually a better solution.

    <p>The problems with accessing data in the cloud are latency (the lag to first get data)
    and file-level access (rather than block-level access).
    Zarr solves the file-level access problem, but does nothing about latency. 
    Compared to just downloading the file (so it can be read as a local file with block-level access), 
    Zarr may even exacerbate the latency problem because, with Zarr, 
    reading a file now involves a series of several calls to read different parts of the file 
    (each with its own lag).
    The latency problem can be solved by parallelizing the requests, but that is
    a higher-level solution, not dependent on Zarr.

    <p>And with Zarr (as with relational databases), 
    we lose the convenience of having a data file be a simple, single file 
    that you can easily verify the integrity of, or make/download a copy of.

    <p>ERDDAP™ (as of v2) has a system for maintaining a local cache of files from a 
    URL source (e.g., S3)  
    (see <a rel="help" href="#cacheFromUrl">&lt;cacheFromUrl&gt; and &lt;cacheMaxGB&gt;</a>).
    And the new 
    <a rel="help" href="#nThreads">&lt;nThreads&gt;</a>
    should minimize the latency problem by parallelizing data retrieval at a high level. 
    &lt;cacheFromUrl&gt; seems to work very well for many scenarios. 
    (We're not sure how beneficial &lt;nThreads&gt; is without further tests.) 
    We admit we haven't done timing tests on an AWS instance with a good network connection, 
    but we have successfully tested with various remote URL sources of files.
    And ERDDAP's &lt;cacheFromUrl&gt; works with any type of data file (e.g., .nc, .hdf,
    .csv, .jsonlCSV), even if externally compressed (e.g., .gz), without any changes
    to the files (e.g., rewriting them as Zarr collections). 

    <p>It is likely that different scenarios will favor different solutions, e.g.,
    only need to read part of a file once (Zarr will win), vs. need to read all of a file once, 
    vs. need to read part or all of a file repeatedly (&lt;cacheFromUrl&gt; will win). 

    <p>Mostly we're saying: before we rush to make the effort to store all our data in Zarr, 
    let's do some tests to see if it is actually a better solution.

</ul>

<br>&nbsp;
<hr>
<p><strong><a class="selfLink" id="datasetTypes" href="#datasetTypes" rel="bookmark">List of Types Datasets</a></strong>
<p>If you need help choosing the right dataset type, see 
  <a rel="help" href="#ChoosingTheDatasetType">Choosing the Dataset Type</a>.

<p>The types of datasets fall into two categories.  (<a rel="help" href="#dataStructures">Why?</a>)
<ul>
<li><a class="selfLink" id="EDDGrid" href="#EDDGrid" rel="bookmark"><strong>EDDGrid</strong></a> datasets handle gridded data. 
  <ul>
  <li>In EDDGrid datasets, data variables are multi-dimensional arrays of data.
  <li>There MUST be an axis variable for each dimension.
    Axis variables MUST be specified in the order that the data variables use them.
  <li>In EDDGrid datasets, all data variables MUST use (share) all of the axis variables. 
    <br>(<a rel="help" href="#dataStructures">Why?</a> <a rel="help" href="#differentDimensions">What if they don't?</a>)
  <li><a class="selfLink" id="SortedDimensionValues" href="#SortedDimensionValues" rel="bookmark">Sorted Dimension Values</a> -
    In all EDDGrid datasets, each dimension MUST be in sorted order (ascending or descending). 
    Each can be irregularly spaced. There can be no ties.
    This is a requirement of the
    <a href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html"
    >CF metadata standard<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
    If any dimension's values aren't in sorted order, the dataset won't be loaded
    and ERDDAP™ will identify the first unsorted value in the log file,
    <i>bigParentDirectory</i>/logs/log.txt .

    <p>A few subclasses have additional restrictions (notably, 
    EDDGridAggregateExistingDimension requires that the outer (leftmost, first) dimension
    be ascending.
    <p>Unsorted dimension values almost always indicate a problem with the source dataset.
    This most commonly occurs when a misnamed or inappropriate file is 
    included in the aggregation, which leads to an unsorted time dimension. 
    To solve this problem, see the error message in the 
    ERDDAP™ log.txt file to find the offending time value. 
    Then look in the source files to find the corresponding file 
    (or one before or one after) that doesn't belong in the aggregation.
  <li>See the more complete description of the 
    <a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/griddap/documentation.html#dataModel">EDDGrid data model</a>.
  <li>The EDDGrid dataset types are:
      <ul>
      <li><a rel="help" href="#EDDGridFromAudioFiles">EDDGridFromAudioFiles</a> 
        aggregates data from a group of local audio files.
      <li><a rel="help" href="#EDDGridFromDap">EDDGridFromDap</a> 
        handles gridded data from DAP servers.
      <li><a rel="help" href="#EDDGridFromEDDTable">EDDGridFromEDDTable</a> 
        lets you convert a tabular dataset into a gridded dataset.
      <li><a rel="help" href="#EDDGridFromErddap">EDDGridFromErddap</a> 
        handles gridded data from a remote ERDDAP.
      <li><a rel="help" href="#EDDGridFromEtopo">EDDGridFromEtopo</a> 
        just handles the built-in ETOPO topography data.
      <li><a rel="help" href="#EDDGridFromFiles">EDDGridFromFiles</a> 
        is the superclass of all EDDGridFrom...Files classes.
      <li><a rel="help" href="#EDDGridFromMergeIRFiles">EDDGridFromMergeIRFiles</a> 
        aggregates data from a group of local MergeIR .gz files.
      <li><a rel="help" href="#EDDGridFromNcFiles">EDDGridFromNcFiles</a> 
        aggregates data from a group of local NetCDF (v3 or v4) .nc and related files.
      <li><a rel="help" href="#EDDGridFromNcFilesUnpacked">EDDGridFromNcFilesUnpacked</a> 
        is a variant if EDDGridFromNcFiles which also aggregates data from a 
        group of local NetCDF (v3 or v4) .nc and related files, which ERDDAP™ unpacks at a low level.
      <li><a rel="help" href="#EDDGridLonPM180">EDDGridLonPM180</a> 
        modifies the longitude values of a child EDDGrid so that they are in 
        the range -180 to 180.
      <li><a rel="help" href="#EDDGridLon0360">EDDGridLon0360</a> 
        modifies the longitude values of a child EDDGrid so that they are in 
        the range 0 to 360.
      <li><a rel="help" href="#EDDGridSideBySide">EDDGridSideBySide</a> 
        aggregates two or more EDDGrid datasets side by side.
      <li><a rel="help" href="#EDDGridAggregateExistingDimension">EDDGridAggregateExistingDimension</a> 
           aggregates two or more EDDGrid datasets, each of which has 
           a different range of values for the first dimension, 
           but identical values for the other dimensions.
      <li><a rel="help" href="#EDDGridCopy">EDDGridCopy</a> can make a local copy of
        another EDDGrid's data and serves data from the local copy.
        <br>&nbsp;
      </ul>
  <li>All EDDGrid datasets support an <kbd>nThreads</kbd> setting,
    which tells ERDDAP™ how many threads to use when responding to a request.
    See the <a href="#nThreads" rel="bookmark">nThreads</a> documentation for details.
    <br>&nbsp;
  </ul>

<li><a class="selfLink" id="EDDTable" href="#EDDTable" rel="bookmark"><strong>EDDTable</strong></a> datasets handle tabular data. 
  <ul>
  <li>Tabular data can be represented as a database-like table with rows and columns. 
    Each column (a data variable) has a name, a set of attributes, 
    and stores just one type of data. 
    Each row has an observation (or group of related values). 
    The data source may have the data in a different data structure, 
    a more complicated data structure, and/or multiple data files,     
    but ERDDAP™ needs to be able to flatten the source data into a database-like table
    in order to present the data as a tabular dataset to users of ERDDAP.
  <li>See the more complete description of the 
    <a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/tabledap/documentation.html#dataModel"
    >EDDTable data model</a>.
  <li>The EDDTable dataset types are:
      <ul>
      <li><a rel="help" href="#EDDTableFromAllDatasets">EDDTableFromAllDatasets</a> 
      is a higher-level dataset which has information about all the other datasets
      in your ERDDAP.
      <li><a rel="help" href="#EDDTableFromAsciiFiles">EDDTableFromAsciiFiles</a> 
      aggregates data from 
      comma-, tab-, semicolon-, or space-separated tabular ASCII data files.
      <li><a rel="help" href="#EDDTableFromAsciiService">EDDTableFromAsciiService</a> 
      is the superclass
        of all EDDTableFromAsciiService... classes. 
      <li><a rel="help" href="#EDDTableFromAsciiServiceNOS">EDDTableFromAsciiServiceNOS</a>
        handles data from some of the NOAA NOS web services.
      <li><a rel="help" href="#EDDTableFromAudioFiles">EDDTableFromAudioFiles</a> 
        aggregates data from a group of local audio files.
      <!-- li><a rel="help" href="#EDDTableFromBMDE">EDDTableFromBMDE</a> 
        handles tabular data from BMDE servers. -->
      <li><a rel="help" href="#EDDTableFromAwsXmlFiles">EDDTableFromAwsXmlFiles</a> 
        aggregates data from a set of Automatic Weather Station (AWS) XML files.
      <li><a rel="help" href="#EDDTableFromCassandra">EDDTableFromCassandra</a> 
        handles tabular data from one Cassandra table.
      <li><a rel="help" href="#EDDTableFromColumnarAsciiFiles">EDDTableFromColumnarAsciiFiles</a>
      aggregates data from tabular ASCII data files with fixed-width data columns.
      <li><a rel="help" href="#EDDTableFromDapSequence">EDDTableFromDapSequence</a> 
        handles tabular data from DAP sequence servers.
      <li><a rel="help" href="#EDDTableFromDatabase">EDDTableFromDatabase</a> 
        handles tabular data from one database table.
      <li><a rel="help" href="#EDDTableFromEDDGrid">EDDTableFromEDDGrid</a>
        lets you create an EDDTable dataset from an EDDGrid dataset.
      <li><a rel="help" href="#EDDTableFromErddap">EDDTableFromErddap</a> 
        handles tabular data from a remote ERDDAP.
      <li><a rel="help" href="#EDDTableFromFileNames">EDDTableFromFileNames</a> 
        creates a dataset from information about a group of files 
        in the server's file system, 
        but it doesn't serve data from within the files.
      <li><a rel="help" href="#EDDTableFromFiles">EDDTableFromFiles</a> 
        is the superclass of all EDDTableFrom...Files classes.
      <!--li><a rel="help" href="#EDDTableFromMWFS">EDDTableFromMWFS</a> 
        handles tabular data from microWFS servers. -->
      <li><a rel="help" href="#EDDTableFromHttpGet">EDDTableFromHttpGet</a> 
        is ERDDAP's only system for data import as well as data export.
      <li><a rel="help" href="#EDDTableFromHyraxFiles">EDDTableFromHyraxFiles</a> (DEPRECATED)
        aggregates data from files with several variables with shared 
        dimensions served by a 
        <a rel="help" href="https://www.opendap.org/software/hyrax-data-server">Hyrax OPeNDAP server<img 
          src="../images/external.png" alt=" (external link)" 
          title="This link to an external website does not constitute an endorsement."></a>.
      <li><a rel="help" href="#EDDTableFromInvalidCRAFiles">EDDTableFromInvalidCRAFiles</a> 
        aggregates data from NetCDF (v3 or v4) .nc files which use 
        a specific, invalid, variant of the CF DSG Contiguous Ragged Array (CRA) files.
        Although ERDDAP™ supports this file type, it is an invalid file type 
        that no one should start using. Groups that currently use this file type are 
        strongly encouraged to use ERDDAP™ to generate valid CF DSG CRA files
        and stop using these files.
      <li><a rel="help" href="#EDDTableFromJsonlCSVFiles">EDDTableFromJsonlCSVFiles</a> 
        aggregates data from 
        <a rel="help"
          href="https://jsonlines.org/examples/"
          >JSON Lines CSV files<img 
          src="../images/external.png" alt=" (external link)" 
          title="This link to an external website does not constitute an endorsement."></a>.
      <li><a rel="help" href="#EDDTableFromMultidimNcFiles">EDDTableFromMultidimNcFiles</a> 
        aggregates data from NetCDF (v3 or v4) .nc files with several variables 
        with shared dimensions.
      <li><a rel="help" href="#EDDTableFromNcFiles">EDDTableFromNcFiles</a> 
        aggregates data from NetCDF (v3 or v4) .nc files with several variables 
        with shared dimensions.
        It is fine to continue using this dataset type for existing datasets,
        but for new datasets we recommend using EDDTableFromMultidimNcFiles instead.
      <li><a rel="help" href="#EDDTableFromNcCFFiles">EDDTableFromNcCFFiles</a> 
        aggregates data from NetCDF (v3 or v4) .nc files which use one of the 
        file formats specified by the 
        <a href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html#discrete-sampling-geometries">CF Discrete Sampling Geometries (DSG)<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a> 
        conventions. 
        But for files using one of the multidimensional CF DSG variants, use
        <a rel="help" href="#EDDTableFromMultidimNcFiles">EDDTableFromMultidimNcFiles</a> 
        instead.
      <li><a rel="help" href="#EDDTableFromNccsvFiles">EDDTableFromNccsvFiles</a> 
        aggregates data from 
        <a rel="help" href="https://erddap.github.io/NCCSV.html">NCCSV</a>
        ASCII .csv files.
      <li><a rel="help" href="#EDDTableFromNOS">EDDTableFromNOS</a> (DEPRECATED)
        handles tabular data from NOS XML servers.
      <li><a rel="help" href="#EDDTableFromOBIS">EDDTableFromOBIS</a> 
        handles tabular data from OBIS servers.
      <li><a rel="help" href="#EDDTableFromParquetFiles">EDDTableFromParquetFiles</a> 
          handles data from 
          <a rel="help"
            href="https://parquet.apache.org/"
            >Parquet<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a>.
      <li><a rel="help" href="#EDDTableFromSOS">EDDTableFromSOS</a> 
        handles tabular data from SOS servers.
      <li><a rel="help" href="#EDDTableFromThreddsFiles">EDDTableFromThreddsFiles</a> (DEPRECATED)
        aggregates data from files with several variables with
        shared dimensions served by a 
          <a rel="help" 
          href="https://www.unidata.ucar.edu/software/tds/"
          >THREDDS OPeNDAP server<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a>.
      <li><a rel="help" href="#EDDTableFromWFSFiles">EDDTableFromWFSFiles</a> (DEPRECATED)
        makes a local copy of all of the data from an ArcGIS MapServer WFS server
        so the data can then be re-served quickly to ERDDAP™ users.
      <li><a rel="help" href="#EDDTableAggregateRows">EDDTableAggregateRows</a>
        can make an EDDTable dataset from a group of EDDTable datasets.
      <li><a rel="help" href="#EDDTableCopy">EDDTableCopy</a> can make a local 
        copy of many types of EDDTable datasets and then re-serve the data quickly from the 
        local copy.
      </ul>
  </ul>
</ul>

<br>&nbsp;
<hr>
<h2><a class="selfLink" id="datasetDescriptions" href="#datasetDescriptions" rel="bookmark">Detailed Descriptions of Dataset Types</a></h2>


<p><a class="selfLink" id="EDDGridFromDap" href="#EDDGridFromDap" rel="bookmark"><strong>EDDGridFromDap</strong></a> handles grid variables from 
<a rel="help" href="https://www.opendap.org/">DAP<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a> servers.
<ul>
<li>We strongly recommend using the
  <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml program</a> 
  to make a rough draft of the datasets.xml chunk for this dataset.
  You can gather the information you need to tweak that or create your own XML 
  for an EDDGridFromDap dataset by looking 
  at the source dataset's DDS and DAS files in your browser (by adding .das and .dds to the sourceUrl,
  for example, 
<a href="https://thredds1.pfeg.noaa.gov/thredds/dodsC/satellite/BA/ssta/5day.dds">https://thredds1.pfeg.noaa.gov/thredds/dodsC/satellite/BA/ssta/5day.dds<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a>).
  <br>&nbsp;
<li>EDDGridFromDap can get data from any multi-dimensional variable from a DAP data server.
  (Previously, EDDGridFromDap was limited to variables designated as "grid"'s, but that is no longer
  a requirement.)
  <br>&nbsp;
<li>Sorted Dimension Values -
   The values for each dimension MUST be in sorted order (ascending or descending). 
   The values can be irregularly spaced. There can be no ties.
   This is a requirement of the
   <a href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html"
    >CF metadata standard<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
    If any dimension's values aren't in sorted order, the dataset won't be loaded
    and ERDDAP™ will identify the first unsorted value in the log file,
    <i>bigParentDirectory</i>/logs/log.txt .
   <p>Unsorted dimension values almost always indicate a problem with the source dataset.
   This most commonly occurs when a misnamed or inappropriate file is 
   included in the aggregation, which leads to an unsorted time dimension. 
   To solve this problem, see the error message in the 
   ERDDAP™ log.txt file to find the offending time value. 
   Then look in the source files to find the corresponding file 
   (or one before or one after) that doesn't belong in the aggregation.
<li><a class="selfLink" id="EDDGridFromDapSkeletonXML" href="#EDDGridFromDapSkeletonXML" rel="bookmark">The skeleton XML for an EDDGridFromDap dataset is:</a>

<pre>
&lt;dataset type="EDDGridFromDap" <a rel="help" href="#datasetID">datasetID</a>="..." <a rel="help" href="#active">active</a>="..." &gt;
  <a rel="help" href="#sourceUrl">&lt;sourceUrl&gt;</a>...&lt;/sourceUrl&gt;
  <a rel="help" href="#accessibleTo">&lt;accessibleTo&gt;</a>...&lt;/accessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#graphsAccessibleTo">&lt;graphsAccessibleTo&gt;</a>auto|public&lt;/graphsAccessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#accessibleViaWMS">&lt;accessibleViaWMS&gt;</a>...&lt;/accessibleViaWMS&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#reloadEveryNMinutes">&lt;reloadEveryNMinutes&gt;</a>...&lt;/reloadEveryNMinutes&gt; &lt;!-- 0 or 1 --&gt; 
  <a rel="help" href="#updateEveryNMillis">&lt;updateEveryNMillis&gt;</a>...&lt;/updateEveryNMillis&gt; &lt;!-- 0 or 1. 
    For EDDGridFromDap, this gets the remote .dds and then gets the new
    leftmost (first) dimension values. --&gt;
  <a rel="help" href="#defaultDataQuery">&lt;defaultDataQuery&gt;</a>...&lt;/defaultDataQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultGraphQuery">&lt;defaultGraphQuery&gt;</a>...&lt;/defaultGraphQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#nThreads">&lt;nThreads&gt;</a>...&lt;/nThreads&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#dimensionValuesInMemory">&lt;dimensionValuesInMemory&gt;</a>...&lt;/dimensionValuesInMemory&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#fgdcFile">&lt;fgdcFile&gt;</a>...&lt;/fgdcFile&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#iso19115File">&lt;iso19115File&gt;</a>...&lt;/iso19115File&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#onChange">&lt;onChange&gt;</a>...&lt;/onChange&gt; &lt;!-- 0 or more --&gt;
  <a rel="help" href="#globalAttributes">&lt;addAttributes&gt;</a>...&lt;/addAttributes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#axisVariable">&lt;axisVariable&gt;</a>...&lt;/axisVariable&gt; &lt;!-- 1 or more --&gt;
  <a rel="help" href="#dataVariable">&lt;dataVariable&gt;</a>...&lt;/dataVariable&gt; &lt;!-- 1 or more --&gt;
&lt;/dataset&gt;
</pre>
&nbsp;
</ul>


<p><a class="selfLink" id="EDDGridFromEDDTable" href="#EDDGridFromEDDTable" rel="bookmark"><strong>EDDGridFromEDDTable</strong></a> 
   lets you convert an EDDTable tabular dataset into an EDDGrid gridded dataset.
   Remember that ERDDAP™ treats datasets as either
   <a rel="help" href="#dataStructures">gridded datasets (subclasses of EDDGrid) 
   or tabular datasets (subclasses of EDDTable)</a>.

<ul>
<li>Normally, if you have gridded data, you just set up an EDDGrid dataset directly.
  Sometimes this isn't possible, for example, when you have the data
  stored in a relational database that ERDDAP™ can only access via EDDTableFromDatabase. 
  EDDGridFromEDDTable class lets you remedy that situation.
  <br>&nbsp;

<li>Obviously, the data in the underlying EDDTable dataset must be (basically) 
  gridded data, but
  in a tabular form. For example, the EDDTable dataset may have CTD data: measurements of
  eastward and northward current, at several depths, at several times. 
  Since the depths are the same at each time point, EDDGridFromEDDTable can
  create a gridded dataset with a time and a depth dimension which accesses the
  data via the underlying EDDTable dataset.
  <br>&nbsp;

<li>GenerateDatasetsXml -- We strongly recommend using the
  <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml program</a> 
  to make a rough draft of the datasets.xml chunk for this dataset.
  You can gather the information you need to improve the rough draft.
  <br>&nbsp;

<li>Source Attributes -- As with all other types of datasets, EDDGridFromTable
  has the idea that there are global sourceAttributes 
  and <a rel="help" href="#globalAttributes">global addAttributes</a>
  (specified in datasets.xml),
  which are combined to make the global combinedAttributes, which are
  what users see. For global sourceAttributes, EDDGridFromEDDTable uses
  the global combinedAttributes of the underlying EDDTable dataset.
  (If you think about it for a minute, it makes sense.)

  <p>Similarly, for each axisVariable's and dataVariable's
  <a rel="help" href="#addAttributes">addAttributes</a>, 
  EDDGridFromEDDTable uses the variable's combinedAttributes from the underlying 
  EDDTable dataset as the EDDGridFromEDDTable variable's sourceAttributes.
  (If you think about it for a minute, it makes sense.)

  <p>As a consequence, if the EDDTable has good metadata, the EDDGridFromEDDTable
  often needs very little addAttributes metadata -- just a few tweaks here and there.

<li>dataVariables versus axisVariables -- The underlying EDDTable has only dataVariables.
  An EDDGridFromEDDTable dataset will have some axisVariables (created from 
  some of the EDDTable dataVariables) and some dataVariables (created from the
  remaining EDDTable dataVariables). 
  <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml</a> 
  will make a guess 
  as to which EDDTable dataVariables should become EDDGridFromEDDTable axisVariables,
  but it is just a guess. You need to modify the output of GenerateDatasetsXml
  to specify which dataVariables will become axisVariables, and in which 
  order.
  <br>&nbsp;

<li>axisValues -- There is nothing about the underlying EDDTable to tell
  EDDGridFromEDDTable the possible values of the axisVariables in the gridded version
  of the dataset, so you MUST provide
  that information for each axisVariable via one of these attributes:
  <ul>
  <li><kbd>axisValues</kbd> -- lets you specify a list of values. For example,
    <br><kbd>&lt;att name="axisValues" 
      <a rel="help" href="#attributeType">type="doubleList"</a>&gt;2, 2.5, 3, 
      3.5, 4&lt;/att&gt;</kbd>
    <br>Note the use of a <a rel="help" href="#dataTypes">data type</a> plus the word <kbd>List</kbd>. 
      Also, the type of list (for example, double),
      MUST match the dataType of the variable in the EDDTable and EDDGridFromEDDTable
      datasets.
  <li><kbd>axisValuesStartStrideStop</kbd> -- lets you specify a sequence of
    regularly spaced values by specifying the start, stride, and stop values. 
    Here is an example that is equivalent to the <kbd>axisValues</kbd> example above:
    <br><kbd>&lt;att name="axisValuesStartStrideStop" 
      <a rel="help" href="#attributeType">type="doubleList"</a>&gt;2, 0.5, 4&lt;/att&gt;</kbd>
    <br>Again, note the use of a list data type.  Also, the type of list (for example, double),
      MUST match the dataType of the variable in the EDDTable and EDDGridFromEDDTable
      datasets.  
    <br>&nbsp;
  </ul>
  Updates -- Just as there is no way for EDDGridFromEDDTable to determine the axisValues
  from the EDDTable initially, there is also no reliable way for EDDGridFromEDDTable
  to determine from the EDDTable when the axisValues have changed
  (notably, when there are new values for the time variable). Currently, the only solution 
  is to change the axisValues attribute in datasets.xml and reload the dataset. 
  For example, you could write a script to
  <ol>
  <li>Search datasets.xml for 
    <br><kbd>datasetID="<i>theDatasetID</i>"</kbd> 
    <br>so you are working with the correct dataset.
  <li>Search datasets.xml for the next occurrence of 
    <br><kbd>&lt;sourceName&gt;<i>theVariablesSourceName</i>&lt;/sourceName&gt;</kbd>
    <br>so you are working with the correct variable.
  <li>Search datasets.xml for the next occurrence of 
    <br><kbd>&lt;att name="axisValuesStartStrideStop" type="doubleList"&gt;</kbd>
    <br>so you know the start position of the tag.
  <li>Search datasets.xml for the next occurrence of 
    <br><kbd>&lt;/att&gt;</kbd>
    <br>so you know the end position of the axis values.
  <li>Replace the old start, stride, stop values with the new values.
  <li>Contact the <a rel="help" 
    href="https://erddap.github.io/setup.html#setDatasetFlag">flag URL</a> for the dataset to tell ERDDAP™ to reload the dataset.
  </ol>
  This isn't ideal, but it works.
  <br>&nbsp;

<li>precision -- When EDDGridFromEDDTable responds to a user's request for data,
  it moves a row of data from the 
  EDDTable response table into the EDDGrid response grid.  To do this, 
  it has to figure out if the "axis" values on a given row in the table 
  match some combination
  of axis values in the grid. For integer data types, it is easy to determine if 
  two values are equal. But for floats and
  doubles, this brings up the horrible problem of floating point numbers 
  <a rel="help" href="https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/">not matching exactly<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
  (for example, 0.2 versus 0.199999999999996). 
  To (try to) deal with this, EDDGridFromTable
  lets you specify a <kbd>precision</kbd> attribute for any of the axisVariables,
  which specifies the total number of decimal digits which must be identical.
  <ul>
  <li>For example, <kbd>&lt;att name="precision" type="int"&gt;5&lt;/att&gt;</kbd>
  <li>For different types of data variables, there are different default precision values.
    The defaults are usually appropriate.  
    If they aren't, you need to specify different values.
  <li>For axisVariables that are 
    <a rel="help" href="#timeStampVariable">time or timeStamp variables</a>, 
    the default is full precision (an exact match).
  <li>For axisVariables that are floats, the default precision is 5.
  <li>For axisVariables that are doubles, the default precision is 9.
  <li>For axisVariables that have integer data types, 
    EDDGridFromEDDTable ignores the <kbd>precision</kbd> attribute and
    always uses full precision (an exact match).
    <br>&nbsp;

  <li><strong>WARNING!</strong> When doing the conversion of a chunk of tabular data into
    a chunk of gridded data, 
    if EDDGridFromEDDTable can't match an EDDTable "axis" value
    to one of the expected EDDGridFromEDDTable axis values, EDDGridFromEDDTable
    silently (no error) throws away the data from that row of the table.
    For example, there may be other data (not on the grid) in the EDDTable dataset.
    (And if stride &gt; 1, it isn't obvious to EDDGridFromTable which axis values are 
    desired values and which ones are the one's to be skipped because of the stride.)
    So, if the precision values are too high,  
    the user will see missing values in the data response 
    when valid data values actually exist.

    <p>Conversely, if the precision values are set too low, EDDTable "axis" values
    that shouldn't match EDDGridFromEDDTable axis values will (erroneously) match.

    <p>These potential problems are horrible, because the user gets the wrong data
    (or missing values) when they should get the right data (or at least an error message).
    <br>This is not a flaw in EDDGridFromTable. 
      EDDGridFromTable can't solve this problem. 
      The problem is inherent in the conversion of tabular data into gridded data
      (unless other assumptions can be made, but they can't be made here).
    <br>It is up to you, the ERDDAP™ administrator, to 
      <strong>test your EDDGridFromEDDTable thoroughly</strong>
      to ensure that the precision values are set to avoid these potential problems. 
  </ul>

<li><a class="selfLink" id="egfetGapThreshold" href="#egfetGapThreshold" rel="bookmark">gapThreshold</a> -- 
  This is a very unusual type of dataset. 
  Since the types of queries that can be made to (handled by) an EDDGrid dataset
  (related to the ranges and strides of the axisVariables)
  are very different from the types of queries that can be made to (handled by) 
  an EDDTable dataset (just related to the ranges of some variables), 
  the performance of EDDGridFromEDDTable datasets 
  will vary greatly depending on the exact request which is made 
  and the speed of the underlying EDDTable dataset.
  For requests that have a stride value &gt; 1, EDDGridFromEDDTable 
  may ask the underlying EDDTable for a relatively big chunk of data (as if stride=1) 
  and then sift through the results, keeping the data from some rows and throwing away
  the data from others. If it has to sift through a lot of data to get
  the data it needs, the request will take longer to fill.

  <p>If EDDGridFromEDDTable can tell that there will be large gaps (with
  rows of unwanted data) between the rows with desired data,
  EDDGridFromEDDTable may choose to make several subrequests
  to the underlying EDDTable instead of one big request,
  thereby skipping the unwanted rows of data in the large gaps.
  The sensitivity for this decision is controlled by the gapThreshold value
  as specified in the &lt;gapThreshold&gt; tag (default=1000 rows of source data).
  Setting gapThreshold to a smaller number will lead to the dataset
  making (generally) more subrequests.
  Setting gapThreshold to a larger number will lead to the dataset
  making (generally) fewer subrequests.

  <p>If gapThreshold is set too small, EDDGridFromEDDTable will operate more
  slowly because the overhead of multiple requests will be greater than 
  the time saved by getting some excess data.
  If gapThreshold is set too big, EDDGridFromEDDTable will operate more slowly
  because so much excess data will be retrieved from the EDDTable, only to be discarded.
  (As Goldilocks discovered, the middle is "just right".)
  The overhead for different types of EDDTable datasets varies greatly,
  so the only way to know the actual best setting for your dataset is via experimentation.  
  But you won't go too far wrong sticking to the default.

  <p>A simple example is: Imagine an EDDGridFromTable with just one axisVariable 
  (time, with a size of 100000), one dataVariable (temperature), and the
  default gapThreshold of 1000.
  <ul>
  <li>If a user requests temperature[0:100:5000], 
    the stride is 100 so the gap size is 99, which 
    is less than the gapThreshold. 
    So EDDGridFromTable will make just one request to EDDTable for
    all of the data needed for the request (equivalent to temperature[0:5000])
    and throw away all the rows of data it doesn't need.
  <li>If a user requests temperature[0:2500:5000], 
    that stride is 2500 so the gap size is 2499, which
    is greater than the gapThreshold. 
    So EDDGridFromTable will make separate requests to 
    EDDTable which are equivalent to temperature[0], temperature[2500],
    temperature[5000]. 
  </ul>
  Calculation of the gap size is more complicated when there are multiple axes.

  <p>For each user request, EDDGridFromEDDTable prints diagnostic messages 
  related to this in the
  <a rel="help" href="https://erddap.github.io/setup.html#log">log.txt</a>
  file.
  <ul>
  <li>If <a rel="help" href="#logLevel"><kbd>&lt;logLevel&gt;</kbd></a> in datasets.xml
    is set to <kbd>info</kbd>, this prints a message like
    <br><kbd>* nOuterAxes=1 of 4 nOuterRequests=22</kbd>
    <br>If nOuterAxes=0, gapThreshold wasn't exceeded and only one request
      will be made to EDDTable.
    <br>If nOuterAxes&gt;0, gapThreshold was exceeded and nOuterRequests will be made
      to EDDTable, corresponding to each requested combination of the 
      leftmost nOuterAxes. For example, if the dataset has 4 axisVariables and
      dataVariables like
      eastward[time][latitude][longitude][depth], the leftmost (first) axis variable
      is time.
  <li>If <kbd>&lt;logLevel&gt;</kbd> in datasets.xml is set to <kbd>all</kbd>, 
    additional information is written to the log.txt file.
    <br>&nbsp;
  </ul>

<li><a class="selfLink" id="EDDGridFromEDDTableSkeletonXML" href="#EDDGridFromEDDTableSkeletonXML" rel="bookmark">The skeleton XML for an EDDGridFromEDDTable
dataset is:</a>

<pre>
&lt;dataset type="EDDGridFromEDDTable" <a rel="help" href="#datasetID">datasetID</a>="..." <a rel="help" href="#active">active</a>="..." &gt;
  <a rel="help" href="#accessibleTo">&lt;accessibleTo&gt;</a>...&lt;/accessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#graphsAccessibleTo">&lt;graphsAccessibleTo&gt;</a>auto|public&lt;/graphsAccessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#accessibleViaWMS">&lt;accessibleViaWMS&gt;</a>...&lt;/accessibleViaWMS&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#reloadEveryNMinutes">&lt;reloadEveryNMinutes&gt;</a>...&lt;/reloadEveryNMinutes&gt; &lt;!-- 0 or 1 --&gt; 
  <a rel="help" href="#updateEveryNMillis">&lt;updateEveryNMillis&gt;</a>...&lt;/updateEveryNMillis&gt; &lt;!-- 0 or 1. 
    For EDDGridFromEDDTable, this only works if the underlying EDDTable
    supports updateEveryNMillis. --&gt;
  <a rel="help" href="#egfetGapThreshold">&lt;gapThreshold&gt;</a>...&lt;/gapThreshold&gt; &lt;!-- 0 or 1. The default is 1000. &gt;
  <a rel="help" href="#defaultDataQuery">&lt;defaultDataQuery&gt;</a>...&lt;/defaultDataQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultGraphQuery">&lt;defaultGraphQuery&gt;</a>...&lt;/defaultGraphQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#fgdcFile">&lt;fgdcFile&gt;</a>...&lt;/fgdcFile&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#iso19115File">&lt;iso19115File&gt;</a>...&lt;/iso19115File&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#onChange">&lt;onChange&gt;</a>...&lt;/onChange&gt; &lt;!-- 0 or more --&gt;
  <a rel="help" href="#globalAttributes">&lt;addAttributes&gt;</a>...&lt;/addAttributes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#axisVariable">&lt;axisVariable&gt;</a>...&lt;/axisVariable&gt; &lt;!-- 1 or more --&gt;
  <a rel="help" href="#dataVariable">&lt;dataVariable&gt;</a>...&lt;/dataVariable&gt; &lt;!-- 1 or more --&gt;
  &lt;dataset&gt;...&lt;/dataset&gt; &lt;!-- The underlying source EDDTable dataset. --&gt;
&lt;/dataset&gt;
</pre>
&nbsp;
</ul>


<p><a class="selfLink" id="EDDGridFromErddap" href="#EDDGridFromErddap" rel="bookmark"><strong>EDDGridFromErddap</strong></a> handles gridded data from 
a remote ERDDAP™ server.
<br><a class="selfLink" id="EDDTableFromErddap" href="#EDDTableFromErddap" rel="bookmark"><strong>EDDTableFromErddap</strong></a> handles tabular data from 
a remote ERDDAP™ server.
<ul>
<li>EDDGridFromErddap and EDDTableFromErddap behave differently from all other 
  types of datasets in ERDDAP.
  <ul>
  <li>Like other types of datasets, these datasets get information about the 
    dataset from the source and keep it in memory.
  <li>Like other types of datasets, when ERDDAP™ searches for datasets, 
    displays the Data Access Form (<i>datasetID</i>.html), 
    or displays the Make A Graph form (<i>datasetID</i>.graph), 
    ERDDAP™ uses the information about the dataset which is in memory.
  <li>Unlike other types of datasets, when ERDDAP™ receives a request for data or 
    images from these datasets, ERDDAP
     <a rel="help" href="https://en.wikipedia.org/wiki/URL_redirection">redirects<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a>
     the request to the remote ERDDAP™ server. The result is:
     <ul>
    <li>This is very efficient (CPU, memory, and bandwidth), because otherwise 
      <ol>
      <li>The composite ERDDAP™ has to send the request to the other ERDDAP™ (which takes time).
      <li>The other ERDDAP™ has to get the data, reformat it, and transmit the data
        to the composite ERDDAP.
      <li>The composite ERDDAP™ has to receive the data (using bandwidth), 
        reformat it (using CPU and memory), and transmit the data to the user 
        (using bandwidth).
      </ol>
      By redirecting the request and allowing the other ERDDAP™ to send the
      response directly to the user, the composite ERDDAP™ spends essentially
      no CPU time, memory, or bandwidth on the request.
    <li>The redirect is transparent to the user regardless of the client software
      (a browser or any other software or command line tool).
     </ul>
  <li><a class="selfLink" id="redirect" href="#redirect" rel="bookmark">You can tell ERDDAP™</a>
    not to redirect any user requests by setting 
    <kbd>&lt;redirect&gt;false&lt;/redirect&gt;</kbd>,
    but this negates most of the advantages of the ...FromErddap dataset type 
    (notably, dispersing the load on the front end ERDDAP™ to the remote/backend ERDDAP).
    <br>&nbsp;
  </ul>

<li>EDDGridFromErddap and EDDTableFromErddap are the basis for
  <a rel="help" href="https://erddap.github.io/setup.html#grids"
  >grids/clusters/federations</a> 
  of ERDDAPs, which efficiently distribute the CPU usage (mostly for making maps), memory usage, 
  dataset storage, and bandwidth usage of a large data center.
    <br>&nbsp;


<li><a class="selfLink" id="EDDFromErddapSubscriptions" href="#EDDFromErddapSubscriptions" rel="bookmark">Subscriptions</a> -- Normally, 
  when an EDDGridFromErddap and EDDTableFromErddap are (re)loaded on your ERDDAP,
  they try to add a subscription to the remote dataset via the remote ERDDAP's
  email/URL subscription system.
  That way, whenever the remote dataset changes, the remote ERDDAP™ contacts the 
     <a rel="help" href="https://erddap.github.io/setup.html#setDatasetFlag"
     >setDatasetFlag URL</a>
  on your ERDDAP™ so that the local dataset is reloaded ASAP and so that the 
  local dataset is always perfectly up-to-date and mimics the remote dataset.
  So, the first time this happens, you should get an email requesting that you 
  validate the subscription. However, if the local ERDDAP™ can't send an email 
  or if the remote ERDDAP's email/URL subscription
  system isn't active, you should email the remote ERDDAP™ administrator and 
  request that s/he manually add 
    <kbd><a rel="help" href="#onChange">&lt;onChange&gt;</a>...&lt;/onChange&gt;</kbd>
    tags to all of the relevant datasets to call your dataset's 
    <a rel="help" href="https://erddap.github.io/setup.html#setDatasetFlag"
    >setDatasetFlag URLs</a>. 
  See your ERDDAP™ daily report for a list of setDatasetFlag URLs, but just send the ones for
  EDDGridFromErddap and EDDTableFromErddap datasets to the remote ERDDAP™ administrator.
  <p>Is this not working? Are your local datasets not staying in sync with the remote datasets?
  <br>Several things must all work correctly for this system to work so that 
    your datasets stay up-to-date.
    Check each of these things in order:
  <ol>
  <li>Your ERDDAP™ must be able to send out emails. 
    See the email settings in your setup.xml.
  <li>In general (but not always), your ERDDAP's &lt;baseUrl&gt; and 
    &lt;baseHttpsUrl&gt;must not have a port number (e.g., :8080, :8443).
    If they do, use a 
    <a rel="help" href="https://erddap.github.io/setup.html#ProxyPass">proxypass</a> 
    to remove the port from the Url.
  <li>In your setup.xml, &lt;subscribeToRemoteErddapDataset&gt; must be set to
    <kbd>true</kbd>.
  <li>When your local EDD...FromErddap dataset is reloaded, 
    it should send a request to the remote
    ERDDAP™ to subscribe to the remote dataset. Look in log.txt to see if this
    is happening.
  <li>You should get an email asking you to validate the subscription request.
  <li>You must click on the link in that email to validate the subscription request.
  <li>The remote ERDDAP™ should say that the validation was successful.
    At any time, you can request an email from the remote ERDDAP™ with a list 
    of your pending and valid subscriptions. See the form at
    <i>remoteErddapBaseUrl</i>/erddap/subscriptions/list.html .    
  <li>When the remote dataset changes (e.g., gets additional data), 
    the remote ERDDAP™ should try to contact the flagURL on your ERDDAP.
    You can't check this, but you can ask the administrator of the remote
    ERDDAP™ to check this.
  <li>Your ERDDAP™ should receive a request to set that flagURL.
    Look in your log.txt for "setDatasetFlag.txt?" request(s) and
    see if there is an error message associated with the requests.
  <li>Your ERDDAP™ should then try to reload that dataset (perhaps not immediately,
    but ASAP).
    <br>&nbsp;
  </ol>

<li><a class="selfLink" id="upToDateMaxTime" href="#upToDateMaxTime"
      rel="bookmark">Up-to-date max(time)?</a>  
      <br>EDDGrid/TableFromErddap datasets
      only changes their stored information 
      about each source dataset when the source dataset is 
      <a rel="help" href="#reloadEveryNMinutes">"reload"ed</a>       
      and some piece of metadata changes (e.g.,
      the time variable's actual_range), thereby generating a subscription notification.
      If the source dataset has data that changes frequently 
      (for example, new data every second) and uses the 
      <a rel="help" href="#updateEveryNMillis">"update"</a>       
      system to notice frequent changes to the underlying data, 
      the EDDGrid/TableFromErddap won't be notified about these frequent changes
      until the next dataset "reload", 
      so the EDDGrid/TableFromErddap won't be perfectly up-to-date. 
      You can minimize this problem by changing the
      source dataset's <kbd>&lt;reloadEveryNMinutes&gt;</kbd> to a smaller value
      (60? 15?) so that there are more subscription notifications to tell
      the EDDGrid/TableFromErddap to update its information about the source dataset.
    <p>Or, if your data management system knows when the source dataset has new data
      (e.g., via a script that copies a data file into place), and if that isn't 
      super frequent (e.g., every 5 minutes, or less frequent), there's a better solution:
      <ol>
      <li>Don't use &lt;updateEveryNMillis&gt; to keep the source dataset up-to-date.
      <li>Set the source dataset's &lt;reloadEveryNMinutes&gt; to a larger number (1440?).
      <li>Have the script contact the source dataset's 
        <a rel="help"
        href="https://erddap.github.io/setup.html#setDatasetFlag">flag URL</a>
        right after it copies a new data file into place.
        <br>&nbsp;
      </ol>
      That will lead to the source dataset being perfectly up-to-date 
      and cause it to generate a subscription notification, 
      which will be sent to the EDDGrid/TableFromErddap dataset.
      That will lead the EDDGrid/TableFromErddap dataset to be perfectly up-to-date 
      (well, within 5 seconds of new data being added).
      And all that will be done efficiently (without unnecessary dataset reloads).
    <br>&nbsp;


<li><a class="selfLink" id="EDDFromErddapNoAddAttributes" href="#EDDFromErddapNoAddAttributes" rel="bookmark">No &lt;addAttributes&gt;, &lt;axisVariable&gt;, or &lt;dataVariable&gt;</a> -
Unlike other types of datasets,
EDDTableFromErddap and EDDGridFromErddap datasets don't allow global 
&lt;addAttributes&gt;, &lt;axisVariable&gt;, or &lt;dataVariable&gt; 
sections in the datasets.xml for that dataset. The problem is that
allowing those would lead to inconsistencies:
<ol>
<li> Let's say it was allowed and you added a new global attribute.
<li> When a user asks your ERDDAP™ for the global attributes, the new attribute will appear.
<li> But when a user asks your ERDDAP™ for a data file, your ERDDAP™ redirects the request to the source ERDDAP. That ERDDAP™ is unaware of the new attribute. So if it creates a data file with metadata, e.g., a .nc file, the metadata won't have the new attribute.
</ol>

<p>There are two work-arounds:
  <ol>
  <li> Convince the admin of the source ERDDAP™ to make the changes that you want to the metadata.

  <li> Instead of EDDTableFromErddap, use 
    <a rel="help" href="#EDDTableFromDapSequence">EDDTableFromDapSequence</a>.
    Or instead of EDDGridFromErddap, use 
    <a rel="help" href="#EDDGridFromDap">EDDGridFromDap</a>.
    Those EDD types allow you to connect efficiently to a dataset on a remote ERDDAP™ (but without redirecting data requests) and they allow you to include global 
    &lt;addAttributes&gt;, &lt;axisVariable&gt;, or &lt;dataVariable&gt; 
    sections in the datasets.xml. 
    One other difference: you will need to manually subscribe to the remote dataset, 
    so that the dataset on your ERDDAP™ will be notified (via the 
    <a rel="help"
    href="https://erddap.github.io/setup.html#setDatasetFlag">flag URL</a>) 
    when there are changes to the remote dataset.
    Thus, you are creating a new dataset, instead of linking to a remote dataset. 
      <br>&nbsp;

  </ol>

<li>For security reasons, EDDGridFromErddap and EDDTableFromErddap don't support the 
  <a rel="help" href="#accessibleTo"><kbd>&lt;accessibleTo&gt;</kbd></a> tag
  and can't be used with remote datasets that
  require logging in
  (because they use <a rel="help" href="#accessibleTo"><kbd>&lt;accessibleTo&gt;</kbd></a>)..
  See ERDDAP's 
    <a rel="help" href="https://erddap.github.io/setup.html"
    >security system</a>
    for restricting access to some datasets to some users.
  <br>&nbsp;

<li>Starting with ERDDAP™ v2.10, EDDGridFromErddap and EDDTableFromErddap support the 
  <a rel="help" href="#accessibleViaFiles">&lt;accessibleViaFiles&gt;</a> tag. 
  Unlike other types of datasets, the default is true, 
  but the dataset's files will be accessibleViaFiles only if the source dataset 
  also has &lt;accessibleViaFiles&gt; set to true.
  <br>&nbsp;

<li>You can use the
  <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml program</a> 
  to make the datasets.xml chunk for this type of dataset.
  But you can do these types of datasets easily by hand.
  <br>&nbsp;
<li><a class="selfLink" id="EDDGridFromErddapSkeletonXML" href="#EDDGridFromErddapSkeletonXML" rel="bookmark"
  >The skeleton XML for an EDDGridFromErddap dataset is very simple,</a> 
  because the intent is just to mimic
  the remote dataset which is already suitable for use in ERDDAP:
<pre>
&lt;dataset type="EDDGridFromErddap" <a rel="help" href="#datasetID">datasetID</a>="..." <a rel="help" href="#active">active</a>="..." &gt;
  <a rel="help" href="#sourceUrl">&lt;sourceUrl&gt;</a>...&lt;/sourceUrl&gt;
  <a rel="help" href="#accessibleTo">&lt;accessibleTo&gt;</a>...&lt;/accessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#accessibleViaFiles">&lt;accessibleViaFiles&gt;</a>...&lt;/accessibleViaFiles&gt; &lt;!-- 0 or 1, default=true. --&gt;
  <a rel="help" href="#graphsAccessibleTo">&lt;graphsAccessibleTo&gt;</a>auto|public&lt;/graphsAccessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#reloadEveryNMinutes">&lt;reloadEveryNMinutes&gt;</a>...&lt;/reloadEveryNMinutes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#updateEveryNMillis">&lt;updateEveryNMillis&gt;</a>...&lt;/updateEveryNMillis&gt; &lt;!-- 0 or 1.  
    For EDDGridFromErddap, this gets the remote .dds and then gets
    the new leftmost (first) dimension values. --&gt;
  <a rel="help" href="#defaultDataQuery">&lt;defaultDataQuery&gt;</a>...&lt;/defaultDataQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultGraphQuery">&lt;defaultGraphQuery&gt;</a>...&lt;/defaultGraphQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#nThreads">&lt;nThreads&gt;</a>...&lt;/nThreads&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#dimensionValuesInMemory">&lt;dimensionValuesInMemory&gt;</a>...&lt;/dimensionValuesInMemory&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#fgdcFile">&lt;fgdcFile&gt;</a>...&lt;/fgdcFile&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#iso19115File">&lt;iso19115File&gt;</a>...&lt;/iso19115File&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#onChange">&lt;onChange&gt;</a>...&lt;/onChange&gt; &lt;!-- 0 or more --&gt;
  <a rel="help" href="#redirect">&lt;redirect&gt;</a>true(default)|false&lt;/redirect&gt; &lt;!-- 0 or 1; --&gt;

&lt;/dataset&gt;
</pre>
<li><a class="selfLink" id="EDDTableFromErddapSkeletonXML" href="#EDDTableFromErddapSkeletonXML" rel="bookmark">The skeleton XML for an EDDTableFromErddap dataset 
  is very simple,</a> because the intent is just to mimic
  the remote dataset, which is already suitable for use in ERDDAP:
<pre>
&lt;dataset type="EDDTableFromErddap" <a rel="help" href="#datasetID">datasetID</a>="..." <a rel="help" href="#active">active</a>="..." &gt;
  <a rel="help" href="#sourceUrl">&lt;sourceUrl&gt;</a>...&lt;/sourceUrl&gt;
  <a rel="help" href="#accessibleTo">&lt;accessibleTo&gt;</a>...&lt;/accessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#graphsAccessibleTo">&lt;graphsAccessibleTo&gt;</a>auto|public&lt;/graphsAccessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#reloadEveryNMinutes">&lt;reloadEveryNMinutes&gt;</a>...&lt;/reloadEveryNMinutes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultDataQuery">&lt;defaultDataQuery&gt;</a>...&lt;/defaultDataQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultGraphQuery">&lt;defaultGraphQuery&gt;</a>...&lt;/defaultGraphQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#addVariablesWhere">&lt;addVariablesWhere&gt;</a>...&lt;/addVariablesWhere&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#fgdcFile">&lt;fgdcFile&gt;</a>...&lt;/fgdcFile&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#iso19115File">&lt;iso19115File&gt;</a>...&lt;/iso19115File&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#onChange">&lt;onChange&gt;</a>...&lt;/onChange&gt; &lt;!-- 0 or more --&gt;
  <a rel="help" href="#redirect">&lt;redirect&gt;</a>true(default)|false&lt;/redirect&gt; &lt;!-- 0 or 1; --&gt;
&lt;/dataset&gt;
</pre>
&nbsp;
</ul>


<p><a class="selfLink" id="EDDGridFromEtopo" href="#EDDGridFromEtopo" rel="bookmark"><strong>EDDGridFromEtopo</strong></a> just serves the 
<a rel="help" href="https://www.ngdc.noaa.gov/mgg/global/global.html"
  >ETOPO1 Global 1-Minute Gridded Elevation Data Set<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a> 
  (Ice Surface, grid registered, binary, 2byte int: etopo1_ice_g_i2.zip) which is 
  distributed with ERDDAP.
<ul>
<li>Only two datasetIDs are supported for EDDGridFromEtopo, so that you can
  access the data with longitude values -180 to 180, or longitude values 0 to 360. 
<li>There are never any sub tags, since the data is already described within ERDDAP. 
<li>So the two options for EDDGridFromEtopo datasets are (literally):
<pre>
  &lt;!-- etopo180 serves the data from longitude -180 to 180 --&gt;
  &lt;dataset type="EDDGridFromEtopo" datasetID="etopo180" /&gt; 
  &lt;!-- etopo360 serves the data from longitude 0 to 360 --&gt;
  &lt;dataset type="EDDGridFromEtopo" datasetID="etopo360" /&gt; 
</pre>
</ul>

<p><a class="selfLink" id="EDDGridFromFiles" href="#EDDGridFromFiles" rel="bookmark"><strong>EDDGridFromFiles</strong></a> is the superclass of all
EDDGridFrom...Files classes. 
You can't use EDDGridFromFiles directly. 
Instead, use a subclass of EDDGridFromFiles to handle the specific file type:
   <ul>
   <li><a rel="help" href="#EDDGridFromMergeIRFiles">EDDGridFromMergeIRFiles</a> 
     handles data from gridded 
      <a rel="help"
      href="https://www.cpc.ncep.noaa.gov/products/global_precip/html/README">MergeIR
      .gz<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> files.

   <li><a rel="help" href="#EDDGridFromAudioFiles">EDDGridFromAudioFiles</a> 
      aggregates data from a group of local audio files.

   <li><a rel="help" href="#EDDGridFromNcFiles">EDDGridFromNcFiles</a> handles data from gridded 
      <a rel="help" href="https://en.wikipedia.org/wiki/GRIB">GRIB .grb<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> files,
      <a rel="help" href="https://www.hdfgroup.org/">HDF (v4 or v5) .hdf<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> files,
      <a rel="help" href="#NcML">.ncml</a> files, and 
      <a rel="help" href="https://www.unidata.ucar.edu/software/netcdf/">NetCDF (v3 or v4) .nc<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> files.
      This may work with other file types (for example, BUFR), we just haven't tested it -- 
      please send us some sample files if you are interested.

   <li><a rel="help" href="#EDDGridFromNcFilesUnpacked">EDDGridFromNcFilesUnpacked</a> 
   is a variant of EDDGridFromNcFiles which handles data from gridded 
   NetCDF (v3 or v4) .nc and related files, 
   which ERDDAP™ unpacks at a low level.

   </ul>
Currently, no other file types are supported. 
But it is usually relatively easy to add support for other file types. 
Contact us if you have a request.
Or, if your data is in an old file format that you would like to move away from, 
we recommend converting the files to be NetCDF v3 .nc files. 
NetCDF is a widely supported, binary format, 
allows fast random access to the data, and is already supported by ERDDAP.

<p><a class="selfLink" id="EDDGridFromFiles_Details" href="#EDDGridFromFiles_Details" rel="bookmark">Details</a> -- 
The following information applies to all of the subclasses of EDDGridFromFiles.
<ul>
<li><a class="selfLink" id="EDDGridFromFiles_Aggregation" href="#EDDGridFromFiles_Aggregation" rel="bookmark"><strong>Aggregation of an Existing Dimension</strong></a> -- 
  <br>All variations of EDDGridFromFiles can 
  aggregate data from local files, where each file has 1 (or more)
  different values for the leftmost (first) dimension, usually [time], which will be aggregated. 
  For example, the dimensions might be [time][altitude][latitude][longitude], 
    and the files might have the data for one (or a few) time value(s) per file.
  The resulting dataset appears as if all of the file's data had been combined.
  The big advantages of aggregation are: 
    <ul>
    <li>The size of the aggregated data set can be much larger than a single 
      file can be conveniently (~2GB).
    <li>For near-real-time data, it is easy to add a new file with the latest chunk of data.
       You don't have to rewrite the entire dataset.
    </ul>
  The requirements for aggregation are:
  <ul>
  <li>The local files needn't have the same dataVariables 
    (as defined in the dataset's datasets.xml).
    The dataset will have the dataVariables defined in datasets.xml.
    If a given file doesn't have a given dataVariable, ERDDAP™ will add
    missing values as needed.
  <li>All of the dataVariables MUST use the same axisVariables/dimensions 
    (as defined in the dataset's datasets.xml).
    The files will be aggregated based on the first (left-most) dimension, 
    sorted in ascending order.
  <li>Each file MAY have data for one or more values of the first dimension, 
    but there can't be any overlap between files.
    If a file has more than one value for the first dimension, 
    the values MUST be sorted in ascending order, with no ties.
  <li>All files MUST have exactly the same values for all of the other dimensions.
    The precision of the testing is determined by  
    <a rel="help" href="#matchAxisNDigits"><kbd>matchAxisNDigits</kbd></a>.
  <li>All files MUST have exactly the same <a rel="help" href="#units">units</a> 
    metadata for all axisVariables and dataVariables.
    If this is a problem, you may be able to use 
    <a rel="help" href="#NcML">NcML</a> or 
    <a rel="help" href="#NCO">NCO</a>
    to fix the problem.
    <br>&nbsp;
  </ul>

<li><a class="selfLink" id="EDDGridFromFiles_AggregationViaFileNames" href="#EDDGridFromFiles_AggregationViaFileNames" rel="bookmark"
><strong>Aggregation via File Names or Global Metadata</strong></a> -- 
  <br>All variations of EDDGridFromFiles can also aggregate a group of files 
    by adding a new leftmost (first)
    dimension, usually time, based on a value derived from each filename
    or from the value of a global attribute that is in each file.
    For example, the filename might include the time value for the data
    in the file. ERDDAP™ would then create a new time dimension.

  <p>Unlike the similar feature in THREDDS, ERDDAP™ always creates
    an axisVariable with numeric values (as required by CF), never String values 
    (which are not allowed by CF). 
    Also, ERDDAP™ will sort 
    the files in the aggregation based on the numeric axisVariable value which
    is assigned to each file, so that the axis variable will always have
    sorted values as required by CF.
    The THREDDS approach of doing a lexicographic sort based on the file names
    leads to aggregations where the axis values aren't sorted (which is not allowed by CF)
    when the file names sort differently than the derived axisVariable values.

  <p>To set up one of these aggregations in ERDDAP™, you will define a new leftmost (first) 
    <a rel="help" href="#axisVariable">axisVariable</a> with a special, pseudo
    <kbd>&lt;sourceName&gt;</kbd>, which tells ERDDAP™ where and how to find
    the value for the new dimension from each file. 
    <ul>
    <li>The format for the pseudo sourceName which gets the value from a filename (just filename.ext) is
      <br><kbd>***fileName,<i><a rel="help" href="#dataTypes">dataType</a></i>,<i>extractRegex</i>,<i>captureGroupNumber</i></kbd>
    <li>The format for the pseudo sourceName which gets the value from a file's absolute path name is
      <br><kbd>***pathName,<i><a rel="help" href="#dataTypes">dataType</a></i>,<i>extractRegex</i>,<i>captureGroupNumber</i></kbd>
      <br>[For this, the path name always uses '/' as the directory separator character, never '\'.]
    <li>The format for the pseudo sourceName which gets the value from a global attribute is
      <br><kbd>***global:<i>attributeName</i>,<i><a rel="help" href="#dataTypes">dataType</a></i>,<i>extractRegex</i>,<i>captureGroupNumber</i></kbd>
    <li>This pseudo sourceName option works differently from the others: instead of creating a 
      new leftmost (first) axisVariable, this replaces the value of the current axisVariable
      with a value extracted from the filename (just filename.ext). The format is
      <br><kbd>***replaceFromFileName,<i><a rel="help" href="#dataTypes">dataType</a></i>,<i>extractRegex</i>,<i>captureGroupNumber</i></kbd>
      <br>&nbsp;
    </ul>
    The descriptions of the parts you need to provide are:
    <ul>
    <li><i>attributeName</i> -- the name of the global attribute which
      is in each file and which contains the dimension value.
    <li><i>dataType</i> -- This specifies the data type
      that will be used to store the values.  See the standard list of 
      <a rel="help" href="#dataTypes">dataTypes</a> that ERDDAP™ supports,
      except that <kbd>String</kbd> is not allowed here since
      axis variables in ERDDAP™ can't be String variables.

      <p>There is an additional pseudo dataType, <kbd>timeFormat=<i>stringTimeFormat</i></kbd>, 
      which tells ERDDAP™ that the value is a String timeStamp  
      <a rel="help" href="#stringTimeUnits">units suitable for string times</a>.
      In most cases, the stringTimeFormat you need will be a variation of one of these 
      formats:
      <ul>
      <li><kbd>yyyy-MM-dd'T'HH:mm:ss.SSSZ</kbd> -- which ISO 8601:2004(E)
        date time format. You may need a shortened version of this, e.g.,
        yyyy-MM-dd'T'HH:mm:ss or yyyy-MM-dd.
      <li><kbd>yyyyMMddHHmmss.SSS</kbd> -- which is the compact version of the ISO 8601
        date time format. You may need a shortened version of this, e.g.,
        yyyyMMddHHmmss or yyyyMMdd.
      <li><kbd>M/d/yyyy H:mm:ss.SSS</kbd> -- which is the U.S. slash date format. 
        You may need a shortened version of this, e.g., M/d/yyyy .
      <li><kbd>yyyyDDDHHmmssSSS</kbd> -- which is the year plus the zero-padded day of the year 
        (e.g, 001 = Jan 1, 365 = Dec 31 in a non-leap year; this is sometimes 
        erroneously called the Julian date).
        You may need a shortened version of this, e.g., yyyyDDD .
      </ul>
      If you use this pseudo dataType, add this to the new variable's &lt;addAttributes&gt;:
      <br><kbd>&lt;att name="units"&gt;seconds since 1970-01-01T00:00:00Z&lt;/att&gt;</kbd>
      <br>If you want to shift all of the time values, shift the time value in <kbd>units</kbd>, e.g.,
      <br><kbd>1970-01-01T12:00:00Z</kbd>.


    <li><i>extractRegex</i> -- This is the 
      <a rel="help"
      href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/regex/Pattern.html"
      >regular expression<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
      (<a rel="help" href="https://www.vogella.com/tutorials/JavaRegularExpressions/article.html"
      >tutorial<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>)
      which includes a capture group (in parentheses) which 
      describes how to extract the value from the filename or global attribute value.
      For example, given a filename like S19980011998031.L3b_MO_CHL.nc,
      capture group #1, "\d{7}", in the regular expression 
      <kbd>S(\d{7})\d{7}\.L3b.*</kbd>
      will capture the first 7 digits after 'S': 1998001.

    <li><i>captureGroupNumber</i> -- This is the number of the capture group
      (within a pair of parentheses) in the regular expression 
      which contains the information of interest.
      It is usually 1, the first capture group. Sometimes you need to 
      use capture groups for other purposes in the regex, so then the
      important capture group number will be 2 (the second capture group) 
      or 3 (the third), etc.
    </ul>

    <p>A full example of an axisVariable which 
    makes an aggregated dataset with a new time axis 
    which gets the time values from the filename of each file is
    <pre>
  &lt;axisVariable&gt;
    &lt;sourceName&gt;***fileName,timeFormat=yyyyDDD,S(\d{7})\.L3m.*,1&lt;/sourceName&gt;
    &lt;destinationName&gt;time&lt;/destinationName&gt;
  &lt;/axisVariable&gt;</pre>
    When you use the "timeFormat=" pseudo dataType, ERDDAP™ will add 2 attributes
    to the axisVariable so that they appear to be coming from the source:<kbd>
    <br>&lt;att name="standard_name"&gt;time&lt;/att&gt;
    <br>&lt;att name="units"&gt;seconds since 1970-01-01T00:00:00Z&lt;/att&gt;</kbd>
    <br>So in this case, ERDDAP™ will create a new axis named "time"
    with double values (seconds since 1970-01-01T00:00:00Z) by extracting the 7 digits 
    after 'S' and before ".L3m" in the filename
    and interpreting those as time values formatted as yyyyDDD.

    <p>You can override the default base time (1970-01-01T00:00:00Z) by adding an 
    <a rel="help" href="#addAttributes">addAttribute</a> which 
    specifies a different units attribute with a different base time.
    A common situation is: there are groups of data files,
    each with a 1 day composite of a satellite dataset,
    where you want the time value to be noon of the day mentioned in the filename
    (the centered time of each day) and want the variable's long_name to be 
    "Centered Time". An example which does this is: 
    <pre>
  &lt;axisVariable&gt;
    &lt;sourceName&gt;***fileName,timeFormat=yyyyDDD,S(\d{7})\.L3m.*,1&lt;/sourceName&gt;
    &lt;destinationName&gt;time&lt;/destinationName&gt;
    &lt;addAttributes&gt;
      &lt;att name="long_name"&gt;Centered Time&lt;/att&gt;
      &lt;att name="units"&gt;seconds since 1970-01-01T12:00:00Z&lt;/att&gt;
    &lt;/addAttributes&gt;
  &lt;/axisVariable&gt;</pre>
    Note hours=12 in the base time, which adds 12 hours relative to the 
    original base time of 1970-01-01T00:00:00Z.

    <p>A full example of an axisVariable which 
    makes an aggregated dataset with a new "run" axis (with int values)
    which gets the run values from the "runID" global attribute
    in each file (with values like "r17_global", where 17 is the run number) is
    <pre>
  &lt;axisVariable&gt; 
    &lt;sourceName&gt;***global:runID,int,(r|s)(\d+)_global,2&lt;/sourceName&gt;
    &lt;destinationName&gt;run&lt;/destinationName&gt;
    &lt;addAttributes&gt;
      &lt;att name="ioos_category"&gt;Other&lt;/att&gt;
      &lt;att name="units"&gt;count&lt;/att&gt;
    &lt;/addAttributes&gt;
  &lt;/axisVariable&gt;</pre>
    Note the use of the capture group number 2 to capture the digits which
    occur after 'r' or 's', and before "_global".
    This example also shows how to add additional attributes 
    (e.g., ioos_category and units) to the axis variable.
    <br>&nbsp;

<li><a class="selfLink" id="ExternallyCompressedFiles" href="#ExternallyCompressedFiles" rel="bookmark"
><strong>Externally Compressed Files</strong></a>
  <ul>
  <li>Datasets that are subsets of EDDGridFromFiles and EDDTableFromFiles
    can serve data directly from externally compressed data files,
    including .tgz, .tar.gz, .tar.gzip, .gz, .gzip, .zip, .bz2, and .Z files.
    <br>&nbsp;
  <li><strong>This works surprisingly well!</strong> 
    <br>In most cases, the slowdown related to decompressing small and 
    medium-sized data files is minor.
    If you need to conserve disk space, 
    we strongly encourage using this feature, especially for older files
    that are rarely accessed.
    <br>&nbsp;
  <li><strong>Save money!</strong>
    <br>This is one of the few features in ERDDAP™ that offers you 
    a chance to save a lot of money (although at the cost of slightly decreased 
    performance). If the compression ratio is e.g., 6:1 (sometimes it will be much higher),
    then the dataset's data files will only need 1/6 the disk space. 
    Then perhaps you can get by with 1 RAID (of a given size) instead of 6 RAIDS (of the same size).
    That is a huge cost savings. Hopefully, the ability to compress some files
    in a collection (the older ones?) and not compress others (the newer ones?),
    and to change that at any time,
    let's you minimize the downside to compressing some of the files (slower access).
    And if the choice is between storing the files on tape (and only accessible 
    upon request, after a delay)
    vs storing them compressed on a RAID (and accessible via ERDDAP),
    then there is a huge advantage to using compression so that users get interactive
    and (relatively) quick access to the data.
    And if this can save you from purchasing an additional RAID, 
    this feature can save you about $30,000.
    <br>&nbsp;
  <li>For all EDDGridFromFiles subclasses, if the data files have an extension
    indicating that they are externally compressed files
    (currently: .tgz, .tar.gz, .tar.gzip, .gz, .gzip, .zip, .bz2, or .Z),
    ERDDAP™ will decompress the files to the dataset's cache directory when it reads them
    (if they aren't already in the cache).
    The same is true for binary file (e.g., .nc) subclasses of EDDTableFromFiles.
    <br>&nbsp;
  <li>For EDDTableFromFiles subclasses for non-binary files (e.g., .csv),
    data files with an extension
    indicating that they are externally compressed files
    will be decompressed on-the-fly as the file is read.
    <br>&nbsp;
  <li>REQUIREMENT: If the type of externally compressed file used (e.g., .tgz or .zip) 
    supports more than 1 file inside the compressed file,
    the compressed file must contain just 1 file.
    <br>&nbsp;
  <li>REQUIREMENT: This feature assumes that the contents of the externally compressed files 
    don't change, so that a cached decompressed file can be reused. If some or all 
    of a dataset's data files are sometimes changed, don't compress those files.
    This is consistent with common usage, since people don't normally compress files that they 
    sometimes need to change.
    <br>&nbsp;
  <li>&lt;fileNameRegex&gt; 
    To make this work, the dataset's &lt;fileNameRegex&gt; must match the compressed files' names.
    Obviously, regexes like .* will match all file names.
    If you specify a specific file type, e.g., .*\.nc, then you need to modify the regex
    to include the compression extension too, e.g., .*\.nc\.gz 
    (if all of the files will be <i>something</i>.nc.gz files) .
    <br>&nbsp;
  <li>It is fine if your dataset includes a mix of compressed and not compressed
    files.
    This may be useful if you believe that some files (e.g., older files)
    will be used less often and therefore it would be useful to save disk space
    by compressing them. To make this work, the &lt;fileNameRegex&gt; must match 
    the compressed and not compressed files' names, e.g., .* or .*\.nc(|\.gz) 
    (where the capture group at the end of that specifies that .gz is optional.
    <br>&nbsp;
  <li>It is fine if you compress or decompress specific files in the collection 
    at any time.
    <br>If the dataset doesn't use 
    <a rel="help" href="#updateEveryNMillis"><kbd>&lt;updateEveryNMillis&gt;</kbd></a>,
    set the dataset's 
    <a rel="help" href="https://erddap.github.io/setup.html#flag">flag</a>
    to tell ERDDAP™ to reload the dataset and thus notice the changes.
    Interestingly, you could use different compression algorithms and settings for different files
    in the same dataset (e.g., .bz2 for rarely used files, .gz for not often used files, and
    no compression for frequently used files), just be sure that the regex supports
    all of the file extensions that are in use, e.g., .*\.nc(|\.gz|\.bz2) .
    <br>&nbsp;
  <li>Of course, compression ratios and speeds for the different compression algorithms 
    vary with the source file and the settings (e.g., compression level).
    If you want to optimize this system for your files, do a test of the different
    compression methods with your files and with a range of compression settings.
    If you want a reliably good (not necessarily the best) setup,
    we will slightly recommend gzip (.gz).
    gzip doesn't make the smallest compressed file (it's reasonably close),
    but it compresses the file very quickly
    and (more important for ERDDAP™ users) decompresses the file very quickly.
    Plus, gzip software comes standard with every Linux and Mac OS installation
    and is readily available for Windows via free tools like 7Zip and Linux
    add-ons like Git Bash.
    For example, to compress a source file into the .gz version of the file 
    (same filename, but with .gz appended), use (in Linux, Mac OS, and Git Bash)
    <br><kbd>gzip <i>sourceName</i></kbd>
    <br>To decompress a .gz file back to the original, use 
    <br><kbd>gunzip <i>sourceName.gz</i></kbd>
    <br>To compress each of the source files in directory and its subdirectories, recursively, use
    <br><kbd>gzip -r <i>directorName</i></kbd>
    <br>To decompress each of the .gz files in directory and its subdirectories , recursively, use
    <br><kbd>gunzip -r <i>directorName</i></kbd>
    <br>&nbsp;
  <li>WARNING: Don't externally compress (gzip) files that are already internally compressed!
    <br>Many files already have compressed data internally. If you gzip these
    files, the resulting files won't be much smaller (&lt; 5%) and ERDDAP™ will waste time
    decompressing them when it needs to read them. For example:
    <ul>
    <li>data files:  e.g., .nc 4, and .hdf 5 files: Some files use internal compression; some don't. 
       How to tell: compressed variables have "_ChunkSize" attributes.
       Also, if a group of gridded .nc or .hdf files are all different sizes, 
         they are likely internally compressed. 
         If they are all the same size, they are not internally compressed.
    <li>image files: e.g., .gif, .jpg, and .png
    <li>audio files: e.g., .mp3, and .ogg.  
    <li>video files: e.g., .mp4, .ogv, and .webm.
    </ul>
    <br>One unfortunate odd case: .wav audio files are huge and not internally compressed.
    It would be nice to compress (gzip) them, but generally you shouldn't because
    if you do, users won't be able to play the compressed files in their browser.
    <br>&nbsp;
  <li>Test Case: compressing (with gzip) a dataset with 1523 gridded .nc files.
    <ul>
    <li>The data in the source files was sparse (lots of missing values).
    <li>Total disk space went from 57 GB before compression to 7 GB after.
    <li>A request for lots of data from 1 time point is &lt; 1 s before and after compression.
    <li>A request for 1 data point for 365 time points (the worst case situation) went from 4 s to 71 s.
    <br>&nbsp;
    </ul>
    To me that is a reasonable trade-off for any dataset, and certainly for 
    datasets that are infrequently used.
    <br>&nbsp;

  <li>Internal versus External Compression --
    <br>Compared to the internal file compression offered by .nc4 and .hdf5 files,
    ERDDAP's approach for externally compressed binary files has advantages and disadvantages.
    The disadvantage is: for a one time read of a small part of one file, 
    internal compression is better because EDDGridFromFiles only needs to decompress
    a few chunk(s) of the file, not the entire file.
    But ERDDAP's approach has some advantages:
    <ul>
    <li>ERDDAP™ supports compression of all types of data files
      (binary and non-binary, e.g., .nc3 and .csv)
      not just .nc4 and .hdf4.
    <li>If the bulk of a file needs to be read more than once in a short period of time,
      then it saves time to decompress the file once and read it many times.
      This happens in ERDDAP™ when a user uses Make-A-Graph for the dataset
      and makes a series of small changes to the graph.
    <li>The ability to have compressed files and not compressed files in the same
      collection, allows
      you more control over which files are compressed and which aren't. 
      And this added control comes without really modifying the source file
      (since you can compress a file with e.g., .gz and then decompress it 
      to get the original file). 
    <li>The ability to change at any time whether a given file is compressed
      and how it is compressed (different algorithms and settings)
      gives you more control over the performance of the system.
      And you can easily recover the original uncompressed file at any time.
    </ul>
    While neither approach is a winner in all situations, it is clear that ERDDAP's ability 
    to serve data from externally compressed files makes external compression
    a reasonable alternative to the internal compression offered by .nc4 and .hdf5.
    That is significant given that internal compression is one of the main
    reasons people choose to use .nc4 and .hdf5.
    <br>&nbsp;
  <li><a class="selfLink" id="decompressedCacheMaxGB" href="#decompressedCacheMaxGB" rel="bookmark">ERDDAP™</a>
     <a class="selfLink" id="decompressedCacheMaxMinutesOld" href="#decompressedCacheMaxMinutesOld" rel="bookmark">makes</a>
    a decompressed version of any compressed binary (e.g., .nc) data file when it needs
    to read the file. The decompressed files are kept in the dataset's directory
    within <i>bigParentDirectory</i>/decompressed/ .
    Decompressed files which haven't been used recently will be deleted
    to free up space when the cumulative file size is &gt;10GB.
    You can change that by setting &lt;decompressedCacheMaxGB&gt; (default=10) in datasetsXml.xml, e.g.,
    <br><kbd>&lt;decompressedCacheMaxGB&gt;40&lt;/decompressedCacheMaxGB&gt;</kbd>
    <br>Also, decompressed files that haven't been used in the last 15 minutes
    will be deleted at the start of each major dataset reload.
    You can change that by setting &lt;decompressedCacheMaxMinutesOld&gt; (default=15) in datasetsXml.xml, e.g.,
    <br><kbd>&lt;decompressedCacheMaxMinutesOld&gt;60&lt;/decompressedCacheMaxMinutesOld&gt;</kbd>
    <br>Larger numbers are nice, but the cumulative size of the decompressed files
    may cause <i>bigParentDirectory</i> to run out of disk space, which causes severe problems.
    <br>&nbsp;
  <li>Because decompressing a file can take a significant amount of time (0.1 to 10 seconds), 
    datasets with compressed files may benefit from setting the dataset's
    <a rel="help" href="#nThreads">&lt;nThreads&gt;</a>
    setting to a higher number (2? 3? 4?). The downsides to even higher numbers (e.g., 5? 6? 7?) 
    are diminishing returns and that one user's request can then use a high percentage 
    of the system's resources, thus significantly slowing down the processing of 
    other user's requests. 
    Thus, there is no ideal nThreads setting, just different consequences in different 
    situations with different settings.
    <br>&nbsp;
   </ul>

<li><a class="selfLink" id="EDDGridFromFiles_SortedDimensionValues" href="#EDDGridFromFiles_SortedDimensionValues" rel="bookmark"><strong>Sorted Dimension Values</strong></a> -
   The values for each dimension MUST be in sorted order (ascending or descending,
   except for the first (left-most) dimension which must be ascending). 
   The values can be irregularly spaced. There can't be any ties.
   This is a requirement of the
   <a href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html"
    >CF metadata standard<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
    If any dimension's values aren't in sorted order, the dataset won't be loaded
    and ERDDAP™ will identify the first unsorted value in the log file,
    <i>bigParentDirectory</i>/logs/log.txt .
   <p>Unsorted dimension values almost always indicate a problem with the source dataset.
   This most commonly occurs when a misnamed or inappropriate file is 
   included in the aggregation, which leads to an unsorted time dimension. 
   To solve this problem, see the error message in the 
   ERDDAP™ log.txt file to find the offending time value. 
   Then look in the source files to find the corresponding file 
   (or one before or one after) that doesn't belong in the aggregation.

<li><a class="selfLink" id="EDDGridFromFiles_Directories" href="#EDDGridFromFiles_Directories" rel="bookmark"><strong>Directories</strong></a> -- 
  The files MAY be in one directory, or in a directory
  and its subdirectories (recursively).
  If there are a large number of files (for example, &gt;1,000), the operating system 
  (and thus EDDGridFromFiles) will operate much more efficiently if you store the files 
  in a series of
  subdirectories (one per year, or one per month for datasets with very frequent files),
  so that there are never a huge number of files in a given directory.
  <br>&nbsp;

<li><strong><kbd>&lt;cacheFromUrl&gt;</kbd></strong> -
    <br>All EDDGridFromFiles and all EDDTableFromFiles datasets support a
    set of tags which tell ERDDAP™ to download and maintain a copy of 
    all of a remote dataset's files, or a cache of a few files (downloaded as needed).
    This can be incredibly useful.
    See the <a rel="help" href="#cacheFromUrl">cacheFromUrl documentation</a>.
  <br>&nbsp;

<li><a class="selfLink" id="remoteDirectories" href="#remoteDirectories" rel="bookmark"
  ><strong>Remote Directories and 
  HTTP Range Requests</strong></a> -
  <br>(AKA Byte Serving, Byte Range Requests, Accept-Ranges http header) 
  <br>EDDGridFromNcFiles, EDDTableFromMultidimNcFiles, EDDTableFromNcFiles, 
  and EDDTableFromNcCFFiles,  
  can <i>sometimes</i> serve data from .nc files on remote servers and accessed via HTTP
  if the server supports
  <a rel="help" href="https://en.wikipedia.org/wiki/Byte_serving"
    >Byte Serving<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
  via HTTP range requests (the HTTP mechanism for byte serving).
  This is possible because netcdf-java (which ERDDAP™ uses to read .nc files)
  supports reading data from remote .nc files via HTTP range requests.

  <p><strong>Don't do this!</strong> It is horribly inefficient and slow.
  <br>Instead, use the 
    <a rel="help" href="#cacheFromUrl">&lt;cacheFromUrl&gt; system</a>.

  <p><a class="selfLink" id="accessingErddapFiles" href="#accessingErddapFiles" rel="bookmark"
  >Accessing</a> ERDDAP™ datasets as files via byte range requests -- 
  <br>Flipping this around, given that you can (in theory) think of a dataset 
  in ERDDAP™ as a giant .nc file by appending ".nc" to the base OPenDAP URL
  for a given dataset (e.g., https://myserver.org/erddap/griddap/datasetID.nc
  and also by adding a ?query after that to specify a subset),
  it is perhaps reasonable to ask whether you can use netcdf-java, Ferret, or some other 
  NetCDF client software to read data via HTTP Range Requests from ERDDAP.
  The answer is no, because there isn't really a huge ".nc" file.
  If you want to do this, instead do one of these options:
  <ul>
  <li>Use (OPeN)DAP client software to connect to the griddap services
    offered by ERDDAP. That is what DAP (and thus ERDDAP) was designed for.
    It is very efficient.
  <li>Or, download the source file(s) from the "files" system
    (or a subset file via a .nc?query) to your computer and use netcdf-java, Ferret, 
    or some other NetCDF client software to read the (now) local file(s).
    <br>&nbsp;
  </ul>

<li><a class="selfLink" id="EDDGridFromFiles_CachedFileInformation" href="#EDDGridFromFiles_CachedFileInformation" rel="bookmark"
><strong>Cached File Information</strong></a> -- 
  When an EDDGridFromFiles dataset is first loaded, 
  EDDGridFromFiles reads information from all of the relevant files 
  and creates tables (one row for each file) with information about each valid file 
  and each "bad" (different or invalid) file. 
  <ul>
  <li>The tables are also stored on disk, as NetCDF v3 .nc files in <i>bigParentDirectory</i>/dataset/<i>last2CharsOfDatasetID</i>/<i>datasetID</i>/ in
    files named:
    <br>&nbsp;&nbsp;dirTable.nc (which holds a list of unique directory names),
    <br>&nbsp;&nbsp;fileTable.nc (which holds the table with each valid file's information),
    <br>&nbsp;&nbsp;badFiles.nc (which holds the table with each bad file's information).
  <li>To speed up access to an EDDGridFromFiles dataset (but at the expense of using more memory),
    you can use
    <br><a rel="help" href="#fileTableInMemory"><kbd>&lt;fileTableInMemory&gt;true&lt;/fileTableInMemory&gt;</kbd></a>
    <br>to tell ERDDAP™ to keep a copy of the file information tables in memory.
  <li>The copy of the file information tables on disk is also useful 
    when ERDDAP™ is shut down and restarted: 
    it saves EDDGridFromFiles from having to re-read all of the data files.
  <li>When a dataset is reloaded, ERDDAP™ only needs to read the data in new files
    and files that have changed.
  <li>If a file has a different structure from the other files 
    (for example, a different data type
    for one of the variables, or a different value for the 
    "<a rel="help" href="#units">units</a>" attribute), ERDDAP
    adds the file to the list of "bad" files. Information about the problem with the file
    will be written to the <i>bigParentDirectory</i>/logs/log.txt file.    
  <li>You shouldn't ever need to delete or work with these files.
    One exception is: if you are still making changes to a dataset's datasets.xml setup,
    you may want to delete these files to force ERDDAP™ to reread all of the files
    since the files will be read/interpreted differently.
    If you ever do need to delete these files, you can do it when ERDDAP™ is running. 
    (Then set a 
    <a rel="help"
    href="https://erddap.github.io/setup.html#setDatasetFlag">flag</a>
    to reload the dataset ASAP.)
    However, ERDDAP™ usually notices that the datasets.xml information doesn't match
    the fileTable information and deletes the file tables automatically.
  <li>If you want to encourage ERDDAP™ to update the stored dataset information 
    (for example, if you just added, removed, or changed some files to the dataset's data directory), 
    use the 
      <a rel="help" href="https://erddap.github.io/setup.html#flag">flag system</a>
      to force ERDDAP™ to update the cached file information.
    <br>&nbsp;
  </ul>

<li><a class="selfLink" id="EDDGridFromFiles_HandlingRequests" href="#EDDGridFromFiles_HandlingRequests" rel="bookmark"
><strong>Handling Requests</strong></a> -- 
    When a client's request for data is processed, EDDGridFromFiles can quickly look 
  in the table with the valid file information to see which files have the requested data.
  <br>&nbsp;

<li><a class="selfLink" id="EDDGridFromFiles_Updating" href="#EDDGridFromFiles_Updating" rel="bookmark"
  ><strong>Updating the Cached File Information</strong></a> -- 
  Whenever the dataset is reloaded,
  the cached file information is updated.
  <ul>
    <li>The dataset is reloaded periodically as determined by the 
      <kbd>&lt;reloadEveryNMinutes&gt;</kbd> 
      in the dataset's information in datasets.xml.
    <li>The dataset is reloaded as soon as possible whenever ERDDAP™ detects
      that you have added, removed, 
      <a rel="help" href="https://en.wikipedia.org/wiki/Touch_(Unix)">touch'd<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        (to change the file's lastModified time), or changed a datafile.
    <li>The dataset is reloaded as soon as possible if you use the 
        <a rel="help" href="https://erddap.github.io/setup.html#flag">flag system</a>.
    </ul>
  When the dataset is reloaded, ERDDAP™ compares the currently available files 
  to the cached file information tables.
  New files are read and added to the valid files table.
  Files that no longer exist are dropped from the valid files table.
  Files where the file timestamp has changed are read and their information is updated.
  The new tables replace the old tables in memory and on disk.
  <br>&nbsp;

<li><a class="selfLink" id="EDDGridFromFiles_BadFiles" href="#EDDGridFromFiles_BadFiles" rel="bookmark"><strong>Bad Files</strong></a> -- 
  The table of bad files and the reasons the files were
  declared bad (corrupted file, 
  missing variables, etc.) is emailed to the emailEverythingTo email address (probably you)
  every time the dataset is reloaded. You should replace or repair these files 
  as soon as possible.
  <br>&nbsp;

<li><a class="selfLink" id="EDDGridFromFiles_MissingVariables" href="#EDDGridFromFiles_MissingVariables" rel="bookmark"><strong>Missing Variables</strong></a> -- 
  If some of the files don't have some of the dataVariables defined in the dataset's datasets.xml chunk, that's okay.
  When EDDGridFromFiles reads one of those files, it will act as if the file had the variable, but with all missing values.
  <br>&nbsp;

<li><a class="selfLink" id="EDDGridFromFiles_FTP" href="#EDDGridFromFiles_FTP" rel="bookmark"><strong>FTP Trouble/Advice</strong></a> -- 
  If you FTP new data files to the ERDDAP™ server 
  while ERDDAP™ is running,
  there is the chance that ERDDAP™ will be reloading the dataset during the FTP process.
  It happens more often than you might think!
  If it happens, the file will appear to be valid (it has a valid name), 
  but the file isn't yet valid.
  If ERDDAP™ tries to read data from that invalid file, the resulting error will
  cause the file to be added to the table of invalid files.
  This is not good.
  To avoid this problem, use a temporary filename when FTP'ing the file, for example, ABC2005.nc_TEMP .
  Then, the fileNameRegex test (see below) will indicate that this is not a relevant file.
  After the FTP process is complete, rename the file to the correct name.
  The renaming process will cause the file to become relevant in an instant.
  <br>&nbsp;

<li><a class="selfLink" id="EDDGridFromFiles_0Files" href="#EDDGridFromFiles_0Files" rel="bookmark"><strong>"0 files" Error Message</strong></a> -- 
  If you run 
  <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml</a> 
  or <a rel="help" href="#DasDds">DasDds</a>, 
  or if you try to load an EDDGridFrom...Files
  dataset in ERDDAP™, and you get a "0 files" error message indicating that
  ERDDAP™ found 0 matching files in the directory 
  (when you think that there are matching files in that directory):
  <ul>
  <li>Check that the files really are in that directory.
  <li>Check the spelling of the directory name.
  <li>Check the fileNameRegex. It's really, really easy to make mistakes with regexes.
    For test purposes, try the regex .* which should match all filenames.
          (See this <a rel="help"
          href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/regex/Pattern.html"
          >regex documentation<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a> 
            and 
          <a rel="help" href="https://www.vogella.com/tutorials/JavaRegularExpressions/article.html"
          >regex tutorial<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a>.)
  <li>Check that the user who is running the program (e.g., user=tomcat (?) for Tomcat/ERDDAP)
    has 'read' permission for those files. 
  <li>In some operating systems (for example, SELinux) and depending on system settings, 
    the user who ran the program must have 'read' permission for the 
    whole chain of directories leading to the directory that has the files. 
    <br>&nbsp;
  </ul>

<li><a class="selfLink" id="EDDGridFromFilesSkeletonXML" href="#EDDGridFromFilesSkeletonXML" rel="bookmark"><strong>The skeleton XML</strong> 
  for all EDDGridFromFiles subclasses is:</a>
<pre>
&lt;dataset type="EDDGridFrom...Files" <a rel="help" href="#datasetID">datasetID</a>="..." <a rel="help" href="#active">active</a>="..." &gt;
  <a rel="help" href="#accessibleTo">&lt;accessibleTo&gt;</a>...&lt;/accessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#graphsAccessibleTo">&lt;graphsAccessibleTo&gt;</a>auto|public&lt;/graphsAccessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#accessibleViaWMS">&lt;accessibleViaWMS&gt;</a>...&lt;/accessibleViaWMS&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#reloadEveryNMinutes">&lt;reloadEveryNMinutes&gt;</a>...&lt;/reloadEveryNMinutes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#updateEveryNMillis">&lt;updateEveryNMillis&gt;</a>...&lt;/updateEveryNMillis&gt; &lt;!-- 0 or 1. For
    EDDGridFromFiles subclasses, this uses Java's WatchDirectory system 
    to notice new/deleted/changed files quickly and efficiently. --&gt;
  <a rel="help" href="#defaultDataQuery">&lt;defaultDataQuery&gt;</a>...&lt;/defaultDataQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultGraphQuery">&lt;defaultGraphQuery&gt;</a>...&lt;/defaultGraphQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#matchAxisNDigits">&lt;matchAxisNDigits&gt;</a>...&lt;/matchAxisNDigits&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#nThreads">&lt;nThreads&gt;</a>...&lt;/nThreads&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#dimensionValuesInMemory">&lt;dimensionValuesInMemory&gt;</a>...&lt;/dimensionValuesInMemory&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#fgdcFile">&lt;fgdcFile&gt;</a>...&lt;/fgdcFile&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#iso19115File">&lt;iso19115File&gt;</a>...&lt;/iso19115File&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#onChange">&lt;onChange&gt;</a>...&lt;/onChange&gt; &lt;!-- 0 or more --&gt;
  &lt;fileDir&gt;...&lt;/fileDir&gt; &lt;-- The directory (absolute) with the 
    data files. --&gt;
  &lt;recursive&gt;true|false&lt;/recursive&gt; &lt;!-- 0 or 1. Indicates if 
    subdirectories of fileDir have data files, too. --&gt;
  <a rel="help" href="#pathRegex">&lt;pathRegex&gt;</a>...&lt;/pathRegex&gt;  &lt;!-- 0 or 1. Only directory names which 
    match the pathRegex (default=".*") will be accepted. --&gt;
  &lt;fileNameRegex&gt;...&lt;/fileNameRegex&gt; &lt;-- 0 or 1. A 
    <a rel="help" href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/regex/Pattern.html"
  >regular expression<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> (<a rel="help"
    href="https://www.vogella.com/tutorials/JavaRegularExpressions/article.html">tutorial<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>) describing valid data
    file names, for example, ".*\.nc" for all .nc files. --&gt;
  <a rel="help" href="#accessibleViaFiles">&lt;accessibleViaFiles&gt;</a>true|false(default)&lt;/accessibleViaFiles&gt; 
    &lt;!-- 0 or 1 --&gt;
  &lt;metadataFrom&gt;...&lt;/metadataFrom&gt; &lt;-- The file to get 
    metadata from ("first" or "last" (the default) based on file's 
    lastModifiedTime). --&gt;
  <a rel="help" href="#fileTableInMemory">&lt;fileTableInMemory&gt;</a>...&lt;/fileTableInMemory&gt; &lt;!-- 0 or 1 (true or 
    false (the default)) --&gt;
  <a rel="help" href="#cacheFromUrl">&lt;cacheFromUrl&gt;</a>...&lt;/cacheFromUrl&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#cacheFromUrl">&lt;cacheSizeGB&gt;</a>...&lt;/cacheSizeGB&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#globalAttributes">&lt;addAttributes&gt;</a>...&lt;/addAttributes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#axisVariable">&lt;axisVariable&gt;</a>...&lt;/axisVariable&gt; &lt;!-- 1 or more --&gt;
  <a rel="help" href="#dataVariable">&lt;dataVariable&gt;</a>...&lt;/dataVariable&gt; &lt;!-- 1 or more --&gt;
&lt;/dataset&gt;
</pre>
&nbsp;
</ul>

<p><a class="selfLink" id="EDDGridFromAudioFiles" href="#EDDGridFromAudioFiles" rel="bookmark"><strong>EDDGridFromAudioFiles</strong></a> and
   <a class="selfLink" id="EDDTableFromAudioFiles" href="#EDDTableFromAudioFiles" rel="bookmark"><strong>EDDTableFromAudioFiles</strong></a> aggregate data 
  from a collection of local audio files. (These first appeared in ERDDAP™ v1.82.)
  The difference is that EDDGridFromAudioFiles treats the data as a multidimensional
  dataset (usually with 2 dimensions: [file startTime] and [elapsedTime within a file]),
  whereas EDDTableFromAudioFiles treats the data as tabular data
  (usually with columns for the file startTime, the elapsedTime with the file,
  and the data from the audio channels).
  EDDGridFromAudioFiles requires that all files have the same number of samples,
  so if that is not true, you must use EDDTableFromAudioFiles.
  Otherwise, the choice of which EDD type to use is entirely your choice.
  One advantage of EDDTableFromAudioFiles: you can add other variables with 
  other information, e.g., stationID, stationType.
  In both cases, the lack of a unified time variable makes it more difficult
  to work with the data from these EDD types,
  but there was no good way to set up a unified time variable.
  
<p>See these class' superclasses, 
<a rel="help" href="#EDDGridFromFiles">EDDGridFromFiles</a> and
<a rel="help" href="#EDDTableFromFiles">EDDTableFromFiles</a>, 
for general information on how this class works and how to use it.

<p>We strongly recommend using the
  <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml program</a> 
  to make a rough draft of the datasets.xml chunk for this dataset.
  Since audio files have no metadata other than information related to the 
  encoding of the sound data, you will have to edit the output from 
  GenerateDatasetsXml to provide essential information (e.g., title, summary,
  creator_name, institution, history).

<p>Details:
<ul>
<li>There are a large number of audio file formats.
  Currently, ERDDAP™ can read data from most .wav and .au files. 
  It currently can't read other types of audio files, e.g., .aiff or .mp3.
  If you need support for other audio file formats or other variants of .wav and .au, 
  please email your request to Chris.John at noaa.gov .
  Or, as a workaround you can use right now, you can convert your audio
  files into PCM_SIGNED (for integer data) or PCM_FLOAT (for floating point data) 
  .wav files so that ERDDAP™ can work with them.
<li>Currently, ERDDAP™ can read audio files with what Java's AudioFormat class calls
  PCM_FLOAT, PCM_SIGNED, PCM_UNSIGNED, ALAW, and ULAW encodings.
  ERDDAP™ converts PCM_UNSIGNED values (e.g., 0 to 255) into signed values
  (e.g., -128 to 128) by rearranging the bits in the data values.
  ERDDAP™ converts ALAW and ULAW encoded from their native encoded byte format into 
  short (int16) values.
  Since Java wants bigEndian=true data, ERDDAP™ rearranges the bytes of data 
  stored with bigEndian=false (little endian) in order to read the values correctly.
  For all other encodings (PCM), ERDDAP™ reads the data as is.
<li>When ERDDAP™ reads data from audio files, it converts the file's available 
  audio metadata into global attributes. This will always include 
  (with sample values shown)
  <pre>
  String audioBigEndian  "false";     //true or false
  int    audioChannels   1;              
  String audioEncoding   "PCM_SIGNED";
  float  audioFrameRate  96000.0;     //per second
  int    audioFrameSize  2;           //# of data bytes per frame
  float  audioSampleRate 96000.0;     //per second
  int    audioSampleSizeInBits 16;    //# of bits per channel per sample</pre>
  For ERDDAP's purposes, a frame is synonymous with a sample, which is the data
  for one point in time.
  <br>The attributes in ERDDAP™ will have the information describing the
  data as it was in the source files. 
  ERDDAP™ will often have changed this while reading
  the data, e.g., PCM_UNSIGNED, ALAW, and ULAW encoded data are converted to 
  PCM_SIGNED, and bigEndian=false data is converted to bigEndian=true 
  data (which is how Java wants to read it).  In the end, data values in 
  ERDDAP™ will always be the 
  <a rel="help" href="https://en.wikipedia.org/wiki/Pulse-code_modulation"
  >PCM-encoded <img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a>
  data values (i.e., simple digitized
  samples of the sound wave).  
<li>When ERDDAP™ reads data from audio files, it reads the entire file. 
  ERDDAP™ can read as many as about 2 billion samples per channel.  
  For example, if the sample rate is 44,100 samples per second, 
  2 billion samples translates to about 756 minutes of sound data per file. 
  If you have audio files with more than this amount of data, you need to 
  break up the files into smaller chunks so that ERDDAP™ can read them.
<li>Because ERDDAP™ reads entire audio files, 
  ERDDAP™ must have access to a large amount of memory to work with large audio files.
  See 
  <a rel="help" href="https://erddap.github.io/setup.html#memory"
  >ERDDAP's memory settings</a>.
  Again, if this is a problem, a workaround that you can use right now is to   
  break up the files into smaller chunks so that ERDDAP™ can read them with less memory.
<li>Some audio files were written incorrectly. ERDDAP™ makes a small effort
  to deal with such cases. But in general, when there is an error, 
  ERDDAP™ will throw an Exception (and reject that file) or (if the error
  is undetectable) read the data (but the data will be incorrect).
<li>ERDDAP™ does not check or alter the volume of the sound.
  Ideally, integer audio data is scaled to use the entire range of the data 
  type.
<li>Audio files and audio players have no system for missing values (e.g., -999 or Float.NaN).
  So audio data shouldn't have any missing values. 
  If there are missing values (e.g., if you need to lengthen an audio file), 
  use a series of 0's which will be interpreted as perfect silence.
<li>When ERDDAP™ reads data from audio files, it always creates 
  a column called elapsedTime with the time for each sample, in seconds (stored as doubles), 
  relative to the first sample (which is assigned elapsedTime=0.0 s).
  With EDDGridFromAudioFiles, this becomes the elapsedTime axis variable.
<li>EDDGridFromAudioFiles requires that all files have the same number of samples.
  So if that is not true, you must use EDDTableFromAudioFiles.
<li>For EDDGridFromAudioFiles, we recommend that you set  
  <a rel="help" href="#dimensionValuesInMemory">&lt;dimensionValuesInMemory&gt;</a>
  to <kbd>false</kbd> (as is recommended by GenerateDatasetsXml),
  because the time dimension often has a huge number of values.
<li>For EDDGridFromAudioFiles, you should almost always use the EDDGridFromFiles 
  system for 
  <a rel="help" href="#EDDGridFromFiles_AggregationViaFileNames"
  >Aggregation via File Names</a>,
  almost always by extracting the recording's start dateTime from the filenames.
  For example, 
<pre><kbd>&lt;sourceName&gt;***fileName,"timeFormat=yyyyMMdd'_'HHmmss",aco_acoustic\.([0-9]{8}_[0-9]{6})\.wav,1&lt;/sourceName&gt;</kbd></pre>
  GenerateDatasetsXml will encourage this and help you with this.
<li>For EDDTableFromAudioFiles, you should almost always use the EDDTableFromFiles system
  for <a href="#fileNameSourceNames" rel="help"><kbd>***fileName</kbd> pseudo sourceNames</a>
  to extract information from the file's name 
  (almost always the start dateTime for the recording)
  and promote it to be a column of data. 
  For example, 
<pre><kbd>&lt;sourceName&gt;***fileName,aco_acoustic\.([0-9]{8}_[0-9]{6})\.wav,1&lt;/sourceName&gt;</kbd></pre>
  The time format should then be specified as the units attribute:
  <kbd>&lt;att name="units"&gt;yyyMMdd'_'HHmmss&lt;/att&gt;</kbd>
  <br>&nbsp;
</ul>


<p><a class="selfLink" id="EDDGridFromMergeIRFiles" href="#EDDGridFromMergeIRFiles" rel="bookmark"><strong>EDDGridFromMergeIRFiles</strong></a> aggregates data 
  from local, <a rel="help"
  href="https://www.cpc.ncep.noaa.gov/products/global_precip/html/README">MergeIR<img
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
  files, which are from the 
  <a rel="bookmark"
  href="https://trmm.gsfc.nasa.gov">Tropical Rainfall Measuring Mission (TRMM)<img
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>,
  which is a joint mission between NASA and the Japan Aerospace Exploration Agency (JAXA).
  MergeIR files can be downloaded from   
  <a rel="bookmark"
  href="ftp://disc2.nascom.nasa.gov/data/s4pa/TRMM_ANCILLARY/MERG/">NASA<img
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>. 
  
<p>EDDGridFromMergeIRFiles.java was written and contributed to the ERDDAP™ project by
  Jonathan Lafite and Philippe Makowski of R.Tech Engineering
   (license: copyrighted open source).

<p>EDDGridFromMergeIRFiles is a little unusual:
<ul>
<li>EDDGridFromMergeIRFiles supports compressed or uncompressed
  source data files, in any combination, in the same dataset. This allows you, for example,
  to compress older files that are rarely accessed, but uncompress new files that
  are often accessed. Or, you can change the type of compression from the original
  .Z to for example, .gz. 
<li>If you have compressed and uncompressed versions of the same data files
  in the same directory, please make sure the &lt;fileNameRegex&gt; for your dataset 
  matches the filenames that you want it to match and doesn't match filenames that
  you don't want it to match.
<li>Uncompressed source data files must have no file extension 
  (i.e., no "." in the filename).
<li>Compressed source data files must have a file extension, but ERDDAP™ determines 
  the type of compression by inspecting the contents of the file, not by looking at 
  the file's file extension (for example, ".Z"). The supported compression types include
  "gz", "bzip2", "xz", "lzma", "snappy-raw", "snappy-framed", "pack200", and "z".
  When ERDDAP™ reads compressed files, it decompresses on-the-fly, without writing
  to a temporary file.
<li>All source data files must use the original file naming system: 
  i.e., merg_<i>YYYYMMDDHH</i>_4km-pixel (where <i>YYYYMMDDHH</i> indicates the 
  time associated with the data in the file),
  plus a file extension if the file is compressed.
</ul>

<p>See this class' superclass, <a rel="help" href="#EDDGridFromFiles">EDDGridFromFiles</a>, 
for general information on how this class works and how to use it.

<p>We strongly recommend using the
  <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml program</a> 
  to make a rough draft of the datasets.xml chunk for this dataset.
  You can then edit that to fine tune it.
<br>&nbsp;


<p><a class="selfLink" id="EDDGridFromNcFiles" href="#EDDGridFromNcFiles" rel="bookmark"
  ><strong>EDDGridFromNcFiles</strong></a> aggregates data from local, gridded, 
    <a rel="help" href="https://en.wikipedia.org/wiki/GRIB">GRIB .grb and .grb2<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> files,
    <a rel="help" href="https://www.hdfgroup.org/">HDF (v4 or v5) .hdf<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
files,
    <a rel="help" href="#NcML">.ncml</a> files, 
    <a rel="help" href="https://www.unidata.ucar.edu/software/netcdf/">NetCDF (v3 or v4) .nc<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> files, and
    <a rel="help" href="https://github.com/zarr-developers/zarr-python">Zarr<img src="../images/external.png"
        alt=" (external link)" title="This link to an external website does not constitute an endorsement."></a> files
    (as of version 2.25). Zarr files have slightly different behavior and require either the fileNameRegex or the
    pathRegex to include "zarr".
<p>This may work with other file types (for example, BUFR), we just haven't tested it -- please send us
some sample files.

<ul>
<li>For GRIB files, ERDDAP™ will make a .gbx index file the first time it reads each GRIB file.
So the GRIB files must be in a directory where the "user" that ran Tomcat has read+write permission.

<li>See this class' superclass, <a rel="help" href="#EDDGridFromFiles">EDDGridFromFiles</a>, for information
on how this class works and how to use it.

<li>Starting with ERDDAP™ v2.12, EDDGridFromNcFiles and EDDGridFromNcFilesUnpacked
  can read data from "structures" in .nc4 and .hdf4 files.
  To identify a variable that is from a structure, the &lt;sourceName&gt; must use the format: 
  <kbd><i>fullStructureName</i>|<i>memberName</i></kbd>, for example group1/myStruct|myMember .

<li>We strongly recommend using the
  <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml program</a> 
  to make a rough draft of the datasets.xml chunk for this dataset.
  You can then edit that to fine tune it.

  <p><a class="selfLink" id="GroupsInGriddedNcFiles" href="#GroupsInGriddedNcFiles" rel="bookmark"
  >Netcdf4 files can contain groups.</a> 
  ERDDAP™ just makes a dataset
  from the variables in one group and all of its parent groups.
  You can specify a specific group name in GenerateDatasetsXml
  (omit the trailing slash), or use "" to have GenerateDatasetsXml 
  search all groups for the variables that use the most dimensions,
  or use "[root]" to have GenerateDatasets just look for variables
  in the root group.

  <p>The first thing GenerateDatasetsXml does for this type of dataset 
  after you answer the questions is print the ncdump-like structure of the sample file.
  So if you enter a few goofy answers for the first loop through GenerateDatasetsXml,
  at least you'll be able to see if ERDDAP™ can read the file and see
  what dimensions and variables are in the file.
  Then you can give better answers for the second loop through GenerateDatasetsXml.

</ul>

<p><a class="selfLink" id="EDDGridFromNcFilesUnpacked" href="#EDDGridFromNcFilesUnpacked" rel="bookmark"
  ><strong>EDDGridFromNcFilesUnpacked</strong></a> 
is a variant of 
<a rel="help" href="#EDDGridFromNcFiles">EDDGridFromNcFiles</a>
which aggregates data from local, gridded NetCDF (v3 or v4) .nc and related files.
The difference is that this class unpacks
each data file before EDDGridFromFiles looks at the files: 
  <ul>
  <li>It unpacks variables that are packed with 
     <a rel="help" href="#scale_factor">scale_factor and/or add_offset</a>.
  <li>It converts _FillValue and missing_value values to be NaN's
      (or MAX_VALUE for integer data types).
  <li>It converts time and timestamp values to "seconds since 1970-01-01T00:00:00Z".
  </ul>
The big advantage of this class is that it provides a way to deal 
with different values of 
scale_factor, add_offset, _FillValue, missing_value, or time units
in different source files in a collection.
Otherwise, you would have to use a tool like 
<a rel="help" href="#NcML">NcML</a> or
<a rel="help" href="#NCO">NCO</a>
to modify each file to remove the differences so that the files could be handled 
by EDDGridFromNcFiles.
For this class to work properly, the files must follow the CF standards for the
related attributes.

<ul>
<li>If try to make an EDDGridFromNcFilesUnpacked from a group of files
  with which you previously tried and failed to use EDDGridFromNcFiles,
  cd to 
  <br><kbd><i>bigParentDirectory</i>/dataset/<i>last2Letters</i>/<i>datasetID</i>/</kbd>
  <br>where <kbd><i>last2Letters</i></kbd> is the last 2 letters of the datasetID,
  <br>and delete all of the files in that directory.

<li>Starting with ERDDAP™ v2.12, EDDGridFromNcFiles and EDDGridFromNcFilesUnpacked
  can read data from "structures" in .nc4 and .hdf4 files.
  To identify a variable that is from a structure, the &lt;sourceName&gt; must use the format: 
  <kbd><i>fullStructureName</i>|<i>memberName</i></kbd>, for example group1/myStruct|myMember .

<li>We strongly recommend using the
  <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml program</a> 
  to make a rough draft of the datasets.xml chunk for this dataset.
  You can then edit that to fine tune it.

  <p>Netcdf4 files can contain groups.
  See <a href="#GroupsInGriddedNcFiles" rel="bookmark">this documentation</a>. 

  <p>The first thing GenerateDatasetsXml does for this type of dataset 
  after you answer the questions is print the ncdump-like structure of the sample file
  <strong>before</strong> it is unpacked.
  So if you enter a few goofy answers for the first loop through GenerateDatasetsXml,
  at least you'll be able to see if ERDDAP™ can read the file and see
  what dimensions and variables are in the file.
  Then you can give better answers for the second loop through GenerateDatasetsXml.

</ul>


<p><a class="selfLink" id="EDDGridLonPM180" href="#EDDGridLonPM180" rel="bookmark"><strong>EDDGridLonPM180</strong></a> 
modifies the longitude values of a child (enclosed) EDDGrid dataset 
that has some longitude values greater than 180 (for example, 0 to 360) so that 
they are in the range -180 to 180 (Longitude Plus or Minus 180, hence the name).
<ul>
<li>This provides a way to make datasets that have longitude 
  values greater than 180 compliant in/with OGC services 
  (for example the WMS server in ERDDAP), since all OGC services 
  require longitude values within -180 to 180.

<li>Working near a discontinuity causes problems, regardless of whether
  the discontinuity is at longitude 0 or at longitude 180. 
  This dataset type lets you avoid those problems for everyone,
  by offering two versions of the same dataset:
  <br>one with longitude values in the range 0 to 360 ("Pacificentric"?),
  <br>one with longitude values in the range -180 to 180 ("Atlanticentric"?).

<li>For child datasets with all longitude values greater than 180,
    all of the new longitude values are simply 360 degrees lower.
    For example, a dataset with longitude values of 180 to 240 would 
    become a dataset with longitude values of -180 to -120.

<li>For child datasets that have longitude values for the entire globe
    (roughly 0 to 360), the new longitude value will be rearranged
    to be (roughly) -180 to 180: 
    <br>The original 0 to almost 180 values are unchanged.
    <br>The original 180 to 360 values are converted to -180 to 0 and
      shifted to the beginning of the longitude array.

<li>For child datasets that span 180 but don't cover the globe,
   ERDDAP™ inserts missing values as needed to make a dataset which
   covers the globe.
   For example, a child dataset with longitude values of 140 to 200
   would become a dataset with longitude values of -180 to 180.
   <br>The child values of 180 to 200 would become -180 to -160.
   <br>New longitude values would be inserted from -160 to 140.
      The corresponding data values will be _FillValues.
   <br>The child values of 140 to almost 180 would be unchanged.
   <br>The insertion of missing values may seem odd, but it avoids several
     problems that result from having longitude values that
     jump suddenly (e.g, from -160 to 140).

<li>In <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml</a>,
    there is a special "dataset type", EDDGridLonPM180FromErddapCatalog,
    that lets you generate the datasets.xml for EDDGridLonPM180 datasets
    from each of the EDDGrid datasets in an ERDDAP
    that have any longitude values greater than 180.
    This facilitates offering two versions of these datasets:
    <br>the original, with longitude values in the range 0 to 360, 
    <br>and the new dataset, with longitude values in the range -180 to 180.

    <p>The child dataset within each EDDGridLonPM180 dataset will
    be an EDDGridFromErddap dataset which points to the original dataset.
    <br>The new dataset's datasetID will be the name of the original dataset
      plus "_LonPM180".
    <br>For example,
<pre>&lt;dataset type="EDDGridLonPM180" datasetID="erdMBsstdmday_LonPM180" active="true"&gt;
  &lt;dataset type="EDDGridFromErddap" datasetID="erdMBsstdmday_LonPM180Child"&gt;
    &lt;!-- SST, Aqua MODIS, NPP, 0.025 degrees, Pacific Ocean, Daytime 
      (Monthly Composite) minLon=120.0 maxLon=320.0 --&gt;
    &lt;sourceUrl&gt;https://coastwatch.pfeg.noaa.gov/erddap/griddap/erdMBsstdmday
    &lt;/sourceUrl&gt;
  &lt;/dataset&gt;
&lt;/dataset&gt; </pre>
    Put the EDDGridLonPM180 dataset <strong>below</strong> the original dataset in datasets.xml.
      That avoids some possible problems.

    <p>Alternatively, you can replace the EDDGridFromErddap child dataset with
    the original dataset's datasets.xml. Then, there will be only one version
    of the dataset: the one with longitude values within -180 to 180.
    We discourage this because there are times when each version of the 
    dataset is more convenient. 

<li>If you offer two versions of a dataset, for example, one with longitude 0 to 360 and
    one with longitude -180 to 180:
  <ul>
  <li>You can use the optional 
    <a rel="help"
href="#accessibleViaWMS"><kbd>&lt;accessibleViaWMS&gt;false&lt;/accessibleViaWMS&gt;</kbd></a> 
with the 0-360 dataset to forcibly disable the WMS service for that dataset.
    Then, only the LonPM180 version of the dataset will be accessible via WMS. 
  <li>There are a couple of ways to keep the LonPM180 dataset up-to-date
    with changes to the underlying dataset:
    <ul>
    <li>If the child dataset is a EDDGridFromErddap dataset that references
      a dataset in the same ERDDAP™, the LonPM180 dataset will try to 
      directly subscribe to the underlying dataset so that it is always up-to-date.
      Direct subscriptions don't generate emails asking you to validate the subscription -
      validation should be done automatically.
    <li>If the child dataset is not an EDDGridFromErddap dataset that is
      on the same ERDDAP™, the LonPM180 dataset will try to use
      the regular subscription system to subscribe to the underlying dataset.
      If you have the subscription system in your ERDDAP™ turned on, 
      you should get emails asking you to validate the subscription. Please do so.
    <li>If you have the subscription system in your ERDDAP™ turned off,
      the LonPM180 dataset may sometimes have outdated metadata until the
      LonPM180 dataset is reloaded. So if the subscription system is turned off,
      you should set the 
      <a rel="help" href="#reloadEveryNMinutes"><kbd>&lt;reloadEveryNMinutes&gt;</kbd></a>
      setting of the LonPM180 dataset to a smaller number, so that it is more 
      likely to catch changes to the child dataset sooner.
    </ul>
  </ul>

<li><a class="selfLink" id="EDDGridLonPM180SkeletonXML" href="#EDDGridLonPM180SkeletonXML" rel="bookmark">The skeleton XML for an 
  EDDGridLonPM180 dataset is:</a>
<pre>
&lt;dataset type="EDDGridLonPM180" <a rel="help" href="#datasetID">datasetID</a>="..." <a rel="help" href="#active">active</a>="..." &gt;
  <a rel="help" href="#reloadEveryNMinutes">&lt;reloadEveryNMinutes&gt;</a>...&lt;/reloadEveryNMinutes&gt; &lt;!-- 0 or 1 --&gt; 
  <a rel="help" href="#updateEveryNMillis">&lt;updateEveryNMillis&gt;</a>...&lt;/updateEveryNMillis&gt; &lt;!-- 0 or 1. For
    EDDGridFromDap, this gets the remote .dds and then gets the new 
    leftmost (first) dimension values. --&gt;
  <a rel="help" href="#accessibleTo">&lt;accessibleTo&gt;</a>...&lt;/accessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#graphsAccessibleTo">&lt;graphsAccessibleTo&gt;</a>auto|public&lt;/graphsAccessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#accessibleViaWMS">&lt;accessibleViaWMS&gt;</a>...&lt;/accessibleViaWMS&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultDataQuery">&lt;defaultDataQuery&gt;</a>...&lt;/defaultDataQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultGraphQuery">&lt;defaultGraphQuery&gt;</a>...&lt;/defaultGraphQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#nThreads">&lt;nThreads&gt;</a>...&lt;/nThreads&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#dimensionValuesInMemory">&lt;dimensionValuesInMemory&gt;</a>...&lt;/dimensionValuesInMemory&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#fgdcFile">&lt;fgdcFile&gt;</a>...&lt;/fgdcFile&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#iso19115File">&lt;iso19115File&gt;</a>...&lt;/iso19115File&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#onChange">&lt;onChange&gt;</a>...&lt;/onChange&gt; &lt;!-- 0 or more --&gt;
  &lt;dataset&gt;...&lt;/dataset&gt; &lt;!-- The child EDDGrid dataset. --&gt;
&lt;/dataset&gt;
</pre>
&nbsp;
</ul>


<p><a class="selfLink" id="EDDGridLon0360" href="#EDDGridLon0360" rel="bookmark"><strong>EDDGridLon0360</strong></a> 
modifies the longitude values of a child (enclosed) EDDGrid dataset 
that has some longitude values less than 0 (for example, -180 to 180) so that 
they are in the range 0 to 360 (hence the name).
<ul>
<li>Working near a discontinuity causes problems, regardless of whether
  the discontinuity is at longitude 0 or at longitude 180. 
  This dataset type lets you avoid those problems for everyone,
  by offering two versions of the same dataset:
  <br>one with longitude values in the range -180 to 180 ("Atlanticentric"?).
  <br>one with longitude values in the range 0 to 360 ("Pacificentric"?),

<li>For child datasets with all longitude values less than 0,
    all of the new longitude values are simply 360 degrees higher.
    For example, a dataset with longitude values of -180 to -120 would 
    become a dataset with longitude values of 180 to 240.

<li>For child datasets that have longitude values for the entire globe
    (roughly -180 to 180), the new longitude value will be rearranged
    to be (roughly) 0 to 360: 
    <br>The original -180 to 0 values are converted to 180 to 360 and
      shifted to the end of the longitude array.
    <br>The original 0 to almost 180 values are unchanged.

<li>For child datasets that span lon=0 but don't cover the globe,
   ERDDAP™ inserts missing values as needed to make a dataset which
   covers the globe.
   For example, a child dataset with longitude values of -40 to 20
   would become a dataset with longitude values of 0 to 360.
   <br>The child values of 0 to 20 would be unchanged.
   <br>New longitude values would be inserted from 20 to 320.
      The corresponding data values will be _FillValues.
   <br>The child values of -40 to 0 would become 320 to 360.
   <br>The insertion of missing values may seem odd, but it avoids several
     problems that result from having longitude values that
     jump suddenly (e.g, from 20 to 320).

<li>In <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml</a>,
    there is a special "dataset type", EDDGridLon0360FromErddapCatalog,
    that lets you generate the datasets.xml for EDDGridLon0360 datasets
    from each of the EDDGrid datasets in an ERDDAP
    that have any longitude values greater than 180.
    This facilitates offering two versions of these datasets:
    <br>the original, with longitude values in the range 0 to 360, 
    <br>and the new dataset, with longitude values in the range -180 to 180.

    <p>The child dataset within each EDDGridLon0360 dataset will
    be an EDDGridFromErddap dataset which points to the original dataset.
    <br>The new dataset's datasetID will be the name of the original dataset
      plus "_Lon0360".
    <br>For example,
<pre>&lt;dataset type="EDDGridLon0360" datasetID="erdMBsstdmday_Lon0360" active="true"&gt;
  &lt;dataset type="EDDGridFromErddap" datasetID="erdMBsstdmday_Lon0360Child"&gt;
    &lt;!-- SST, Aqua MODIS, NPP, 0.025 degrees, Pacific Ocean, Daytime 
      (Monthly Composite) minLon=-40.0 maxLon=20.0 --&gt;
    &lt;sourceUrl&gt;https://coastwatch.pfeg.noaa.gov/erddap/griddap/erdMBsstdmday
    &lt;/sourceUrl&gt;
  &lt;/dataset&gt;
&lt;/dataset&gt; </pre>
    Put the EDDGridLon0360 dataset <strong>below</strong> the original dataset in datasets.xml.
      That avoids some possible problems.

    <p>Alternatively, you can replace the EDDGridFromErddap child dataset with
    the original dataset's datasets.xml. Then, there will be only one version
    of the dataset: the one with longitude values within 0 to 360.
    We discourage this because there are times when each version of the 
    dataset is more convenient. 

<li>If you offer two versions of a dataset, for example, one with longitude 0 to 360 and
    one with longitude -180 to 180:
  <ul>
  <li>You can use the optional 
    <a rel="help"
href="#accessibleViaWMS"><kbd>&lt;accessibleViaWMS&gt;false&lt;/accessibleViaWMS&gt;</kbd></a> 
with the 0 to 360 dataset to forcibly disable the WMS service for that dataset.
    Then, only the -180 to 180 version of the dataset will be accessible via WMS. 
  <li>There are a couple of ways to keep the Lon0360 dataset up-to-date
    with changes to the underlying dataset:
    <ul>
    <li>If the child dataset is a EDDGridFromErddap dataset that references
      a dataset in the same ERDDAP™, the Lon0360 dataset will try to 
      directly subscribe to the underlying dataset so that it is always up-to-date.
      Direct subscriptions don't generate emails asking you to validate the subscription -
      validation should be done automatically.
    <li>If the child dataset is not an EDDGridFromErddap dataset that is
      on the same ERDDAP™, the Lon0360 dataset will try to use
      the regular subscription system to subscribe to the underlying dataset.
      If you have the subscription system in your ERDDAP™ turned on, 
      you should get emails asking you to validate the subscription. Please do so.
    <li>If you have the subscription system in your ERDDAP™ turned off,
      the Lon0360 dataset may sometimes have outdated metadata until the
      Lon0360 dataset is reloaded. So if the subscription system is turned off,
      you should set the 
      <a rel="help" href="#reloadEveryNMinutes"><kbd>&lt;reloadEveryNMinutes&gt;</kbd></a>
      setting of the Lon0360 dataset to a smaller number, so that it is more 
      likely to catch changes to the child dataset sooner.
    </ul>
  </ul>

<li><a class="selfLink" id="EDDGridLon0360SkeletonXML" href="#EDDGridLon0360SkeletonXML" rel="bookmark">The skeleton XML for an 
  EDDGridLon0360 dataset is:</a>
<pre>
&lt;dataset type="EDDGridLon0360" <a rel="help" href="#datasetID">datasetID</a>="..." <a rel="help" href="#active">active</a>="..." &gt;
  <a rel="help" href="#reloadEveryNMinutes">&lt;reloadEveryNMinutes&gt;</a>...&lt;/reloadEveryNMinutes&gt; &lt;!-- 0 or 1 --&gt; 
  <a rel="help" href="#updateEveryNMillis">&lt;updateEveryNMillis&gt;</a>...&lt;/updateEveryNMillis&gt; &lt;!-- 0 or 1. For
    EDDGridFromDap, this gets the remote .dds and then gets the new 
    leftmost (first) dimension values. --&gt;
  <a rel="help" href="#accessibleTo">&lt;accessibleTo&gt;</a>...&lt;/accessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#graphsAccessibleTo">&lt;graphsAccessibleTo&gt;</a>auto|public&lt;/graphsAccessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#accessibleViaWMS">&lt;accessibleViaWMS&gt;</a>...&lt;/accessibleViaWMS&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultDataQuery">&lt;defaultDataQuery&gt;</a>...&lt;/defaultDataQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultGraphQuery">&lt;defaultGraphQuery&gt;</a>...&lt;/defaultGraphQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#nThreads">&lt;nThreads&gt;</a>...&lt;/nThreads&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#dimensionValuesInMemory">&lt;dimensionValuesInMemory&gt;</a>...&lt;/dimensionValuesInMemory&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#fgdcFile">&lt;fgdcFile&gt;</a>...&lt;/fgdcFile&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#iso19115File">&lt;iso19115File&gt;</a>...&lt;/iso19115File&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#onChange">&lt;onChange&gt;</a>...&lt;/onChange&gt; &lt;!-- 0 or more --&gt;
  &lt;dataset&gt;...&lt;/dataset&gt; &lt;!-- The child EDDGrid dataset. --&gt;
&lt;/dataset&gt;
</pre>
&nbsp;
</ul>


<p><a class="selfLink" id="EDDGridSideBySide" href="#EDDGridSideBySide" rel="bookmark"><strong>EDDGridSideBySide</strong></a> 
aggregates two or more EDDGrid datasets (the children) side by side.
<ul>
<li>The resulting dataset has all of the variables from all of the child datasets.
<li>The parent dataset and all of the child datasets MUST have different datasetIDs.
  If any names in a family are exactly the same, the dataset will fail to load
  (with the error message that the values of the aggregated axis are not in sorted order).
<li>All children MUST have the same source values for axisVariables[1+] 
  (for example, latitude, longitude). 
  The precision of the testing is determined by  
  <a rel="help" href="#matchAxisNDigits"><kbd>matchAxisNDigits</kbd></a>.
<li>The children may have different source values for axisVariables[0]
  (for example, time), but they are usually largely the same.
<li>The parent dataset will appear to have all of the axisVariables[0] source
  values from all of the children.
<li>For example, this lets you combine a source dataset with a vector's 
  u-component and another source dataset with a vector's v-component, 
  so the combined data can be served.
<li>Children created by this method are held privately.
  They are not separately accessible datasets (for example, by client data
  requests or by 
    <a rel="help" href="https://erddap.github.io/setup.html#flag">flag files</a>).
<li>The global metadata and settings for the parent comes from the global
  metadata and settings for the first child.
<li>If there is an exception while creating the first child, 
  the parent will not be created.
<li>If there is an exception while creating other children, this sends an email to
  <kbd>emailEverythingTo</kbd> (as specified in 
  <a rel="help" href="https://erddap.github.io/setup.html#setup.xml">setup.xml</a>) 
  and continues with the other children.
<li><a class="selfLink" id="EDDGridSideBySideSkeletonXML" href="#EDDGridSideBySideSkeletonXML" rel="bookmark"
>The skeleton XML for an EDDGridSideBySide dataset is:</a>
<pre>
&lt;dataset type="EDDGridSideBySide" <a rel="help" href="#datasetID">datasetID</a>="..." <a rel="help" href="#active">active</a>="..." &gt;
  <a rel="help" href="#accessibleTo">&lt;accessibleTo&gt;</a>...&lt;/accessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#graphsAccessibleTo">&lt;graphsAccessibleTo&gt;</a>auto|public&lt;/graphsAccessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#accessibleViaWMS">&lt;accessibleViaWMS&gt;</a>...&lt;/accessibleViaWMS&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultDataQuery">&lt;defaultDataQuery&gt;</a>...&lt;/defaultDataQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultGraphQuery">&lt;defaultGraphQuery&gt;</a>...&lt;/defaultGraphQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#matchAxisNDigits">&lt;matchAxisNDigits&gt;</a>...&lt;/matchAxisNDigits&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#nThreads">&lt;nThreads&gt;</a>...&lt;/nThreads&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#dimensionValuesInMemory">&lt;dimensionValuesInMemory&gt;</a>...&lt;/dimensionValuesInMemory&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#fgdcFile">&lt;fgdcFile&gt;</a>...&lt;/fgdcFile&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#iso19115File">&lt;iso19115File&gt;</a>...&lt;/iso19115File&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#onChange">&lt;onChange&gt;</a>...&lt;/onChange&gt; &lt;!-- 0 or more --&gt;
  &lt;dataset&gt;...&lt;/dataset&gt; &lt;!-- 2 or more --&gt;
&lt;/dataset&gt;
</pre>
&nbsp;
</ul>

<p><a class="selfLink" id="EDDGridAggregateExistingDimension" href="#EDDGridAggregateExistingDimension" rel="bookmark"><strong>EDDGridAggregateExistingDimension</strong></a> 
aggregates two or more EDDGrid datasets each of which has 
a different range of values for the first dimension, 
but identical values for the other dimensions.
<ul>
<li>For example, one child dataset might have 366 values (for 2004) for the time
dimension and another child might have 365 values (for 2005) for the time dimension.
<li>All the values for all of the other dimensions (for example, latitude, longitude)
  MUST be identical for all of the children.
    The precision of the testing is determined by  
    <a rel="help" href="#matchAxisNDigits"><kbd>matchAxisNDigits</kbd></a>.
<li>Sorted Dimension Values -
   The values for each dimension MUST be in sorted order (ascending or descending). 
   The values can be irregularly spaced. There can be no ties.
   This is a requirement of the
   <a href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html"
    >CF metadata standard<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
    If any dimension's values aren't in sorted order, the dataset won't be loaded
    and ERDDAP™ will identify the first unsorted value in the log file,
    <i>bigParentDirectory</i>/logs/log.txt .
   <p>Unsorted dimension values almost always indicate a problem with the source dataset.
   This most commonly occurs when a misnamed or inappropriate file is 
   included in the aggregation, which leads to an unsorted time dimension. 
   To solve this problem, see the error message in the 
   ERDDAP™ log.txt file to find the offending time value. 
   Then look in the source files to find the corresponding file 
   (or one before or one after) that doesn't belong in the aggregation.
<li>The parent dataset and the child dataset MUST have different datasetIDs.
  If any names in a family are exactly the same, the dataset will fail to load
  (with the error message that the values of the aggregated axis are not in sorted order).
<li>Currently, the child dataset MUST be an EDDGridFromDap dataset and MUST have
  the lowest values of the aggregated dimension (usually the oldest time values). 
  All of the other children MUST be almost identical datasets (differing just in the values
  for the first dimension) and are specified by just their sourceUrl.
<li>The aggregate dataset gets its metadata from the first child.
<li>The <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml program</a> 
  can make a rough draft of the datasets.xml for an
  EDDGridAggregateExistingDimension based on a set of files served by a Hyrax or THREDDS server.
  For example, use this input for the program (the "/1988" in the URL makes the example run faster):<kbd>
  <br>&nbsp;&nbsp;EDDType? EDDGridAggregateExistingDimension
  <br>&nbsp;&nbsp;Server type (hyrax, thredds, or dodsindex)? hyrax
  <br>&nbsp;&nbsp;Parent URL (for example, for hyrax, ending in "contents.html"; 
  <br>&nbsp;&nbsp;&nbsp;&nbsp;for thredds, ending in "catalog.xml")
  <br>&nbsp;&nbsp;? https://opendap.jpl.nasa.gov/opendap/ocean_wind/ccmp/L3.5a/data/
  <br>&nbsp;&nbsp;&nbsp;&nbsp;flk/1988/contents.html
  <br>&nbsp;&nbsp;File name regex (for example, ".*\.nc")? month.*flk\.nc\.gz
  <br>&nbsp;&nbsp;ReloadEveryNMinutes (for example, 10080)? 10080</kbd>
  <br>You can use the resulting <kbd>&lt;sourceUrl&gt;</kbd> tags or 
  delete them and uncomment the <kbd>&lt;sourceUrl&gt;</kbd> tag 
  (so that new files are noticed each time
  the dataset is reloaded.
<li><a class="selfLink" id="EDDGridAggregateExistingDimensionSkeletonXML" href="#EDDGridAggregateExistingDimensionSkeletonXML" rel="bookmark"
  >The skeleton XML for an EDDGridAggregateExistingDimension dataset is:</a>

<pre>
&lt;dataset type="EDDGridAggregateExistingDimension" <a rel="help" href="#datasetID">datasetID</a>="..." 
    <a rel="help" href="#active">active</a>="..." &gt;
  &lt;dataset&gt;...&lt;/dataset&gt; &lt;!-- This is a regular <a rel="help" href="#EDDGridFromDap">EDDGridFromDap</a> dataset
    description child with the lowest values for the aggregated 
    dimensions. --&gt;
  <a rel="help" href="#sourceUrl">&lt;sourceUrl&gt;</a>...&lt;/sourceUrl&gt; &lt;!-- 0 or many; the sourceUrls for 
    other children.  These children must be listed in order of 
    ascending values for the aggregated dimension. --&gt;
  &lt;sourceUrls serverType="..." regex="..." recursive="true" 
    <a rel="help" href="#pathRegex">pathRegex</a>=".*"
    &gt;https://<i>someServer/someDirectory/someSubdirectory</i>/catalog.xml&lt;/sourceUrls&gt; 
    &lt;!-- 0 or 1. This specifies how to find the other children, 
    instead of using separate sourceUrl tags for each child.  The 
    advantage of this is: new children will be detected each time 
    the dataset is reloaded. The serverType must be "thredds", 
    "hyrax", or "dodsindex".  
    An example of a <a rel="help" 
    href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/regex/Pattern.html"
    >regular expression<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> (regex) (<a rel="help" href="https://www.vogella.com/tutorials/JavaRegularExpressions/article.html">tutorial<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>) is .*\.nc 
    recursive can be "true" or "false".  
    Only directory names which match the 
    <a rel="help" href="#pathRegex"><kbd>&lt;pathRegex&gt;</kbd></a>
    (default=".*") will be accepted. 
    A thredds catalogUrl MUST include "/thredds/catalog/".
    An example of a thredds catalogUrl is
    <a href="https://thredds1.pfeg.noaa.gov/thredds/catalog/Satellite/aggregsatMH/chla/catalog.xml"
      >https://thredds1.pfeg.noaa.gov/thredds/catalog/Satellite/aggregsatMH/
      chla/catalog.xml<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
    An example of a hyrax catalogUrl is
    <a href="https://opendap.jpl.nasa.gov/opendap/allData/ccmp/L3.5a/monthly/flk/1988/contents.html"
      >https://opendap.jpl.nasa.gov/opendap/allData/ccmp/L3.5a/monthly/
    flk/1988/contents.html<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
    An example of a dodsindex URL is 
    <a href="https://opendap.jpl.nasa.gov/opendap/GeodeticsGravity/tellus/L3/mascon/RL06/JPL/v02/CRI/netcdf/contents.html"
      >https://opendap.jpl.nasa.gov/opendap/GeodeticsGravity/tellus/L3/mascon/RL06/JPL/v02/CRI/netcdf/contents.html<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
    (Note the "OPeNDAP logo at the top of the page.)
    When these children are sorted by filename, they must be in 
    order of ascending values for the aggregated dimension. --&gt;
  <a rel="help" href="#accessibleTo">&lt;accessibleTo&gt;</a>...&lt;/accessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#graphsAccessibleTo">&lt;graphsAccessibleTo&gt;</a>auto|public&lt;/graphsAccessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#accessibleViaWMS">&lt;accessibleViaWMS&gt;</a>...&lt;/accessibleViaWMS&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultDataQuery">&lt;defaultDataQuery&gt;</a>...&lt;/defaultDataQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultGraphQuery">&lt;defaultGraphQuery&gt;</a>...&lt;/defaultGraphQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#matchAxisNDigits">&lt;matchAxisNDigits&gt;</a>...&lt;/matchAxisNDigits&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#nThreads">&lt;nThreads&gt;</a>...&lt;/nThreads&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#dimensionValuesInMemory">&lt;dimensionValuesInMemory&gt;</a>...&lt;/dimensionValuesInMemory&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#fgdcFile">&lt;fgdcFile&gt;</a>...&lt;/fgdcFile&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#iso19115File">&lt;iso19115File&gt;</a>...&lt;/iso19115File&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#onChange">&lt;onChange&gt;</a>...&lt;/onChange&gt; &lt;!-- 0 or more --&gt;
&lt;/dataset&gt;
</pre>
&nbsp;
</ul>


<p><a class="selfLink" id="EDDGridCopy" href="#EDDGridCopy" rel="bookmark"><strong>EDDGridCopy</strong></a> makes and maintains a local copy 
of another EDDGrid's data and serves data from the local copy.
<ul>
<li>EDDGridCopy (and for tabular data, <a rel="help" href="#EDDTableCopy">EDDTableCopy</a>)
    is a very easy to use and a very effective
  <br><strong>solution to some of the biggest problems with serving data from a remote data source:</strong>
  <ul>
  <li>Accessing data from a remote data source can be slow.
    <ul>
    <li>It may be slow because it is inherently slow (for example, an
      inefficient type of server), 
    <li>because it is overwhelmed by too many requests,
    <li>or because your server or the remote server is bandwidth limited.
    </ul>
  <li>The remote dataset is sometimes unavailable (again, for a variety of reasons).
  <li>Relying on one source for the data doesn't scale well (for example, 
    when many users and many ERDDAPs utilize it).
    <br>&nbsp;
  </ul>

<li>How It Works -- EDDGridCopy solves these problems by automatically making and
  maintaining a local copy of the data and serving data from the local copy.
  ERDDAP™ can serve data from the local copy very, very quickly.
  And making a local copy relieves the burden on the remote server.
  And the local copy is a backup of the original, which is useful in case
  something happens to the original.

  <p>There is nothing new about making a local copy of a dataset. 
  What is new here is that this class
  makes it *easy* to create and *maintain* a local copy of data from a *variety* of types 
  of remote data sources and *add metadata* while copying the data.

<li>Chunks of Data -- EDDGridCopy makes the local copy of the data by requesting 
  chunks of data from the remote <kbd>&lt;dataset&gt;</kbd> .
  There will be a chunk for each value of the leftmost (first) axis variable.
  EDDGridCopy doesn't rely on the remote dataset's index numbers 
  for the axis -- those may change.

  <p>WARNING: If the size of a chunk of data is so big (&gt; 2GB) that it causes 
  problems, EDDGridCopy can't be used. 
  (Sorry, we hope to have a solution for this problem in the future.)

<li>[An alternative to EDDGridCopy -
  <br>If the remote data is available via downloadable files, 
   not a web service, use
   <a rel="help" href="#cacheFromUrl">cacheFromUrl option for EDDGridFromFiles</a>,
   which makes a local copy of the remote files and serves the data from the local files.]

<li>Local Files -- Each chunk of data is stored in a separate NetCDF file in a 
  subdirectory of
  <i>bigParentDirectory</i>/copy/<i>datasetID</i>/ (as specified in 
    <a rel="help" href="https://erddap.github.io/setup.html#setup.xml">setup.xml</a>).
  Filenames created from axis values are modified to make them file-name-safe 
  (for example, hyphens are replaced by "x2D") -- this doesn't affect the actual data.
  <br>&nbsp;

<li>New Data -- Each time EDDGridCopy is reloaded, 
  it checks the remote <kbd>&lt;dataset&gt;</kbd> to see what chunks are available.
  If the file for a chunk of data doesn't already exist, a request to get the 
  chunk is added to a queue.
  ERDDAP's <a class="selfLink" id="taskThread" href="#taskThread" rel="bookmark">taskThread</a> processes all the queued 
  requests for chunks of data, one-by-one.
  You can see statistics for the taskThread's activity on the
     <a rel="help" href="https://erddap.github.io/setup.html#statusPage">Status Page</a> and in the 
     <a rel="help" href="https://erddap.github.io/setup.html#dailyReport">Daily Report</a>.
  (Yes, ERDDAP™ could assign multiple tasks to this process, but that would use 
  up lots of the remote data source's bandwidth, memory, and CPU time,
  and lots of the local ERDDAP's bandwidth,
  memory, and CPU time, neither of which is a good idea.)

  <p>NOTE: The very first time an EDDGridCopy is loaded, (if all goes well) 
  lots of requests for chunks
  of data will be added to the taskThread's queue, but no local data files will
  have been created.
  So the constructor will fail but taskThread will continue to work and create local files.
  If all goes well, the taskThread will make some local data files and the next attempt to 
  reload the dataset (in ~15 minutes) will succeed, but initially with a very 
  limited amount of data.

  <p>NOTE: After the local dataset has some data and appears in your ERDDAP,
  if the remote dataset is temporarily or permanently not accessible,
  the local dataset will still work.

  <p>WARNING: If the remote dataset is large and/or the remote server is slow
  (that's the problem, isn't it?!), it will take a long time to make a complete
  local copy.
  In some cases, the time needed will be unacceptable.
  For example, transmitting 1 TB of data over a T1 line (0.15 GB/s) takes at 
  least 60 days, under optimal conditions.
  Plus, it uses lots of bandwidth, memory, and CPU time on the remote and local computers.
  The solution is to mail a hard drive to the administrator of the remote data set so that
  s/he can make a copy of the dataset and mail the hard drive back to you.
  Use that data as a starting point and EDDGridCopy will add data to it.
  (That is one way that 
  <a rel="help" href="https://aws.amazon.com/importexport/">Amazon's EC2 Cloud Service<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
    handles the problem, even though their system
  has lots of bandwidth.)

  <p>WARNING: If a given value for the leftmost (first) axis variable disappears from 
  the remote dataset, EDDGridCopy does NOT delete the local copied file. 
  If you want to, you can delete it yourself.

<li><a class="selfLink" id="gridCopy_checkSourceData" href="#gridCopy_checkSourceData" rel="bookmark"><kbd>&lt;checkSourceData&gt;</kbd></a> -- 
The datasets.xml for this dataset can have an optional tag
<br><kbd>&lt;checkSourceData&gt;true&lt;/checkSourceData&gt;</kbd>
<br>The default value is true. If/when you set it to false, the dataset won't ever
check the source dataset to see if there is additional data available.
<br>&nbsp;

<li><a class="selfLink" id="onlySince" href="#onlySince"  rel="bookmark"><kbd>&lt;onlySince&gt;</kbd></a> -- 
  You can tell EDDGridCopy to make a copy of a subset of the source
  dataset, instead of the entire source dataset, by adding a tag in the form 
  <kbd>&lt;onlySince&gt;<i>someValue</i>&lt;/onlySince&gt;</kbd>
  to the dataset's datasets.xml chunk.
  EDDGridCopy will only download data values related to the values of the first dimension 
  (usually the time dimension) which are greater than <i>someValue</i>.
  <i>someValue</i> can be:
  <ul>
  <li>A relative time specified via <kbd>now-<i>nUnits</i></kbd>.
    <br>For example, <kbd>&lt;onlySince&gt;now-2years&lt;/onlySince&gt;</kbd>
    tells the dataset to only make local copies of the data
    for data where the outer dimension's values (usually time values)
    are within the last 2 years (which is re-evaluated 
    each time the dataset is reloaded, which is when it looks for 
    new data to copy). See the 
    <a rel="help"
    href="https://coastwatch.pfeg.noaa.gov/erddap/tabledap/documentation.html#now"
    ><kbd>now-<i>nUnits</i></kbd> syntax description</a>.
    This is useful if the first dimension has time data, which it usually does.

    <p>EDDGridCopy does not delete local data files which have data 
    that, over time, becomes older than <kbd>now-<i>nUnits</i></kbd>.   
    You can delete those files any time if you choose to. 
    If you do, we strongly recommend that you set a 
    <a rel="help"
    href="https://erddap.github.io/setup.html#flag">flag</a>
    after you delete the files to tell EDDGridCopy
    to update the list of cached files.

  <li>A fixed point in time specified as an ISO 8601 string
    <kbd>yyyy-MM-ddTHH:mm:ssZ</kbd> .
    <br>For example,
    <kbd>&lt;onlySince&gt;2000-01-01T00:00:00Z&lt;/onlySince&gt;</kbd> 
    tells the dataset only to make local copies of data where the
    first dimension's value is 
    <span class="N">&gt;=2000-01-01T00:00:00Z .</span>
    This is useful if the first dimension has time data, which it usually does.
    <br>&nbsp;

  <li>A floating point number.
    <br>For example, <kbd>&lt;onlySince&gt;946684800.0&lt;/onlySince&gt;</kbd> .
    The units will be the destination units of the first dimension.
    For example, for time dimensions, the units in ERDDAP™ are always
    "seconds since 1970-01-01T00:00:00Z".
    So <kbd>946684800.0</kbd> "seconds since 1970-01-01T00:00:00Z"
    is equivalent to 2000-01-01T00:00:00Z.
    This is always a useful option, 
    but is especially useful when the first dimension doesn't have time data.
    <br>&nbsp;
  </ul>   

<li>Recommended use -
  <ol>
  <li>Create the <kbd>&lt;dataset&gt;</kbd> entry (the native type, not EDDGridCopy)
    for the remote data source. 
    <br><strong>Get it working correctly, including all of the desired metadata.</strong> 
  <li>If it is too slow, add XML code to wrap it in an EDDGridCopy dataset.
    <ul>
    <li>Use a different datasetID (perhaps by changing the datasetID of the old datasetID slightly).
    <li>Copy the <kbd>&lt;accessibleTo&gt;, &lt;reloadEveryNMinutes&gt;</kbd> and 
      <kbd>&lt;onChange&gt;</kbd> from the
      remote EDDGrid's XML to the EDDGridCopy's XML.
      (Their values for EDDGridCopy matter; their values for the inner dataset 
      become irrelevant.)
    </ul>
  <li>ERDDAP™ will make and maintain a local copy of the data.
    <br>&nbsp;
  </ol>

<li>WARNING: EDDGridCopy assumes that the data values for each chunk don't ever change. 
  If/when they do, you need to manually delete the chunk files in 
     <i>bigParentDirectory</i>/copy/<i>datasetID</i>/
  which changed and 
  <a rel="help" href="https://erddap.github.io/setup.html#flag">flag</a> 
     the dataset to be reloaded so that the deleted chunks will be replaced.
  If you have an email subscription to the dataset, you will get two emails:
  one when the dataset first reloads and starts to copy the data, 
  and another when the dataset loads again (automatically) and detects the new 
  local data files.
  <br>&nbsp;

<li>All axis values must be equal.
  <br>For each of the axes except the leftmost (first), all of the values must be equal for all children.
  The precision of the testing is determined by  
  <a rel="help" href="#matchAxisNDigits"><kbd>matchAxisNDigits</kbd></a>.
  <br>&nbsp;

<li>Settings, Metadata, Variables -- EDDGridCopy uses settings, metadata,
  and variables from the enclosed source dataset. 
  <br>&nbsp;

<li>Change Metadata -- If you need to change any addAttributes or change the 
  order of the variables associated with the source dataset:
  <ol>
  <li>Change the addAttributes for the source dataset in datasets.xml, as needed. 
  <li>Delete one of the copied files.
  <li>Set a <a rel="help" href="https://erddap.github.io/setup.html#flag">flag</a> 
    to reload the dataset immediately.
    If you do use a flag and you have an email subscription to the dataset, you will get two emails:
    one when the dataset first reloads and starts to copy the data, 
    and another when the dataset loads again (automatically) and detects the new local data files.
  <li>The deleted file will be regenerated with the new metadata. 
    If the source dataset is ever unavailable, the EDDGridCopy dataset will get metadata 
    from the regenerated file, since it is the youngest file.
    <br>&nbsp;
  </ol>


<li><a class="selfLink" id="EDDGridCopySkeletonXML" href="#EDDGridCopySkeletonXML" rel="bookmark">The skeleton XML for an EDDGridCopy dataset is:</a>
<pre>
&lt;dataset type="EDDGridCopy" <a rel="help" href="#datasetID">datasetID</a>="..." <a rel="help" href="#active">active</a>="..." &gt;
  <a rel="help" href="#accessibleTo">&lt;accessibleTo&gt;</a>...&lt;/accessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#graphsAccessibleTo">&lt;graphsAccessibleTo&gt;</a>auto|public&lt;/graphsAccessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#accessibleViaFiles">&lt;accessibleViaFiles&gt;</a>true|false(default)&lt;/accessibleViaFiles&gt; 
    &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#accessibleViaWMS">&lt;accessibleViaWMS&gt;</a>...&lt;/accessibleViaWMS&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#reloadEveryNMinutes">&lt;reloadEveryNMinutes&gt;</a>...&lt;/reloadEveryNMinutes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultDataQuery">&lt;defaultDataQuery&gt;</a>...&lt;/defaultDataQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultGraphQuery">&lt;defaultGraphQuery&gt;</a>...&lt;/defaultGraphQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#fgdcFile">&lt;fgdcFile&gt;</a>...&lt;/fgdcFile&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#iso19115File">&lt;iso19115File&gt;</a>...&lt;/iso19115File&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#onChange">&lt;onChange&gt;</a>...&lt;/onChange&gt; &lt;!-- 0 or more --&gt;
  <a rel="help" href="#matchAxisNDigits">&lt;matchAxisNDigits&gt;</a>...&lt;/matchAxisNDigits&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#fileTableInMemory">&lt;fileTableInMemory&gt;</a>...&lt;/fileTableInMemory&gt; &lt;!-- 0 or 1 (true or false 
    (the default)) --&gt;
  <a rel="help" href="#gridCopy_checkSourceData">&lt;checkSourceData&gt;</a>...&lt;/checkSourceData&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#onlySince">&lt;onlySince&gt;</a>...&lt;/onlySince&gt; &lt;!-- 0 or 1 --&gt;
  &lt;dataset&gt;...&lt;/dataset&gt; &lt;!-- 1 --&gt;
&lt;/dataset&gt;
</pre>
&nbsp;
</ul>


<!-- THIS IS INACTIVE. THERE ARE NO KNOWN VALID DATASETS. TESTING IS NO LONGER DONE.
p><a class="selfLink" id="EDDTableFromBMDE" href="#EDDTableFromBMDE" rel="bookmark"><strong>EDDTableFromBMDE</strong></a> handles data from a
<a rel="help" href="https://avianknowledge.net">Bird Monitoring Data Exchange (BMDE)<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> server.
<ul>
<li>BMDE servers expect an XML request and return an XML response.
<li>All BMDE servers return XML responses with the same variables, as specified in the 
<a rel="help" href="WAS http://akn.ornith.cornell.edu/Schemas/bmde/BMDE-Bandingv1.38.08.xsd">BMDE schema<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
  The BMDE schema defines 515 data variables(!), but most BMDE servers only return data in a few of the variables.
  You should ask the BMDE administrator for a list of the data variables which are active for a given sourceCode.
  Then, just create ERDDAP™ dataVariables for those variables.
  ERDDAP™ will check that each dataVariable's sourceName is a valid BMDE name (without the "bmde:" prefix).
  ERDDAP™ will add some standard <a rel="help" href="#globalAttributes">sourceAttributes</a> for each dataVariable,
    so you usually will not need to define any <a rel="help" href="#globalAttributes">addAttributes</a>. 
<li><a class="selfLink" id="EDDTableFromBMDESkeletonXML" href="#EDDTableFromBMDESkeletonXML" rel="bookmark">The skeleton XML for an EDDTableFromBMDE dataset is:</a>
<pre>
&lt;dataset type="EDDTableFromBMDE" <a rel="help" href="#datasetID">datasetID</a>="..." <a rel="help" href="#active">active</a>="..." &gt;
  <a rel="help" href="#sourceUrl">&lt;sourceUrl&gt;</a>...&lt;/sourceUrl&gt;
  <a rel="help" href="#sourceCode">&lt;sourceCode&gt;</a>...&lt;/sourceCode&gt;
    &lt;!- - If you read the XML response from the sourceUrl, the source code (for example, prbo05) 
    is the value from one of the &lt;resource&gt;&lt;code&gt; tags. - -&gt;
  <a rel="help" href="#accessibleTo">&lt;accessibleTo&gt;</a>...&lt;/accessibleTo&gt; &lt;!- - 0 or 1 - -&gt;
  <a rel="help" href="#graphsAccessibleTo">&lt;graphsAccessibleTo&gt;</a>auto|public&lt;/graphsAccessibleTo&gt; &lt;!- - 0 or 1 - -&gt;
  <a rel="help" href="#accessibleViaWMS">&lt;accessibleViaWMS&gt;</a>...&lt;/accessibleViaWMS&gt; &lt;!- - 0 or 1 - -&gt;
  <a rel="help" href="#reloadEveryNMinutes">&lt;reloadEveryNMinutes&gt;</a>...&lt;/reloadEveryNMinutes&gt; &lt;!- - 0 or 1 - -&gt;
  <a rel="help" href="#defaultDataQuery">&lt;defaultDataQuery&gt;</a>...&lt;/defaultDataQuery&gt; &lt;!- - 0 or 1 - - &gt;
  <a rel="help" href="#defaultGraphQuery">&lt;defaultGraphQuery&gt;</a>...&lt;/defaultGraphQuery&gt; &lt;! - - 0 or 1 - - &gt;
  <a rel="help" href="#addVariablesWhere">&lt;addVariablesWhere&gt;</a>...&lt;/addVariablesWhere&gt; &lt;!- - 0 or 1 - -&gt;
  <a rel="help" href="#fgdcFile">&lt;fgdcFile&gt;</a>...&lt;/fgdcFile&gt; &lt;!- - 0 or 1 - -&gt;
  <a rel="help" href="#iso19115File">&lt;iso19115File&gt;</a>...&lt;/iso19115File&gt; &lt;!- - 0 or 1 - -&gt;
  <a rel="help" href="#onChange">&lt;onChange&gt;</a>...&lt;/onChange&gt; &lt;!- - 0 or more - -&gt;
  <a rel="help" href="#dataVariable">&lt;dataVariable&gt;</a>...&lt;/dataVariable&gt; &lt;!- - 1 or more - -&gt;

&lt;/dataset&gt;
</pre>
&nbsp;
</ul -->

<p><a class="selfLink" id="EDDTableFromCassandra" href="#EDDTableFromCassandra" rel="bookmark"><strong>EDDTableFromCassandra</strong></a> 
handles data from one 
<a rel="help" 
    href="https://cassandra.apache.org/">Cassandra<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a>
table.
Cassandra is a NoSQL database.

<!-- or <a rel="help" href="https://en.wikipedia.org/wiki/View_(database)">view<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>. -->
<ul>
<li>ERDDAP™ can work with Cassandra v2 and v3 with no changes or differences in setup.
  We have tested with 
  <a rel="help" 
    href="https://cassandra.apache.org/download/">Cassandra v2 and v3 from Apache<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a>.
  It is likely that ERDDAP™ can also work with Cassandra downloaded from DataStax.
  <br>&nbsp;

<li>For Aug 2019 - May 2021, we had trouble getting Cassandra to work with AdoptOpenJdk Java v8.
  It threw an EXCEPTION_ACCESS_VIOLATION). But now (May 2021), that problem is gone: we
  can successfully use Cassandra v2.1.22 and AdoptOpenJdk jdk8u292-b10. 
  <br>&nbsp;

<li><a class="selfLink" id="CassandraOneTable" href="#CassandraOneTable" rel="bookmark">One Table</a> -- 
  Cassandra doesn't support "joins" in the way that relational databases do.
  One ERDDAP™ EDDTableFromCassandra dataset maps to one (perhaps a subset of one)
  Cassandra table.
  <br>&nbsp;

<li><a class="selfLink" id="CassandraDatasetsXml" href="#CassandraDatasetsXml" rel="bookmark">datasets.xml</a><br>
  
  <!-- It is difficult to create the correct 
  datasets.xml information needed for ERDDAP™ to establish
  a connection to Cassandra.  Be patient. Be methodical. -->
  <ul>
  <li>ERDDAP™ comes with the Cassandra Java driver, so you don't need to install 
    it separately.
  <li>Carefully read all of this document's information about EDDTableFromCassandra.
    Some of the details are very important.
  <li>The Cassandra Java driver is intended to work with Apache Cassandra (1.2+) 
     and DataStax Enterprise (3.1+).
     If you are using Apache Cassandra 1.2.x, you must edit the cassandra.yaml file for 
     each node to set <kbd>start_native_transport: true</kbd>, then restart each node.
     <!-- https://www.datastax.com/documentation/developer/java-driver/2.1/common/drivers/introduction/driverDependencies_r.html -->
  <li>We strongly recommend using the
    <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml program</a> 
    to make a rough draft of the datasets.xml chunk for this dataset.
    You can then edit that to fine tune it (especially
    <a rel="help"
    href="#CassandraPartitionKeySourceNames">&lt;partitionKeySourceNames&gt;</a>).
    You can gather most of the information you need to create the XML for an
    EDDTableFromCassandra dataset by contacting the Cassandra administrator and 
    by searching the web. 
    <p>GenerateDatasetsXml has two special options for EDDTableFromCassandra:
    <ol>
    <li>If you enter "!!!LIST!!!" (without the quotes) for the keyspace,
      the program will display a list of keyspaces
    <li>If you enter a specific keyspace and then enter "!!!LIST!!!" 
      (without the quotes) for the tablename,
      the program will display a list of tables in that keyspace and their columns.
    </ol>
  <li><a class="selfLink" id="CassandraQuotes" href="#CassandraQuotes" rel="bookmark">Case-insensitive Keyspace and Table Names</a> -
    <br>Cassandra treats keyspace and table names in a case-insensitive way.
    Because of this, you MUST NEVER use a reserved word (but with a different case)
    as a Cassandra keyspace or table name.
  <li>Case-insensitive Column Names -- 
    <br>By default, Cassandra treats column names in a case-insensitive way.
    If you use one of Cassandra's reserved words as a column name (please don't!),
    you MUST use 
    <br><kbd>&lt;columnNameQuotes&gt;"&lt;columnNameQuotes&gt;</kbd> 
    <br>in datasets.xml for this dataset so that Cassandra and ERDDAP™ will treat 
    the column names in a case-sensitive way.
    This will likely be a massive headache for you, because it is hard to determine
    the case-sensitive versions of the column names -- Cassandra almost always
    displays the column names as all lower-case, regardless of the true case.
  <li>Work closely with the Cassandra administrator, who may have relevant experience.
    If the dataset fails to load, read the 
    <a rel="help" href="#errorMessages">error message</a> carefully to find out why.
  <br>&nbsp;
  </ul>  

<li><a class="selfLink" id="CassandraConnectionProperty" href="#CassandraConnectionProperty" rel="bookmark">&lt;connectionProperty&gt;</a>   
    <br>Cassandra has connection properties which can be specified in datasets.xml. 
    Many of these will affect the performance of the Cassandra-ERDDAP™ connection. 
    Unfortunately, Cassandra properties must be set programmatically in Java,
    so ERDDAP™ must have code for each property ERDDAP™ supports.
    Currently, ERDDAP™ supports these properties:
    <br>(The defaults shown are what we see. Your system's defaults may be different.)
    <ul>
    <li><strong>General Options</strong>
    <!-- https://www.datastax.com/drivers/java/2.1/com/datastax/driver/core/ProtocolOptions.html -->
    <br>&lt;connectionProperty 
    name="<strong>compression</strong>"&gt;<i>none|LZ4|snappy</i>&lt;/connectionProperty&gt;   (case-insensitive, default=none)
    <br>(General compression advice: use 'none' if the connection between 
    Cassandra and ERDDAP™ is local/fast and use 'LZ4' if the connection is remote/slow.)
    <br>&lt;connectionProperty 
    name="<strong>credentials</strong>"&gt;<i>username/password</i>&lt;/connectionProperty&gt; (that's a literal '/')
    <br>&lt;connectionProperty 
    name="<strong>metrics</strong>"&gt;<i>true|false</i>&lt;/connectionProperty&gt;   (2021-01-25 was default=true, now ignored and always false)
    <br>&lt;connectionProperty 
    name="<strong>port</strong>"&gt;<i>anInteger</i>&lt;/connectionProperty&gt;   (default for native binary protocol=9042)
    <br>&lt;connectionProperty 
    name="<strong>ssl</strong>"&gt;<i>true|false</i>&lt;/connectionProperty&gt;   (default=false)
    <br>(My quick attempt to use ssl failed. If you succeed, please tell me how 
    you did it.)

    <li><strong>Query Options</strong>
    <!-- https://www.datastax.com/drivers/java/2.1/com/datastax/driver/core/QueryOptions.html -->
    <br>&lt;connectionProperty 
    name="<strong>consistencyLevel</strong>"&gt;<i>all|any|each_quorum|local_one|local_quorum|
    local_serial|one|quorum|serial|three|two</i>&lt;/connectionProperty&gt;   (case-insensitive, default=ONE)
    <br>&lt;connectionProperty 
    name="<strong>fetchSize</strong>"&gt;<i>anInteger</i>&lt;/connectionProperty&gt;  (default=5000)
    <br>(Do not set fetchSize to a smaller value.)
    <br>&lt;connectionProperty 
    name="<strong>serialConsistencyLevel</strong>"&gt;<i>all|any|each_quorum|local_one|local_quorum|
    local_serial|one|quorum|serial|three|two</i>&lt;/connectionProperty&gt;   (case-insensitive, default=SERIAL)

    <li><strong>Socket Options</strong>
    <!-- https://www.datastax.com/drivers/java/2.1/com/datastax/driver/core/SocketOptions.html -->
    <br>&lt;connectionProperty
    name="<strong>connectTimeoutMillis</strong>"&gt;<i>anInteger</i>&lt;/connectionProperty&gt;  (default=5000)
    <br>(Do not set connectTimeoutMillis to a smaller value.)
    <br>&lt;connectionProperty 
    name="<strong>keepAlive</strong>"&gt;<i>true|false</i>&lt;/connectionProperty&gt;
    <br>&lt;connectionProperty 
    name="<strong>readTimeoutMillis</strong>"&gt;<i>anInteger</i>&lt;/connectionProperty&gt;   <br>(Cassandra's default readTimeoutMillis is 12000, but ERDDAP™ changes the default to 120000. If Cassandra is throwing readTimeouts, increasing this may not help, 
    because Cassandra sometimes throws them before this time. The problem is more
    likely that you are storing too much data per partitionKey combination.)
    <br>&lt;connectionProperty 
    name="<strong>receiveBufferSize</strong>"&gt;<i>anInteger</i>&lt;/connectionProperty&gt;
    <br>(It is unclear what the default receiveBufferSize is. Don't set this to a small value.)
    <br>&lt;connectionProperty 
    name="<strong>soLinger</strong>"&gt;<i>anInteger</i>&lt;/connectionProperty&gt;
    <br>&lt;connectionProperty 
    name="<strong>tcpNoDelay</strong>"&gt;<i>true|false</i>&lt;/connectionProperty&gt;   (default=null)
    <!-- 
    <li>&lt;connectionProperty 
    name=""&gt;<i>a</i>&lt;/connectionProperty&gt; -->
    </ul>
    <p>If you need to be able to set other connection properties, please 
    send an email with the details to 
    <br><kbd>erd dot data at noaa dot gov</kbd>. 
    <br>Or, you can join the <a rel="help"
        href="#ERDDAPMailingList">ERDDAP™ Google Group / Mailing List</a> 
        and post your question there.

    <p>For a given startup of Tomcat, 
    connectionProperties are only used the first time a dataset
    is created for a given Cassandra URL. All reloads of that dataset 
    and all subsequent datasets that share the same URL 
    will use those original connectionProperties.

<li><a class="selfLink" id="CassandraCQL" href="#CassandraCQL" rel="bookmark">CQL</a> -- 
    The Cassandra Query Language (CQL) is superficially like SQL, 
    the query language used by traditional databases.
    Because OPeNDAP's tabular data requests were designed to mimic SQL tabular data requests,
    it is possible for ERDDAP™ to convert tabular data requests into  
    CQL Bound/PreparedStatements. 
    ERDDAP™ logs the statement in 
    <a rel="help" 
    href="https://erddap.github.io/setup.html#log">log.txt</a>
    as<kbd> 
    <br>statement as text: <i>theStatementAsText</i></kbd> 
    <br>The version of the statement you see
    will be a text representation of the statement and will 
    only have "?" where constraint values will be placed.
  <br>&nbsp;
  <br>Not so simple -- Unfortunately, CQL has many restrictions on which columns 
    can be queried
    with which types of constraints, for example, partition key columns can be 
    constrained with = and IN,
    so ERDDAP™ sends some constraints to Cassandra and applies all constraints 
    after the data is received from Cassandra. 
    To help ERDDAP™ deal efficiently with Cassandra, you need to specify
    <a rel="help" href="#CassandraPartitionKeySourceNames">&lt;partitionKeySourceNames&gt;</a>,
    <a rel="help" href="#CassandraClusterColumnSourceNames">&lt;clusterColumnSourceNames&gt;</a>, 
    and
    <a rel="help" href="#CassandraIndexColumnSourceNames">&lt;indexColumnSourceNames&gt;</a> 
    in datasets.xml for this dataset.
    These are the most important ways to help ERDDAP™ work efficiently with Cassandra.
    If you don't tell ERDDAP™ this information, 
    the dataset will be painfully slow in ERDDAP™ and use tons of Cassandra resources.
  <br>&nbsp;

<li><a class="selfLink" id="CassandraPartitionKeySourceNames" href="#CassandraPartitionKeySourceNames" rel="bookmark">&lt;partitionKeySourceNames&gt;</a> -
  Because partition keys play a central role in Cassandra tables,
  ERDDAP™ needs to know their sourceNames and, if relevant, other information about how
  to work with them.
  <ul>
  <li>You MUST specify a comma-separated list of partition key source column names in 
    datasets.xml via &lt;partitionKeySourceNames&gt;.
  <br>Simple example,
    <br><kbd>&lt;partitionKeySourceNames&gt;station, deviceid<wbr>&lt;partitionKeySourceNames&gt;</kbd>
  <br>More complex example,
    <br><kbd>&lt;partitionKeySourceNames&gt;deviceid=1007, date/sampletime/1970-01-01<wbr>&lt;partitionKeySourceNames&gt;</kbd>  

  <li>TimeStamp Partition Keys -- If one of the partition key columns is a 
    timestamp column that 
    has a coarser version of another timestamp column, specify this via 
    <br><kbd><i>partitionKeySourcName/otherColumnSourceName/time_precision</i></kbd>
    <br>where time_precision is one of the 
      <a rel="help" href="#time_precision">time_precision</a> strings used 
      elsewhere in ERDDAP.
    <br>The trailing Z in the time_precision string is the default, 
     so it doesn't matter if the time_precision string ends in Z or not. 
    <br>For example, ERDDAP™ will interpret 
      <kbd>date/sampletime/1970-01-01</kbd> as
    "Constraints for date can be constructed from constraints on sampletime by using
    this time_precision." The actual conversion of constraints is more complex,
    but that is the overview.
    <br><strong>Use this whenever it is relevant.</strong> 
    It enables ERDDAP™ to work efficiently with Cassandra.
    If this relationship between columns exists in a Cassandra table 
    and you don't tell ERDDAP™, 
    the dataset will be painfully slow in ERDDAP™ and use tons of Cassandra resources.

  <li>Single Value Partition Keys -- If you want an ERDDAP™ dataset to work with only 
    one value of one partition key, specify <kbd><i>partitionKeySourceName=value</i></kbd>. 
    <br>Don't use quotes for a numeric column, for example, <kbd>deviceid=1007</kbd> 
    <br>Do use quotes for a String column, for example, <kbd>stationid="Point Pinos"</kbd>

  <li>Dataset Default Sort Order -- The order of the partition key &lt;dataVariable&gt;'s
    in datasets.xml determines the default sort order of the results from Cassandra.
    Of course, users can request a different sort order for a given set of results
    by appending <kbd>&amp;orderBy("<i>comma-separated list of variables</i>")</kbd>
    to the end of their query.

  <li>By default, Cassandra and ERDDAP™ treat column names in a case-insensitive way.
    But if you set <a rel="help" href="#CassandraQuotes">columnNameQuotes</a> to ", 
    ERDDAP™ will treat Cassandra column names a in case-sensitive way.
  <br>&nbsp;
    
  </ul>

<li><a class="selfLink" id="CassandraPartitionKeyCSV" href="#CassandraPartitionKeyCSV" rel="bookmark">&lt;partitionKeyCSV&gt;</a> -

    If this is specified, ERDDAP™ will use it instead of asking Cassandra for the 
    partitionKey information each time the dataset is reloaded.
    This provides the list of distinct partition key values, in the order they'll be used.
    Times must be specified as seconds since 1970-01-01T00:00:00Z.
    But there are also two special alternate ways to specify times 
    (each encoded as a string):
    <br>1) time(aISO8601Time)    (MAY be encoded as a string)
    <br>2) "times(anISO8601StartTime, strideSeconds, stopTime)"  (MUST be encoded as a string)
    <br>stopTime can be an ISO8601Time or a "now-nUnits" time (e.g., "now-3minutes").
    <br>stopTime doesn't have to be an exact match of startTime + x strideSeconds.
    <br>A row with a times() value gets expanded into multiple rows before every query,
  so the list of partitionKeys can be always perfectly up-to-date.
  <br>For example,<pre>
&lt;partitionKeyCSV&gt;
deviceid,date
1001,"times(2014-11-01T00:00:00Z, 86400, 2014-11-02T00:00:00Z)"
1007,"time(2014-11-07T00:00:00Z)"
1008,time(2014-11-08T00:00:00Z)
1009,1.4154912E9
&lt;/partitionKeyCSV&gt;</pre>
expands into this table of partition key combinations:<pre>
deviceid,date
1001,1.4148E9
1001,1.4148864E9
1007,1.4153184E9
1008,1.4154048E9
1009,1.4154912E9 </pre>

<li><a class="selfLink" id="CassandraClusterColumnSourceNames" href="#CassandraClusterColumnSourceNames" rel="bookmark">&lt;clusterColumnSourceNames&gt;</a> -
  Cassandra accepts SQL-like constraints on cluster columns, 
  which are the columns that form the second part of the primary key 
  (after the partition key(s)).
  So, it is essential that you identify these columns via &lt;clusterColumnSourceNames&gt;.
  This enables ERDDAP™ to work efficiently with Cassandra.
  If there are cluster columns and you don't tell ERDDAP,
  the dataset will be painfully slow in ERDDAP™ and use tons of Cassandra resources.
  <ul>
  <li>For example, <kbd>&lt;clusterColumnSourceNames&gt;<i>myClusterColumn1, myClusterColumn2</i>&lt;/clusterColumnSourceNames&gt;</kbd>
  <li>If a Cassandra table has no cluster columns, either don't specify 
    &lt;clusterColumnSourceNames&gt;, or specify it with no value.
  <li>By default, Cassandra and ERDDAP™ treat column names in a case-insensitive way.
    But if you set <a rel="help" href="#CassandraQuotes">columnNameQuotes</a> to ", 
    ERDDAP™ will treat Cassandra column names in a case-sensitive way.
  <br>&nbsp;
  </ul>

<li><a class="selfLink" id="CassandraIndexColumnSourceNames" href="#CassandraIndexColumnSourceNames" rel="bookmark">&lt;indexColumnSourceNames&gt;</a> -
  Cassandra accepts '=' constraints on secondary index columns, 
  which are the columns that you have explicitly created indexes for 
  via 
  <br><kbd>CREATE INDEX <i>indexName</i> ON <i>keyspace.tableName</i>
  (<i>columnName</i>);</kbd>
  <br>(Yes, the parentheses are required.) 
  <br>So, it is very useful if you identify these columns via &lt;indexColumnSourceNames&gt;.
  This enables ERDDAP™ to work efficiently with Cassandra.
  If there are index columns and you don't tell ERDDAP,
  some queries will be needlessly, painfully slow in ERDDAP™ and use tons of Cassandra resources.
  <ul>
  <li>For example, <kbd>&lt;indexColumnSourceNames&gt;<i>myIndexColumn1, myIndexColumn2</i>&lt;/indexColumnSourceNames&gt;</kbd>
  <li>If a Cassandra table has no index columns, either don't specify 
    &lt;indexColumnSourceNames&gt;, or specify it with no value.
  <li>WARNING: Cassandra indexes aren't like database indexes.
    Cassandra indexes only help with '=' constraints. 
    And they are only 
    <a rel="help" href="https://cassandra.apache.org/doc/latest/cql/indexes.html">recommended<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a>
    for columns that have far fewer distinct values than total values.
  <li>By default, Cassandra and ERDDAP™ treat column names in a case-insensitive way.
    But if you set <a rel="help" href="#CassandraQuotes">columnNameQuotes</a> to ", 
    ERDDAP™ will treat Cassandra column names in a case-sensitive way.
  <br>&nbsp;
  </ul>

<li><a class="selfLink" id="maxRequestFraction" href="#maxRequestFraction" rel="bookmark">&lt;maxRequestFraction&gt;</a> -
  When ERDDAP™ (re)loads a dataset, ERDDAP™ gets from Cassandra the list of 
  distinct combinations of the partition keys. 
  For a huge dataset, the number of combinations will be huge.
  If you want to prevent users requests from requesting most or all of the dataset
  (or even a request that asks ERDDAP™ to download most or all of the data 
  in order to further filter it),
  you can tell ERDDAP™ only to allow requests that reduce the number of combinations
  by some amount via &lt;maxRequestFraction&gt;, which is a floating point number
  between 1e-10 (which means that the request can't need more than 1 combination 
  in a billion) and 1 (the default, which means that the request can be for the 
  entire dataset).
  <br>For example, if a dataset has 10000 distinct combinations of the partition keys
  and maxRequestFraction is set to 0.1,
  <br>then requests which need data from 1001 or more combinations will generate 
    an error message,
  <br>but requests which need data from 1000 or fewer combinations will be allowed.

  <p>Generally, the larger the dataset, the lower you should set
    &lt;maxRequestFraction&gt;. 
    So you might set it to 1 for a small dataset, 0.1 for a medium-sized dataset,
    0.01 for a large dataset, and 0.0001 for a huge dataset.

  <p>This approach is far from perfect. It will lead to some reasonable requests being
  rejected and some too-big requests being allowed. But it is a difficult problem 
  and this solution is much better than nothing.

<li><a rel="help" href="#subsetVariables"><kbd>subsetVariables</kbd></a> -
  As with other EDDTable datasets, you can specify a comma-separated
  list of <kbd>&lt;dataVariable&gt;</kbd> destinationNames 
  in a global attribute called "subsetVariables"
  to identify variables which have a limited number of
  values.
  The dataset will then have a .subset web page 
  and show lists of distinct values for those variables in drop-down
  lists on many web pages.

  <p>Including just partition key variables and static columns
  in the list is STRONGLY ENCOURAGED.
  Cassandra will be able to generate the list of distinct combinations very 
  quickly and easily each time the dataset is reloaded.
  One exception is timestamp partition keys that are coarse versions of
  some other timestamp column -- it is probably best to leave these
  out of the list of subsetVariables since there are a large number of values
  and they aren't very useful to users.

  <p>If you include non-partition key, non-static variables in the list, 
  it will probably be <strong>very</strong> computationally expensive for Cassandra
  each time the dataset is reloaded, because ERDDAP™ has to look through
  every row of the dataset to generate the information.
  In fact, the query is likely to fail.
  So, except for very small datasets, this is STRONGLY DISCOURAGED.   

<li><a class="selfLink" id="CassandraDataTypes" href="#CassandraDataTypes" rel="bookmark">Cassandra DataTypes</a> -- 
  Because there is some ambiguity about which
  <a rel="help" 
  href="https://cassandra.apache.org/doc/latest/cql/types.html">Cassandra data types</a>
  map to which ERDDAP™ data types, you need to specify a 
  <a rel="help" href="#dataType"><kbd>&lt;dataType&gt;</kbd></a> tag
  for each <a rel="help" href="#dataVariable"><kbd>&lt;dataVariable&gt;</kbd></a>
  to tell ERDDAP™ which dataType to use.
  The standard ERDDAP™ dataTypes (and the most common corresponding Cassandra data types) 
  are: 
  <ul>
  <li><a rel="help" href="#booleanData">boolean</a> (boolean), which ERDDAP™ then stores as bytes
  <li>byte (int, if the range is -128 to 127)
  <li>short (int, if the range is -32768 to 32767)
  <li>int (int, counter?, varint?, if the range is -2147483648 to 2147483647)
  <li>long (bigint, counter?, varint?, if the range is -9223372036854775808 to 9223372036854775807)
  <li>float (float)
  <li>double (double, decimal (with possible loss of precision), timestamp)
  <li>char (ascii or text, if they never have more than 1 character)
  <li>String (ascii, text, varchar, inet, uuid, timeuuid, blob, map, set, list?)
  </ul>
  <p>Cassandra's <a rel="help" href="#CassandraTimeStamp">timestamp</a>
  is a special case: use ERDDAP's double dataType.

  <p>If you specify a String dataType in ERDDAP™ for a Cassandra map, set or list,
  the map, set or list on each Cassandra row will be converted to a single string 
  on a single row in the ERDDAP™ table. 
  ERDDAP™ has an alternative system for lists; see below. 

  <p><i>type</i>Lists -- ERDDAP's 
  <a rel="help" href="#dataType"><kbd>&lt;dataType&gt;</kbd></a> 
  tag for Cassandra dataVariables can include the regular 
  ERDDAP™ dataTypes (see above) plus several special dataTypes that can be used 
  for Cassandra list columns: booleanList, 
  byteList, ubyteList, shortList, ushortList, intList, uintList,
  longList, ulongList, floatList, doubleList, 
  charList, StringList.
  When one of these list columns is in the results being passed to ERDDAP™, 
  each row of source data will be expanded
  to list.size() rows of data in ERDDAP; simple dataTypes (for example, int)
  in that source data row will be duplicated list.size() times. 
  If the results contain more than one list variable, all lists on a given 
  row of data MUST have the same size and MUST be "parallel" lists,
  or ERDDAP™ will generate an error message. For example,
  for currents measurements from an ADCP,
  <br>&nbsp;&nbsp;depth[0], uCurrent[0], vCurrent[0], and zCurrent[0] are all related, and 
  <br>&nbsp;&nbsp;depth[1], uCurrent[1], vCurrent[1], and zCurrent[1] are all related, ...  
  <br>Alternatively, if you don't want ERDDAP™ to expand a list into multiple
  rows in the ERDDAP™ table, specify String as the dataVariable's dataType
  so the entire list will be represented as one String on one row in ERDDAP.

<li><a class="selfLink" id="CassandraTimeStamp" href="#CassandraTimeStamp" rel="bookmark">Cassandra TimeStamp Data</a> -- 
  Cassandra's timestamp data is always aware of time zones. 
  If you enter timestamp data without specifying a timezone, Cassandra assumes
  the timestamp uses the local time zone. 

  <p>ERDDAP™ supports timestamp data and always presents the data in the Zulu/GMT time zone.
  So if you enter timestamp data in Cassandra using a time zone other than Zulu/GMT,
  remember that you need to do all queries for timestamp data in ERDDAP™ using the Zulu/GMT time zone.
  So don't be surprised when the timestamp values that come out of ERDDAP
  are shifted by several hours because of the time zone switch from local to Zulu/GMT time.
 
  <ul>
  <li>In ERDDAP's datasets.xml, in the <kbd>&lt;dataVariable&gt;</kbd> tag for 
    a timestamp variable, set
    <br>&nbsp;&nbsp;<kbd>&lt;dataType&gt;double&lt;/dataType&gt;</kbd> 
    <br>and in <kbd>&lt;addAttributes&gt;</kbd> set
    <br>&nbsp;&nbsp;<kbd>&lt;att name="units"&gt;seconds since 1970-01-01T00:00:00Z&lt;/att&gt;</kbd> .
  <li>Suggestion: If the data is a time range, it is useful to have the timestamp
  values refer to 
  the center of the implied time range (for example, noon).
  For example, if a user has data for 2010-03-26T13:00Z from another dataset 
  and they want the closest data from this Cassandra dataset that has data for each day, 
  then the data for 2010-03-26T12:00Z (representing Cassandra data for that date) 
  is obviously the best 
  (as opposed to the midnight before or after, where it is less obvious which is best).
  <li>ERDDAP™ has a utility to 
    <a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/convert/time.html"
    >Convert a Numeric Time to/from a String Time</a>.
  <li>See <a rel="help" 
    href="https://coastwatch.pfeg.noaa.gov/erddap/convert/time.html#erddap">How
     ERDDAP™ Deals with Time</a>.
     <br>&nbsp;
  </ul>

<li><a class="selfLink" id="CassandraNulls" href="#CassandraNulls" rel="bookmark">Integer nulls</a> -- 
  Cassandra supports nulls in Cassandra int (ERDDAP™ int) and bigint (ERDDAP™ long) columns, 
  but ERDDAP™ doesn't support true nulls for any integer data type.
  <br>By default, Cassandra integer nulls will be converted in ERDDAP™ 
  to 2147483647 for int columns,
  or 9223372036854775807 for long columns.
  These will appear as "NaN" in some types of text output files (for example, .csv),
  "" in other types of text output files (for example, .htmlTable),
  and the specific number (2147483647 for missing int values) in other types of files
  (for example, binary files like .nc and mat).
  A user can search for rows of data with this type of missing value 
  by referring to "NaN", e.g, "&amp;windSpeed=NaN".
  <p>If you use some other integer value to indicate missing values in your Cassandra 
    table, please identify that value in datasets.xml:
    <br><kbd>    &lt;att name="missing_value" <a rel="help" href="#attributeType">type="int"</a>&gt;-999&lt;/att&gt;</kbd>
  <p>For Cassandra floating point columns, nulls get converted to NaNs in ERDDAP.
    For Cassandra data types that are converted to Strings in ERDDAP™, 
    nulls get converted to empty Strings. That shouldn't be a problem.

<li><a class="selfLink" id="CassandraRepreparingQuery" href="#CassandraRepreparingQuery" rel="bookmark">"WARNING:
  Re-preparing already prepared query"</a>
  in <i>tomcat</i>/logs/catalina.out (or some other Tomcat log file)
  <br>Cassandra documentation says there is trouble if the same query is
  made into a PreparedStatement twice (or more).
  (See this 
  <a rel="help" href="https://datastax-oss.atlassian.net/browse/JAVA-236">bug
  report<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a>.)
  To avoid making Cassandra mad, ERDDAP™ caches all PreparedStatements 
  so it can reuse them.
  That cache is lost if/when Tomcat/ERDDAP™ is restarted, but I think that is
  okay because the PreparedStatements are associated with a given session
  (between Java and Cassandra), which is also lost.
  So, you may see these messages. I know of no other solution.
  Fortunately, it is a warning, not an error 
  (although Cassandra threatens that it may lead to performance problems).
  <p>Cassandra claims that PreparedStatements are good forever,
  so ERDDAP's cached PreparedStatements should never become out-of-date/invalid. 
  If that isn't true, and you get errors about certain PreparedStatements
  being out-of-date/invalid, then you need to restart ERDDAP™ to clear 
  ERDDAP's cache of PreparedStatements.

<li><a class="selfLink" id="CassandraSecurity" href="#CassandraSecurity" rel="bookmark">Security</a> 
  <br>See <a rel="help" 
  href="https://cassandra.apache.org/doc/latest/operating/security.html">
  Securing Cassandra</a>

  <p>When working with Cassandra, you need to do things as safely and securely as
  possible
  to avoid allowing a malicious user to damage your Cassandra or gain access to 
  data they shouldn't have access to.
  ERDDAP™ tries to do things in a secure way, too.
  <ul>
  <li>We encourage you to set up ERDDAP™ to connect to Cassandra as a 
    Cassandra user that only has
    access to the <strong>relevant</strong> table(s) and only has READ privileges.
  <li>We encourage you to set up the connection from ERDDAP™ to Cassandra so that it 
    <ul>
    <li>always uses SSL, 
    <li>only allows connections from one IP address (or one block of addresses) 
      and from the one ERDDAP™ user, and 
    <li>only transfers passwords in their MD5 hashed form.
    </ul>
  <li>[KNOWN PROBLEM] The connectionProperties (including the password!) are
    stored as plain text in datasets.xml.
    We haven't found a way to allow the administrator to enter the Cassandra 
    password during
    ERDDAP's startup in Tomcat (which occurs without user input), so the password 
    must be accessible in a file.
    To make this more secure:
    <ul>
    <li>You (the ERDDAP™ administrator) should be the owner of datasets.xml and have
      READ and WRITE access.
    <li>Make a group that includes only user=tomcat. 
      Use chgrp to make that the group for datasets.xml, with just READ privileges.
    <li>Use chmod to assign o-rwx privileges (no READ or WRITE access for "other" users) 
      for datasets.xml.
    </ul>
  <li>When in ERDDAP™, the password and other connection properties are stored in
    "private" Java variables.
  <li>Requests from clients are parsed and checked for validity before generating 
    the CQL requests for Cassandra.
  <li>Requests to Cassandra are made with CQL Bound/PreparedStatements, to prevent
    CQL injection.
    In any case, Cassandra is inherently less susceptible to CQL injection
    than traditional databases are to 
    <a rel="help" href="https://en.wikipedia.org/wiki/SQL_injection">SQL injection</a>.
  <br>&nbsp;
  </ul>

<li><a class="selfLink" id="CassandraSpeed" href="#CassandraSpeed" rel="bookmark">Speed</a> -- Cassandra can be fast or slow. 
 There are some things you can do to make it fast:
  <ul>
  <li>In General -
    <br>The nature of CQL is that queries are 
    <a rel="help" href="https://en.wikipedia.org/wiki/Declarative_programming">declarative<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
    They just specify what the user wants. 
    They don't include a specification or hints for how the query is to be handled
    or optimized.
    So there is no way for ERDDAP™ to generate the query in such a way that it helps 
    Cassandra optimize the query (or in any way specifies how the query is to be handled). 
    In general, it is up to the Cassandra administrator to set things up 
    (for example, indexes) to optimize for certain types of queries.
  <br>&nbsp;

  <li>Specifying the timestamp columns that are related to coarser-precision timestamp
    partition keys via
    <a rel="help" href="#CassandraPartitionKeySourceNames">&lt;partitionKeySourceNames&gt;</a> 
    is the most important way to help ERDDAP™ work efficiently with Cassandra.
    If this relationship exists in a Cassandra table and you don't tell ERDDAP™, 
    the dataset will be painfully slow in ERDDAP™ and use tons of Cassandra resources.
  <br>&nbsp;

  <li>Specifying the cluster columns via
    <a rel="help" href="#CassandraClusterColumnSourceNames">&lt;clusterColumnSourceNames&gt;</a> 
    is the second most important way to help ERDDAP™ work efficiently with Cassandra.
    If there are cluster columns and you don't tell ERDDAP,
    a large subset of the possible queries for data 
    will be needlessly, painfully slow in ERDDAP™ and use tons of Cassandra resources.
  <br>&nbsp;

  <li>Make <a rel="help"
    href="https://cassandra.apache.org/doc/latest/cql/indexes.html"
    >Indexes<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
    for Commonly Constrained Variables -- 
    <br>You can speed a few queries by creating indexes for Cassandra columns
    that are often constrained with "=" constraints.  

    <p>Cassandra can't make indexes for list, set, or map columns.

  <li>Specifying the index columns via
    <a rel="help" href="#CassandraIndexColumnSourceNames">&lt;indexColumnSourceNames&gt;</a> 
    is an important way to help ERDDAP™ work efficiently with Cassandra.
    If there are index columns and you don't tell ERDDAP,
    some queries for data 
    will be needlessly, painfully slow in ERDDAP™ and use tons of Cassandra resources.
  <br>&nbsp;

  <li><a class="selfLink" id="CassandraStats" href="#CassandraStats" rel="bookmark">"Cassandra stats" Diagnostic Messages</a> -- 
    For every ERDDAP™ user query to a Cassandra dataset, 
    ERDDAP™ will print a line in the log file, <i>bigParentDirectory</i>/logs/log.txt,
    with some statistics related to the query, for example,
    <br><kbd>* Cassandra stats: partitionKeyTable: 2/10000=2e-4 &lt; 0.1 nCassRows=1200 nErddapRows=12000 nRowsToUser=7405</kbd>
    <br>Using the numbers in the example above, this means:
    <ul>
    <li>When ERDDAP™ last (re)loaded this dataset, Cassandra told ERDDAP™ that there
      were 10000 distinct combinations of the partition keys.
      ERDDAP™ cached all of the distinct combinations in a file.
    <li>Due to the user's constraints, ERDDAP™ identified 2 combinations 
      out of the 10000 that might
      have the desired data. So, ERDDAP™ will make 2 calls to Cassandra, one for each
      combination of the partition keys. (That's what Cassandra requires.) 
      Clearly, it is troublesome if a large dataset has
      a large number of combinations of the partition keys
      and a given request doesn't drastically reduce that.
      You can require that each request reduce the key space by setting
      <a rel="help" href="#maxRequestFraction">&lt;maxRequestFraction&gt;</a>.
      Here, 2/10000=2e-4, which is less than the maxRequestFraction (0.1),
      so the request was allowed.
    <li>After applying the constraints on the partition keys, 
      <a rel="help" href="#CassandraClusterColumnSourceNames">cluster columns</a>, and
      <a rel="help" href="#CassandraIndexColumnSourceNames">index columns</a>
      which were sent by ERDDAP™, 
      Cassandra returned 1200 rows of data to ERDDAP™ in the ResultSet.
    <li>The ResultSet must have had 
      <a rel="help" href="#CassandraDataTypes">dataType=<i>sometype</i>List</a> 
      columns (with an average 
      of 10 items per list), because ERDDAP™ expanded the 1200 rows from Cassandra
      into 12000 rows in ERDDAP.
    <li>ERDDAP™ always applies all of the user's constraints to the data from 
      Cassandra. In this case, constraints which Cassandra had not handled
      reduced the number of rows to 7405. That is the number of rows sent to the user.
    </ul>
    The most important use of these diagnostic messages is to make sure that 
    ERDDAP™ is doing what you think it is doing.  
    If it isn't 
    (for example, is it not reducing the number of distinct combinations as expected?), 
    then you can use the information to try to figure out what's going wrong.
    <br>&nbsp;

  <li>Research and experiment to find and set better 
    <a rel="help" href="#CassandraConnectionProperty">&lt;connectionProperty&gt;</a>'s. 
  <br>&nbsp;  

  <li>Check the speed of the network connection between Cassandra and ERDDAP.
    If the connection is slow, see if you can improve it.
    The best situation is when ERDDAP™ is running on a server
    attached to the same (fast) switch as the server running the Cassandra node
    to which you are connecting. 
    <br>&nbsp;

  <li>Please be patient. Read the information here and in the Cassandra documentation
    carefully. Experiment. Check your work.
    If the Cassandra-ERDDAP™ connection is still slower than you expect, please
    email your Cassandra table's schema and your ERDDAP™ chunk of datasets.xml
    to <kbd>erd dot data at noaa dot gov</kbd>.
    <br>Or, you can join the <a rel="help"
        href="#ERDDAPMailingList">ERDDAP™ Google Group / Mailing List</a> 
        and post your question there.
    <br>&nbsp;

  <li>If all else fails, 
    <br>consider storing the data in a collection of NetCDF v3 .nc files 
    (especially .nc files that use the 
    <a href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html#discrete-sampling-geometries"
    >CF Discrete Sampling Geometries (DSG)<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
    Contiguous Ragged Array data structures and so can be handled with ERDDAP's
    <a rel="help" href="#EDDTableFromNcCFFiles">EDDTableFromNcCFFiles</a>).
    If they are logically organized (each with data for a chunk of space and time), 
    ERDDAP™ can extract data from them very quickly.
    <br>&nbsp;
  </ul>

<li><a class="selfLink" id="EDDTableFromCassandraSkeletonXML" href="#EDDTableFromCassandraSkeletonXML" rel="bookmark">The skeleton XML for an 
EDDTableFromCassandra dataset is:</a>
<pre>
&lt;dataset type="EDDTableFromCassandra" <a rel="help" href="#datasetID">datasetID</a>="..." <a rel="help" href="#active">active</a>="..." &gt;
  <a rel="help" href="#sourceUrl">&lt;ipAddress&gt;</a>...&lt;/ipAddress&gt;
    &lt;!-- The Cassandra URL without the port number, for example, 
    127.0.0.1 REQUIRED. --&gt;
  &lt;<a rel="help" href="#CassandraConnectionProperty">connectionProperty</a> name="<i>name</i>"&gt;<i>value</i>&lt;/connectionProperty&gt;
    &lt;!-- The names (for example, "readTimeoutMillis") and values 
      of the Cassandra properties that ERDDAP™ needs to change. 
      0 or more. --&gt; 
  &lt;keyspace&gt;...&lt;/keyspace&gt; &lt;!-- The name of the keyspace that has 
    the table. REQUIRED. --&gt;
  &lt;tableName&gt;...&lt;/tableName&gt; &lt;!-- The name of the table, default = "".
    REQUIRED. --&gt;
  <a rel="help" href="#CassandraPartitionKeySourceNames">&lt;partitionKeySourceNames&gt;</a>...&lt;partitionKeySourceNames&gt; 
    &lt;!-- REQUIRED. --&gt;
  <a rel="help" href="#CassandraClusterColumnSourceNames">&lt;clusterColumnSourceNames&gt;</a>...&lt;clusterColumnSourceNames&gt; 
    &lt;!-- OPTIONAL. --&gt;
  <a rel="help" href="#CassandraIndexColumnSourceNames">&lt;indexColumnSourceNames&gt;</a>...&lt;indexColumnSourceNames&gt; &lt;!-- OPTIONAL. --&gt;
  <a rel="help" href="#maxRequestFraction">&lt;maxRequestFraction&gt;</a>...&lt;maxRequestFraction&gt; 
    &lt;!-- OPTIONAL double between 1e-10 and 1 (the default). --&gt;
  <a rel="help" href="#CassandraQuotes">&lt;columnNameQuotes&gt;</a>...&lt;columnNameQuotes&gt; &lt;!-- OPTIONAL.
    Options: [nothing] (the default) or ". --&gt;
  <a rel="help" href="#sourceNeedsExpandedFP_EQ">&lt;sourceNeedsExpandedFP_EQ&gt;</a>true(default)|false&lt;/sourceNeedsExpandedFP_EQ&gt;
  <a rel="help" href="#accessibleTo">&lt;accessibleTo&gt;</a>...&lt;/accessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#graphsAccessibleTo">&lt;graphsAccessibleTo&gt;</a>auto|public&lt;/graphsAccessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#reloadEveryNMinutes">&lt;reloadEveryNMinutes&gt;</a>...&lt;/reloadEveryNMinutes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultDataQuery">&lt;defaultDataQuery&gt;</a>...&lt;/defaultDataQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultGraphQuery">&lt;defaultGraphQuery&gt;</a>...&lt;/defaultGraphQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#addVariablesWhere">&lt;addVariablesWhere&gt;</a>...&lt;/addVariablesWhere&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#fgdcFile">&lt;fgdcFile&gt;</a>...&lt;/fgdcFile&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#iso19115File">&lt;iso19115File&gt;</a>...&lt;/iso19115File&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#onChange">&lt;onChange&gt;</a>...&lt;/onChange&gt; &lt;!-- 0 or more --&gt;
  <a rel="help" href="#globalAttributes">&lt;addAttributes&gt;</a>...&lt;/addAttributes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#dataVariable">&lt;dataVariable&gt;</a>...&lt;/dataVariable&gt; &lt;!-- 1 or more.
     Each dataVariable MUST include a <a rel="help" href="#dataType"><kbd>&lt;dataType&gt;</kbd></a> tag. See 
       <a rel="help" href="#CassandraDataTypes">Cassandra DataTypes</a>.
     For <a rel="help" href="#CassandraTimeStamp">Cassandra timestamp columns</a>, set dataType=double and 
     units=seconds since 1970-01-01T00:00:00Z --&gt;
&lt;/dataset&gt;
</pre>
&nbsp;
</ul>


<p><a class="selfLink" id="EDDTableFromDapSequence" href="#EDDTableFromDapSequence" rel="bookmark"><strong>EDDTableFromDapSequence</strong></a> handles variables within 
1- and 2-level sequences from 
<a rel="help" href="https://www.opendap.org/">DAP<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
    servers such as DAPPER (was at https://www.pmel.noaa.gov/epic/software/dapper/, now discontinued).
<ul>
<li><p>We strongly recommend using the
  <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml program</a> 
  to make a rough draft of the datasets.xml chunk for this dataset.
  You can then edit that to fine tune it.
  You can gather the information you need 
  by looking at the source dataset's DDS and DAS files in your browser 
  (by adding .das and .dds to the sourceUrl
  (an example was at https://dapper.pmel.noaa.gov/dapper/epic/tao_time_series.cdp.dds ).
<li>A variable is in a DAP sequence if the .dds response indicates that the data structure holding
  the variable is a "sequence" (case insensitive).
<li>In some cases, you will see a sequence within a sequence, a 2-level sequence -- 
  EDDTableFromDapSequence handles these, too.
<li><a class="selfLink" id="EDDTableFromDapSequenceSkeletonXML" href="#EDDTableFromDapSequenceSkeletonXML" rel="bookmark">The skeleton XML for an EDDTableFromDapSequence dataset is:</a>

<pre>
&lt;dataset type="EDDTableFromDapSequence" <a rel="help" href="#datasetID">datasetID</a>="..." <a rel="help" href="#active">active</a>="..." &gt;
  <a rel="help" href="#sourceUrl">&lt;sourceUrl&gt;</a>...&lt;/sourceUrl&gt;
  <a rel="help" href="#accessibleTo">&lt;accessibleTo&gt;</a>...&lt;/accessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#graphsAccessibleTo">&lt;graphsAccessibleTo&gt;</a>auto|public&lt;/graphsAccessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#reloadEveryNMinutes">&lt;reloadEveryNMinutes&gt;</a>...&lt;/reloadEveryNMinutes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultDataQuery">&lt;defaultDataQuery&gt;</a>...&lt;/defaultDataQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultGraphQuery">&lt;defaultGraphQuery&gt;</a>...&lt;/defaultGraphQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#addVariablesWhere">&lt;addVariablesWhere&gt;</a>...&lt;/addVariablesWhere&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#fgdcFile">&lt;fgdcFile&gt;</a>...&lt;/fgdcFile&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#iso19115File">&lt;iso19115File&gt;</a>...&lt;/iso19115File&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#onChange">&lt;onChange&gt;</a>...&lt;/onChange&gt; &lt;!-- 0 or more --&gt;
  &lt;outerSequenceName&gt;...&lt;/outerSequenceName&gt;
    &lt;!-- The name of the outer sequence for DAP sequence data. 
    This tag is REQUIRED. --&gt;
  &lt;innerSequenceName&gt;...&lt;/innerSequenceName&gt;
    &lt;!-- The name of the inner sequence for DAP sequence data. 
    This tag is OPTIONAL; use it if the DAP data is a two level 
    sequence. --&gt;
  <a rel="help" href="#sourceNeedsExpandedFP_EQ">&lt;sourceNeedsExpandedFP_EQ&gt;</a>true(default)|false&lt;/sourceNeedsExpandedFP_EQ&gt;
  <a rel="help" href="#sourceCanConstrainStringEQNE">&lt;sourceCanConstrainStringEQNE&gt;</a>true|false&lt;/sourceCanConstrainStringEQNE&gt;
  <a rel="help" href="#sourceCanConstrainStringGTLT">&lt;sourceCanConstrainStringGTLT&gt;</a>true|false&lt;/sourceCanConstrainStringGTLT&gt;
  <a rel="help" href="#sourceCanConstrainStringRegex">&lt;sourceCanConstrainStringRegex&gt;</a>...&lt;/sourceCanConstrainStringRegex&gt;
  &lt;skipDapperSpacerRows&gt;...&lt;/skipDapperSpacerRows&gt;
    &lt;!-- skipDapperSpacerRows specifies whether the dataset 
    will skip the last row of each innerSequence other than the 
    last innerSequence (because Dapper servers put NaNs in the 
    row to act as a spacer).  This tag is OPTIONAL. The default 
    is false.  It is recommended that you set this to true for 
    all Dapper sources and false for all other data sources. --&gt;
  <a rel="help" href="#globalAttributes">&lt;addAttributes&gt;</a>...&lt;/addAttributes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#dataVariable">&lt;dataVariable&gt;</a>...&lt;/dataVariable&gt; &lt;!-- 1 or more --&gt;
&lt;/dataset&gt;
</pre>
&nbsp;
</ul>


<p><a class="selfLink" id="EDDTableFromDatabase" href="#EDDTableFromDatabase" rel="bookmark"><strong>EDDTableFromDatabase</strong></a> 
handles data from one relational database table or 
<a rel="help" href="https://en.wikipedia.org/wiki/View_(database)">view<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
<ul>
<li><a class="selfLink" id="databaseOneTable" href="#databaseOneTable" rel="bookmark">One Table</a> (or <a class="selfLink" id="databaseViews" href="#databaseViews" rel="bookmark">View</a>) 
  <br>If the data you want to serve is in two or more tables 
  (and thus needs a JOIN to extract data from both tables at once), 
  you need to make one 
      <a rel="help" href="https://en.wikipedia.org/wiki/Denormalization">denormalized<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
    (already joined) table or
      <a rel="help" href="https://en.wikipedia.org/wiki/View_(SQL)">view<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
    with all of the data that you want to make available as one dataset in ERDDAP.

    <p>For large, complex databases, it may make sense to separate out several chunks
    as denormalized tables,
    each with a different type of data, which will become separate datasets in ERDDAP.

    <p>Making a denormalized table for use in ERDDAP™ may sound like a crazy idea to you.
    Please trust us. There are several reasons why ERDDAP™ works with
    denormalized tables:
    <ul>
    <li>It's vastly easier for users.
      <br>When ERDDAP™ presents the dataset as one, simple, denormalized, single table,
      it is very easy for anyone to understand the data. Most users have never
      heard of normalized tables, and very few understand
      keys, foreign keys, or table joins,
      and they almost certainly don't know the details of the different types of joins,
      or how to specify the SQL to do a join (or multiple joins) correctly. Using a denormalized
      table avoids all those problems.  This reason alone justifies the use of a
      denormalized single table for the presentation of a dataset to ERDDAP™ users.
      <br>&nbsp;
    <li>Normalized tables (multiple tables related by key columns) are great for 
      storing data in a database. 
      <br>But even in SQL, the result that is returned to the user is a denormalized 
      (joined) single table. So it seems reasonable to present the dataset to users
      as a huge, denormalized, single table from which they can then request subsets
      (e.g., show me rows of the table where temperature&gt;30).
      <br>&nbsp;
    <li>You can make changes for ERDDAP™ without changing your tables.
      <br>ERDDAP™ has a few requirements that may be different from how you have set
      up your database.
      <br>For example, ERDDAP™ requires that timestamp data be stored in 'timestamp
      with timezone' fields.
      <br>By making a separate table/view for ERDDAP™, you can make these changes
      when you make the denormalized table for ERDDAP.
      Thus, you don't have to make any changes to your tables.
      <br>&nbsp;
    <li>ERDDAP™ will recreate some of the structure of the normalized tables.
      <br>You can specify which columns of data come from the 'outer' tables and therefore
      have a limited number of distinct values.  ERDDAP™ will collect all of the
      different combinations of values in these columns
      and present them to users on a special .subset web page that 
      helps users quickly select subsets of the dataset.
      The distinct values for each column are also shown in drop-down lists 
      on the dataset's other web pages.
      <br>&nbsp;
    <li>A denormalized table makes the data hand-off from you to the ERDDAP
      administrator easy.
      <br>You're the expert for this dataset,
      so it makes sense that you make the decisions
      about which tables and which columns to join and how to join them.
      So you don't have to hand us (or worse, the end users) 
      several tables and detailed instructions for how to join them, 
      you just have to give us access to the denormalized table.
      <br>&nbsp;
    <li>A denormalized table allows for efficient access to the data.
      <br>The denormalized form is usually faster to access than the normalized form.
      Joins can be slow. Multiple joins can be very slow.
      <br>&nbsp;
    </ul>
    In order to get the data from two or more tables in the database into ERDDAP™, 
    there are three options:
    <br>&nbsp;
    <ul>
    <li>Recommended Option:
      <br>You can create a comma- or tab-separated-value file with
      the data from the denormalized table.
      <br>If the dataset is huge, then it makes sense to create several files,
      each with a cohesive subset of the denormalized table
      (for example, data from a smaller time range).
      <p>The big advantage here is that ERDDAP™ will be able to handle user requests
      for data without any further effort by your database.
      So ERDDAP™ won't be a burden on your database or a security risk.
      This is the best option under almost all circumstances because ERDDAP™ can usually
      get data from files faster than from a database
      (if we convert the .csv files to .ncCF files). (Part of the reason is that
      ERDDAP+files is a read-only system and doesn't have to deal with making changes
      while providing
      <a rel="help" href="https://en.wikipedia.org/wiki/ACID">ACID<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
      (Atomicity, Consistency, Isolation, Durability).)
      Also, you probably won't need a separate server since we can store the data
      on one of our RAIDs and access it with an existing ERDDAP™ on an existing server.

    <li>Okay Option:
      <br>You set up a new database on a different computer with just the
      denormalized table.
      <br>Since that database can be a free and open source database like MariaDB, MySQL, and PostgreSQL,
      this option needn't cost a lot.
      <p>The big advantage here is that ERDDAP™ will be able to handle user requests
      for data without any further effort by your current database.
      So ERDDAP™ won't be a burden on your current database.
      This also eliminates a lot of security concerns since ERDDAP™ will not have
      access to your current database.

    <li>Discouraged Option:
      <br>We can connect ERDDAP™ to your current database.
      <br>To do this, you need to:
      <ul>
      <li>Create a separate table or view with the denormalized table of data.
      <li>Create an "erddap" user who has read-only access to only the denormalized table(s).
        <br>&nbsp;
      </ul>
      This is an option if the data changes very frequently and you want to give
      ERDDAP™ users instant access to those changes; however, even so,
      it may make sense to use the file option above and periodically
      (every 30 minutes?) replace the file that has today's data.
      <br>The huge disadvantages of this approach are that ERDDAP™ user requests will
      probably place an unbearably large burden on your database and that
      the ERDDAP™ connection is a security risk (although we can minimize/manage the risk).
    </ul>
  
  <p>Making the denormalized table or view for ERDDAP™ is
  a good opportunity to make a few changes that ERDDAP™ needs, 
  in a way that doesn't affect your original tables:
  <ul>
  <li>Change the date and timestamp fields/columns to use the dataType that Postgres calls
    <a rel="help" href="#databaseDate">timestamp with time zone</a> 
    (or the equivalent in your database).
      <br>Timestamps without time zone information don't work correctly in ERDDAP.
  <li>Make indexes for the columns that users often search. 
  <li>Be very aware of <a rel="help" href="#databaseQuotes">the case of the field/column names</a>
    (for example, use all lowercase) when you type them.
  <li>Don't use reserved words for the table and for the field/column names.
  </ul>
  <p>If you need help making the denormalized table or view, please 
    contact your database administrator.
  <br>If you want to talk about this whole approach or strategize how best to do it, 
    please email Chris.John at noaa.gov .

<li><a class="selfLink" id="databaseDatasetsXml" href="#databaseDatasetsXml" rel="bookmark">datasets.xml</a> -- 
  It is difficult to create the correct 
  datasets.xml information needed for ERDDAP™ to establish
  a connection to the database.  Be patient. Be methodical.
  <ul>
  <li>We strongly recommend using the
    <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml program</a> 
    to make a rough draft of the datasets.xml chunk for this dataset.
    You can then edit that to fine tune it.     
    <p>GenerateDatasetsXml has three special options for EDDTableFromDatabase:
    <ol>
    <li>If you enter "!!!LIST!!!" (without the quotes) for the catalog name,
      the program will display a list of the catalog names.
    <li>If you enter "!!!LIST!!!" (without the quotes) for the schema name,
      the program will display a list of the schema names.
    <li>If you enter "!!!LIST!!!" (without the quotes) for the tablename,
      the program will display a list of tables and their columns.
    </ol>
    The first "!!!LIST!!!" entry that you make is the one that will be used.
  <li>Carefully read all of this document's information about EDDTableFromDatabase.
  <li>You can gather most of the information you need to create the XML for an EDDTableFromDatabase
    dataset by contacting the database administrator and by searching the web.
  <li>Although databases often treat column names and table names in a case-insensitive way,
    they are case-sensitive in ERDDAP.
    So if an error message from the database says that a column name is unknown 
    (for example, "Unknown identifier='<i>column_name</i>'") even though you know it exists,
    try using all capitals, for example, <i>COLUMN_NAME</i>, which is often the true, case-sensitive 
    version of the column name.
  <li>Work closely with the database administrator, who may have relevant experience.
    If the dataset fails to load, read the 
    <a rel="help" href="#errorMessages">error message</a> carefully to find out why.
  <br>&nbsp;
  </ul>  

<li><a class="selfLink" id="databaseDriver" href="#databaseDriver" rel="bookmark">JDBC Driver and &lt;driverName&gt;</a> -- 
  You must get the appropriate JDBC 3 or JDBC 4 driver .jar file 
  for your database and
  <br>put it in <i>tomcat</i>/webapps/erddap/WEB-INF/lib after you install ERDDAP.
  Then, in your datasets.xml for this dataset, you must specify the <kbd>&lt;driverName&gt;</kbd> 
  for this driver, which is (unfortunately) different from the filename.
  Search on the web for the JDBC driver for your database and the
  driverName that Java needs to use it.
  <ul>
  <li>For MariaDB, try
    <a rel="help" href="https://mariadb.com/kb/en/about-the-mariadb-java-client/"
      >https://mariadb.com/kb/en/about-the-mariadb-java-client/<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
    <br>The <kbd>&lt;driverName&gt;</kbd> to use in datasets.xml (see below) is probably <kbd>org.mariadb.jdbc.Driver</kbd> .
  <li>For MySQL and Amazon RDS, try 
    <a rel="help" href="https://dev.mysql.com/downloads/connector/j/"
      >https://dev.mysql.com/downloads/connector/j/<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>        
    <br>The <kbd>&lt;driverName&gt;</kbd> to use in datasets.xml (see below) is probably <kbd>com.mysql.jdbc.Driver</kbd> .
  <li>For Oracle, try 
    <a rel="help" href="https://www.oracle.com/database/technologies/appdev/jdbc-downloads.html"
      >https://www.oracle.com/database/technologies/appdev/jdbc-downloads.html<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
    <br>The <kbd>&lt;driverName&gt;</kbd> to use in datasets.xml (see below) is probably <kbd>oracle.jdbc.driver.OracleDriver</kbd> .
  <li>For Postgresql, we got the JDBC 4 driver from 
    <a rel="help" href="https://mvnrepository.com/artifact/org.postgresql/postgresql">https://mvnrepository.com/artifact/org.postgresql/postgresql<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
    <br>The <kbd>&lt;driverName&gt;</kbd> to use in datasets.xml (see below) is probably <kbd>org.postgresql.Driver</kbd> .
  <li>For SQL Server, you can get the JTDS JDBC driver from 
    <a rel="help" href="https://jtds.sourceforge.net">https://jtds.sourceforge.net<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
    <br>The <kbd>&lt;driverName&gt;</kbd> to use in datasets.xml (see below) is probably <kbd>net.sourceforge.jtds.jdbc.Driver</kbd> .
  </ul>
  After you put the JDBC driver .jar in ERDDAP™ lib directory, you need to 
  add a reference to that .jar file in the .bat and/or .sh script files for 
  GenerateDatasetsXml, DasDds, and ArchiveADataset which are in the
  <i>tomcat</i>/webapps/erddap/WEB-INF/ directory; 
  otherwise, you'll get a ClassNotFoundException when you run those scripts.

  <p>Unfortunately, JDBC is sometimes the source of trouble. In its role as intermediary between
  ERDDAP™ and the database, it sometimes makes subtle changes to the standard/generic database SQL request
  that ERDDAP™ creates, thereby causing problems (for example, related to 
  <a rel="help" href="#databaseQuotes">upper/lowercase identifiers</a>
  and related to 
  <a rel="help" href="#databaseDate">date/time timezones</a>).  
  Please be patient, read the information here carefully, check your work,
  and email <kbd>erd dot data at noaa dot gov</kbd> if you have problems that you can't resolve.
    <br>Or, you can join the <a rel="help"
        href="#ERDDAPMailingList">ERDDAP™ Google Group / Mailing List</a> 
        and post your question there.


<li><a class="selfLink" id="databaseConnectionProperty" href="#databaseConnectionProperty" rel="bookmark">&lt;connectionProperty&gt;</a>  -- In the datasets.xml for your dataset, 
  you must define several connectionProperty tags to tell ERDDAP™ how to connect
  to your database (for example, to specify the user name, password, ssl connection, and 
  <a rel="help" href="#databaseFetchSize">fetch size</a>).
  These are different for every situation and are a little hard to figure out.
  Search the web for examples of using a JDBC driver to connect to your database.
  The <kbd>&lt;connectionProperty&gt;</kbd> names (for example, "user",
  "password", and "ssl"), and some of the connectionProperty values can be found by searching
  the web for "JDBC connection properties <i>databaseType</i>" 
  (for example, Oracle, MySQL, Amazon RDS, MariaDB, PostgreSQL).
  <br>&nbsp;

<li><a class="selfLink" id="databaseQuotes" href="#databaseQuotes" rel="bookmark">Quotes for Field/Column Names; Case Sensitivity</a> -
  By default, EDDTableFromDatabase puts ANSI-SQL-standard double
  quotes around field/column names 
  in SELECT statements in case you have used a reserved word as a field/column name,
  or a special character in a field/column name.
  The double quotes also thwart certain types of SQL injection attacks.
  You can tell ERDDAP™ to use ", ', or no quotes via 
  &lt;columnNameQuotes&gt; in datasets.xml for this dataset.

  <p>For many databases, using any type of quotes causes the database to work with
  field/column names in a case sensitive way 
  (instead of the default database case insensitive way).
  Databases often display file/column names as all upper-case, 
  when in reality the case sensitive form is different.
  In ERDDAP™, please always treat database column names as case sensitive.
  <ul>
  <li>For MariaDB, you need to run the database with 
    <a rel="help" href="https://mariadb.com/kb/en/mysql-command-line-client/">--sql-mode=ANSI_QUOTES<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> . 
  <li>For MySQL and Amazon RDS, you need to run the database with
    <a rel="help"
    href="https://dev.mysql.com/doc/refman/5.7/en/sql-mode.html#sqlmode_ansi_quotes"
    >--sql-mode=ANSI_QUOTES<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> .
  <li>Oracle supports ANSI-SQL-standard double quotes 
    <a rel="help"
    href="https://docs.oracle.com/database/121/SQLRF/sql_elements008.htm#SQLRF00223"
    >by default</a>.
  <li>PostgreSQL supports ANSI-SQL-standard double quotes by default.
  </ul>  
  <br>Don't use a reserved word for a database, catalog, schema or table's name.
    ERDDAP™ doesn't put quotes around them. 
  <p>If possible, use all lower-case for database, catalog, schema, table names and field names 
    when creating the database table (or view) and when referring to the field/column names in datasets.xml in ERDDAP. 
    Otherwise, you may get an error message saying the database, catalog, schema, table,
    and/or field wasn't found.
    If you do get that error message, try using the case-sensitive version, the all upper-case version, 
    and the all lower-case version of the name in ERDDAP. One of them may work. If not,
    you need to change the name of database, catalog, schema, and/or table to all lower-case.

<li><a class="selfLink" id="DatabaseDataTypes" href="#DatabaseDataTypes" rel="bookmark">Database</a> <a rel="help" href="#dataType"><kbd>&lt;dataType&gt;</kbd></a> Tags -- 
  Because there is some ambiguity about which
  <a rel="help" 
  href="https://www.w3schools.com/sql/sql_datatypes_general.asp">database data types</a>
  map to which ERDDAP™ data types, you need to specify a 
  <a rel="help" href="#dataType"><kbd>&lt;dataType&gt;</kbd></a> tag
  for each <a rel="help" href="#dataVariable"><kbd>&lt;dataVariable&gt;</kbd></a>
  to tell ERDDAP™ which dataType to use.
  Part of the problem is that different datasets use different terms
  for the various data types -- so always try to match the definitions, not
  just the names.
  See the description of the 
  <a rel="help" href="#dataTypes">standard ERDDAP™ dataTypes</a>,
  which includes references to the corresponding SQL data types. 
  <a rel="help" href="#databaseDate">Date and timestamp</a> are special cases: 
  use ERDDAP's double dataType.
  <br>&nbsp;

<li><a class="selfLink" id="databaseDate" href="#databaseDate" rel="bookmark">Database Date Time Data</a> -- 
  Some database date time columns have no explicit time zone.
  Such columns are trouble for ERDDAP. 
  Databases support the concept of a date (with or without a time) without a time zone, as an
  approximate range of time.
  But Java (and thus ERDDAP) only deals with instantaneous date+times with a timezone. 
  So you may know that the date time data is based on a local time zone (with or without daylight saving time)
  or the  GMT/Zulu time zone, but Java (and ERDDAP) don't.
  We originally thought we could work around this problem (e.g, by specifying a time zone for the
  column), but the database+JDBC+Java interactions made this an unreliable solution.
  <ul>
  <li>So, ERDDAP™ requires that you store all date and date time data in the database table 
    with a database data type that corresponds to the JDBC type "timestamp with time zone" 
    (ideally, that uses the GMT/Zulu time zone). 
  <li>In ERDDAP's datasets.xml, in the <kbd>&lt;dataVariable&gt;</kbd> tag for 
    a timestamp variable, set
    <br>&nbsp;&nbsp;<a rel="help" href="#dataType"><kbd>&lt;dataType&gt;double&lt;/dataType&gt;</kbd></a> 
    <br>and in <kbd>&lt;addAttributes&gt;</kbd> set
    <br>&nbsp;&nbsp;<kbd>&lt;att name="units"&gt;seconds since 1970-01-01T00:00:00Z&lt;/att&gt;</kbd> .
  <li>Suggestion: If the data is a time range, it is useful to have the timestamp values refer to 
  the center of the implied time range (for example, noon).
  For example, if a user has data for 2010-03-26T13:00Z from another dataset 
  and they want the closest data from a database dataset that has data for each day, 
  then the database data for 2010-03-26T12:00Z (representing data for that date) 
  is obviously the best 
  (as opposed to the midnight before or after, where it is less obvious which is best).
  <li>ERDDAP™ has a utility to 
    <a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/convert/time.html">Convert
    a Numeric Time to/from a String Time</a>.
  <li>See <a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/convert/time.html#erddap">How ERDDAP
     Deals with Time</a>.
     <br>&nbsp;
  </ul>

<li><a class="selfLink" id="databaseNulls" href="#databaseNulls" rel="bookmark">Integer nulls</a> -- 
  Databases support nulls in integer (int, smallint, tinyint) columns, but ERDDAP™ doesn't support true nulls.
  <br>Database nulls will be converted in ERDDAP™ 
   127 for byte columns, 255 for ubyte columns,  
   32767 for short columns, 65535 for ushort columns,
   2147483647 for int columns, 4294967295 for uint columns,
   9,223,372,036,854,775,807 for long columns, or 18446744073709551615 for ulong columns.
   If you use those defaults, please identify those missing_values for the 
   dataset's users in ERDDAP™ with 
  <br><kbd>    &lt;att name="_FillValue" <a rel="help" href="#attributeType">type="int"</a>&gt;2147483647&lt;/att&gt;</kbd>
  <br>or
  <br><kbd>    &lt;att name="_FillValue" <a rel="help" href="#attributeType">type="short"</a>&gt;32767&lt;/att&gt;</kbd>
  <br>Alternatively, you can use the "missing_value" attribute instead of "_FillValue".
  <br>GenerateDatasetsXml automatically adds these _FillValue attributes when it generates the
    suggested datasets.xml for database datasets.
  <p>For database floating point columns, nulls get converted to NaNs in ERDDAP.
  <br>For database data types that are converted to Strings in ERDDAP™, 
    nulls get converted to empty Strings.

<li><a class="selfLink" id="databaseSecurity" href="#databaseSecurity" rel="bookmark">Security</a> -- 
  When working with databases, you need to do things as safely and securely as possible
  to avoid allowing a malicious user to damage your database or gain access to data they shouldn't
  have access to.
  ERDDAP™ tries to do things in a secure way, too.
  <ul>
  <li>Consider replicating, on a different computer, the database and database tables 
    with the data that you want ERDDAP™ to serve.  
    (Yes, for commercial databases like Oracle, this involves
    additional licensing fees. But for open source databases, 
    like PostgreSQL, MySQL, Amazon RDS, and MariaDB,
    this costs nothing.)  This gives you a high level of security and also 
    prevents ERDDAP™ requests from slowing down the original database.
  <li>We encourage you to set up ERDDAP™ to connect to the database as a database user that only has
    access to the <strong>relevant</strong> database(s) and only has READ privileges.
  <li>We encourage you to set up the connection from ERDDAP™ to the database so that it 
    <ul>
    <li>always uses SSL, 
    <li>only allows connections from one IP address (or one block of addresses) and from the one
       ERDDAP™ user, and 
    <li>only transfers passwords in their MD5 hashed form.
    </ul>
  <li>[KNOWN PROBLEM]The connectionProperties (including the password!) are stored as plain text
    in datasets.xml.
    We haven't found a way to allow the administrator to enter the database 
    password during
    ERDDAP's startup in Tomcat (which occurs without user input), so the password 
    must be accessible in a file.
    To make this more secure:
    <ul>
    <li>You (the ERDDAP™ administrator) should be the owner of datasets.xml and have
      READ and WRITE access.
    <li>Make a group that includes only user=tomcat. 
      Use chgrp to make that the group for datasets.xml, with just READ privileges.
    <li>Use chmod to assign o-rwx privileges (no READ or WRITE access for "other" users) 
      for datasets.xml.
    </ul>
  <li>When in ERDDAP™, the password and other connection properties are stored in "private" 
    Java variables.
  <li>Requests from clients are parsed and checked for validity before generating the SQL requests
    for the database.
  <li>Requests to the database are made with SQL PreparedStatements, to prevent 
    <a rel="help" href="https://en.wikipedia.org/wiki/SQL_injection">SQL injection</a>.
  <li>Requests to the database are submitted with executeQuery (not executeStatement) to limit
    requests to be read-only (so attempted SQL injection to alter the database will fail for this
    reason, too).
  <br>&nbsp;
  </ul>

<li><a class="selfLink" id="databaseSQL" href="#databaseSQL" rel="bookmark">SQL</a> -- 
    Because OPeNDAP's tabular data requests were designed to mimic SQL tabular data requests,
    it is easy for ERDDAP™ to convert tabular data requests into simple SQL PreparedStatements. 
    For example, the ERDDAP™ request<kbd>
    <br>time,temperature&amp;time&gt;=2008-01-01T00:00:00Z&amp;time&lt;=2008-02-01T00:00:00Z</kbd>
    <br>will be converted into the SQL PreparedStatement<kbd>
    <br>SELECT "time", "temperature" FROM <i>tableName</i> 
    <br>WHERE "time" &gt;= 2008-01-01T00:00:00Z AND "time" &lt;= 2008-02-01T00:00:00Z</kbd>
    <br>ERDDAP™ requests with <kbd>&amp;distinct()</kbd> and/or <kbd>&amp;orderBy(<i>variables</i>)</kbd> will add
      <kbd>DISTINCT</kbd> and/or <kbd>ORDER BY <i>variables</i></kbd> to the SQL prepared statement. In general, this will greatly slow down the response from the database.
    <br>ERDDAP™ logs the PreparedStatement in 
    <a rel="help" href="https://erddap.github.io/setup.html#log">log.txt</a>
    as<kbd> 
    <br>statement=<i>thePreparedStatement</i></kbd>
    <br>This will be a text representation of the PreparedStatement, which 
    may be slightly different from the actual PreparedStatement. 
    For example, in the PreparedStatement, times are encoded in a special way.
    But in the text representation, they appear as ISO 8601 date times.
  <br>&nbsp;

<li><a class="selfLink" id="databaseSpeed" href="#databaseSpeed" rel="bookmark">Speed</a> -- Databases can be slow. There are some things you can do:
  <ul>
  <li>In General -
    <br>The nature of SQL is that queries are 
    <a rel="help" href="https://en.wikipedia.org/wiki/Declarative_programming">declarative<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
    They just specify what the user wants. 
    They don't include a specification or hints for how the query is to be handled or optimized.
    So there is no way for ERDDAP™ to generate the query in such a way that it helps the 
    database optimize the query (or in any way specifies how the query is to be handled). 
    In general, it is up to the database administrator to set things up (for example, indexes) to optimize 
    for certain types of queries.

  <li><a class="selfLink" id="databaseFetchSize" href="#databaseFetchSize" rel="bookmark">Set the Fetch Size</a> -- 
    <br>Databases return the data to ERDDAP™ in chunks.
    By default, different databases return a different number of rows in the chunks.
    Often this number is very small and so very inefficient.
    For example, the default for Oracle is 10!
    Read the JDBC documentation for your database's JDBC driver
    to find the connection property to set
    in order to increase this, and add this to the dataset's description
    in datasets.xml. For example, 
    <br>For MySQL and Amazon RDS, use
    <br><kbd>&lt;connectionProperty name="defaultFetchSize"&gt;10000&lt;/connectionProperty&gt;</kbd>
    <br>For MariaDB, there is currently no way to change the fetch size. 
      But it is a requested feature, so search the web to see if this has been implemented.
    <br>For Oracle, use
    <br><kbd>&lt;connectionProperty name="defaultRowPrefetch"&gt;10000&lt;/connectionProperty&gt;</kbd>
    <br>For PostgreSQL, use
    <br><kbd>&lt;connectionProperty name="defaultRowFetchSize"&gt;10000&lt;/connectionProperty&gt;</kbd>
    <br>but feel free to change the number.  Setting the number too big will
    <br>cause ERDDAP™ to use lots of memory and be more likely to run out of memory.

  <li><a class="selfLink" id="databaseConnectionProperties" href="#databaseConnectionProperties" rel="bookmark">ConnectionProperties</a> -- 
    <br>Each database has other connection properties which
    can be specified in datasets.xml. Many of these will affect the performance
    of the database to ERDDAP™ connection. Please read the documentation for 
    your database's JDBC driver to see the options.
    If you find connection properties that are useful, please 
    send an email with the details to <kbd>erd dot data at noaa dot gov</kbd>.

  <li>Make a Table -- 
    <br>You will probably get faster responses if you periodically
    (everyday? whenever there is new data?) generate an actual table (similarly to how you
    generated the VIEW) and tell ERDDAP™ to get data from the table instead of the VIEW. 
    Since any request to the table can then be
    fulfilled without JOINing another table, the response will be much faster.

  <li>Vacuum the Table -
    <br>MySQL and Amazon RDS will respond much faster if you use
    <a rel="help" href="https://dev.mysql.com/doc/refman/5.7/en/optimize-table.html">OPTIMIZE TABLE<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
    <br>MariaDB will respond much faster if you use
    <a rel="help" href="https://mariadb.com/kb/en/optimize-table/">OPTIMIZE TABLE<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
    <br>PostgreSQL will respond much faster if you 
    <a rel="help" href="https://www.postgresql.org/docs/8.3/static/sql-vacuum.html">VACUUM<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> the table.
    <br>Oracle doesn't have or need an analogous command.

  <li>Make <a rel="help" href="https://en.wikipedia.org/wiki/Database_index">Indexes<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
    for Commonly Constrained Variables -- 
    <br>You can speed up many/most queries by creating indexes in the database for the variables 
    (which databases call "columns")
    that are often constrained in the user's query.  
    In general, these are the same variables specified by 
    <a rel="help" href="#subsetVariables"><kbd>&lt;subsetVariables&gt;</kbd></a> and/or
    the latitude, longitude, and time variables.

  <li><a class="selfLink" id="databaseConnectionPooling" href="#databaseConnectionPooling" rel="bookmark">Use Connection Pooling</a> -
    <br>Normally, ERDDAP™ makes a separate connection to the database for each request.
      This is the most reliable approach. The faster alternative 
      is to use a DataSource which supports connection pooling. 
      To set it up, specify (for example)
      <br><kbd>&lt;dataSourceName&gt;java:comp/env/jdbc/postgres/erddap&lt;/dataSourceName&gt;</kbd>
      <br>right next to &lt;sourceUrl&gt;, &lt;driverName&gt;, and &lt;connectionProperty&gt;.
      <br>And in <i>tomcat</i>/conf/context.xml, define a resource with the same information, for example, 
      <br><kbd>&lt;Resource
      <br>  name="jdbc/postgres/erddap" auth="Container" type="javax.sql.DataSource"
      <br>  driverClassName="org.postgresql.Driver"
      <br>  url="<i>jdbc:postgresql://somehost:5432/myDatabaseName</i>"
      <br>  username="<i>myUsername</i>" password="<i>myPassword</i>"
      <br>  initialSize="0" maxActive="8" minIdle="0" maxIdle="0" maxWait="-1"/&gt;</kbd>
      <br>General information about using a DataSource is at
        <a rel="help" href="https://docs.oracle.com/javase/tutorial/jdbc/basics/sqldatasources.html"
             >https://docs.oracle.com/javase/tutorial/jdbc/basics/sqldatasources.html<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a>.
      <br>See 
        <a rel="help"
        href="https://tomcat.apache.org/tomcat-7.0-doc/jndi-resources-howto.html#JDBC_Data_Sources">Tomcat
        DataSource information<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a> and 
        <a rel="help"
        href="https://tomcat.apache.org/tomcat-7.0-doc/jndi-datasource-examples-howto.html">Tomcat DataSource examples<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a>
        or search the web for examples of using DataSources with other application servers.

  <!-- <li><a class="selfLink" id="databaseConnectionPooling" href="#databaseConnectionPooling" rel="bookmark">Connection Pooling</a> - 
    <br>ERDDAP™ currently doesn't use connection pooling. ERDDAP™ makes a new connection 
    to the database for each SQL query that it sends to the database. 
    This adds about 0.1 seconds per request (sometimes longer, for example, for remote databases), 
    but is a more robust and safe approach.
    We may add optional connection pooling in the future.  -->

  <li>If all else fails, 
    <br>consider storing the data in a collection of NetCDF v3 .nc files 
    (especially .nc files that use the 
    <a href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html#discrete-sampling-geometries"
    >CF Discrete Sampling Geometries (DSG)<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
    Contiguous Ragged Array data structures and so can be handled with ERDDAP's
    <a rel="help" href="#EDDTableFromNcCFFiles">EDDTableFromNcCFFiles</a>).
    If they are logically organized (each with data for a chunk of space and time), 
    ERDDAP™ can extract data from them very quickly.
    <br>&nbsp;
  </ul>

<li><a class="selfLink" id="EDDTableFromDatabaseSkeletonXML" href="#EDDTableFromDatabaseSkeletonXML" rel="bookmark">The skeleton XML for an EDDTableFromDatabase dataset is:</a>
<pre>
&lt;dataset type="EDDTableFromDatabase" <a rel="help" href="#datasetID">datasetID</a>="..." <a rel="help" href="#active">active</a>="..." &gt;
  <a rel="help" href="#sourceUrl">&lt;sourceUrl&gt;</a>...&lt;/sourceUrl&gt;
    &lt;!-- The format varies for each type of database, but will be 
      something like: 
      For MariaDB:    jdbc:mariadb://<i>xxx.xxx.xxx.xxx</i>:3306/<i>databaseName</i>
      For MySql       jdbc:mysql://<i>xxx.xxx.xxx.xxx</i>:3306/<i>databaseName</i>
      For Amazon RDS: jdbc:mysql://<i>xxx.xxx.xxx.xxx</i>:3306/<i>databaseName</i>
      For Oracle:     jdbc:oracle:thin:@<i>xxx.xxx.xxx.xxx</i>:1521:<i>databaseName</i>
      For Postgresql: jdbc:postgresql://<i>xxx.xxx.xxx.xxx</i>:5432/<i>databaseName</i>
      where <i>xxx.xxx.xxx.xxx</i> is the host computer's numeric IP address 
      followed by :<i>PortNumber</i> (4 digits), which may be different for your
      database.  REQUIRED. --&gt;
  &lt;<a rel="help" href="#databaseDriver">driverName</a>&gt;...&lt;/driverName&gt;
    &lt;!-- The high-level name of the database driver, for example, 
      "org.postgresql.Driver".  You need to put the actual database 
      driver .jar file (for example, postgresql.jdbc.jar) in 
      <i>tomcat</i>/webapps/erddap/WEB-INF/lib.  REQUIRED. --&gt;
  &lt;<a rel="help" href="#databaseConnectionProperty">connectionProperty</a> name="<i>name</i>"&gt;<i>value</i>&lt;/connectionProperty&gt;
    &lt;!-- The names (for example, "user", "password", and "ssl") 
      and values of the properties needed for ERDDAP™ to establish 
      the connection to the database.  0 or more. --&gt;
  <a rel="help" href="#databaseConnectionPooling">&lt;dataSourceName&gt;</a>...&lt;/dataSourceName&gt;  &lt;!-- 0 or 1 --&gt;
  &lt;catalogName&gt;...&lt;/catalogName&gt;
    &lt;!-- The name of the catalog which has the schema which has the 
      table, default = "".  OPTIONAL.  Some databases don't use 
      this. --&gt;
  &lt;schemaName&gt;...&lt;/schemaName&gt; &lt;!-- The name of the 
    schema which has the table, default = "".  OPTIONAL. --&gt;
  &lt;tableName&gt;...&lt;/tableName&gt;  &lt;!-- The name of the 
    table, default = "".  REQUIRED. --&gt;
  <a rel="help" href="#databaseQuotes">&lt;columnNameQuotes&gt;</a>&lt;columnNameQuotes&gt; &lt;!-- OPTIONAL. Options: 
    " (the default), ', [nothing]. --&gt;
  &lt;orderBy&gt;...&lt;/orderBy&gt;  &lt;!-- A comma-separated list of
    <a rel="help" href="#sourceName">sourceName</a>s to be used in an ORDER BY clause at the end of the 
    every query sent to the database (unless the user's request
    includes an &amp;orderBy() filter, in which case the user's 
    orderBy is used).  The order of the sourceNames is important. 
    The leftmost (first) sourceName is most important; subsequent 
    sourceNames are only used to break ties.  Only relevant 
    sourceNames are included in the ORDER BY clause for a given user 
    request.  If this is not specified, the order of the returned 
    values is not specified. Default = "".  OPTIONAL. --&gt;
  <a rel="help" href="#sourceCanOrderBy">&lt;sourceCanOrderBy&gt;</a>no(default)|partial|yes&lt;/sourceCanOrderBy&gt; 
    &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#sourceCanDoDistinct">&lt;sourceCanDoDistinct&gt;</a>no(default)|partial|yes&lt;/sourceCanDoDistinct&gt;
    &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#sourceNeedsExpandedFP_EQ">&lt;sourceNeedsExpandedFP_EQ&gt;</a>true(default)|false&lt;/sourceNeedsExpandedFP_EQ&gt;
  <a rel="help" href="#accessibleTo">&lt;accessibleTo&gt;</a>...&lt;/accessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#graphsAccessibleTo">&lt;graphsAccessibleTo&gt;</a>auto|public&lt;/graphsAccessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#reloadEveryNMinutes">&lt;reloadEveryNMinutes&gt;</a>...&lt;/reloadEveryNMinutes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultDataQuery">&lt;defaultDataQuery&gt;</a>...&lt;/defaultDataQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultGraphQuery">&lt;defaultGraphQuery&gt;</a>...&lt;/defaultGraphQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#addVariablesWhere">&lt;addVariablesWhere&gt;</a>...&lt;/addVariablesWhere&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#fgdcFile">&lt;fgdcFile&gt;</a>...&lt;/fgdcFile&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#iso19115File">&lt;iso19115File&gt;</a>...&lt;/iso19115File&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#onChange">&lt;onChange&gt;</a>...&lt;/onChange&gt; &lt;!-- 0 or more --&gt;
  <a rel="help" href="#globalAttributes">&lt;addAttributes&gt;</a>...&lt;/addAttributes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#dataVariable">&lt;dataVariable&gt;</a>...&lt;/dataVariable&gt; &lt;!-- 1 or more.
    Each dataVariable MUST include a <a rel="help" href="#dataType"><kbd>&lt;dataType&gt;</kbd></a> tag. 
    See <a rel="help" href="#DatabaseDataTypes">Database DataTypes</a>.
    For <a rel="help" href="#databaseDate">database date and timestamp columns</a>, set dataType=double and 
    units=seconds since 1970-01-01T00:00:00Z --&gt;
&lt;/dataset&gt;
</pre>
&nbsp;
</ul>


<p><a class="selfLink" id="EDDTableFromEDDGrid" href="#EDDTableFromEDDGrid" rel="bookmark"><strong>EDDTableFromEDDGrid</strong></a>  
lets you create an EDDTable dataset from any EDDGrid dataset. 
<ul>
<li>Some common reasons for doing this are:
  <ul>
  <li>This allows the dataset to be queried with OPeNDAP selection constraints,
    which is a type of "query by value" (which a user may have requested).
  <li>The dataset is inherently a tabular dataset. 
  </ul>

<li>The value of the global attribute "maxAxis0" (usually of type="int"), 
  (the default is 10) will be used to limit the number of axis[0]
  (usually the "time" axis) values of the enclosed EDDGrid dataset 
  that can be accessed per request for data. 
  If you don't want there to be any limit, specify a value of 0.
  This setting is important because, otherwise, it would be too easy
  for a user to ask EDDTableFromEDDGrid to look through all of the 
  gridded dataset's data. That would take a long time and would
  almost certainly fail with a timeout error. This is the setting
  that makes it safe to have EDDTableFromEDDGrid datasets in your ERDDAP
  without fear that they will lead to an unreasonable use of computing resources.

<li>If the enclosed EDDGrid is an 
  <a rel="help" href="#EDDGridFromErddap">EDDGridFromErddap</a>
  and the ERDDAP™ is the same ERDDAP,
  then EDDTableFromEDDGrid will always use the currently available
  version of the referenced dataset directly. This is a very efficient 
  way for EDDTableFromEDDGrid to access the gridded data.

<li>This class's 
  <a rel="help" href="#reloadEveryNMinutes"><kbd>&lt;reloadEveryNMinutes&gt;</kbd></a>
  is what counts. 
  The enclosed EDDGrid's <kbd>&lt;reloadEveryNMinutes&gt;</kbd> is ignored. 

<li>If a value for 
  <a rel="help" href="#updateEveryNMillis"><kbd>&lt;updateEveryNMillis&gt;</kbd></a>
  is supplied for this dataset, it is ignored. 
  The enclosed EDDGrid's <kbd>&lt;updateEveryNMillis&gt;</kbd> is what matters.

<li><a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml</a> has an 
  option for dataset type=EDDTableFromEDDGrid which asks for the URL of an ERDDAP
  (usually the same ERDDAP) (ending in "/erddap/") and a regular expression.
  GenerateDatasetsXml will then generate the XML for an EDDTableFromEDDGrid
  dataset for each gridded dataset in the ERDDAP™ which has a datasetID
  which matches the regular expression (use .* to match all datasetIDs 
  for gridded datasets).

  <p>The chunk of XML that is generated by GenerateDatasetsXml for each 
  dataset includes:
  <ul>
  <li>A datasetID which is the EDDGrid's datasetID plus "_AsATable".
  <li>A new summary global attribute which is the EDDGrid's summary 
    plus a new first paragraph describing what this dataset is.
  <li>A new title global attribute which is the EDDGrid's title 
    plus ", (As A Table)".
  <li>A new maxAxis0 global attribute with a value of 10.
  </ul>

<li><a class="selfLink" id="EDDTableFromEDDGridSkeletonXML" href="#EDDTableFromEDDGridSkeletonXML" rel="bookmark">The skeleton XML for an EDDTableFromEDDGrid dataset is:</a>
<pre>
&lt;dataset type="EDDTableFromEDDGrid" <a rel="help" href="#datasetID">datasetID</a>="..." <a rel="help" href="#active">active</a>="..." &gt;
  <a rel="help" href="#accessibleTo">&lt;accessibleTo&gt;</a>...&lt;/accessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#graphsAccessibleTo">&lt;graphsAccessibleTo&gt;</a>auto|public&lt;/graphsAccessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#reloadEveryNMinutes">&lt;reloadEveryNMinutes&gt;</a>...&lt;/reloadEveryNMinutes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#updateEveryNMillis">&lt;updateEveryNMillis&gt;</a>...&lt;/updateEveryNMillis&gt; &lt;!-- 0 or 1. 
    For EDDTableFromEDDGrid, this calls lowUpdate() of the underlying 
    EDDGrid. --&gt;
  <a rel="help" href="#defaultDataQuery">&lt;defaultDataQuery&gt;</a>...&lt;/defaultDataQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultGraphQuery">&lt;defaultGraphQuery&gt;</a>...&lt;/defaultGraphQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#addVariablesWhere">&lt;addVariablesWhere&gt;</a>...&lt;/addVariablesWhere&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#fgdcFile">&lt;fgdcFile&gt;</a>...&lt;/fgdcFile&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#iso19115File">&lt;iso19115File&gt;</a>...&lt;/iso19115File&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#onChange">&lt;onChange&gt;</a>...&lt;/onChange&gt; &lt;!-- 0 or more --&gt;
  <a rel="help" href="#globalAttributes">&lt;addAttributes&gt;</a>...&lt;/addAttributes&gt;  &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#EDDGrid">&lt;dataset&gt;</a>...&lt;/dataset&gt; &lt;!-- 1 
     Any type of EDDGrid dataset.  You can even use an 
     EDDGridFromERDDAP™ to access an independent EDDGrid dataset on 
     this server. --&gt;
&lt;/dataset&gt;
</pre>
&nbsp;
</ul>


<p><a class="selfLink" id="EDDTableFromFileNames" href="#EDDTableFromFileNames" rel="bookmark"><strong>EDDTableFromFileNames</strong></a> 
creates a dataset from information about a group of files in the server's file system, 
including a URL for each file so that users can download the files via ERDDAP's 
<a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/files/documentation.html">"files" system</a>.
Unlike all of the 
<a rel="help" href="#EDDTableFromFiles">EDDTableFromFiles</a> 
subclasses, this dataset type does not serve data from within the files. 

<ul>
<li>EDDTableFromFileNames is useful when:
  <ul>
  <li>You have a group of files that you want to distribute as whole files
    because they don't contain "data" in the same way that regular data files have data. 
    For example, image files, video files, Word documents, Excel spreadsheet files, 
    PowerPoint presentation files, or text files with unstructured text.
  <li>You have a group of files which have data in a format 
    that ERDDAP™ can't yet read.
    For example, a project-specific, custom, binary format.
    <br>&nbsp;
  </ul>

<li><a class="selfLink" id="EDDTableFromFileNamesData" href="#EDDTableFromFileNamesData" rel="bookmark">The data
  in an EDDTableFromFileNames dataset</a> is a table that ERDDAP™ creates
  on-the-fly with information about a group of local files.
  In the table, there is a row for each file. 
  Four special attributes in the    
  <a rel="help" href="#EDDTableFromFileNamesSkeletonXML">datasets.xml for this dataset</a>
  determine which files will be included in this dataset:
    <ul>
    <li>&lt;fileDir&gt; -- This specifies the source directory in the server's
      file system with the files for this dataset.
      The files that are actually located in the server's file system
      in &lt;fileDir&gt; will appear in the url column of this dataset 
      within a virtual directory named
      https://<i>serverUrl</i>/erddap/files/<i>datasetID/</i> .
      <br>For example, if the datasetID is jplMURSST,
      <br>and the &lt;fileDir&gt; is /home/data/mur/ ,
      <br>and that directory has a file named jplMURSST20150103000000.png, 
      <br>then the URL that will be shown to users for that file will be
      <br>https://<i>serverUrl</i>/erddap/jplMURSST/jplMURSST20150103000000.png .

      <p>In addition to using a local directory for the &lt;fileDir&gt;,
      you can also specify the URL of a remote, directory-like web page.
      This works with: 
      <ul>
      <li>Unaggregated datasets in THREDDS, e.g., 
      <br>https://data.nodc.noaa.gov/thredds/catalog/aquarius/nodc_binned_V3.0/monthly/ [2020-10-21 This server is no longer reliably available.]
      <li>Unaggregated datasets in Hyrax, e.g., 
      <br><a rel="help" href="https://podaac-opendap.jpl.nasa.gov/opendap/allData/ccmp/L3.5a/monthly/flk/"
                             >https://podaac-opendap.jpl.nasa.gov/opendap/allData/ccmp/L3.5a/monthly/flk/<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a>
      <li>Most Apache-like directory listings, e.g., 
      <br><a rel="help" href="https://www1.ncdc.noaa.gov/pub/data/cmb/ersst/v5/netcdf/"
                             >https://www1.ncdc.noaa.gov/pub/data/cmb/ersst/v5/netcdf/<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a>
      </ul>

      <p><a class="selfLink" id="fromOnTheFly" href="#fromOnTheFly" rel="bookmark"
        >***fromOnTheFly</a> -- For some huge S3 buckets 
      (like noaa-goes17, which has 26 million files), it may take ERDDAP™ 
      up to 12 hours to download all the information about the contents of the bucket
      (and then there are other problems). To get around this, 
      there is a special way to use &lt;fileDir&gt;
      in EDDTableFromFileNames to make a dataset with the directory and file names 
      from an AWS S3 bucket.  The dataset won't have the list of 
      all of the S3 bucket's directories and file names that a user can 
      search via requests to the dataset. But the dataset will get the names of 
      directories and files on-the-fly if the user traverses the directory hierarchy with 
      the dataset's "files" option. Thus, this allows users to browse the 
      S3 bucket's file hierarchy and files via the dataset's "files" system.
      To do this, instead of specifying the URL for the S3 bucket as the 
      "Starting directory" (in GenerateDatasetsXml)
      or &lt;fileDir&gt; (in datasets.xml), use:
      <br><kbd>***fromOnTheFly,<i>theS3BucketUrl</i></kbd>
      <br>for example:
      <br><kbd>***fromOnTheFly,https://noaa-goes17.s3.us-east-1.amazonaws.com/</kbd>
      <br>See the documentation for <a rel="help" href="#AwsS3Files">working with S3 Buckets in ERDDAP™</a>,
      notably the description of the specific format that must be used for S3 bucket URL.      
      And see 
      <br><a rel="help" href="#AwsS3MakeAnEDDTableFromFileNamesDataset">these details and an example</a>
        of using ***fromOnTheFly.

    <li>&lt;recursive&gt; -- Files in subdirectories of &lt;fileDir&gt;
      with names which match &lt;fileRegex&gt; will appear in the same subdirectories
      in the "files" URL if &lt;recursive&gt; is set to <kbd>true</kbd>.  
      The default is <kbd>false</kbd>.
    <li><a rel="help" href="#pathRegex"><kbd>&lt;pathRegex&gt;</kbd></a>
      -- If recursive=true, Only directory names which 
      match the pathRegex (default=".*") will be accepted.
      If recursive=false, this is ignored.
      This is rarely used, but can be very useful in unusual circumstances.
          (See this <a rel="help"
          href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/regex/Pattern.html"
          >regex documentation<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a> 
            and 
          <a rel="help" href="https://www.vogella.com/tutorials/JavaRegularExpressions/article.html"
          >regex tutorial<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a>.)
    <li>&lt;fileRegex&gt; -- Only the filenames where the whole filename
      (not including the directory name) match the &lt;fileRegex&gt;
      will be included in this dataset. For example, <kbd>jplMURSST.{14}\.png</kbd> .
          (See this <a rel="help"
          href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/regex/Pattern.html"
          >regex documentation<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a> 
            and 
          <a rel="help" href="https://www.vogella.com/tutorials/JavaRegularExpressions/article.html"
          >regex tutorial<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a>.)
      <br>&nbsp;
    </ul>  
  In the table, there will be columns with:
  <ul>
  <li>url -- The URL that users can use to download the file via ERDDAP's
    <a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/files/documentation.html">"files" system</a>.
  <li>name -- The file's name (without a directory name).
  <li>lastModified -- The time the file was last modified (stored as doubles
      with "seconds since 1970-01-01T00:00:00Z").
      This variable is useful because users can see if/when the contents of a
      given file last changed.
      This variable is a 
      <a rel="help" href="#timeStampVariable">timeStamp&nbsp;variable</a>,
      so the data may appear as numeric values (seconds since 1970-01-01T00:00:00Z) 
      or a String value (ISO 8601:2004(E) format), depending on the situation.
  <li>size -- The size of the file in bytes, stored as doubles.
     They are stored as doubles because some files may be larger than 
     ints allow and longs are not supported in some response file types.
     Doubles will give the exact size, even for very large files.
  <li>addition columns defined by the ERDDAP™ administrator with information 
    extracted from the filename (for example, the time associated with the data in the file)
    based on two attributes that you specify in the metadata for each additional
    column/dataVariable: 
    <ul>
    <li>extractRegex -- This is a <a rel="help" 
    href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/regex/Pattern.html"
    >regular expression<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
      (<a rel="help" href="https://www.vogella.com/tutorials/JavaRegularExpressions/article.html">tutorial<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>).  
        The entire regex must match the entire filename 
        (not including the directory name).
       The regex must include at least one capture group 
       (a section of a regular expression that is enclosed by parentheses)
       which ERDDAP™ uses to determine which section of the filename to extract
       to become data.
    <li>extractGroup -- This is the number of the capture group (#1 is the first 
       capture group) in the regular expression. The default is 1.
       A capture group is a section of a regular expression that is enclosed by 
       parentheses.
    </ul> 
    Here are two examples:
    <pre>
    &lt;dataVariable&gt;
        &lt;sourceName&gt;time&lt;/sourceName&gt;
        &lt;destinationName&gt;time&lt;/destinationName&gt;
        &lt;dataType&gt;String&lt;/dataType&gt;
        &lt;addAttributes&gt;
            &lt;att name="extractRegex"&gt;jplMURSST(.{14})\.png&lt;/att&gt;
            &lt;att name="extractGroup" type="int"&gt;1&lt;/att&gt;
            &lt;att name="units"&gt;yyyyMMddHHmmss&lt;/att&gt;
        &lt;/addAttributes&gt;
    &lt;/dataVariable&gt;
    &lt;dataVariable&gt;
        &lt;sourceName&gt;day&lt;/sourceName&gt;
        &lt;destinationName&gt;day&lt;/destinationName&gt;
        &lt;dataType&gt;int&lt;/dataType&gt;
        &lt;addAttributes&gt;
            &lt;att name="extractRegex"&gt;jplMURSST.{6}(..).{6}\.png&lt;/att&gt;
            &lt;att name="extractGroup" type="int"&gt;1&lt;/att&gt;
            &lt;att name="ioos_category"&gt;Time&lt;/att&gt;
        &lt;/addAttributes&gt;
    &lt;/dataVariable&gt; </pre>
      In the case of the time variable, 
      if a file has the name jplMURSST20150103000000.png,
      the extractRegex will match the filename, 
      extract the characters which match the first capture group ("20150103000000")
      as dataType=String, 
      then use the 
      <a rel="help" href="#stringTimeUnits">units suitable for string times</a>
      to parse the strings into time data values (2015-01-03T00:00:00Z).
      <p>In the case of the day variable, if a file has the name
      jplMURSST20150103000000.png,
      the extractRegex will match the filename, 
      extract the characters which match the first capture group ("03")
      as <a rel="help" href="#dataType"><kbd>&lt;dataType&gt;</kbd></a>=int, 
      yielding a data value of 3.
    </ul>

<li>No <a rel="help" href="#updateEveryNMillis"><kbd>&lt;updateEveryNMillis&gt;</kbd></a> -- 
  This type of dataset doesn't need and can't use the &lt;updateEveryNMillis&gt; tag
  because the information served by EDDTableFromFileNames is always perfectly up-to-date
  because ERDDAP™ queries the file system in order to respond to each request for data.
  Even if there are a huge number of files, this approach should work reasonably well.
  A response may be slow if there are a huge number of files 
  and the dataset hasn't been queried for a while.
  But for several minutes after that, the operating system keeps the information in a cache,
  so responses should be very fast.
  <br>&nbsp;

<li>You can use the
  <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml program</a> 
  to make the datasets.xml chunk for this type of dataset.
  You can add/define additional columns with information extracted from the filename,
  as shown above.
  <br>&nbsp;

<!--  clean up documentation for ***fromFiles (file info stored in jsonlCSV files).
<li>
    <ol>
    <li>Obtain the bucket URL in the 
    <a rel="help" href="#AwsS3URLFormat">format that ERDDAP™ requires</a>, e.g., 
    <a rel="help" 
      href="https://noaa-goes17.s3.us-east-1.amazonaws.com"
           >https://noaa-goes17.s3.us-east-1.amazonaws.com<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a> ,
    and test it in a browser to make sure it returns an XML document which has a partial 
    listing the contents of that bucket.
    <br>&nbsp;

    <li>Use <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml</a>
    to try to create an
    <a rel="help" href="#EDDTableFromFileNames">EDDTableFromFileNames</a> dataset.
    <br>&nbsp;&nbsp;For the <kbd>Starting directory</kbd>, specify the bucket URL, e.g., 
      <a rel="help" 
      href="https://noaa-goes17.s3.us-east-1.amazonaws.com"
           >https://noaa-goes17.s3.us-east-1.amazonaws.com/</a> 
    <br>&nbsp;&nbsp;For the <kbd>File name regex</kbd>, specify .*
    <br>&nbsp;&nbsp;For <kbd>Recursive</kbd>, specify <kbd>true</kbd>
    <br>For this example, that AWS bucket has 26 million files, 
    so GenerateDatasets will run for a long time (several hours),
    create a huge temporary file (4TB?) with the 
    S3 file names in <i>bigParentDirectory</i>/dataset/_FileVisitor/,
    then fail at the last step because the file is too big for GenerateDatasetsXml.
    The file's name will be a file-name-safe version of the URL plus the start time plus ".jsonlCsv",
    e.g., 
    <br>https_noaa-goes17.s3.us-east-1.amazonaws.com_20190730135148_596.jsonlCsv
    <br>That's okay. We'll get around this problem below.
    <br>&nbsp;

    <li>Make a directory to hold the data for the EDDTable from
    In this example, we'll use the name /data/awsS3NoaaGoes17/ .
    <br>&nbsp;

    <li>Copy the huge temporary file with the 
    S3 file names from <i>bigParentDirectory</i>/dataset/_FileVisitor/ 
    into the new directory for the dataset (e.g., /data/awsS3NoaaGoes17/ ).
    <br>&nbsp;

    <li>Split the huge temporary file with S3 file names into multiple files 
    with 100000 lines each so that ERDDAP™ can work with the information.
    An easy way to do this with Linux (or GitBash) is:
    <br><kbd>split -a 4 -d -l 100000 input_file output_file</kbd>
    <br>Or, in this example:
    <pre>split -a 4 -d -l 100000 https_noaa-goes17.s3.us-east-1.amazonaws.com_20190730135148_596.jsonlCsv awsS3NoaaGoes17_</pre>
    That says: split the file into chunks with 100000 'l'ines each, and make file names with 4 character, 'd'ecimal suffixes.
    <br>That makes files named: awsS3NoaaGoes17_0001, awsS3NoaaGoes17_0002, awsS3NoaaGoes17_0003, etc.
    which will sort nicely.
    For details, see the <a rel="help" 
      href="https://www.computerhope.com/unix/usplit.htm"
           ><kbd>split</kbd> documentation<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a>.
    <br>&nbsp;
       
    <li>Make a file with the column names.
    <br>First, use a text editor to make a small file with the column names on one, jsonlCSV-style line:
    <br><kbd>["directory","name","lastModified","size"]</kbd>
    <br>Make sure there is a newline at the end of that line, but no other blank lines in the file.
    <br>&nbsp;

    <li>Add column names to the files.
    And easy way to do this with Linux (or GitBash) is:
    <pre>cd /data/awsS3NoaaGoes17/
    for f in awsS3* ; do cat columnNames.csv "$f" > ../awsS3NoaaGoes17/"$f".jsonlCSV ; done</pre>

    <li>Now you can use GenerateDatasetsXml to create an EDDTableFromFileNames
    to display all the names of the 26 million files in the noaa-goes17 bucket.
    Specify:
    <pre>Which EDDType? EDDTableFromFileNames
    fileDir? ***fromFiles, jsonlCSV, /u00/data/points/awsS3NoaaGoes17/, awsS3NoaaGoes17_....\.jsonlCSV(|.gz), https://noaa-goes17.s3.us-east-1.amazonaws.com/
    File name regex? .*
    Recursive? true
    reloadEveryNMinutes? 10080
    infoUrl? https://registry.opendata.aws/noaa-goes/
    institution? NOAA
    summary? This is a test of displaying file names from the AWS S3 noaa-goes17 Bucket. Use ERDDAP's "files" system to download the files.
    title? AWS S3 File Names from the noaa-goes17 Bucket
    </pre>

    <li>If you follow the instructions above, you have created a dataset like 
    <br><a rel="help" 
      href="https://coastwatch.pfeg.noaa.gov/erddap/info/awsS3NoaaGoes17/index.html"
           >https://coastwatch.pfeg.noaa.gov/erddap/info/awsS3NoaaGoes17/index.html</a>
    <br>which has the files names from the noaa-goes17 bucket.
      If you click on the "files" link there, 
      and click down the directory tree to an actual file name and click on the file name,
      ERDDAP™ will redirect your request to AWS S3 so that you can download the file 
      from AWS.
    <br>&nbsp;

    <li>Make ERDDAP™ datasets.
    <br>If you do a little poking around with the directory and file names in the directory tree for that dataset, 
      it becomes clear that the top level directory names (e.g., ABI-L1b-RadC)
      correspond to what ERDDAP™ would call separate datasets.
      You could then pursue creating separate datasets in ERDDAP™ for each of those datasets, using, e.g., 
      <br>https://coastwatch.pfeg.noaa.gov/erddap/files/awsS3NoaaGoes17/ABI-L1b-RadC/
      <br>as the <kdb>&lt;cacheFromUrl&gt;</kbd>. Unfortunately, for this particular
      example, the datasets all seem to be level 1 or level 2 datasets,
      which ERDDAP™ isn't particularly good at.
    <br>&nbsp;
     
    </ol>

-->

<li><a class="selfLink" id="EDDTableFromFileNamesSkeletonXML" href="#EDDTableFromFileNamesSkeletonXML" rel="bookmark">The 
  skeleton XML for an EDDTableFromFileNames dataset is:</a>
<pre>
&lt;dataset type="EDDTableFromFileNames" <a rel="help" href="#datasetID">datasetID</a>="..." <a rel="help" href="#active">active</a>="..." &gt;
  <a rel="help" href="#accessibleTo">&lt;accessibleTo&gt;</a>...&lt;/accessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#graphsAccessibleTo">&lt;graphsAccessibleTo&gt;</a>auto|public&lt;/graphsAccessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#reloadEveryNMinutes">&lt;reloadEveryNMinutes&gt;</a>...&lt;/reloadEveryNMinutes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultDataQuery">&lt;defaultDataQuery&gt;</a>...&lt;/defaultDataQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultGraphQuery">&lt;defaultGraphQuery&gt;</a>...&lt;/defaultGraphQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#addVariablesWhere">&lt;addVariablesWhere&gt;</a>...&lt;/addVariablesWhere&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#fgdcFile">&lt;fgdcFile&gt;</a>...&lt;/fgdcFile&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#iso19115File">&lt;iso19115File&gt;</a>...&lt;/iso19115File&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#onChange">&lt;onChange&gt;</a>...&lt;/onChange&gt; &lt;!-- 0 or more --&gt;
  <a rel="help" href="#EDDTableFromFileNamesData">&lt;fileDir&gt;</a>...&lt;/fileDir&gt; 
  <a rel="help" href="#EDDTableFromFileNamesData">&lt;recursive&gt;</a>...&lt;/recursive&gt;  &lt;!-- true or false (the default) --&gt;
  <a rel="help" href="#pathRegex">&lt;pathRegex&gt;</a>...&lt;/pathRegex&gt;  &lt;!-- 0 or 1. Only directory names which 
    match the pathRegex (default=".*") will be accepted. --&gt;
  <a rel="help" href="#EDDTableFromFileNamesData">&lt;fileNameRegex&gt;</a>...&lt;/fileNameRegex&gt; 
  <a rel="help" href="#globalAttributes">&lt;addAttributes&gt;</a>...&lt;/addAttributes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#dataVariable">&lt;dataVariable&gt;</a>...&lt;/dataVariable&gt; &lt;!-- 1 or more.
     Each dataVariable MUST include <a rel="help" href="#dataType"><kbd>&lt;dataType&gt;</kbd></a> tag. --&gt;
&lt;/dataset&gt;
</pre>
&nbsp;
</ul>


<p><a class="selfLink" id="EDDTableFromFiles" href="#EDDTableFromFiles" rel="bookmark"><strong>EDDTableFromFiles</strong></a> is the superclass of all
EDDTableFrom...Files classes. 
You can't use EDDTableFromFiles directly. 
Instead, use a subclass of EDDTableFromFiles to handle the specific file type:
  <ul>
  <li><a rel="help" href="#EDDTableFromAsciiFiles">EDDTableFromAsciiFiles</a> aggregates data from 
       comma-, tab-, semicolon-, or space-separated tabular ASCII data files.
  <li><a rel="help" href="#EDDTableFromAudioFiles">EDDTableFromAudioFiles</a> 
        aggregates data from a group of local audio files.
  <li><a rel="help" href="#EDDTableFromAwsXmlFiles">EDDTableFromAwsXmlFiles</a> aggregates data from 
       a set of Automatic Weather Station (AWS) XML files.
  <li><a rel="help" href="#EDDTableFromColumnarAsciiFiles">EDDTableFromColumnarAsciiFiles</a> 
    aggregates data from tabular ASCII data files with fixed-width data columns.
  <li><a rel="help" href="#EDDTableFromHyraxFiles">EDDTableFromHyraxFiles</a> (DEPRECATED)
    aggregates data with several variables, each with shared dimensions
    (for example, time, altitude (or depth), latitude, longitude), and served by a 
      <a rel="help" href="https://www.opendap.org/software/hyrax-data-server">Hyrax OPeNDAP server<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
  <li><a rel="help" href="#EDDTableFromInvalidCRAFiles">EDDTableFromInvalidCRAFiles</a> 
        aggregates data from NetCDF (v3 or v4) .nc files which use 
        a specific, invalid, variant of the CF DSG Contiguous Ragged Array (CRA) files.
        Although ERDDAP™ supports this file type, it is an invalid file type 
        that no one should start using. Groups that currently use this file type are 
        strongly encouraged to use ERDDAP™ to generate valid CF DSG CRA files
        and stop using these files.
  <li><a rel="help" href="#EDDTableFromJsonlCSVFiles">EDDTableFromJsonlCSVFiles</a> 
        aggregates data from 
        <a rel="help"
          href="https://jsonlines.org/examples/"
          >JSON Lines CSV files<img 
          src="../images/external.png" alt=" (external link)" 
          title="This link to an external website does not constitute an endorsement."></a>.
  <li><a rel="help" href="#EDDTableFromMultidimNcFiles">EDDTableFromMultidimNcFiles</a> 
    aggregates data from NetCDF (v3 or v4) .nc 
    (or <a rel="help" href="#NcML">.ncml</a>)
    files with several variables, 
    each with shared dimensions (for example, time, altitude (or depth), latitude, longitude).
  <li><a rel="help" href="#EDDTableFromNcFiles">EDDTableFromNcFiles</a> aggregates
    data from NetCDF (v3 or v4) .nc 
    (or <a rel="help" href="#NcML">.ncml</a>)
    files with several variables, 
    each with shared dimensions (for example, time, altitude (or depth), latitude, longitude).
        It is fine to continue using this dataset type for existing datasets,
        but for new datasets we recommend using EDDTableFromMultidimNcFiles instead.
  <li><a rel="help" href="#EDDTableFromNcCFFiles">EDDTableFromNcCFFiles</a> 
    aggregates data from NetCDF (v3 or v4) .nc 
    (or <a rel="help" href="#NcML">.ncml</a>)
    files which use one of the file formats specified by the 
    <a href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html#discrete-sampling-geometries"
    >CF Discrete Sampling Geometries (DSG)<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
    conventions.
    But for files using one of the multidimensional CF DSG variants, use
    <a rel="help" href="#EDDTableFromMultidimNcFiles">EDDTableFromMultidimNcFiles</a> 
    instead.

  <li><a rel="help" href="#EDDTableFromNccsvFiles">EDDTableFromNccsvFiles</a> 
    aggregates data from 
    <a rel="help" href="https://erddap.github.io/NCCSV.html">NCCSV</a>
    ASCII .csv files.

  <li><a rel="help" href="#EDDTableFromParquetFiles">EDDTableFromParquetFiles</a> 
    handles data from 
    <a rel="help"
      href="https://parquet.apache.org/"
      >Parquet<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a>.

  <li><a rel="help" href="#EDDTableFromThreddsFiles">EDDTableFromThreddsFiles</a> (DEPRECATED)
    aggregates data from files with several variables with shared dimensions served by a 
    <a rel="help" 
    href="https://www.unidata.ucar.edu/software/tds/"
    >THREDDS OPeNDAP server<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
  <li><a rel="help" href="#EDDTableFromWFSFiles">EDDTableFromWFSFiles</a> (DEPRECATED)
    makes a local copy of all of the data from an ArcGIS MapServer WFS server
    so the data can then be re-served quickly to ERDDAP™ users.
  </ul>
Currently, no other file types are supported. 
But it is usually relatively easy to add support for other file types. Contact us if you have a request.
Or, if your data is in an old file format that you would like to move away from, 
we recommend converting the files to be NetCDF v3 .nc files
(and especially .nc files with the  
<a href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html#discrete-sampling-geometries"
>CF Discrete Sampling Geometries (DSG)<img 
    src="../images/external.png" alt=" (external link)" 
    title="This link to an external website does not constitute an endorsement."></a> 
Contiguous Ragged Array data structure -- 
ERDDAP™ can extract data from them very quickly). 
NetCDF is a widely supported, binary format, 
allows fast random access to the data, and is already supported by ERDDAP.

<p><a class="selfLink" id="EDDTableFromFiles_Details" href="#EDDTableFromFiles_Details" rel="bookmark">Details</a> -- 
The following information applies to all of the subclasses of EDDTableFromFiles.
<ul>
<li><a class="selfLink" id="EDDTableFromFiles_Aggregation" href="#EDDTableFromFiles_Aggregation" rel="bookmark"><strong>Aggregation</strong></a> -- 
  This class aggregates data from local files. Each file holds a (relatively)
  small table of data.
  <ul>
  <li>The resulting dataset appears as if all of the file's tables had been combined
    (all of the rows of data from file #1, plus all of the rows from file #2, ...).
  <li>The files don't all have to have all of the specified variables.
    If a given file doesn't have a specified variable, ERDDAP™ will add missing 
    values as needed.
  <li>The variables in all of the files MUST have the same values for the 
      <a rel="help" href="#add_offset">add_offset</a>,
      <a rel="help" href="#missing_value">missing_value</a>, 
      <a rel="help" href="#FillValue">_FillValue</a>, 
      <a rel="help" href="#scale_factor">scale_factor</a>, and 
      <a rel="help" href="#units">units</a> attributes (if any). 
    ERDDAP™ checks, but it is an imperfect test -- if there are different values, ERDDAP
    doesn't know which is correct and therefore which files are invalid.
    If this is a problem, you may be able to use 
    <a rel="help" href="#NcML">NcML</a> or 
    <a rel="help" href="#NCO">NCO</a>
    to fix the problem.
    <br>&nbsp;
  </ul>

<li><strong>Compressed Files</strong>
  <br>The source data files for all EDDTableFromFiles subclasses 
    can be externally compressed
    (e.g., .tgz, .tar.gz, .tar.gzip, .gz, .gzip, .zip, .bz2, or .Z).
    See the <a rel="help" href="#ExternallyCompressedFiles">Externally Compressed Files documentation</a>.
    <br>&nbsp;

<li><a class="selfLink" id="EDDTableFromFiles_CachedFileInformation" href="#EDDTableFromFiles_CachedFileInformation" rel="bookmark"><strong>Cached File Information</strong></a>
- When an EDDTableFromFiles dataset is first loaded, 
  EDDTableFromFiles reads information from all of the relevant files 
  and creates tables (one row for each file) with information about each valid file and each "bad" 
  (different or invalid) file. 
  <ul>
  <li>The tables are also stored on disk, as NetCDF v3 .nc files in <i>bigParentDirectory</i>/dataset/<i>last2CharsOfDatasetID</i>/<i>datasetID</i>/ in
    files named:
    <br>&nbsp;&nbsp;dirTable.nc (which holds a list of unique directory names),
    <br>&nbsp;&nbsp;fileTable.nc (which holds the table with each valid file's information),
    <br>&nbsp;&nbsp;badFiles.nc (which holds the table with each bad file's information).
  <li>To speed up access to an EDDTableFromFiles dataset (but at the expense of using more memory),
    you can use
    <br><a rel="help" href="#fileTableInMemory"><kbd>&lt;fileTableInMemory&gt;true&lt;/fileTableInMemory&gt;</kbd></a>
    <br>to tell ERDDAP™ to keep a copy of the file information tables in memory.
  <li>The copy of the file information tables on disk is also useful when ERDDAP™ is shut down and restarted: 
    it saves EDDTableFromFiles from having to re-read all of the data files.
  <li>When a dataset is reloaded, ERDDAP™ only needs to read the data in new files
    and files that have changed.
  <li>If a file has a different structure from the other files (for example, a different data type
    for one of the variables, or a different value for the 
    "<a rel="help" href="#units">units</a>" attribute), ERDDAP
    adds the file to the list of "bad" files. Information about the problem with the file
    will be written to the <i>bigParentDirectory</i>/logs/log.txt file.    
  <li>You shouldn't ever need to delete or work with these files.
    One exception is: if you are still making changes to a dataset's datasets.xml setup,
    you may want to delete these files to force ERDDAP™ to reread all of the files
    since the files will be read/interpreted differently.
    If you ever do need to delete these files, you can do it when ERDDAP™ is running. 
    (Then set a 
    <a rel="help"
    href="https://erddap.github.io/setup.html#setDatasetFlag">flag</a>
    to reload the dataset ASAP.)
    However, ERDDAP™ usually notices that the datasets.xml information doesn't match
    the fileTable information and deletes the file tables automatically.
  <li>If you want to encourage ERDDAP™ to update the stored dataset information 
    (for example, if you just added, removed, or changed some files to the dataset's data directory), 
    use the 
      <a rel="help" href="https://erddap.github.io/setup.html#flag">flag system</a>
      to force ERDDAP™ to update the cached file information.
    <br>&nbsp;
  </ul>

<li><a class="selfLink" id="EDDTableFromFiles_HandlingRequests" href="#EDDTableFromFiles_HandlingRequests" rel="bookmark"><strong>Handling Requests</strong></a> -- 
  ERDDAP™ tabular data requests can put constraints on any variable.
  <ul>
  <li>When a client's request for data is processed, EDDTableFromFiles can quickly look 
    in the table with the valid file information to see which files might have relevant data.
    For example, if each source file has the data for one fixed-location buoy, EDDTableFromFiles
    can very efficiently determine which files might have data within a given longitude range and
    latitude range.
  <li>Because the valid file information table includes the minimum and maximum value of every 
    variable for every valid file, EDDTableFromFiles can often handle other queries quite efficiently.
    For example, if some of the buoys don't have an air pressure sensor, and a client requests 
    data for <kbd>airPressure!=NaN</kbd>, EDDTableFromFiles can efficiently determine which buoys 
    have air pressure data.
    <br>&nbsp;
  </ul>

<li><a class="selfLink" id="EDDTableFromFiles_Updating" href="#EDDTableFromFiles_Updating" rel="bookmark"><strong>Updating the Cached File Information</strong></a> -- 
    Whenever the dataset is reloaded, the cached file
  information is updated.
  <ul>
    <li>The dataset is reloaded periodically as determined by the 
      <kbd>&lt;reloadEveryNMinutes&gt;</kbd> in the 
      dataset's information in datasets.xml.
    <li>The dataset is reloaded as soon as possible whenever ERDDAP™ detects that you have added,
       removed, 
        <a rel="help" href="https://en.wikipedia.org/wiki/Touch_(Unix)">touch'd<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        (to change the file's lastModified time), or changed a datafile.
    <li>The dataset is reloaded as soon as possible if you use the 
        <a rel="help" href="https://erddap.github.io/setup.html#flag">flag system</a>.
    </ul>
  When the dataset is reloaded, ERDDAP™ compares the currently available files to the cached file
  information table.
  New files are read and added to the valid files table.
  Files that no longer exist are dropped from the valid files table.
  Files where the file timestamp has changed are read and their information is updated.
  The new tables replace the old tables in memory and on disk.
  <br>&nbsp;

<li><a class="selfLink" id="EDDTableFromFiles_BadFiles" href="#EDDTableFromFiles_BadFiles" rel="bookmark"><strong>Bad Files</strong></a> -- 
  The table of bad files and the reasons the files were declared bad (corrupted file, 
  missing variables, incorrect axis values, etc.) is emailed to the emailEverythingTo email address 
  (probably you) every time the dataset is reloaded.
  You should replace or repair these files as soon as possible.  
  <br>&nbsp;

<li><a class="selfLink" id="EDDTableFromFiles_MissingVariables" href="#EDDTableFromFiles_MissingVariables" rel="bookmark"><strong>Missing Variables</strong></a> -- 
  If some of the files don't have some of the dataVariables defined in the dataset's datasets.xml chunk, that's okay.
  When EDDTableFromFiles reads one of those files, it will act as if the file had the variable, but with all missing values.
  <br>&nbsp;

<li><strong><a class="selfLink" id="EDDTableFromFiles_NearRealTimeData" href="#EDDTableFromFiles_NearRealTimeData" rel="bookmark">Near Real Time Data</a></strong> -- 
    EDDTableFromFiles treats requests for very recent data as a special case.
  The problem: If the files making up the dataset are updated frequently, it is likely that the
  dataset won't be updated every time a file is changed. So EDDTableFromFiles won't be aware of
  the changed files.
  (You could use the 
    <a rel="help" href="https://erddap.github.io/setup.html#flag">flag system</a>,
      but this might lead to ERDDAP™ reloading the dataset almost continually.
  So in most cases, we don't recommend it.)
  Instead, EDDTableFromFiles deals with this by the following system:
  When ERDDAP™ gets a request for data within the last 20 hours (for example, 8 hours ago until Now), 
  ERDDAP™ will search all files which have any data in the last 20 hours.
  Thus, ERDDAP™ doesn't need to have perfectly up-to-date data for all of the files in order to
  find the latest data.
  You should still set <a rel="help" href="#reloadEveryNMinutes"><kbd>&lt;reloadEveryNMinutes&gt;</kbd></a> 
  to a reasonably
       small value (for example, 60),
  but it doesn't have to be tiny (for example, 3).
  <br>&nbsp;
  <ul>
  <li><strong>Not recommended</strong> organization of near-real-time data in the files:
  If, for example, you have a dataset that stores data for numerous stations (or buoy, or trajectory, ...)
  for many years, you could arrange the files so that, for example, there is one file per station.
  But then, every time new data for a station arrives, you have to read a large old file and
  write a large new file.
  And when ERDDAP™ reloads the dataset, it notices that some files have been modified, so it reads
  those files completely. 
  That is inefficient.
  <br>&nbsp;

  <li><strong>Recommended</strong> organization of near-real-time data in the files:
  Store the data in chunks, for example, all data for one station/buoy/trajectory for
  one year (or one month).
  Then, when a new datum arrives, only the file with this year's (or month's) data is affected.
  <ul>
  <li>Best: Use NetCDF v3 .nc files with an unlimited dimension (time). Then,
  to add new data, you can just append the new data without having to read and rewrite the entire file.
  The change is made very efficiently and essentially atomically,
  so the file isn't ever in an inconsistent state.
  <li>Otherwise: If you don't/can't use .nc files with an unlimited dimension (time),
  then, when you need to add new data, you have to read and rewrite the entire affected file
  (hopefully small because it just has a year's (or month's) worth of data).
  Fortunately, all of the files for previous years (or months) for that station remain unchanged.
  </ul>
  In both cases, when ERDDAP™ reloads the dataset, most files are unchanged; 
  only a few, small files have changed and need to be read.
  <br>&nbsp;

  </ul>

<li><a class="selfLink" id="EDDTableFromFiles_Directories" href="#EDDTableFromFiles_Directories" rel="bookmark"><strong>Directories</strong></a> -- 
  The files can be in one directory, or in a directory 
  and its subdirectories (recursively).
  If there are a large number of files (for example, &gt;1,000), the operating system
  (and thus EDDTableFromFiles) will operate much more efficiently if you store 
  the files in a
  series of subdirectories (one per year, or one per month for datasets with very frequent files),
  so that there are never a huge number of files in a given directory.
  <br>&nbsp;

<li><strong>Remote Directories and HTTP Range Requests</strong> (AKA Byte Serving, Byte Range Requests) -- 
  <br>EDDGridFromNcFiles, EDDTableFromMultidimNcFiles,
  EDDTableFromNcFiles, and EDDTableFromNcCFFiles,  
  can sometimes serve data from .nc files on remote servers and accessed via HTTP
  if the server supports
  <a href="https://en.wikipedia.org/wiki/Byte_serving"
    >Byte Serving<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
  via HTTP range requests (the HTTP mechanism for byte serving).
  This is possible because netcdf-java (which ERDDAP™ uses to read .nc files)
  supports reading data from remote .nc files via HTTP range requests.

  <p><strong>Don't do this!</strong>
  <br>Instead, use the 
    <a rel="help" href="#cacheFromUrl">&lt;cacheFromUrl&gt; system</a>.

<li><a class="selfLink" id="cacheFromUrl" href="#cacheFromUrl" rel="bookmark"
><strong><kbd>&lt;cacheFromUrl&gt;</kbd></strong></a> -
    <br>All EDDGridFromFiles and all EDDTableFromFiles datasets support a
    set of tags which tell ERDDAP™ to download and maintain a copy of 
    all of a remote dataset's files, or a cache of a few files (downloaded as needed).
    <strong>This is an incredibly useful feature.</strong>
    <ul>
    <li>The &lt;cacheFromUrl&gt; tag lets you specify a URL with a list 
      of a remote dataset's files from a remote file list.
      <ul>
      <li>Unaggregated datasets in THREDDS, e.g., 
      <br>https://data.nodc.noaa.gov/thredds/catalog/aquarius/nodc_binned_V3.0/monthly/ [2020-10-21 This server is no longer reliably available.]
      <li>Unaggregated datasets in Hyrax, e.g., 
      <br><a rel="help" href="https://podaac-opendap.jpl.nasa.gov/opendap/allData/ccmp/L3.5a/monthly/flk/"
                             >https://podaac-opendap.jpl.nasa.gov/opendap/allData/ccmp/L3.5a/monthly/flk/<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a>
      <li>Most Apache-like directory listings, e.g., 
      <br><a rel="help" href="https://www.ncei.noaa.gov/data/global-precipitation-climatology-project-gpcp-daily/"
                             >https://www.ncei.noaa.gov/data/global-precipitation-climatology-project-gpcp-daily/<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a>
      <li>S3 buckets, e.g, 
      <br><a rel="help" href="https://noaa-goes17.s3.us-east-1.amazonaws.com/"
                             >https://noaa-goes17.s3.us-east-1.amazonaws.com/<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a> 
      <br>However, this may require an AWS account and more setup. 
        <br>See <a rel="help" href="#AwsS3Files">working with S3 Buckets in ERDDAP™</a>.
      <br>Also, you usually don't need to use cacheFromUrl with files in S3 buckets if the 
        files are ASCII files (e.g., .csv), because ERDDAP™ can efficiently read the data from the 
        bucket directly via a stream.
      </ul>

      <p>ERDDAP™ will copy or cache these files in the dataset's &lt;fileDir&gt; directory. 
      If you need support for another type of remote file list (e.g., FTP),
      please email your request to Chris.John at noaa.gov .
      <ul>
      <li>The default value for the &lt;cacheFromUrl&gt; tag is null.
        If you don't specify a value for the &lt;cacheFromUrl&gt; tag, 
        the copy/cache system won't be used for this dataset.
      <li>If the dataset's &lt;fileRegex&gt; setting is something other than .*,
        ERDDAP™ will only download files that match the fileRegex. 
      <li>If the dataset's &lt;recursive&gt; setting is <kbd>true</kbd> 
        and the remote files are in subdirectories, ERDDAP™ will look in
        remote subdirectories that match the dataset's <a rel="help" href="#pathRegex">&lt;pathRegex&gt;</a>, 
        create the same directory structure locally, 
        and put the local files in the same subdirectories.
      <li>In GenerateDatasetsXml, if you specify a &lt;cacheFromUrl&gt; value,
        GenerateDatasetsXml will create the local &lt;fileDir&gt; directory
        and copy 1 remote file into it. GenerateDatasetsXml will then 
        generate the datasets.xml chunk based on that sample file
        (specify <kbd>sampleFile=nothing</kbd>).
      <li>If the data source is a remote ERDDAP™, use 
        <a rel="help" href="#EDDGridFromErddap">EDDGridFromErddap</a> or
        <a rel="help" href="#EDDTableFromErddap">EDDTableFromErddap</a> instead
        of &lt;cacheFromUrl&gt;. That way, your local ERDDAP™ will appear to 
        have the dataset but won't need to store any of the data locally.
        The only reason to use &lt;cacheFromUrl&gt; to get data from a remote
        ERDDAP™ is when you have some other reason why you want to have a local
        copy of the data files. In that case:
        <ul>
        <li>This dataset will try to subscribe to the dataset on the remote ERDDAP
          so that changes to that dataset will call this dataset's flagUrl,
          causing this local dataset to reload and download the changed remote files.
          Thus, the local dataset will be up-to-date very soon after
          changes are made to the remote dataset.          
        <li>You should email the administrator of the remote
          ERDDAP™ to ask for the datasets.xml for the remote dataset so that you can 
          make the dataset in your local ERDDAP™ look like the dataset in the 
          remote ERDDAP.
        </ul>
      <li>If the data source is a remote ERDDAP™, the local dataset will try to 
        subscribe to the remote dataset. 
        <ul>
        <li>If the subscription succeeds, whenever the remote ERDDAP
          reloads and has new data, it will contact the flagURL for this dataset,
          causing it to reload and download the new and/or changed data files.
        <li>If the subscription fails (for whatever reason) or if you simply want
          to ensure that the local dataset is up-to-date, you can set a 
          <a rel="help" href="https://erddap.github.io/setup.html#flag">flag</a>
          for the local dataset, so it will reload, so it will check for new and/or changed
          remote data files. 
        </ul>
      <li>If the data source isn't a remote ERDDAP: the dataset will 
          check for new and/or changed remote files whenever it reloads. 
          Normally, this is controlled by 
          <a rel="help" href="#reloadEveryNMinutes"><kbd>&lt;reloadEveryNMinutes&gt;</kbd></a>.
          But if you know when there are new remote files, you can set a 
          <a rel="help" href="https://erddap.github.io/setup.html#flag">flag</a>
          for the local dataset, so it will reload and check for new and/or changed
          remote data files. 
          If this happens routinely at a certain time of day (e.g., at 7am),
          you can make a cron job to use curl to contact the flagUrl for 
          this dataset, so it will reload and check for new and/or changed remote data files.
      </ul>

    <li>The &lt;cacheSizeGB&gt; tag specifies the size of the local cache.
      You probably only need to use this when working with cloud storage systems like 
      <a href="https://aws.amazon.com/s3/">Amazon S3<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
      which is a commonly used storage system that is part of 
      <a href="https://aws.amazon.com/"
      >Amazon Web Services (AWS)<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>. 
      The default is -1.
      <ul>
      <li>If the value is &lt;=0 (e.g., the default value of -1), 
        <br>ERDDAP™ will download and maintain a 
        <strong>complete copy</strong> of all of the remote dataset's files
        in the dataset's &lt;fileDir&gt;. 
        <ul>
        <li>This is the setting which is recommended whenever possible.
        <li>Everytime the dataset is reloaded, it compares the names, sizes, and
          lastModified times of the remote files and the local files, 
          and downloads any remote files which are new or have changed.
        <li>If a file that was on the remote server disappears, ERDDAP™ will
          not delete the corresponding local file (otherwise, if something
          was temporarily wrong with the remote server, ERDDAP™ might delete
          some or all of the local files!).
        <li>With this setting, usually you will set &lt;updateEveryNMillis&gt; to -1,
          since the dataset is aware of when it has copied new data files into place.
        </ul>
      <li>If the value is &gt;0, 
        <br>ERDDAP™ will download files from 
        the remote dataset as needed into a local 
        <strong>cache</strong> (in the dataset's &lt;fileDir&gt;) 
        with a threshold size of that specified number of GB.
        <ul>
        <li>The cache must be large enough to hold at least several data files.
        <li>In general, the larger the cache, the better, because the next
          requested data file will be more likely to already be in the cache.
        <li>Caching should only be used when ERDDAP™ is running in a cloud computing server
          (e.g., an AWS compute instance) and the remote files in a cloud storage system
          (e.g., AWS S3). 
        <li>When the disk space used by the local files exceeds cacheSizeGB,
          ERDDAP™ will soon (maybe not immediately) delete some of the cached files 
          (currently, based on the Least Recently Used (LRU) algorithm) until
          the disk space used by the local files is &lt;0.75*cacheSizeGB
          (the "goal"). Yes, there are cases where LRU performs very badly --
          there is no perfect algorithm.
        <li>ERDDAP™ will never try to delete a cached file that ERDDAP™ started to use 
          in the last 10 seconds. 
          This is an imperfect system to deal with the cache system 
          and the data file reader system being only loosely integrated.
          Because of this rule, ERDDAP™ may not be able to 
          delete enough files to reach its goal, in which case it will 
          print a WARNING to the log.txt file,
          and the system will waste a lot of time trying to prune the cache,
          and it is possible that the size of the files
          in the cache may greatly exceed the cacheSizeGB.          
          If this ever occurs, use a larger cacheSizeGB setting for that dataset.
        <li>Currently, ERDDAP™ never checks if the remote server has a newer
          version of a file that is in the local cache. If you need this feature,
          please email Chris.John at noaa.gov .
        </ul>
      <li>Although the use of the same tag names might imply that the copy system
        and the cache system use the same underlying system, that is not correct.
        <ul>
        <li>The copy system proactively starts taskThread tasks to download
          new and changed files every time the dataset is reloaded.
          Only files that have actually been copied to the local directory
          are available via the ERDDAP™ dataset.
        <li>The cache system gets the remote file list every time the dataset is
          reloaded and pretends that all of those files are available via
          the ERDDAP™ dataset.
          Interestingly, all of the remote files even appear in the dataset's 
          /files/ web pages and are available for downloading 
          (although perhaps only after a delay while the file is first downloaded 
          from the remote server to the local cache.)
        </ul>
      <li>Datasets that use cacheSizeGB may benefit from using an 
        <a href="#nThreads" rel="bookmark">nThreads</a> setting greater than 1,
        because this will enable the dataset to download more than 1 remote
        file at a time.
      </ul>

    <li>The &lt;cachePartialPathRegex&gt; tag is a rarely used tag that 
      can specify an alternative for the dataset's <a rel="help" href="#pathRegex">&lt;pathRegex&gt;</a>.
      The default is null.
      <ul>
      <li>Only use this if you are copying the entire dataset via the default
         &lt;cacheSizeGB&gt; value of -1. With &lt;cacheSizeGB&gt; values of &gt;1,
         this will be ignored because it is nonsensical.
      <li>See <a rel="help" href="#pathRegex">the documentation for &lt;pathRegex&gt;</a> 
        for guidance on how to construct the regex.
      <li>If this is specified, it will be used every time the dataset is reloaded,
        except the first time a dataset is reloaded at the beginning of a month.
      <li>This is useful when the remote dataset is stored in a labyrinth
        of subdirectories and when the vast majority of those files rarely, if ever, change.
        (&lt;cough&gt;NASA&lt;cough&gt;) You could, for example, specify a 
        &lt;cachePartialPathRegex&gt; which just matches the current year 
        or the current month. These regexes are very tricky to specify, 
        because all of the partial and full path names must match the &lt;cachePartialPathRegex&gt;
        and because the &lt;cachePartialPathRegex&gt; must work with the remote URLs
        and the local directories. A real life example is:
        <br>&lt;cacheFromUrl&gt;https://data.nodc.noaa.gov/ghrsst/GDS2/L4/GLOB/JPL/MUR/v4.1/&lt;/cacheFromUrl&gt;
          <br>&gt;!-- [2020-10-21 This server is no longer reliably available.] For most types of remote directories, omit the filename (e.g., contents.html for Hyrax). --&gt;
        <br>&lt;fileDir&gt;/u00/satellite/MUR41/&lt;/fileDir&gt;
        <br>&lt;fileNameRegex&gt;*\.nc&lt;/fileNameRegex&gt;
        <br>&lt;recursive&gt;true&lt;/recursive&gt;
        <br>&lt;pathRegex&gt;.*&lt;/pathRegex&gt;
        <br>&lt;cachePartialPathRegex&gt;.*/v4\.1/(|2018/(|01./))&lt;/cachePartialPathRegex&gt;
        <br>The sample URL above has files in subdirectories based on year (e.g., 2018) 
          and day of year (e.g.,  001, 002, ..., 365 or 366).
        <br>Note that the &lt;cachePartialPathRegex&gt; starts with .*, 
        <br>then has a specific subdirectory which is common to the remote URLs and the local directories, e.g., /v4\.1/
        <br>then has a series of nested capture groups where the first option is nothing
        <br>and the second option is a specific value.
        <p>The example above will only match directories for the second 10 days of 2018, e.g.,
        <br>https://data.nodc.noaa.gov/ghrsst/GDS2/L4/GLOB/JPL/MUR/v4.1/2018/010/ [2020-10-21 This server is no longer reliably available.]
        <br>and day 011, 012, ..., 019.
        <br>(See this <a rel="help"
          href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/regex/Pattern.html"
          >regex documentation<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a> 
            and 
          <a rel="help" href="https://www.vogella.com/tutorials/JavaRegularExpressions/article.html"
          >regex tutorial<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a>.)
        <br>If you need help creating &lt;cachePartialPathRegex&gt;, please 
          email the &lt;cacheFromUrl&gt; to Chris.John at noaa.gov .
      <li>A common approach: If you want to use &lt;cachePartialPathRegex&gt;, don't use it initially,
        because you want ERDDAP™ to download all of the files initially.
        After ERDDAP™ has downloaded all of the files, add it to the dataset's
        chunk of datasets.xml.
        <br>&nbsp;
      </ul>
    </ul>

<li><a class="selfLink" id="EDDTableFromFiles_ThousandsOfFiles" href="#EDDTableFromFiles_ThousandsOfFiles" rel="bookmark"><strong>Thousands of Files</strong></a> 
  <br>If your dataset has many thousands of files, ERDDAP™ may be slow to respond to requests for data from that dataset. 
  There are two issues here:
    <br>&nbsp;

  <ol>
  <li>The number of files per directory. 
    <br>Internally, ERDDAP™ operates at the same speed regardless of whether n files 
    are in one directory or dispersed in several directories.
    <br>&nbsp;

    <p>But there is a problem: The more files in a given directory, 
      the slower the operating system is at returning the list of files in the directory (per file) to ERDDAP. 
      The response time might be O(n log n). 
      It is hard to say how many files in one directory is too many, but 10,000 is probably too many.
      So if your setup is generating lots of files, a recommendation here might be: 
      put the files in logically organized subdirectories (e.g., station or station/year).

    <p>Another reason to use subdirectories: 
    if a user wants to use ERDDAP's "files" system to find the name of the oldest file for station X, 
    it is faster and more efficient if the files are in station/year subdirectories, 
    because much less information needs to be transferred.

  <li>The total number of files.
    <br>For tabular datasets, ERDDAP™ keeps track of the range of values for each variable in each file. 
    When a user makes a request, ERDDAP™ has to read all the data from all of the files 
    that might have data matching the user's request.
    If the user asks for data from a limited time (e.g., one day or one month), 
    then ERDDAP™ won't have to open too many files in your dataset.
    But there are extreme cases where almost every file might have matching data 
    (e.g., when waterTemperature=13.2C).
    Since it takes ERDDAP™ a little bit of time (partly the seek time on the HDD, 
    partly the time to read the file's header) just to open a given file 
    (and more if there are lots of files in the directory),
    there is a significant time penalty if the total number of files 
    that ERDDAP™ has to open is very large. Even opening 1000 files takes significant time.
    So there are benefits to periodically consolidating the daily files into larger chunks (e.g., 1 station for 1 year).
    I understand that you might not want to do this for various reasons, but it does lead to much faster responses.
    In extreme cases (e.g., I deal with a GTSPP dataset that has ~35 million source files), 
    serving data from a huge number of source files is impractical because ERDDAP's response 
    to simple queries can take hours and use tons of memory. 
    By consolidating source files into a smaller number (for GTSPP, I have 720 now, 2 per month), 
    ERDDAP™ can respond reasonably quickly. See 
    <a rel="help" href="#EDDTableFromFiles_MillionsOfFiles">Millions of Files</a>
    <br>&nbsp;

  </ol>
  N.B. Solid State Drives are great!
  The fastest, easiest, cheapest way to help ERDDAP™ deal with a huge number of (small) 
  files is to use a solid state drive. See
  <a rel="help" href="https://erddap.github.io/setup.html#SSD">Solid State Drives are great!</a>
  <br>&nbsp;


<li><a class="selfLink" id="EDDTableFromFiles_MillionsOfFiles" href="#EDDTableFromFiles_MillionsOfFiles" rel="bookmark"><strong>Millions of Files</strong></a> -
  Some datasets have millions of source files. 
  ERDDAP™ can handle this, but with mixed results.
  <ul>
  <li>For requests that just involve variables listed in 
    <a rel="help" href="#subsetVariables">&lt;subsetVariables&gt;</a>,
    ERDDAP™ has all of the needed information already extracted from 
    the datafiles and stored in one file, so it can respond very, very quickly.
  <li>For other requests, ERDDAP™ can scan the dataset's
    <a rel="help" href="#EDDTableFromFiles_CachedFileInformation">cached file information</a>
    and figure out that only a few of the files
    might have data which is relevant to the request and thus respond quickly.
  <li>But for other requests (for example, waterTemperature=18 degree_C)
    where any file might have relevant data,
    ERDDAP™ has to open a large number of files to see if each of the files has
    any data which is relevant to the request.
    The files are opened sequentially. On any operating system and any file system
    (other than solid state drives), this takes a long time (so ERDDAP™ responds slowly)
    and really ties up the file system (so ERDDAP™ responds slowly to other requests).
  </ul>

  <p>Fortunately, there is a solution.
  <ol>
  <li>Set up the dataset on a non-public ERDDAP™ (your personal computer?).
  <li>Create and run a script which requests a series of .ncCF files, each with 
    a large chunk of the dataset, usually a time period
    (for example, all of the data for a given month).
    Choose the time period so that all of the resulting files are less 
    than 2GB (but hopefully greater than 1GB).
    If the dataset has near-real-time data, run the script to regenerate the 
    file for the current time period (e.g., this month) frequently 
    (every 10 minutes? every hour?).
    Requests to ERDDAP™ for .ncCF files create a NetCDF v3 .nc file that uses the 
    <a href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html#discrete-sampling-geometries"
    >CF Discrete Sampling Geometries (DSG)<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
    Contiguous Ragged Array data structures).

  <li>Set up an <a rel="help" href="#EDDTableFromNcCFFiles">EDDTableFromNcCFFiles</a>
    dataset on your public ERDDAP™ which gets data from the .nc(CF) files.
    ERDDAP™ can extract data from these files very quickly.
    And since there are now dozens or hundreds (instead of millions) of files, 
    even if ERDDAP™ has to open all of the files, it can do so quickly.
    
  </ol>
  Yes, this system takes some time and effort to set up, but it works very, very well. 
  Most data requests can be handled 100 times faster than before.
  <br>[Bob knew this was a possibility, but it was Kevin O'Brien who first did this
  and showed that it works well.  Now, Bob uses this for the GTSPP dataset 
  which has about 18 million source files and which ERDDAP™ now serves 
  via about 500 .nc(CF) files.]

  <p>N.B. Solid State Drives are great!
  The fastest, easiest, cheapest way to help ERDDAP™ deal with a huge number of (small) 
  files is to use a solid state drive. See
  <a rel="help" href="https://erddap.github.io/setup.html#SSD">Solid State Drives are great!</a>
  <br>&nbsp;

<li><a class="selfLink" id="EDDTableFromFiles_HugeFiles" href="#EDDTableFromFiles_HugeFiles" rel="bookmark"><strong>Huge Files</strong></a> -- 
A single huge data file (notably huge ASCII data files) can cause an OutOfMemoryError. 
If this is the problem, it should be obvious because ERDDAP™ will fail to load the dataset. 
The solution, if feasible, is to split the file into multiple files. 
Ideally, you can split the file into logical chunks. 
For example, if the file has 20 month's worth of data, split it into 20 files, 
each with 1 month's worth of data. 
But there are advantages even if the main file is split up arbitrarily. 
This approach has multiple benefits: 
a) This will reduce the memory needed to read the data files to 1/20th, 
because only one file is read at a time. 
b) Often, ERDDAP™ can deal with requests much faster because it only has to look 
in one or a few files to find the data for a given request. 
c) If data collection is ongoing, then the existing 20 files can remain unchanged, 
and you only need to modify one, small, new file to add the next month's worth of data to the dataset.
  <br>&nbsp;

<li><a class="selfLink" id="EDDTableFromFiles_FTP" href="#EDDTableFromFiles_FTP" rel="bookmark"><strong>FTP Trouble/Advice</strong></a> -- 
  If you FTP new data files to the ERDDAP™ server while ERDDAP™ is running,
  there is the chance that ERDDAP™ will be reloading the dataset during the FTP process.
  It happens more often than you might think!
  If it happens, the file will appear to be valid (it has a valid name), but the file isn't valid.
  If ERDDAP™ tries to read data from that invalid file, the resulting error will cause the file to
  be added to the table of invalid files.
  This is not good.
  To avoid this problem, use a temporary filename when FTP'ing the file, for example, ABC2005.nc_TEMP .
  Then, the fileNameRegex test (see below) will indicate that this is not a relevant file.
  After the FTP process is complete, rename the file to the correct name.
  The renaming process will cause the file to become relevant in an instant.
  <br>&nbsp;
<li><a class="selfLink" id="EDDTableFromFiles_FileNameExtracts" href="#EDDTableFromFiles_FileNameExtracts" rel="bookmark"><strong>File Name Extracts</strong></a>   <br>[This feature is DEPRECATED. Please use 
    <a href="#fileNameSourceNames" rel="help"><kbd>***fileName</kbd> pseudo sourceName</a> instead.]
  <br>EDDTableFromFiles has a system for extracting a String from each filename
  and using that to make a pseudo data variable.
  Currently, there is no system to interpret these Strings as dates/times.
  There are several XML tags to set up this system.
  If you don't need part or all of this system, just don't specify these tags or use "" values.
  <ul>
  <li><kbd>preExtractRegex</kbd> is a 
      <a rel="help" 
      href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/regex/Pattern.html"
      >regular expression<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
      (<a rel="help" href="https://www.vogella.com/tutorials/JavaRegularExpressions/article.html">tutorial<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>) 
      used to identify text to be removed
    from the start of the filename. 
    The removal only occurs if the regex is matched.
    This usually begins with "^" to match the beginning of the filename.
  <li><kbd>postExtractRegex</kbd> is a regular expression used to identify text to be removed from
    the end of the filename.
    The removal only occurs if the regex is matched.
    This usually ends with "$" to match the end of the filename.
  <li><kbd>extractRegex</kbd> If present, this regular expression is used after preExtractRegex and 
    postExtractRegex to identify a string to be extracted from the filename (for example, the stationID).
    If the regex isn't matched, the entire filename is used (minus preExtract and postExtract).
    Use ".*" to match the entire filename that is left after preExtractRegex and postExtractRegex.
  <li><kbd>columnNameForExtract</kbd> is the data column source name for the extracted Strings.
    A dataVariable with this <a rel="help" href="#sourceName">sourceName</a> must be in the dataVariables list (with any data type,
    but usually String).
  </ul>
  For example, if a dataset has files with names like 
  <kbd>XYZAble.nc, XYZBaker.nc, XYZCharlie.nc,</kbd> ...,
  and you want to create a new variable (<kbd>stationID</kbd>) when each file is read 
  which will have station ID values
  (<kbd>Able, Baker, Charlie,</kbd> ...) extracted from the filenames, you could use these tags:
  <ul>
  <li><kbd>&lt;preExtractRegex&gt;^XYZ&lt;/preExtractRegex&gt;</kbd>
    <br>The initial ^ is a regular expression special character which 
    forces ERDDAP™ to look for <kbd>XYZ</kbd> at the beginning of the filename.
    This causes <kbd>XYZ</kbd>, if found at the beginning of the filename, to be removed
    (for example, the filename <kbd>XYZAble.nc</kbd> becomes <kbd>Able.nc</kbd>).
  <li><kbd>&lt;postExtractRegex&gt;\.nc$&lt;/postExtractRegex&gt;</kbd> 
    <br>The $ at the end is a regular expression special character which 
    forces ERDDAP™ to look for <kbd>.nc</kbd> at the end of the filename.
    Since . is a regular expression special character (which matches any character), 
    it is encoded as <kbd>\.</kbd> here 
    (because 2E is the hexadecimal character number for a period).
    This causes <kbd>.nc</kbd>, if found at the end of the filename, to be removed
    (for example, the partial filename <kbd>Able.nc</kbd> becomes <kbd>Able</kbd>).
  <li><kbd>&lt;extractRegex&gt;.*&lt;/extractRegex&gt;</kbd> 
    <br>The .* regular expression matches all remaining characters
    (for example, the partial filename <kbd>Able</kbd> becomes the extract for the first file).
  <li><kbd>&lt;columnNameForExtract&gt;stationID&lt;/columnNameForExtract&gt;</kbd> 
    <br>This tells ERDDAP™ to create a new source column called <kbd>stationID</kbd> 
    when reading each file.
    Every row of data for a given file will have the text extracted 
    from its filename (for example, <kbd>Able</kbd>) as the value in the <kbd>stationID</kbd> column.
  </ul>
  In most cases, there are numerous values for these extract tags that will yield the same results --
  regular expressions are very flexible. But in a few cases, there is just one 
  way to get the desired results.  
    <br>&nbsp;

<li><strong><a class="selfLink" id="EDDTableFromFiles_PseudoSourceNames" href="#EDDTableFromFiles_PseudoSourceNames" rel="bookmark"
  >Pseudo sourceNames</a></strong> 
  <br>Every variable in every dataset in ERDDAP™ has a 
  <a rel="help" href="#sourceName"><kbd>&lt;sourceName&gt;</kbd></a>
  which specifies the source's name for the variable. EDDTableFromFiles supports
  a few pseudo sourceNames which extract a value from some other place (e.g., the
  file's name or the value of a global attribute) and promote that value
  to be a column of constant values for that chunk of data (e.g., the table
  of that file's data).
  For these variables, you must specify the variable's data type via the
    <a rel="help" href="#dataType"><kbd>&lt;dataType&gt;</kbd></a> tag.
  If the extracted information is a dateTime string, you specify the format
    of the dateTime string in the 
    <a rel="help" href="#stringTimeUnits"><kbd>units attribute</kbd></a>.
  The pseudo sourceName options are:
  <br>&nbsp;

  <ul>
  <li><strong><a class="selfLink" id="globalSourceNames" href="#globalSourceNames" rel="bookmark"
  ><kbd>global:</kbd> sourceNames</a></strong> 
  <br>A global metadata attribute in each source data file can be promoted to be a column of data.
  If a variable's <kbd>&lt;sourceName&gt;</kbd>
  has the format 
  <pre>&lt;sourceName&gt;global:<i>attributeName</i>&lt;/sourceName&gt;</pre>
  then when ERDDAP™ is reading the data from a file, 
  ERDDAP™ will look for a global attribute of that name (for example, <kbd>PI</kbd>) and 
  create a column filled with the attribute's value.
  This is useful when the attribute has different values in different source files,
  because otherwise, users would only see one of those values for the whole dataset.
  For example,
  <pre>&lt;sourceName&gt;global:PI&lt;/sourceName&gt;</pre>

  When you promote an attribute to be data, ERDDAP™ removes the corresponding attribute.
  This is appropriate because the value is presumably different in every file;
  whereas in the aggregated dataset in ERDDAP™ it will have only one value.
  If you want, you can add a new value for the attribute for the whole dataset by adding
  <kbd>&lt;att name="<i>attributeName</i>"&gt;<i>newValue</i>&lt;/att&gt;</kbd>
  to the dataset's global <a rel="help" href="#addAttributes"><kbd>&lt;addAttributes&gt;</kbd></a>.
  For global attributes that ERDDAP™ requires, for example, <kbd>institution</kbd>, 
  you MUST add a new value for the attribute.
  <br>&nbsp;
  
  <li><strong><a class="selfLink" id="variableSourceNames" href="#variableSourceNames" rel="bookmark"
  ><kbd>variable:</kbd> sourceNames</a></strong> 
  <br>A variable's metadata attribute in each file can be promoted to be a column of data.
  If a variable's <kbd>&lt;<a rel="help" href="#sourceName">sourceName</a>&gt;</kbd> has the format
  <pre>&lt;sourceName&gt;variable:<i>variableName</i>:<i>attributeName</i>&lt;sourceName&gt;</pre>  
  then when ERDDAP™ is reading the data from a file, 
  ERDDAP™ will look for the specified attribute (for example, <kbd>ID</kbd>) 
  of the specified variable (for example, <kbd>instrument</kbd>) and 
  create a column filled with the attribute's value. 
  The parent variable (for example, <kbd>instrument</kbd>) needn't be one of the dataVariables
  included in the dataset's definition in ERDDAP. For example,
  <pre>&lt;sourceName&gt;variable:instrument:ID&lt;/sourceName&gt;</pre>
  This is useful when the attribute has different values in different source files,
  because otherwise, users would only see one of those values for the whole dataset.

  <p>When you promote an attribute to be data, ERDDAP™ removes the corresponding attribute.
  This is appropriate because the value is presumably different in every file;
  whereas in the aggregated dataset in ERDDAP™ it will have only one value.
  If you want, you can add a new value for the attribute for the whole dataset by adding
  <kbd>&lt;att name="<i>attributeName</i>"&gt;<i>newValue</i>&lt;/att&gt;</kbd>
  to the variable's <a rel="help" href="#addAttributes"><kbd>&lt;addAttributes&gt;</kbd></a>.
  For attributes that ERDDAP™ requires, for example, <kbd>ioos_category</kbd> (depending on your setup), 
  you MUST add a new value for the attribute.

  <li><strong><a class="selfLink" id="fileNameSourceNames" href="#fileNameSourceNames" rel="bookmark"
  ><kbd>***fileName</kbd> sourceNames</a></strong> 
  <br>You can extract part of a file's fileName and promote that to be a column of data.
  The format for this pseudo <a rel="help" href="#sourceName"><kbd>&lt;sourceName&gt;</kbd></a> is
  <pre>&lt;sourceName&gt;***fileName,<i>regex</i>,<i>captureGroupNumber</i>&lt;/sourceName&gt;</pre>
  For example, 
  <pre>&lt;sourceName&gt;***fileName,A(\d{12})\.slcpV1.nc,1&lt;/sourceName&gt;</pre> 
  When EDDTableFromFiles is reading the data from a file, 
  it will make sure the fileName (for example, <kbd>A201807041442.slcpV1.nc</kbd>) 
  matches the specified regular expression ("regex")
  and extract the specified (in this case, the first) 
  capture group (which is a part surrounded by parentheses),
  for example, "201807041442".
          (See this <a rel="help"
          href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/regex/Pattern.html"
          >regex documentation<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a> 
            and 
          <a rel="help" href="https://www.vogella.com/tutorials/JavaRegularExpressions/article.html"
          >regex tutorial<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a>.)
  The regex may be specified as a string with or without surrounding quotes.
  If the regex is specified as a string with surrounding quotes,
  the string must be 
  <a rel="help" href="https://www.json.org/json-en.html" >JSON-style string<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a>
  (with special characters escaped with \ characters).
  The capture group number is usually 1 (the first capture group), but may be any number.
  <br>&nbsp;

  <li><strong><a class="selfLink" id="pathNameSourceNames" href="#pathNameSourceNames" rel="bookmark"
  ><kbd>***pathName</kbd> sourceNames</a></strong> 
  <br>You can extract part of a file's full pathName (/directories/fileName.ext) 
  and promote that to be a column of data.
  The format for this pseudo <a rel="help" href="#sourceName"><kbd>&lt;sourceName&gt;</kbd></a> is
  <pre>&lt;sourceName&gt;***pathName,<i>regex</i>,<i>captureGroupNumber</i>&lt;sourceName&gt;</pre> 
  For example, 
  <pre>&lt;sourceName&gt;***pathName,/data/myDatasetID/([A-Z0-9]*)/B(\d{12}).nc,1&lt;/sourceName&gt;</pre> 
  When EDDTableFromFiles is reading the data from a file, 
  it will make sure the full pathName 
  (for example, <kbd>/data/myDatasetID/BAY17/B201807041442.nc</kbd> . 
  For this test, the directory separators will always be '/', never '\') 
  matches the specified regular expression ("regex")
  and extract the specified (in this case, the first) 
  capture group (which is a part surrounded by parentheses), for example, "BAY17".
          (See this <a rel="help"
          href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/regex/Pattern.html"
          >regex documentation<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a> 
            and 
          <a rel="help" href="https://www.vogella.com/tutorials/JavaRegularExpressions/article.html"
          >regex tutorial<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a>.)
  The regex may be specified as a string with or without surrounding quotes.
  If the regex is specified as a string with surrounding quotes,
  the string must be a
  <a rel="help" href="https://www.json.org/json-en.html" >JSON-style string<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a>
  (with special characters escaped with \ characters).
  The capture group number is usually 1 (the first capture group), but may be any number.
  <br>&nbsp;
  </ul>

<li><a class="selfLink" id="EDDTableFromFiles_0Files" href="#EDDTableFromFiles_0Files" rel="bookmark"><strong>"0 files" Error Message</strong></a> -- 
  If you run 
  <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml</a> or
  <a rel="help" href="#DasDds">DasDds</a>, 
  or if you try to load an EDDTableFrom...Files
  dataset in ERDDAP™, and you get a "0 files" error message indicating that
  ERDDAP™ found 0 matching files in the directory 
  (when you think that there are matching files in that directory):
  <ul>
  <li>Check that the files really are in that directory.
  <li>Check the spelling of the directory name.
  <li>Check the fileNameRegex. It's really, really easy to make mistakes with regexes.
    For test purposes, try the regex .* which should match all filenames.
          (See this <a rel="help"
          href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/regex/Pattern.html"
          >regex documentation<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a> 
            and 
          <a rel="help" href="https://www.vogella.com/tutorials/JavaRegularExpressions/article.html"
          >regex tutorial<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a>.)
  <li>Check that the user who is running the program (e.g., user=tomcat (?) for Tomcat/ERDDAP)
    has 'read' permission for those files. 
  <li>In some operating systems (for example, SELinux) and depending on system settings, 
    the user who ran the program must have 'read' permission for the 
    whole chain of directories leading to the directory that has the files. 
    <br>&nbsp;
  </ul>

<li><a class="selfLink" id="EDDTableFromFiles_standardizeWhat" href="#EDDTableFromFiles_standardizeWhat" rel="bookmark"
><strong>standardizeWhat</strong></a> -- 
When any subclass of EDDTableFromFiles is aggregating a set of source files, 
for a given variable, all of the source files MUST have identical attribute values for 
several attributes: scale_factor, add_offset, _Unsigned, missing_value,
_FillValue, and units). 
Think about it: if one file has windSpeed units=knots and another has windSpeed units=m/s,
then the data values from the two files shouldn't be included in the same aggregated dataset.
So, when EDDTableFromFiles first creates the dataset, 
it reads the attribute values from one file, 
then rejects all of the files that have different values for those important attributes.
For most collections of files, this is not a problem because the attributes of 
all the variables are consistent. However, for other collections of files, 
this can lead to 1%, 10%, 50%, 90%, or even 99% of the files being rejected as "bad" files.
That is trouble.

<p>EDDTableFrom files has a system to deal with this problem: standardizeWhat.
The standardizeWhat setting tells EDDTableFromFiles to standardize
the files as soon as it reads them, before EDDTableFromFiles
looks at the attributes to see if they are consistent.

<p>The flip side is: if the dataset doesn't have this problem, don't use standardizeWhat.
standardizeWhat has some potential risks (discussed below) and inefficiencies.
So if you don't actually need the features of standardizeWhat, there is no 
need to face the potential risks and inefficiencies. The biggest inefficiency is:
When various standardizeWhat options are used by a dataset, it implies that the 
source files are storing data in significantly different ways 
(e.g., with different scale_factor and add_offset,
or with time strings using different formats).
Thus, for a given constraint in a user request,
there is no way for ERDDAP™ to make a single source-level constraint that can be applied to all source files. 
So ERDDAP™ can only apply the affected constraints at a higher level.
So ERDDAP™ has to read the data from more files before applying the higher, destination-level constraints.
So requests to datasets that use standardizeWhat take longer to be processed.

<p>To use this system, you need to specify 
<br><kbd>&lt;standardizeWhat&gt;<i>standardizeWhat</i>&lt;/standardizeWhat&gt;</kbd>
<br>in the  
<a href="#EDDTableFromFilesSkeletonXML" rel="bookmark">datasets.xml for the EDDTableFrom...Files dataset</a>
(within the <kbd>&lt;dataset&gt;</kbd> tag).

<p>The <i>standardizeWhat</i> value specifies which changes EDDTableFromFiles should
try to apply. The changes are the sum of some combination of:

<ol>
<li value="1">Unpack
   <br>This does many common and safe operations to standardize numeric columns in the files: 
   <ul>
   <li>If scale_factor and/or add_offset attributes are present, remove them and 
     apply them to unpack the data values.
   <li>Unpack packed attributes (e.g., actual_min, actual_max, actual_range, 
     data_min, data_max, data_range, valid_min, valid_max, valid_range), if present, if the variable was packed, and 
     if the attribute values were packed (this is tricky, but reasonably reliable).
   <li>If _FillValue and/or missing_value are present, 
     convert those data values to ERDDAP's "standard" missing values:
     MAX_VALUE for integer types (e.g., 127 for bytes, 32,767 for short, and
     2,147,483,647 for ints, 9223372036854775807 for longs)
     and NaN for doubles and floats.       
   <li>Remove the old _FillValue and/or missing_value attributes (if any),
      and replace them with just _FillValue=[the ERDDAP™ standard missing value].
      <br>&nbsp;
   </ul>
<li value="2">Standardize Numeric Times
  <br>If a numeric column has CF-style numeric time units 
  ("<i>timeUnits</i> since <i>baseTime</i>", e.g., "days since 1900-01-01"),
  this converts the dateTime values into 
  <span class="nowrap">"seconds since 1970-01-01T00:00:00Z"</span> values
  and changes the units attribute to indicate that.
  <br>If this is selected and there is a chance that this variable
    has scale_factor or add_offset, #1 MUST be selected also.
  <br>&nbsp;
<li value="4">Apply String missing_value
  <br>If a String column has _FillValue and/or missing_value attributes,
   this converts those values to "" and removes the attributes.
  <br>&nbsp;
<li value="256">Find Numeric missing_value
  <br>If a numeric column doesn't have _FillValue or missing_value attributes,
   this tries to identify an undefined numeric missing_value (e.g., -999, 9999, 1e37f) 
   and convert instances of it to the "standard" values 
   (MAX_VALUE for integer types, and NAN for doubles and floats).
   <br><strong>This option has a risk:</strong> if the largest or smallest valid data value looks
   like a missing value (e.g., 999), then those valid data values will be
   converted to missing values (e.g., NaN).
  <br>&nbsp;
<li value="512">Change String "N/A" to ""
  <br>For each String column, convert several strings commonly used to indicate
   a missing String value to "".
   Currently, this looks for 
   ".", "...", "-", "?", "???", "N/A", "NA", "none", "not applicable", "null", "unknown", "unspecified".   
   The string search is case-insensitive and applied after the strings are trim'd.
   "nd" and "other" are specifically not on the list.
   <br><strong>This option has a risk:</strong> Strings that you consider to 
   be valid values may be converted to "".
  <br>&nbsp;
<li value="1024">Standardize to String ISO 8601 DateTimes
  <br> For each String column, try to convert not-purely-numeric String dateTimes 
    (e.g., "Jan 2, 2018") to ISO 8601 String dateTimes ("2018-01-02").
    <br><strong>Note</strong> that all data values for the column must use the same format,
    otherwise, this option won't make any changes to a given column.
    <br><strong>This option has a risk:</strong> If there is a column with 
    string values that just happen to look like a common dateTime format,
    they will be converted to ISO 8601 String dateTimes.
  <br>&nbsp;
<li value="2048">Standardize Compact DateTimes To ISO 8601 DateTimes
  <br>For each String or integer-type column, try to convert purely-numeric String dateTimes
    (e.g., "20180102") to ISO 8601 String dateTimes ("2018-01-02").
    <br><strong>Note</strong> that all data values for the column must use the same format,
    otherwise, this option won't make any changes to a given column.
    <br><strong>This option has a risk:</strong> If there is a column with 
    values that aren't compact dateTimes but look like compact dateTimes,
    they will be converted to ISO 8601 String dateTimes.
  <br>&nbsp;
<li value="4096">Standardize Units
  <br>This tries to standardize the units string for each variable.
     For example, "meters per second", "meter/second", "m.s^-1", "m s-1", "m.s-1"
     will all be converted to "m.s-1".  This doesn't change the data values.
     This works well for valid UDUNITS
     units strings, but can have problems with invalid or complex strings.
     You can deal with problems by specifying specific from-to pairs
     in &lt;standardizeUdunits&gt; in ERDDAP's
     <br>[tomcat]/webapps/erddap/WEB-INF/classes/gov/noaa/pfel/erddap/util/messages.xml file. 
     Please email any changes
     you make to Chris.John at noaa.gov so they can be incorporated
     into the default messages.xml. 
     <br><strong>This option has a risk:</strong> This may mangle some 
     complex or invalid units; however, you can use the work-around
     described above to circumvent problems if they occur.
  <br>&nbsp;
</ol>

The default value of standardizeWhat is 0, which doesn't do anything.

<p>If/when you change the value of standardizeWhat, the next time the dataset is reloaded,
ERDDAP™ will reread all of the data files for the dataset in order to rebuild
the mini-database with information about each file. If the dataset has lots of
files, this will take a long time.

<p>Notes:
<ul>
<li>A tricky thing is -
<br>The standardizeWhat setting is used for all columns in the source file. So, for example,
using #2048 might successfully convert a column of compact String dateTimes into
ISO 8601 String dateTimes, but it might also mistakenly convert a column with
Strings that just happen to look like compact dateTimes.
<br>&nbsp;

<li>datasets.xml and GenerateDatasetsXml -
<br>It is especially tricky to get the settings correct in datasets.xml to
make your dataset work the way you want it to. The best approach (as always) is:
  <ol>
  <li>Use <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml</a> 
    and specify the value of standardizeWhat that you would like to use.
  <li>Use <a rel="help" href="#DasDds">DasDds</a> to ensure that the dataset
    loads correctly and reflects the standardizeWhat setting that you specified. 
  <li>Test the dataset by hand when it is in ERDDAP™ to ensure that the affected 
    variables work as expected.
    <br>&nbsp;
  </ol>

<li>Risks -
<br>Options #256 and above are more risky, i.e., there is a greater chance that ERDDAP™ 
will make a change that shouldn't be made. For example, option #2048
might accidentally convert a variable with station ID strings
that all just happen to look ISO 8601 "compact" dates (e.g., 20180102) into
ISO 8601 "extended" dates ("2018-01-02").
<br>&nbsp;

<li>Slow after a change -- 
<br>Since the value of standardizeWhat changes the data values that EDDTableFromFiles
sees for each data file, if you change the standardizeWhat setting, EDDTableFromFiles
will throw away all the cached information about each file (which includes the min and
max for each data variable in each file) and re-read each data file.
If a dataset has a large number of files, this can be very time consuming, 
so it will take a long time for the dataset to reload the first time ERDDAP™ reloads
it after you make the change. 
<br>&nbsp;

<li>Heuristics -
<br>Options #256 and above use heuristics to make their changes. If you come 
across a situation where the heuristics make a bad decision, please email
a description of the problem to Chris.John at noaa.gov so we can 
improve the heuristics.
<br>&nbsp;

<li>Alternatives -- 
<br>If one of the standardizeWhat options doesn't solve a problem for a given dataset, 
you may be able to solve the problem by making an 
<a rel="help" href="#NcML">.ncml file</a>
to parallel every data file and define changes
to things in the files so that the files are consistent. 
Then, tell the EDDTableFrom...Files dataset to aggregate the .ncml files.

<p>Or, use <a rel="help" href="#NCO">NCO</a>
to actually make changes to the files so that the files are consistent.

</ul>


<li><a class="selfLink" id="SeparateColumnsForYearMonthDateHourMinuteSecond" href="#SeparateColumnsForYearMonthDateHourMinuteSecond" rel="bookmark"><strong>Separate Columns for Year, Month, Date, Hour, Minute, Second</strong></a> -- 
<br>It is fairly common for tabular data files to have separate columns for 
year, month, date, hour, minute, second. 
Before ERDDAP™ v2.10, the only solution was to edit the data file to combine those columns
into a unified time column.
With ERDDAP™ 2.10+, you can use the 
<br><a href="#sourceName" rel="help">&lt;sourceName&gt;=<i>expression</i>&lt;sourceName&gt;</a>
to tell ERDDAP™ how to combine the source columns to make a unified time column,
so you no longer have to edit the source file.

<li><a class="selfLink" id="skipHeaderToRegex" href="#skipHeaderToRegex" rel="bookmark">&lt;skipHeaderToRegex&gt;</a> --
<br>OPTIONAL. (For EDDTableFromAsciiFiles and EDDTableFromColumnarAsciiFiles datasets only.)
<br>When EDDTableFromAsciiFiles reads a data file, 
it will ignore all of the lines up to and including the line 
that matches this regular expression. 
The default is "", which doesn't use this option. 
An example is 
<br><kbd>&lt;skipHeaderToRegex&gt;\*\*\* END OF HEADER.*&lt;skipHeaderToRegex&gt;</kbd>
<br>which will ignore all lines up to and including a line that starts with "*** END OF HEADER".

<p>When you use this tag, <kbd>&lt;columnNamesRow&gt;</kbd> and <kbd>&lt;firstDataRow&gt;</kbd>
act as if the header has been removed before the file is read. 
For example, you would use columnNamesRow=0 if the column names are on the row right after the header.

<p>If you want to use generateDatasetsXml with a dataset that needs this tag:
<ol>
<li>Make a new, temporary, sample file by copying an existing file and removing the header.
<li>Run generateDatasetsXml and specify that sample file.
<li>Manually add the &lt;skipHeaderToRegex&gt; tag to the datasets.xml chunk.
<li>Delete the temporary, sample file.
<li>Use the dataset in ERDDAP.
</ol>

<li><a class="selfLink" id="skipLinesRegex" href="#ETFFskipLinesRegex" rel="bookmark">&lt;skipLinesRegex&gt;</a> -- 
<br>OPTIONAL. (For EDDTableFromAsciiFiles and EDDTableFromColumnarAsciiFiles datasets only.)
<br>When EDDTableFromAsciiFiles reads a data file, 
it will ignore all lines which match this regular expression.
The default is "", which doesn't use this option.
An example is
<br><kbd>&lt;skipLinesRegex&gt;#.*&lt;skipLinesRegex&gt;</kbd>
<br>which will ignore all lines which start with "#".

<p>When you use this tag, <kbd>&lt;columnNamesRow&gt;</kbd> and <kbd>&lt;firstDataRow&gt;</kbd>
act as if all of the matching lines had been removed before the file is read. 
For example, you would use columnNamesRow=0 even if there are several lines starting with, 
for example, "#" at the start of the file.


<li><a class="selfLink" id="EDDTableFromFilesSkeletonXML" href="#EDDTableFromFilesSkeletonXML" rel="bookmark"><strong>The skeleton XML</strong>
  for all EDDTableFromFiles subclasses is:</a>
<pre>
&lt;dataset type="EDDTableFrom...Files" <a rel="help" href="#datasetID">datasetID</a>="..." <a rel="help" href="#active">active</a>="..." &gt;
  &lt;nDimensions&gt;...&lt;/nDimensions&gt;  &lt;!-- This was used prior to ERDDAP™ 
    version 1.30, but is now ignored. --&gt;
  <a rel="help" href="#accessibleTo">&lt;accessibleTo&gt;</a>...&lt;/accessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#graphsAccessibleTo">&lt;graphsAccessibleTo&gt;</a>auto|public&lt;/graphsAccessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#reloadEveryNMinutes">&lt;reloadEveryNMinutes&gt;</a>...&lt;/reloadEveryNMinutes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#updateEveryNMillis">&lt;updateEveryNMillis&gt;</a>...&lt;/updateEveryNMillis&gt; &lt;!-- 0 or 1. For 
    EDDTableFromFiles subclasses, this uses Java's WatchDirectory system 
    to notice new/deleted/changed files quickly and efficiently. --&gt;
  <a rel="help" href="#EDDTableFromFiles_standardizeWhat">&lt;standardizeWhat&gt;</a>...&lt;/standardizeWhat&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultDataQuery">&lt;defaultDataQuery&gt;</a>...&lt;/defaultDataQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultGraphQuery">&lt;defaultGraphQuery&gt;</a>...&lt;/defaultGraphQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#addVariablesWhere">&lt;addVariablesWhere&gt;</a>...&lt;/addVariablesWhere&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#nThreads">&lt;nThreads&gt;</a>...&lt;/nThreads&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#fgdcFile">&lt;fgdcFile&gt;</a>...&lt;/fgdcFile&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#iso19115File">&lt;iso19115File&gt;</a>...&lt;/iso19115File&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#onChange">&lt;onChange&gt;</a>...&lt;/onChange&gt; &lt;!-- 0 or more --&gt;
  &lt;specialMode&gt;<i>mode</i>&lt;/specialMode&gt;  &lt;-- This rarely-used, OPTIONAL tag 
    can be used with EDDTableFromThreddsFiles to specify that special,
    hard-coded rules should be used to determine which files should 
    be downloaded from the server. Currently, the only valid <i>mode</i>
    is SAMOS which is used with datasets from
    https://tds.coaps.fsu.edu/thredds/catalog/samos to download only the 
    files with the last version number. --&gt;
  &lt;sourceUrl&gt;...&lt;/sourceUrl&gt;  &lt;-- For subclasses like 
    EDDTableFromHyraxFiles and EDDTableFromThreddsFiles, this is where 
    you specify the base URL for the files on the remote server.  
    For subclasses that get data from local files, ERDDAP™ doesn't use 
    this information to get the data, but does display the 
    information to users. So I usually use "(local files)". --&gt;
  &lt;fileDir&gt;...&lt;/fileDir&gt; &lt;-- The directory (absolute) with the data 
    files. --&gt;
  &lt;recursive&gt;true|false&lt;/recursive&gt; &lt;!-- 0 or 1. Indicates if 
    subdirectories of fileDir have data files, too. --&gt;
  <a rel="help" href="#pathRegex">&lt;pathRegex&gt;</a>...&lt;/pathRegex&gt;  &lt;!-- 0 or 1. Only directory names which 
    match the pathRegex (default=".*") will be accepted. --&gt;
  &lt;fileNameRegex&gt;...&lt;/fileNameRegex&gt; &lt;-- 0 or 1. A <a rel="help" href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/regex/Pattern.html"
  >regular expression<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
    (<a rel="help" href="https://www.vogella.com/tutorials/JavaRegularExpressions/article.html">tutorial<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>) describing valid data file names, for example, 
        ".*\.nc" for all .nc files. --&gt;
  <a rel="help" href="#accessibleViaFiles">&lt;accessibleViaFiles&gt;</a>true|false(default)&lt;/accessibleViaFiles&gt; 
    &lt;!-- 0 or 1 --&gt;
  &lt;metadataFrom&gt;...&lt;/metadataFrom&gt; &lt;-- The file to get metadata
    from ("first" or "last" (the default) based on file's 
    lastModifiedTime). --&gt;
  <a class="selfLink" id="ETFFcharset" href="#ETFFcharset" rel="bookmark">&lt;charset&gt;...&lt;/charset&gt;</a> 
    &lt;!-- (For EDDTableFromAsciiFiles and EDDTableFromColumnarAsciiFiles 
    only) This OPTIONAL tag specifies the character set (case 
    sensitive!) of the source files, for example, ISO-8859-1 
    (the default) and UTF-8.  --&gt; 
  <a href="#skipHeaderToRegex" rel="bookmark">&lt;skipHeaderToRegex&gt;</a>...&lt;/skipHeaderToRegex&gt; 
  <a href="#skipLinesRegex" rel="bookmark">&lt;skipLinesRegex&gt;</a>...&lt;/skipLinesRegex&gt;
  <a class="selfLink" id="columnNamesRow" href="#columnNamesRow" rel="bookmark">&lt;columnNamesRow&gt;...&lt;/columnNamesRow&gt;</a> &lt;-- (For EDDTableFromAsciiFiles 
    only) This specifies the number of the row with the column 
    names in the files. (The first row of the file is "1".
    Default = 1.)  If you specify 0, ERDDAP™ will not look for 
    column names and will assign names: Column#1, Column#2, ... --&gt;
  &lt;firstDataRow&gt;...&lt;/firstDataRow&gt; 
    &lt;-- (For EDDTableFromAsciiFiles and EDDTableFromColumnarAsciiFiles 
    only) This specifies the number of the first row with data in the 
    files. (The first row of the file is "1". Default = 2.) --&gt;
  &lt;dimensionsCSV&gt;...&lt;/dimensionsCSV&gt; &lt;-- (For EDDTableFromNcFiles
    and EDDTableFromMultidimNcFiles only) This is a comma-separated
    list of dimension fullNames. If specified, ERDDAP™ will only read
    variables in the source files which use some or all of these 
    dimensions, plus all of the scalar variables. If a dimension 
    is in a group, you must specify its fullName,
    e.g., "<i>groupName/dimensionName</i>". --&gt;
  &lt;-- The next four tags are DEPRECATED. For more information, see 
    <a rel="help" href="#fileNameSourceNames">File Name Extracts</a>. --&gt;
  &lt;preExtractRegex&gt;...&lt;/preExtractRegex&gt;
  &lt;postExtractRegex&gt;...&lt;/postExtractRegex&gt;
  &lt;extractRegex&gt;...&lt;/extractRegex&gt;
  &lt;columnNameForExtract&gt;...&lt;/columnNameForExtract&gt; 
  &lt;sortedColumnSourceName&gt;...&lt;/sortedColumnSourceName&gt; 
    &lt;-- The <a rel="help" href="#sourceName">sourceName</a> of the numeric column that the data files are 
    usually already sorted by within each file, for example, "time".
    Don't specify this or use an empty string if no variable is 
    suitable. It is ok if not all files are sorted by this column.
    If present, this can greatly speed up some data requests. 
    For EDDTableFromHyraxFiles, EDDTableFromNcFiles and 
    EDDTableFromThreddsFiles, this must be the leftmost (first) axis variable. 
    EDDTableFromMultidimNcFiles ignores this because it has a better 
    system. --&gt;
  &lt;sortFilesBySourceNames&gt;...&lt;/sortFilesBySourceNames&gt;
    &lt;-- This is a space-separated list of <a rel="help" href="#sourceName">sourceName</a>s 
    which specifies how the internal list of files should be sorted
    (in ascending order), for example "id time". 
    It is the minimum value of the specified columns in each file
    that is used for sorting.
    When a data request is filled, data is obtained from the files
    in this order. Thus it determines the overall order of the data
    in the response.  If you specify more than one column name, the
    second name is used if there is a tie for the first column; the
    third is used if there is a tie for the first and second 
    columns; ... This is OPTIONAL (the default is 
    fileDir+fileName order). --&gt;
  <!--&lt;isLocal&gt;false&lt;isLocal&gt; &lt;!- - (may be true or false, the default). This is only used by EDDTableFromNcCFFiles. It indicates if the files are local (actual files) or remote     (accessed via the web). The two types are treated slightly differently.-->  
  <a rel="help" href="#sourceNeedsExpandedFP_EQ">&lt;sourceNeedsExpandedFP_EQ&gt;</a>true(default)|false&lt;/sourceNeedsExpandedFP_EQ&gt;
  <a rel="help" href="#fileTableInMemory">&lt;fileTableInMemory&gt;</a>...&lt;/fileTableInMemory&gt; &lt;!-- 0 or 1 (true or 
    false (the default)) --&gt;
  <a rel="help" href="#cacheFromUrl">&lt;cacheFromUrl&gt;</a>...&lt;/cacheFromUrl&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#cacheFromUrl">&lt;cacheSizeGB&gt;</a>...&lt;/cacheSizeGB&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#globalAttributes">&lt;addAttributes&gt;</a>...&lt;/addAttributes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#dataVariable">&lt;dataVariable&gt;</a>...&lt;/dataVariable&gt; &lt;!-- 1 or more --&gt;
    &lt;-- For EDDTableFromHyraxFiles, EDDTableFromMultidimNcFiles, 
    EDDTableFromNcFiles, EDDTableFromNccsvFiles, and 
    EDDTableFromThreddsFiles, the source's axis variables (for 
    example, time) needn't be first or in any specific order. --&gt;
&lt;/dataset&gt;
</pre>
&nbsp;
</ul>


<p><a class="selfLink" id="EDDTableFromAsciiService" href="#EDDTableFromAsciiService" rel="bookmark"><strong>EDDTableFromAsciiService</strong></a> 
is essentially a screen scraper.
It is intended to deal with data sources which have
a simple web service for requesting data (often an HTML form on a web page) 
and which can return the data in some structured ASCII format (for example, 
a comma-separated-value or columnar ASCII text format, 
often with other information before and/or after the data).  

<p>EDDTableFromAsciiService is the 
superclass of all EDDTableFromAsciiService... classes. 
You can't use EDDTableFromAsciiService directly. 
Instead, use a subclass of EDDTableFromAsciiService to handle specific types
of services:
  <ul>
  <li><a rel="help" href="#EDDTableFromAsciiServiceNOS">EDDTableFromAsciiServiceNOS</a> 
    gets data from NOAA NOS's ASCII services.
  </ul>
Currently, no other service types are supported. 
But it is usually relatively easy to support other services if they work in a similar way.
Contact us if you have a request.

<p><a class="selfLink" id="EDDTableFromAsciiService_Details" href="#EDDTableFromAsciiService_Details" rel="bookmark">Details</a> -- 
The following information applies to all of the subclasses of EDDTableFromAsciiService.
<ul>
<li>Constraints -- ERDDAP™ tabular data requests can put constraints on any variable. The underlying service may or may not allow constraints on all variables.
For example, many services only support constraints on station names, latitude,
longitude, and time. So when a subclass of EDDTableFromAsciiService gets a request 
for a subset of a dataset, 
it passes as many constraints as possible to the source data service
and then applies the remaining constraints to the data returned by the service,
before handing the data to the user.

<li>Valid Range -- Unlike many other dataset types, EDDTableFromAsciiService
usually doesn't know the range of data for each variable, so it can't quickly reject
requests for data outside of the valid range.

<li>Parsing the ASCII Text Response -- When EDDTableFromAsciiService gets a response
from an ASCII Text Service, it must validate that the response has the expected format and
information, and then extract the data. 
You can specify the format by using various special
tags in the chunk of XML for this dataset:
  <ul>
  <li><kbd>&lt;beforeData1&gt;</kbd> through <kbd>&lt;beforeData10&gt;</kbd> tags -- 
  You can specify a series of pieces of text (as many as you want, up to 10) that     
  EDDTableFromAsciiService must look for in the header of the ASCII text returned 
  by the service with <kbd>&lt;beforeData1&gt;</kbd> through <kbd>&lt;beforeData10&gt;</kbd>.
  For example, this is useful for verifying that the response includes
  the expected variables using the expected units.
  The last beforeData tag that you specify identifies the text that occurs
  right before the data starts.
  <li><kbd>&lt;afterData&gt;</kbd> -- 
  This specifies the text that EDDTableFromAsciiService will look for in the ASCII text returned by the service which signifies the end of the data.
  <li><kbd>&lt;noData&gt;</kbd> --   
  If EDDTableFromAsciiService finds this text in the ASCII text returned by the service, it concludes that there is no data which matches the request. 
  </ul>
<li><a class="selfLink" id="EDDTableFromAsciiServiceSkeletonXML" href="#EDDTableFromAsciiServiceSkeletonXML" rel="bookmark"><strong>The skeleton XML</strong>
  for all EDDTableFromAsciiService subclasses is:</a>
<pre>
&lt;dataset type="EDDTableFromAsciiService..." <a rel="help" href="#datasetID">datasetID</a>="..." <a rel="help" href="#active">active</a>="..." &gt;
  <a rel="help" href="#accessibleTo">&lt;accessibleTo&gt;</a>...&lt;/accessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#graphsAccessibleTo">&lt;graphsAccessibleTo&gt;</a>auto|public&lt;/graphsAccessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#reloadEveryNMinutes">&lt;reloadEveryNMinutes&gt;</a>...&lt;/reloadEveryNMinutes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultDataQuery">&lt;defaultDataQuery&gt;</a>...&lt;/defaultDataQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultGraphQuery">&lt;defaultGraphQuery&gt;</a>...&lt;/defaultGraphQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#addVariablesWhere">&lt;addVariablesWhere&gt;</a>...&lt;/addVariablesWhere&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#fgdcFile">&lt;fgdcFile&gt;</a>...&lt;/fgdcFile&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#iso19115File">&lt;iso19115File&gt;</a>...&lt;/iso19115File&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#onChange">&lt;onChange&gt;</a>...&lt;/onChange&gt; &lt;!-- 0 or more --&gt;
  &lt;sourceUrl&gt;...&lt;/sourceUrl&gt;  
  &lt;beforeData1&gt;...&lt;beforeData1&gt; &lt;!-- 0 or 1 --&gt;
  ...
  &lt;beforeData10&gt;...&lt;beforeData10&gt; &lt;!-- 0 or 1 --&gt;
  &lt;afterData&gt;...&lt;afterData&gt; &lt;!-- 0 or 1 --&gt; 
  &lt;noData&gt;...&lt;noData&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#globalAttributes">&lt;addAttributes&gt;</a>...&lt;/addAttributes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#dataVariable">&lt;dataVariable&gt;</a>...&lt;/dataVariable&gt; &lt;!-- 1 or more --&gt;
&lt;/dataset&gt;
</pre>
&nbsp;
</ul>


<p><a class="selfLink" id="EDDTableFromAsciiServiceNOS" href="#EDDTableFromAsciiServiceNOS" rel="bookmark"><strong>EDDTableFromAsciiServiceNOS</strong></a> 
  makes EDDTable datasets from the ASCII text data services offered by NOAA's
  <a rel="bookmark" href="https://oceanservice.noaa.gov/"
    >National&nbsp;Ocean&nbsp;Service&nbsp;(NOS)<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
  For information on how this class works and how to use it, see 
  this class's superclass 
  <a rel="help" href="#EDDTableFromAsciiService">EDDTableFromAsciiService</a>.
  It is unlikely that anyone other than Bob Simons will need to use this subclass.

  <p>Since the data within the response from a NOS service uses a columnar 
  ASCII text format, data variables other than latitude and longitude
  must have a special attribute which specifies which characters of each data line 
  contain that variable's data, for example,
  <br><kbd>&lt;att name="responseSubstring"&gt;17, 25&lt;/att&gt;</kbd>
  <br>&nbsp;


<!-- p><a class="selfLink" id="EDDTableFromMWFS" href="#EDDTableFromMWFS" rel="bookmark"><strong>EDDTableFromMWFS</strong></a> handles data from a 
<a rel="help" href="
WAS    https://csc-s-ial-p.csc.noaa.gov/DTL/DTLProjects/microwfs/microWFS.html">microWFS<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> source.
<ul>
<li>This won't work for other WFS servers.
<li><a class="selfLink" id="EDDTableFromMWFSSkeletonXML" href="#EDDTableFromMWFSSkeletonXML" rel="bookmark">The skeleton XML for an EDDTableFromMWFS dataset is:</a>
<pre>
&lt;dataset type="EDDTableFromMWFS" <a rel="help" href="#datasetID">datasetID</a>="..." <a rel="help" href="#active">active</a>="..." &gt;
  <a rel="help" href="#sourceUrl">&lt;sourceUrl&gt;</a>...&lt;/sourceUrl&gt;
  <a rel="help" href="#accessibleTo">&lt;accessibleTo&gt;</a>...&lt;/accessibleTo&gt; &lt;!- - 0 or 1 - -&gt;
  <a rel="help" href="#graphsAccessibleTo">&lt;graphsAccessibleTo&gt;</a>auto|public&lt;/graphsAccessibleTo&gt; &lt;!- - 0 or 1 - -&gt;
  <a rel="help" href="#reloadEveryNMinutes">&lt;reloadEveryNMinutes&gt;</a>...&lt;/reloadEveryNMinutes&gt; &lt;!- - 0 or 1 - -&gt;
  <a rel="help" href="#defaultDataQuery">&lt;defaultDataQuery&gt;</a>...&lt;/defaultDataQuery&gt; &lt;! - - 0 or 1 - - &gt;
  <a rel="help" href="#defaultGraphQuery">&lt;defaultGraphQuery&gt;</a>...&lt;/defaultGraphQuery&gt; &lt;! - - 0 or 1 - - &gt;
  <a rel="help" href="#addVariablesWhere">&lt;addVariablesWhere&gt;</a>...&lt;/addVariablesWhere&gt; &lt;!- - 0 or 1 - -&gt;
  <a rel="help" href="#fgdcFile">&lt;fgdcFile&gt;</a>...&lt;/fgdcFile&gt; &lt;!- - 0 or 1 - -&gt;
  <a rel="help" href="#iso19115File">&lt;iso19115File&gt;</a>...&lt;/iso19115File&gt; &lt;!- - 0 or 1 - -&gt;
  <a rel="help" href="#onChange">&lt;onChange&gt;</a>...&lt;/onChange&gt; &lt;!- - 0 or more - -&gt;
  &lt;longitudeSourceMinimum&gt;...&lt;/longitudeSourceMinimum&gt; &lt;- - OPTIONAL - -&gt;
  &lt;longitudeSourceMaximum&gt;...&lt;/longitudeSourceMaximum&gt; &lt;- - OPTIONAL - -&gt;
  &lt;latitudeSourceMinimum&gt;...&lt;/latitudeSourceMinimum&gt; &lt;- - OPTIONAL - -&gt;
  &lt;latitudeSourceMaximum&gt;...&lt;/latitudeSourceMaximum&gt; &lt;- - OPTIONAL - -&gt;
  &lt;altitudeSourceMinimum&gt;...&lt;/altitudeSourceMinimum&gt; &lt;- - OPTIONAL - -&gt;
  &lt;altitudeSourceMaximum&gt;...&lt;/altitudeSourceMaximum&gt; &lt;- - OPTIONAL - -&gt;
  &lt;timeSourceMinimum&gt;...&lt;/timeSourceMinimum&gt; &lt;- - OPTIONAL, in yyyy-MM-dd'T'HH:mm:ssZ format - -&gt;
  &lt;timeSourceMaximum&gt;...&lt;/timeSourceMaximum&gt; &lt;- - OPTIONAL, in yyyy-MM-dd'T'HH:mm:ssZ format - -&gt;
  <a rel="help" href="#globalAttributes">&lt;addAttributes&gt;</a>...&lt;/addAttributes&gt; &lt;!- - 0 or 1 - -&gt;
  <a rel="help" href="#dataVariable">&lt;dataVariable&gt;</a>...&lt;/dataVariable&gt; &lt;!- - 1 or more - -&gt;
&lt;/dataset&gt;
</pre>
&nbsp;
</ul -->

<p><a class="selfLink" id="EDDTableFromAllDatasets" href="#EDDTableFromAllDatasets" rel="bookmark"><strong>EDDTableFromAllDatasets</strong></a> is a higher-level dataset
which has information about all of the other datasets which are currently loaded
in your ERDDAP. 
Unlike other types of datasets, there is no specification for the allDatasets 
dataset in datasets.xml.
ERDDAP™ automatically creates one EDDTableFromAllDatasets dataset 
(with datasetID=allDatasets).
Thus, an allDatasets dataset will be created in each ERDDAP™ installation
and will work the same way in each ERDDAP™ installation.

<p>The allDatasets dataset is a tabular dataset.
It has a row of information for each dataset.
It has columns with information about each dataset, e.g., datasetID, accessible,
institution, title, minLongitude, maxLongitude, minLatitude, maxLatitude, minTime, maxTime,
etc. Because allDatasets is a tabular dataset, you can query it the same way you can 
query any other tabular dataset in ERDDAP™, and you can specify the file type
for the response. This lets users search for datasets of interest in very powerful ways.
<br>&nbsp;


<p><a class="selfLink" id="EDDTableFromAsciiFiles" href="#EDDTableFromAsciiFiles" rel="bookmark"
><strong>EDDTableFromAsciiFiles</strong></a> aggregates data from
  comma-, tab-, semicolon-, or space-separated tabular ASCII data files.
<ul>
<li>Most often, the files will have column names on the first row and data 
  starting on the second row. (Here, the first row of the file is called row number 1.)
  But you can use <kbd>&lt;columnNamesRow&gt;</kbd> and <kbd>&lt;firstDataRow&gt;</kbd> 
  in your datasets.xml file to
  specify a different row number.
<li>ERDDAP™ allows the rows of data to have different numbers of data values.
  ERDDAP™ assumes that the missing data values are the final columns in the row.
  ERDDAP™ assigns the standard missing value values for the missing data values. (added v1.56)
<li>ASCII files are easy to work with, but they are not the most efficient way to store/retrieve data. 
  For greater efficiency, save the files as NetCDF v3 .nc files 
  (with one dimension, "row", shared by all variables) instead.
  You can 
  <a rel="help" href="#EDDTableFromFiles_MillionsOfFiles"
  >use ERDDAP™ to generate the new files</a>.
<li>See this class' superclass, <a rel="help" href="#EDDTableFromFiles">EDDTableFromFiles</a>, 
  for information on how this class works and how to use it.
<li>We strongly recommend using the
  <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml program</a> 
  to make a rough draft of the datasets.xml chunk for this dataset.
  Because of the total lack of metadata in ASCII files, 
  you will always need to edit the results of GenerateDatasetsXml.
<li>WARNING: When ERDDAP™ reads ASCII data files, if it finds an error 
  on a given line (e.g., incorrect number of items), it logs a warning message 
  ("WARNING: Bad line(s) of data" ... with a list of the bad lines on subsequent lines) 
  to the 
  <a rel="help" href="https://erddap.github.io/setup.html#log">log.txt file</a> 
  and then continues to read the rest of the data file.
  Thus, it is your responsibility to look periodically (or write a script to do so)
  for that message in the log.txt so that you can fix the problems in the data
  files. ERDDAP™ is set up this way so that users can continue to read all of
  the available valid data even though some lines of the file have flaws. 
  <br>&nbsp;
</ul>


<p><a class="selfLink" id="EDDTableFromAwsXmlFiles" href="#EDDTableFromAwsXmlFiles" rel="bookmark"><strong>EDDTableFromAwsXmlFiles</strong></a> aggregates data from
  a set of Automatic Weather Station (AWS) XML data files using the 
  WeatherBug Rest XML API (which is no longer active).
<ul>
<li>This type of file is a simple but inefficient way to store the data, 
  because each file usually seems to contain the observation from just one time point.  
  So there may be a large number of files.  If you want to improve performance,
  consider consolidating groups of observations (a week's worth?) 
  in NetCDF v3 .nc files
  (best: .nc files with the 
  <a href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html#discrete-sampling-geometries">CF Discrete Sampling Geometries (DSG) Contiguous Ragged Array format<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a>)
  and using <a href="#EDDTableFromMultidimNcFiles">EDDTableFromMultidimNcFiles</a> 
        (or <a href="#EDDTableFromNcCFFiles">EDDTableFromNcCFFiles</a>) 
  to serve the data.
  You can 
  <a rel="help" href="#EDDTableFromFiles_MillionsOfFiles"
  >use ERDDAP™ to generate the new files</a>.

<li>See this class' superclass, 
  <a rel="help" href="#EDDTableFromFiles">EDDTableFromFiles</a>, for information
  on how this class works and how to use it.
  <br>&nbsp;
</ul>


<p><a class="selfLink" id="EDDTableFromColumnarAsciiFiles" href="#EDDTableFromColumnarAsciiFiles" rel="bookmark"><strong>EDDTableFromColumnarAsciiFiles</strong></a> 
  aggregates data from tabular ASCII data files with fixed-width columns.
<ul>
<li>Most often, the files will have column names on the first row and data 
  starting on the second row.
  The first line/row in the file is called row #1.
  But you can use <kbd>&lt;columnNamesRow&gt;</kbd> and <kbd>&lt;firstDataRow&gt;</kbd> 
  in your datasets.xml file to
  specify a different row number.

<li>The &lt;addAttributes&gt; for each &lt;dataVariable&gt; for these datasets
  MUST include these two special attributes:
  <ul>
  <li>&lt;att name="startColumn"&gt;<i>integer</i>&lt;att&gt; -- 
    specifies the character column in each line that is the start of this data variable.
  <li>&lt;att name="stopColumn"&gt;<i>integer</i>&lt;att&gt; -- 
    specifies the character column in each line that is the 1 after the end of this data variable.
  </ul>
  The first character column is called column #0.
  <br>For example, for this file that has time values abutting temperature values :<pre>
  0         1         2        &lt;-- character column number 10's digit
  0123456789012345678901234567 &lt;-- character column number 1's digit
  time                temp
  2014-12-01T12:00:00Z12.3
  2014-12-02T12:00:00Z13.6
  2014-12-03T12:00:00Z11.0</pre>the time data variable would have <kbd>
  <br>&nbsp;&nbsp;&lt;att name="startColumn"&gt;0&lt;att&gt;
  <br>&nbsp;&nbsp;&lt;att name="stopColumn"&gt;20&lt;att&gt;  
  </kbd><br>and the time data variable would have <kbd>
  <br>&nbsp;&nbsp;&lt;att name="startColumn"&gt;20&lt;att&gt;
  <br>&nbsp;&nbsp;&lt;att name="stopColumn"&gt;24&lt;att&gt;  
  </kbd>
  <br>These attributes MUST be specified for all variables except 
    <a rel="help" href="#fixedValue">fixed-value</a>
    and <a rel="help" href="#fileNameSourceNames">file-name-source-names</a>
    variables. 
<li>ASCII files are easy to work with, but they are not an efficient way to 
  store/retrieve data. 
  For greater efficiency, save the files as NetCDF v3 .nc files 
  (with one dimension, "row", shared by all variables) instead.
  You can 
  <a rel="help" href="#EDDTableFromFiles_MillionsOfFiles"
  >use ERDDAP™ to generate the new files</a>.
<li>See this class' superclass, <a rel="help" href="#EDDTableFromFiles">EDDTableFromFiles</a>, 
  for information on how this class works and how to use it.
<li>We strongly recommend using the
  <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml program</a> 
  to make a rough draft of the datasets.xml chunk for this dataset.
  Because of the difficulty of determining the start and end positions for each
  data column and the total lack of metadata in ASCII files, 
  you will always need to edit the results from GenerateDatasetsXml.
  <br>&nbsp;
</ul>


<!-- **************** -->
<p><a class="selfLink" id="EDDTableFromHttpGet" href="#EDDTableFromHttpGet" rel="bookmark"
><strong>EDDTableFromHttpGet</strong></a> is different 
from all other types of datasets in ERDDAP™ in that it has a system whereby 
specific "authors" can add data, revise data, or delete data from the dataset 
by regular HTTP GET or <a href="#HttpGetPost" rel="bookmark">POST</a>
requests from a computer program, a script or a browser. 
The dataset is queryable by users in the same way that all other EDDTable 
datasets are queryable in ERDDAP.
See the description of this class's superclass, 
<a rel="help" href="#EDDTableFromFiles">EDDTableFromFiles</a>,
to read about the features which are inherited from that superclass.
<p>The unique features of EDDTableFromHttpGet are described below. 
You need to read all of this initial section and understand it; otherwise, you may have
unrealistic expectations or get yourself into trouble that is hard to fix.

<ul>
<li><a class="selfLink" id="HttpGetIntendedUse" href="#HttpGetIntendedUse" rel="bookmark"
>Intended Use</a>
<br>This system is intended for:
  <ul>
  <li>Tabular (in situ) data, not gridded data.
  <li>Real time data -
    <br>The goal is to allow an author
      (e.g., the sensor, an automated QC script, or a specific human)
      to make a change to the dataset
      (via an <a href="#HttpGetInsertDelete" rel="help">.insert or .delete command</a>)
      and make that change accessible to ERDDAP™ users,
      all in less than 1 second, and possibly much faster.
      Most of that 1 second is network time. ERDDAP™ can process the
      request in about 1 ms and the data is immediately accessible to users.
      This is a <a href="#HttpGetSpeed" rel="help">fast</a>,
      <a href="#HttpGetRobust" rel="help">robust</a>, and
      <a href="#HttpGetSystemReliability" rel="help">reliable system</a>.
  <li>Almost any frequency of data -
    <br>This system can accept infrequent data (e.g., daily) through
      very frequent data (e.g., 100 Hz data). 
      If you optimize the system, it can handle higher frequency data
      (perhaps 10 KHz data if you go to extremes).
  <li>Data from one sensor or a collection of similar sensors.
  <li><a href="#HttpGetVersioning" rel="help">Versioning</a> / 
    <a rel="help" href="https://en.wikipedia.org/wiki/Reproducibility"
    >Reproducible Science<img 
    src="../images/external.png" alt=" (external link)" 
    title="This link to an external website does not constitute an endorsement."></a> 
    / DOIs -- 
    <br>Situations where you need to be able to make changes to the data
    (e.g., change a quality control flag), know which author made each change,
    know the timestamp of when the author made the change, 
    and (upon request) be able to see the original data from before the change was made.  
    Thus, these datasets are eligible for 
    <a rel="bookmark" href="https://en.wikipedia.org/wiki/Digital_object_identifier"
        >DOIs<img src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>. 
    because they meet the DOI requirement
    that the dataset is unchanging, except by aggregation. In general,
    near real time datasets are not eligible for DOIs because the
    data is often retroactively changed (e.g., for QA/QC purposes).

    <br>&nbsp;
  </ul>
  Once data is in an EDDTableFromHttpGet dataset, any user can request
  data in the same way that they request data from any other EDDTable dataset.
  <br>&nbsp;


<li><a class="selfLink" id="HttpGetBeCareful" href="#HttpGetBeCareful" rel="bookmark"
>EXPERIMENTAL! Be Careful!</a>
<br>Since this system is new and since lost environmental data can't be reacquired,
you should treat EDDTableFromHttpGet as experimental. If you are transitioning from
another system, please run the old system and the new system in parallel 
until you are confident that the new system works well (weeks or months, not just hours or days).
In all cases, please make sure your system separately archives the .insert and .delete
URLs that are sent to the EDDTableFromHttpGet dataset 
(even if just in the Apache and/or Tomcat logs), at least for a while.
And in all cases, make sure that the data files created by your EDDTableFromHttpGet 
dataset are routinely backed up to external data storage devices. (Note that 
<a rel="help" href="https://en.wikipedia.org/wiki/Rsync">rsync<img 
  src="../images/external.png" alt=" (external link)" 
  title="This link to an external website does not constitute an endorsement."></a>.
can back up the data files created by EDDTableFromHttpGet very efficiently.)
  <br>&nbsp;


<li><a class="selfLink" id="HttpGetInsertDelete" href="#HttpGetInsertDelete" rel="bookmark"
>.insert and .delete</a>
<br>For any dataset in ERDDAP™, 
when you send a request to ERDDAP™ for a subset of the data in a dataset,
you specify the file type that you want for the response, e.g.,
.csv, .htmlTable, .nc, .json.
EDDTableFromHttpGet extends this system to support two additional "file types"
which can insert (or change) or delete data in the dataset:
  <ul>
  <li>.insert
    <ul>
    <li>The request is formatted like a standard HTML form response, with 
       key=value pairs, separated by '&amp;'. For example, 
      <br><span class="N"><kbd>https://<i>some.erddap.url</i>/erddap/tabledap/myDataset<b>.insert</b>?stationID=46088&amp;time=2016-03-30T12:37:55Z&amp;latitude=10.1&amp;longitude=-150.1&amp;airTemp=17.23&amp;waterTemp=12.3&amp;author=JohnSmith_someKey1</kbd></span>
      <br>tells ERDDAP™ to add or change the data for stationID=46088 for the specified time.
    <li>The author of this change is JohnSmith and the key is <kbd>someKey1</kbd>.
    <li>The URL must include valid values (not missing values) for all of the 
      <a rel="help" href="#httpGetRequiredVariables">httpGetRequiredVariables</a>
    <li>If the values of the httpGetRequiredVariables
      in the request (e.g., stationID and time) match the values on a row already 
      in the dataset, the new values
      effectively overwrite the old values (although the old values are still accessible
      if the user requests data from a previous 
      <a rel="help" href="#HttpGetVersioning">version</a> of the dataset).
    <li>The .insert URL must never include 
      <kbd>&amp;timestamp=</kbd> (ERDDAP™ generates that value) or 
      <kbd>&amp;command=</kbd> (that is specified by .insert (which is command=0) or .delete (which is command=1)).
    <li>If the .insert URL doesn't specify values for other columns that are in the 
      dataset, they are assumed to be the native missing values (MAX_VALUE for
      integer data types, NaN for floats and doubles, and "" for Strings).
      <br>&nbsp;
    </ul>
  <li>.delete
    <ul>
    <li>The request is formatted like a standard HTML form response, with 
       key=value pairs, separated by '&amp;'. For example, 
      <br><span class="N"><kbd>https://<i>some.erddap.url</i>/erddap/tabledap/myDataset<b>.delete</b>?stationID=46088&amp;time=2016-03-30T12:37:55Z&amp;author=JohnSmith_someKey1</kbd></span>
      <br>tells ERDDAP™ to delete the data for stationID=46088 at the specified time.
    <li>The author of this change is JohnSmith and the key is <kbd>someKey1</kbd>.
    <li>The URL must specify the 
      <a rel="help" href="#httpGetRequiredVariables">httpGetRequiredVariables</a>
      in the request (e.g., stationID and time).
      If those values match the values on a row already in the dataset 
      (which they usually will), the old values
      are effectively deleted (although the old values are still accessible
      if a user requests data from a previous 
      <a rel="help" href="#HttpGetVersioning">version</a> of the dataset).
    <li>There is no need to specify values for non-HttpGetRequiredVariables, other than
      <kbd>author</kbd>, which is needed to authenticate the request.
      <br>&nbsp;
    </ul>
  </ul>
  Details:
  <ul>
  <li>.insert and .delete requests are formatted like standard HTML form responses, with 
       key=value pairs, separated by '&amp;'.  
       The values must be 
        <a rel="help" href="https://en.wikipedia.org/wiki/Percent-encoding">percent encoded<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
        Thus, you need to encode special characters into the form %HH, 
        where HH is the 2 digit hexadecimal value of the character.
        Usually, you just need to convert a few of the punctuation characters: % into %25, 
        &amp; into %26, " into %22, &lt; into %3C, = into %3D, &gt; into %3E, + into %2B,
        | into %7C, [ into %5B, ] into %5D, space into %20, 
        and convert all characters above #127 into their UTF-8 form and then percent encode
        each byte of the UTF-8 form into the %HH format (ask a programmer for help).
  <li>.insert and .delete requests must include the
    <a rel="help" href="#httpGetRequiredVariables">httpGetRequiredVariables</a>,
    e.g., stationID and time.
    For .insert requests, variables which are not specified in the request are assumed
    to be missing values (MAX_VALUE for integer variables, NaN for float and double variables,
    and an empty String for String variables).
    For .delete requests, values for non-HttpGetRequiredVariables 
    (other than <kbd>author</kbd>, which is required) are ignored.
  <li>.insert and .delete requests must include the name
    of the author and the author's key 
    via a parameter in the form
    <kbd>author=<i>author_key</i></kbd> as the last parameter in the request.
    Requiring this to be last ensures that the entire request has been received by ERDDAP.
    Only the author (not the key) will be stored in the data file.
    You must specify the list of allowed <kbd><i>author_key</i></kbd>'s 
    via the global attribute <a href="#httpGetKeys" rel="help">httpGetKeys</a>
  <li>.insert and .delete parameters may be scalar (single) values or arrays
    of any length in the form <kbd>[value1,value2,value3,...,valueN]</kbd>.
    For a given request, all variables with arrays must have arrays with 
    the same number of values (else it is an error).
    If a request has scalar and array values, the scalar values are 
    replicated to become arrays with the same length as the specified arrays,
    e.g., <kbd>&amp;stationID=46088</kbd> might be treated as <kbd>&amp;stationID=[46088,46088,46088]</kbd>.
    Arrays are the key to 
    <a rel="help" href="#HttpGetSpeed">high throughput</a>.
    Without arrays, it will be challenging to .insert or .delete
    more than 8 rows of data per second from a remote author (because of all
    the overhead of the network).
    With arrays, it will be easy to .insert or .delete
    more than 1000 rows of data per second from a remote sensor.
  <li>.insert and .delete accept (without an error message) 
    floating point numbers when integers 
    are expected. In these cases, the dataset rounds the values to integers.
  <li>.insert and .delete accept (without an error message)
    integer and floating point numbers
    which are out-of-range of the variable's data type.
    In these cases, the dataset stores the values
    as ERDDAP's native missing values for that data type (MAX_VALUE for integer
    types and NaN for floats and doubles).
    <br>&nbsp;
  </ul>

<li><a class="selfLink" id="HttpGetResponse" href="#HttpGetResponse" rel="bookmark"
>Response</a>
<br>If the .insert or .delete URL succeeds, the HTTP response code will be 200 (OK)
and the response will be text with a .json object, e.g.,
<pre>{
"status":"success",
"nRowsReceived":1,
"stringTimestamp":"2018-11-05T22:12:19.517Z",
"numericTimestamp":1.541455939517+E9
}</pre>
Note that the timestamps have millisecond precision.

<p>If the .insert or .delete URL fails, you will get an HTTP response 
code other than 200 (Okay), e.g.,
Error 403 Forbidden if you submit an incorrect author_key value.
ERDDAP™ sends the HTTP response code (not, eg., a .json formatted error)
because that's how things are done in the internet
and because errors can occur anywhere in the system (e.g., in the network,
which returns an HTTP error).
If the error is from ERDDAP™, the response may include some text (not .json) with a
more detailed explanation of what went wrong, but the HTTP response
code (200=Okay, anything else is trouble) is the proper way to 
check if the .insert or .delete succeeded.
If checking the HTTP response code isn't possible or is inconvenient,
search for <kbd>"status":"success"</kbd> in the response text
which should be a reliable indication of success.

<li><a class="selfLink" id="HttpGetLogFiles" href="#HttpGetLogFiles" rel="bookmark"
>Log Files</a>
<br>When EDDTableFromHttpGet receives .insert and .delete commands, it
simply appends the information to the relevant file in a set of log files, 
each of which is a table stored in a 
<a rel="help"
        href="https://jsonlines.org/examples/"
        >JSON Lines CSV file<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>. 
When a user makes a request for data, ERDDAP™ quickly reads the relevant log files, 
applies the changes to the dataset in the order they were made, and then 
filters the request via the user's constraints like any other ERDDAP™ data request.
The partitioning of the data into various log files, the storage of 
various pieces of information (e.g., the timestamp of the command, and whether 
the command was .insert or .delete), and various aspects of the setup of the dataset,
all make it possible for ERDDAP
to store data to and retrieve data from this dataset very quickly and very efficiently.
<br>&nbsp;

<li><a class="selfLink" id="HttpGetSecurityAndAuthor" href="#HttpGetSecurityAndAuthor" rel="bookmark"
>Security and Author</a>
<br>Every .insert and .delete command must include <kbd>&amp;author=<i>author_key</i></kbd>
as the last parameter,
where author_key is composed of the author's identifier (you chose: name, initials,
pseudonym, number), an underscore, and a secret key. The ERDDAP™ administrator will work with authors
to generate the list of valid author_key values, which can be changed at any time.

<br>When EDDTableFromHttpGet receives an .insert or .delete command,
it makes sure that the authorID_key is the last parameter and valid. 
Because it is the last parameter, it indicates that the entire command line 
reached ERDDAP™ and wasn't truncated.
The secret key ensures that only specific authors may insert or delete
data in the dataset.

ERDDAP™ then extracts the authorID and saves that in the author variable, so that 
anyone can see who was responsible for a given change to the dataset.

<br>.insert and .delete commands can only be made via https: (secure) ERDDAP™ URLs.
This ensures that the information being transferred is kept secret during transit.
<br>&nbsp;

<li><a class="selfLink" id="HttpGetTimestamp" href="#HttpGetTimestamp" rel="bookmark"
>timestamp</a>
<br>As part of the log system, EDDTableFromHttpGet adds a timestamp (the time that ERDDAP
received the request) to each command that it stores in the log files.
Because ERDDAP™ generates the timestamp, not the authors,  
it doesn't matter if different authors are making changes from computers 
with clocks set to slightly different times. The timestamp reliably indicates 
the time when the change was made to the dataset.
<br>&nbsp;


<li><a class="selfLink" id="HttpGetPost" href="#HttpGetPost" rel="bookmark"
>"What about HTTP POST?!"</a>
<br>HTTP <a rel="help" href="https://en.wikipedia.org/wiki/POST_(HTTP)">POST<img 
    src="../images/external.png" alt=" (external link)" 
    title="This link to an external website does not constitute an endorsement."></a>
is the better alternative (compared to HTTP GET) for sending information from a client to an HTTP server.
If you can, or if you really want to improve security, use POST instead of GET to 
send the information to ERDDAP. 
POST is more secure because: with GET and https, the URL is transmitted in a secure way,
but the entire URL (including parameters, including the author_key) will be 
written to the Apache, Tomcat, and ERDDAP™ log files, where someone could read them
if the files are not properly secured.
With POST, the parameters are transmitted in a secure way and aren't written
to the log files. 
POST is a little harder for clients to work with and isn't supported as widely by client software,
but programming languages do support it. 
The content that you send to the dataset via GET or POST will be the
same, just formatted in a different way. 
<br>&nbsp;

<li><a class="selfLink" id="httpGetRequiredVariables" href="#httpGetRequiredVariables" rel="bookmark"
><kbd>httpGetRequiredVariables</kbd> Global Attribute</a>
<br>An essential part of what makes this whole system work is the required global attribute
<kbd>httpGetRequiredVariables</kbd>, which is a comma-separated list of the dataVariable source names
which uniquely identify a row of data.
This should be as minimal as possible and will almost always include the time variable.
For example, here are the recommended httpGetRequiredVariables for each of the 
<a rel="help"
  href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html#discrete-sampling-geometries"
  >CF Discrete Sampling Geometries (DSG)<img 
    src="../images/external.png" alt=" (external link)" 
    title="This link to an external website does not constitute an endorsement."></a>
    (Of course, the ID names may be different in your dataset.):
  <ul>
  <li>For TimeSeries: stationID, time
  <li>For Trajectory: trajectoryID, time
  <li>For Profile: time (assuming time is the profile_id), depth
  <li>For TimeSeriesProfile: stationID, time (assuming time is the profile_id), depth 
  <li>For TrajectoryProfile: trajectoryID, time (assuming time is the profile_id), depth
  </ul>
<br>Taking TimeSeries as an example: 
<br>Given a .insert command that includes stationID=46088 and 
time=2016-06-23T19:53:00Z (and other values for other variables):
  <ul>
  <li>If there is no existing data for that station and that time, then the effect
  will be to add the data to the dataset.
  <li>If there is existing data for that station and that time, then the effect
  will be to replace the existing row of data with this new data. (Of course,
  since ERDDAP™ keeps the log of every command it receives, the old data is 
  still in the log. If a user requests data from a version of the dataset
  before this change, they will see the older data.)
  <br>&nbsp;
  </ul>

<li><a class="selfLink" id="httpGetDirectoryStructure" href="#httpGetDirectoryStructure" rel="bookmark"
><kbd>httpGetDirectoryStructure</kbd> Global Attribute and Data (Log) File Names</a>
<br>Part of what makes this whole system work efficiently is that ERDDAP™ creates
a set of data (log) files, each with a different chunk of the dataset. 
If these are set up well, ERDDAP™ will be able to respond quickly to 
most requests for data.
This setup is specified by the <kbd>httpGetDirectoryStructure</kbd> global attribute,
which is a String that looks like a relative filename,
e.g., "stationID/10years",
but is actually a specification for the directory structure.
The parts of that indicate how directory and filenames for the data (log) files
will be constructed.
  <ul>
  <li>If a part is an integer (&gt;= 1) plus a timePeriod 
    (millisecond, second, minute,
    hour, date, month, year, or their plurals), e.g., 10years, then 
    the EDDTableFromHttpGet dataset will take the time value for the row of data
    (e.g., 2016-06-23T19:53:00Z), calculate the time truncated to that precision
    (e.g., 2010), and make a folder or fileName from that.
    <p>The goal is to get a reasonably large chunk of data into each file,
      but far less than 2GB.
      
  <li>Otherwise, the part of the specification must be a dataVariable's sourceName,
    e.g., stationID.
    In this case, EDDTableFromHttpGet will make a folder or filename
    from the value of that variable for the new row of data (e.g., "46088").
  </ul>
Because the .insert and .delete command data is stored in specific data (log) files,
EDDTableFromHttpGet usually only needs to open one or a few data (log) files to find the
data for a given user request. And because each data (log) file has all of the
relevant information for its chunk of the dataset, it is fast and easy
for EDDTableFromHttpGet to make a specific version (or the current version)
of the dataset for the data in that file (and not have to generate the requested 
version of the entire dataset).

<p>General guidelines are based on the
  quantity and frequency of the data.
  If we assume 100 bytes per row of data, then ...
  <table class="erd" style="border-collapse:collapse;">
  <tr> <th>Frequency<br>of measurements</th> <th>Recommended<br><kbd>httpGetDirectoryStructure</kbd></th> </tr>
  <tr> <td>&gt;=1 per second</td> <td><i>featureID</i>/1year/1day</td> </tr>
  <tr> <td>&gt;=1 per minute</td> <td><i>featureID</i>/2months</td> </tr>
  <tr> <td>&gt;=1 per hour</td>   <td><i>featureID</i>/10years</td> </tr>
  <tr> <td>&gt;=1 per day</td>    <td><i>featureID</i></td> </tr>
  </table>

<p>For example, if the directory structure is <kbd>stationID/2months</kbd>
and you insert data from two stations (46088 and 46155) with time values 
from Dec 2015 through May 2016, EDDTableFromHttpGet will create directories
named 46088 and 46155 and create files in each named 
2015-11.jsonl, 2016-01.jsonl, 2016-03.jsonl, 2016-05.jsonl (each holding 2 months worth
of data for the relevant station).
At any time in the future, 
if you use .insert or .delete to change or delete the data for,
for example, station 46088 at 2016-04-05T14:45:00Z,
EDDTableFromHttpGet will append that command to 46088/2016-03.jsonl,
the relevant data (log) file.
And clearly, it is fine to add data for other stations at any time in the future,
since the dataset will simply create additional directories
as needed to hold the data from the new stations.

<li><a class="selfLink" id="httpGetKeys" href="#httpGetKeys" rel="bookmark"
>httpGetKeys</a>
<br>Every EDDTableFromHttpGet dataset must have a global attribute <kbd>httpGetKeys</kbd>
which specifies the list of allowed authors and their secret keys
as a comma-separated list of <kbd><i>author_key</i></kbd>, e.g., 
<kbd>JohnSmith_someKey1, HOBOLogger_someKey2, QCScript59_someKey3</kbd> .
  <ul>
  <li><kbd>author_key</kbd>'s are case-sensitive and must be entirely ASCII 
    characters (#33 - #126, and without any comma, " or ' characters
  <li>Keys are like passwords, so they MUST be &gt;=8 characters, hard to guess,
    and without internal dictionary words. You should treat them as you would 
    treat passwords -- keep them private.
  <li>The first '_' character separates the author from the key, so the 
    author name can't include a '_' character (but a key can).
  <li>Any given author can have one or more <kbd>author_key</kbd>'s, e.g.,
    JohnSmith_someKey1, JohnSmith_someKey7, etc.
  <li>You can change the value of this attribute any time. The changes take
    effect the next time the dataset is loaded.
  <li>This information will be removed from the dataset's globalAttributes
    before it is made public.
  <li>Each request to the dataset to insert or delete data must include an 
    <kbd>&amp;author=<i>author_key</i></kbd> parameter. After verifying
    the validity of the key, ERDDAP™ only saves the author part (not the key) 
    in the data file.
  </ul>

</ul>


<h3><a class="selfLink" id="HttpGetSetUp" href="#HttpGetSetUp" rel="bookmark"
>Set Up</a></h3>
Here are the recommended steps to setting up an EDDTableFromHttpGet dataset:
<ol>
<li>Make the main directory to hold this dataset's data. 
  For this example, let's use /data/testGet/ .
  The user running GenerateDatasetsXml and the user running ERDDAP™ must both have
  read-write access to this directory.
  <br>&nbsp;
<li>Use a text editor to make a sample .jsonl CSV file with the extension
  <kbd>.jsonl</kbd> in that directory.
  <br>The name isn't important. For example, you could call it <kbd>sample.jsonl</kbd>
  <br>Make a 2 line .jsonl CSV file, with column names on the
  first line and dummy/typical values (of the correct data type) on the second line. 
  Here is a sample file
  that is suitable for a collection of featureType=TimeSeries data that measured
  air and water temperature.
  <br>[For featureType=Trajectory, you might change <kbd>stationID</kbd> to be <kbd>trajectoryID</kbd>.]
  <br>[For featureType=Profile, you might change <kbd>stationID</kbd> to be <kbd>profileID</kbd> 
     and add a <kbd>depth</kbd> variable.]
<pre>
["stationID", "time", "latitude", "longitude", "airTemp", "waterTemp", "timestamp", "author", "command"]
["myStation", "2018-06-25T17:00:00Z", 0.0, 0.0, 0.0, 0.0, 0.0, "SomeBody", 0]
</pre>
  Note:
  <ul>
  <li>The actual data values don't matter because you will eventually delete this file,
    but they should be of the correct data type. Notably, the time variable 
    should use the same format that the actual data from the source will use.
  <li>For all variables, the sourceName will equal the destinationName, so use
    the correct/final variable names now, including <kbd>time, latitude, longitude</kbd>
    and sometimes <kbd>depth</kbd> or <kbd>altitude</kbd> if variables with that information
    will be included.
  <li>There will almost always be a variable named <kbd>time</kbd> which 
    records the time the observation was made. It can be
    dataType <kbd>String</kbd> with 
    <a rel="help" href="#stringTimeUnits">units suitable for string times</a>
    (e.g., <kbd>yyyy-MM-dd'T'HH:mm:ss.SSSZ</kbd>)
    or dataType <kbd>double</kbd> with 
    <a rel="help" href="#timeUnits">units suitable for numeric times</a>
    (e.g., <kbd>seconds since 1970-01-01T00:00:00Z</kbd>, or some other base time).
  <li>Three of the columns (usually the last three) must be <kbd>timestamp, author, command</kbd>.
  <li>The <kbd>timestamp</kbd> column will be used by EDDTableFromHttpGet to add a timestamp
     indicating when it added a given line of data to the data file. 
     It will have dataType <kbd>double</kbd> and units <kbd>seconds since 1970-01-01T00:00:00Z</kbd>.
  <li>The <kbd>author</kbd> column with dataType <kbd>String</kbd> will be used to record which 
    authorized author provided this line's data. Authorized authors are specified
    by the <a rel="help" href="#httpGetKeys"><kbd>httpGetKeys</kbd> global attribute</a>.
    Although the keys are specified as <kbd><i>author_key</i></kbd> and are in the 
    "request" URL in that form, only the author part is saved in the data file.
  <li>The <kbd>command</kbd> column with dataType <kbd>byte</kbd> will indicate 
    if the data on this line is an insertion (0) or a deletion (1).
  <br>&nbsp;
  </ul>

<li>Run GenerateDatasetsXml and tell it
   <ol>
   <li>The dataset type is <kbd>EDDTableFromHttpGet</kbd>
   <li>The directory is (for this example) <kbd>/data/testGet/</kbd>
   <li>The sample file is (for this example) <kbd>/data/testGet/startup.jsonl</kbd>
   <li>The httpGetRequiredVariables are (for this example) <kbd>stationID, time</kbd>
     See the description of 
     <a rel="help" href="#httpGetRequiredVariables">httpGetRequiredVariables</a> below.
   <li>If data is collected every 5 minutes, the httpGetDirectoryStructure for this example 
     is <kbd>stationID/2months</kbd> . See the description of 
     <a rel="help" href="#httpGetDirectoryStructure">httpGetDirectoryStructure</a> below.
   <li>The <a rel="help" href="#httpGetKeys">httpGetKeys</a>
   </ol>
  Add the output (the chunk of datasets.xml for the dataset) to datasets.xml.
  <br>&nbsp;

<li>Edit the datasets.xml chunk for this dataset to make it correct and complete.
  <br>Notably, replace all the <kbd>???</kbd> with correct content.
  <br>&nbsp;
<li>For the &lt;fileTableInMemory&gt; setting:
  <ul>
  <li>Set this to <kbd>true</kbd> if the dataset will usually get frequent 
    .insert and/or .delete requests (e.g,. more often than once every 10 seconds).
    This helps EDDTableFromHttpGet respond faster to .insert and/or .delete requests. 
    If you set this to <kbd>true</kbd>, EDDTableFromHttpGet
    will still save the fileTable and related information to disk periodically
    (as needed, roughly every 5 seconds).
  <li>Set this to <kbd>false</kbd> (the default) if the dataset will usually get infrequent 
    .insert and/or .delete requests (e.g., less than once every 10 seconds).
    <br>&nbsp;
  </ul>
<li>Note: It is possible to use <kbd>&lt;cacheFromUrl&gt;</kbd> and related
  settings in datasets.xml for EDDTableFromHttpGet datasets
  as a way to make and maintain a local copy of a remote EDDTableFromHttpGet dataset
  on another ERDDAP.
  However, in this case, this local dataset will reject any .insert and .delete
  requests.
</ol>

<h3><a class="selfLink" id="HttpGetUsing" href="#HttpGetUsing" rel="bookmark"
>Using EDDTableFromHttpGet Datasets</a></h3>

<ul>
<li>Authors can make "requests" which 
  <a href="#HttpGetInsertDelete" rel="bookmark">insert data to or delete data from the dataset</a>.
  <br>&nbsp;

<li>After real data has been inserted into the dataset, you can and should 
  delete the original sample data file.
  <br>&nbsp;

<li>Users can request data from the dataset as they do for any other EDDTable dataset in ERDDAP.
  If the request doesn't include a constraint on the timestamp column,
  then the request gets data from the current version of the dataset (the log file
  after processing all of the insertion and deletion commands and re-sorting by the 
  <kbd>httpGetRequiredVariables</kbd>).
  <br>&nbsp;

<li>Users can also make requests which are specific to EDDTableFromHttpGet datasets:
  <ul>
  <li>If the request includes a &lt; or &lt;= constraint of the timestamp column,
  then ERDDAP™ processes rows of the log file up until the specified timestamp.
  In effect, this temporarily deletes all of the changes made to the dataset since that timestamp value. 
  For more info, see <a rel="help" href="#HttpGetVersioning">Versioning</a>.

  <li>If the request includes a &gt;, &gt;=, or = constraint of the timestamp column, 
  e.g., <kbd>&amp;timestamp&lt;=0</kbd>, then ERDDAP™ returns the data from 
  the data files as is, without processing the insertion and deletion commands.
  </ul>

<li>In the future, we envision that tools will be built (by us? by you?) for
  working with these datasets. For example, there could be a script which 
  reads the raw log files, applies a different calibration equation,
  and generates/updates a different dataset with that derived information. 
  Note that the script can get
  the original data via a request to ERDDAP™ (which gets the data in the file
  format which is easiest for the script to work with) 
  and generate/update the new dataset via .insert "requests" to ERDDAP. 
  The script doesn't need direct access to the
  data files; it can be on any authorized author's computer.
  <br>&nbsp;

</ul>

<h3>Detailed Information about EDDTableFromHttpGet</h3>

The topics are:
<ul>
<li><a href="#HttpGetDontChangeThings" rel="bookmark"  >DON'T change the setup!</a>
<li><a href="#HttpGetCRUD" rel="bookmark"              >CRUD</a>
<li><a href="#HttpGetInvalidRequests" rel="bookmark"   >InvalidRequests</a>
<li><a href="#HttpGetSpeed" rel="bookmark"             >Speed</a> 
<li><a href="#HttpGetRobust" rel="bookmark"            >Robust</a>
<li><a href="#HttpGetSystemReliability" rel="bookmark" >System Reliability</a>
<li><a href="#HttpGetVersioning" rel="bookmark"        >Versioning</a>
<li><a href="#HttpGetPut" rel="bookmark"               >"What about HTTP PUT and DELETE?!"</a>
<li><a href="#HttpGetNotes" rel="bookmark"             >Notes</a>
<li><a href="#HttpGetThanks" rel="bookmark"            >Thanks to CHORDS for the basic idea.</a>
</ul>

Here is the detailed information:
<ul>
<li><a class="selfLink" id="HttpGetDontChangeThings" href="#HttpGetDontChangeThings" rel="bookmark"
>DON'T change the setup!</a>
<br>Once the dataset has been created and you have added data to it:
  <ul>
  <li>DON'T add or remove any dataVariables.
  <li>DON'T change the sourceName or destinationName of the dataVariables.
  <li>DON'T change the dataType of the dataVariables.
      But you can change the dataVariable's metadata.
  <li>DON'T change the <kbd>httpGetRequiredVariables</kbd> global attribute.
  <li>DON'T change the <kbd>httpGetDirectoryStructure</kbd> global attribute.
  </ul>
If you need to change any of these things, make a new dataset and transfer all of the 
data to the new dataset.
  <br>&nbsp;


<li><a class="selfLink" id="HttpGetCRUD" href="#HttpGetCRUD" rel="bookmark"
>CRUD</a>
<br>In computer science, the four fundamental commands for working with a 
dataset are 
<a rel="help" href="https://en.wikipedia.org/wiki/Create,_read,_update_and_delete"
    >CREATE, READ, UPDATE, DELETE (CRUD)<img 
    src="../images/external.png" alt=" (external link)" 
    title="This link to an external website does not constitute an endorsement."></a>.
SQL, the language for working with relational databases, has the equivalent in
INSERT, SELECT, UPDATE, and DELETE.
In EDDTableFromHttpGet, 
<ul>
<li>.insert is a combination of CREATE and UPDATE.
<li>.delete is DELETE.
<li>The regular system for requesting subsets of data is READ.
</ul>
Thus, EDDTableFromHttpGet supports all of the fundamental commands for working with a dataset.
<br>&nbsp;

  <li>.insert or .delete requests with no errors will return HTTP status code=200
    and a JSON object, e.g., <pre>
{
"status":"success",
"nRowsReceived":1,
"stringTimestamp":"2018-03-26T15:34:05.552Z",
"numericTimestamp":1.522078445552E9
}
</pre>The two timestamp values refer to the same millisecond, which is the 
    millisecond that will be stored in the <kbd>timestamp</kbd> variable
    for the rows of data that were inserted or deleted.
    ERDDAP™ won't change the name and formatting of these key-value pairs in the future. 
    ERDDAP™ may add additional key-value pairs to the JSON object in the future.
    <br>&nbsp;

  <li><a class="selfLink" id="HttpGetInvalidRequests" href="#HttpGetInvalidRequests" rel="bookmark"
>InvalidRequests</a>
    <br>Invalid .insert or .delete requests will return an HTTP error status code 
    other than status=200 and no change will be made to the dataset.   
    This includes requests with incorrect author information, incorrect
    variable names, different array lengths for different variables, 
    missing required variables, missing required variable values,
    etc. If the request involves more than one data file, it is possible
    that part of the request will succeed and part will fail.
    However this shouldn't be a problem if the sensor sending the request treats 
    any failure as a complete failure. For example, if you tell ERDDAP™ to 
    insert (or delete) the same data twice in a row, the worst case is that
    that information is stored twice, close together in the log file.
    It is hard to see how that could cause trouble.
    <br>&nbsp;

  <li><a class="selfLink" id="HttpGetSpeed" href="#HttpGetSpeed" rel="bookmark"
>Speed</a> 
   <br>For .insert or .delete requests (not counting http overhead),
    ballpark figures the speed of .insert or .delete are 
    <br>1ms per .insert with 1 row of data
    <br>2ms per .insert with 10 rows of data in arrays ([])
    <br>3ms per .insert with 100 rows of data in arrays ([])
    <br>13ms per .insert with 1000 rows of data in arrays ([])
    <br>Clearly arrays are the key to 
    <a rel="help" href="#HttpGetSpeed">high throughput</a>.
    Without arrays, it will be challenging to .insert or .delete
    more than 8 rows of data per second from a remote author (because of all
    the overhead of the network).
    With arrays, it will be easy to .insert or .delete
    more than 1000 rows of data per second from a remote sensor.

    <p>With very large amounts of data per request, you will hit Tomcat's
    limit to the maximum query length (default is 8KB?), but that can be 
    increased by editing the maxHttpHeaderSize setting in your 
    <i>tomcat</i>/conf/server.xml's HTTP/1.1 Connector entry.

    <p>When ERDDAP™ reads the JSON Lines CSV 
     data (log) files, there is a small time penalty 
     compared to reading binary data files. We felt that this time penalty
     when reading was a reasonable price to pay for the speed and robustness
     of the system when writing data (which is of primary importance).

    <p><a class="selfLink" id="SSD" href="#SSD" rel="bookmark"
    >For greater speed,</a>  use a 
    <a rel="help" href="https://en.wikipedia.org/wiki/Solid-state_drive">Solid State Drive (SSD)</a>
    to store the data.
    They have a much faster file access time (&lt;0.1ms) than hard disk drives (3 - 12 ms).
    They also have a faster data transfer rate (200 - 2500 MB/s) than hard disk drives (~200 MB/s).
    Their cost has come down considerably in recent years.
    Although early SSD's had problems after a large number of writes
    to a given block, this problem is now greatly reduced.
    If you just use the SSD to write the data once then read it many times,
    even a consumer-grade SSD (which is considerably less expensive than an 
    enterprise-grade SSD) should last a long time.


<li><a class="selfLink" id="HttpGetRobust" href="#HttpGetRobust" rel="bookmark"
>Robust</a>
    <br>We have tried to make this system as easy-to-work-with and as robust as possible.
    <ul>
    <li>The system is designed to have multiple threads (e.g., the sensor, 
      an automated QC script, and a human) simultaneously working on 
      the same dataset and even the same file.
      Much of this is made possible by using a log file approach to storing the 
      data and by using a very simple file type, 
      <a rel="help"
        href="https://jsonlines.org/examples/"
        >JSON Lines CSV files<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>,
        to store the data.
    <li>Another huge advantage to JSON Lines CSV is that if a file ever does become 
      corrupted (e.g., invalid because of an error on a line), 
      it is easy to open the file in a text editor and fix the problem.
    <li>Another advantage is, if there is an error on a line in a file, the 
      system can still read all the data on lines before and after the error line.
      And the system can still log additional .insert and .delete information.
    <li>A huge advantage of using admin-accessible standard files (compared to
      a relational database or Cassandra or other software): There is no 
      other software which has to be maintained and which has to be running in order to
      store or retrieve data. And it is easy to back up standard files at any time
      and in an incremental way because the data is in chunks (after a while,
      only the current-time file for each station will be changing).
      In contrast, it takes considerable effort and system down time 
      to make external backup files from databases and from Cassandra.
      <br>&nbsp;
  </ul>

<li><a class="selfLink" id="HttpGetSystemReliability" href="#HttpGetSystemReliability" rel="bookmark"
>System Reliability</a>
<br>It's reasonable to expect 
one server with ERDDAP™ to have 99.9% uptime -- that's about 9 hours of downtime per year
(although, you can use that up in one bad night!).
<br>If you are diligent and lucky, you might get 99.99% uptime (53 minutes downtime per year),
since just a few restarts for updates will take that much time.
<br>You would have to take extreme measures (a separate backup server, 
uninterruptible power supply, backup air conditioning, 
24x7x365 personnel to monitor the site, etc.) to have a slim chance at 99.999% uptime
(5.25 minutes downtime per year).  Even then, it is extremely unlikely that
you will attain 99.999% uptime (or even 99.99%) because problems are often 
outside of your control.
For example, Amazon Web Service and Google offer astonishingly reliable web services, 
yet big sections of them are sometimes down for hours. 

<p>Face it, everyone wants ERDDAP™ to have 100% uptime, or at least the vaunted "six nines" 
(99.9999% uptime equals 32 seconds of downtime per year), 
but there's no way you're going to get it no matter how much time, effort, and
money you spend. 

<p>But ERDDAP™ uptime isn't the real goal here. 
The goal is to build a reliable <strong>system</strong>, one that doesn't lose any data.
This is a solvable problem.  

<p>The solution is: build fault-tolerance into the computer software 
that is sending the data to ERDDAP.
Specifically, that software should maintain a queue
of data waiting to go to ERDDAP. 
When data is added to the queue, the software should check the response from 
ERDDAP. If the response doesn't include <kbd>Data received. No errors.</kbd>,
then the software should leave the data in the queue.
When more data is generated and added to the queue, the software should again 
try to .insert the data in the queue (perhaps with the [] system).
It will succeed or fail. If it fails, it will try again later.
If you write the software to work this way and if the software is prepared to 
queue a few days worth of data, you actually do have a good chance of uploading 
100% of the sensor's data to ERDDAP. And you will have done it without
going to great effort or expense.

<p>[Background: We didn't think this up.
<a rel="help" href="https://en.wikipedia.org/wiki/Reliability_(computer_networking)"
    >This is how computer networks achieve reliability.<img 
    src="../images/external.png" alt=" (external link)" 
    title="This link to an external website does not constitute an endorsement."></a>
Computer networks are inherently unreliable. So when you transfer a file from
one computer to another, the sending software knows/expects that some packets may be lost.
If it doesn't get a proper acknowledgment for a given packet from the receiver, 
it resends the lost packet. With this approach, relatively simple sender and
receiver software can build a reliable file transfer system on top of an unreliable 
network.]

<li><a class="selfLink" id="HttpGetJsonLinesCsv" href="#HttpGetJsonLinesCsv" rel="bookmark"
>"Why JSON Lines CSV files?!"</a>
<br>EDDTableFromHttpGet uses 
<a rel="help"
        href="https://jsonlines.org/examples/"
        >JSON Lines CSV files<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>. 
for storing the data. The reasons are:
  <ul>
  <li>The main reason is: The simplicity of JSON Lines CSV
    files offers a fast, easy and reliable way to allow multiple threads to write to a given file
    (e.g., by synchronizing on the filename).
  <li>If a JSON Lines CSV file ever did become 
    corrupted (e.g., invalid because of an error on a line), 
    EDDTableFromFromHttpGet could still read all of the data on all of the lines 
    before and after the error line.
    And the .insert and .delete system could continue to add new data to the data file.
  <li>Because the JSON Lines CSV files are ASCII files, if a file ever did become 
    corrupted, it would be easy to fix (in a text editor).
  <li>JSON Lines CSV supports Unicode strings.
  <li>JSON Lines CSV supports variable length strings (not limited to some max length).
  <li>JSON Lines CSV supports 64-bit integers (longs).
  <li>The formal nature and extra syntax of JSON Lines CSV (vs old-school CSV)
    provides some extra assurance that a given line hasn't been corrupted.
  </ul>

<p>We initially tried to use .nc3 files with an unlimited dimension. 
However, there were problems:
  <ul>
  <li>The main problem was: 
    There is no reliable way to allow multiple threads to write to a .nc3 file,
    even if the threads cooperate by doing the writes in a synchronized way. 
  <li>If an .nc3 file becomes corrupted,  
    the .insert and .delete system can't continue to use the file.
  <li>Because the .nc3 files are binary, if a file becomes corrupted (which 
    they do because of the multi-threading problem) they are exceedingly
    hard or impossible to fix. There are no tools to help with the repair.
  <li>CF has no way to specify the encoding of strings,
    so there is no official way to support Unicode, e.g., the UTF-8 encoding.   
    We tried to get CF to support an _Encoding attribute but were unable to make any progress.
    (Unidata, to their credit, does support the _Encoding attribute.)
  <li>.nc3 files only support fixed length strings.
    Again, we tried to get CF and Unidata to support variable length strings
    but were unable to make any progress.
  <li>.nc3 files don't support an easy way to distinguish single character variables 
    from String variables.
    Again, we tried to get CF and Unidata to support a system for 
    distinguishing these two data types, but were unable to make any progress.
  <li>.nc3 files only support 8-bit characters with an unspecified encoding. 
    Again, we tried to get CF and Unidata to support a system for 
    specifying the encoding, but were unable to make any progress.
  <li>.nc3 files don't support 64-bit integers (longs). 
    Again, we tried to get CF and Unidata to support a system for 
    longs, but were unable to make any progress.
    <br>&nbsp;
  </ul>

<li><a class="selfLink" id="HttpGetVersioning" href="#HttpGetVersioning" rel="bookmark"
>Versioning</a>
<br>Because EDDTableFromHttpGet stores a log of all of the changes to the dataset
with the timestamp and the author of each change, it can quickly recreate that dataset 
as of any point in time. In a sense, there is a version for any point in time.
If a user's request for data includes a timestamp &lt;= constraint, 
e.g., &amp;timestamp&lt;=2016-06-23T16:32:22.128Z
(or any time point), but no constraint of author or command,
ERDDAP™ will respond to the request by first generating
a version of the dataset as of that point in time. Then, ERDDAP™ applies
the user's other constraints, as with any other request for data from ERDDAP.
EDDTableFromHttpGet is set up so that this process is very fast and efficient,
even for very large datasets.

<p>Similarly, a user can find out when the dataset was last updated by requesting
<kbd>...?timestamp&amp;timestamp=max(timestamp)&amp;distinct()</kbd>

<p>And for any request for data, for any version of the dataset, users can see
which author made which changes, and when they made them.

<p>This versioning system enables 
<a rel="help" href="https://en.wikipedia.org/wiki/Reproducibility"
    >Reproducible Science<img 
    src="../images/external.png" alt=" (external link)" 
    title="This link to an external website does not constitute an endorsement."></a>
because anyone, at any time, can request data from the version of the dataset 
at any point in time. 
This fine-grained versioning is not possible with any other system that we know of.
The underlying mechanism is very efficient, in that no extra storage space is needed, 
and the processing overhead is truly minimal.

<p>Not everyone has a need for this type of fine-grained versioning, 
but it is exceedingly useful, perhaps necessary, 
in the context of a large data management organization 
(e.g., OOI, Earth Cube, Data One, and NOAA's NCEI)
where a dataset can have multiple authors 
(e.g., the sensor, an automated QC script, and a human editor). 

<p>[History: 
The need for this type of versioning first came up for me (Bob) when reading about
and discussing OOI in 2008. 
At the time, OOI had a cumbersome, slow, inefficient system for versioning 
based on Git. Git is great for what it was designed for, but not this. 
In 2008, while at an OOI discussion, I designed an extensive, efficient 
alternative-to-OOI system for data management, 
including many of the features that I have added to ERDDAP™ since then,
and including this versioning system. 
At that time and since, OOI was committed to their versioning system 
and not interested in alternatives. 
In 2016, other aspects of this plan fell into place and I started to implement it.
Because there were lots of interruptions to work on other projects, I didn't finish until 2018.
Even now, I'm not aware of any other scientific data system that offers
such quick and easy access to a version of the data from any point in time,
for frequently changing datasets.
Simple file systems don't offer this. Relational databases don't. Cassandra doesn't.]


<li><a class="selfLink" id="HttpGetPut" href="#HttpGetPut" rel="bookmark"
>"What about HTTPS PUT and DELETE?!"</a>
<br><a rel="help" href="https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol">Hypertext Transfer Protocol (HTTP)<img 
    src="../images/external.png" alt=" (external link)" 
    title="This link to an external website does not constitute an endorsement."></a>
is the basis of the World Wide Web and the reason that web page URLs begin with "http://" or "https://".
HTTPS is HTTP with an additional security layer.
Every day, browsers, scripts and computer programs make billions of HTTP(S)
<strong>GET</strong> requests to get information from remote sources.
HTTP(S) also includes other 
<a rel="help" href="https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol#Request_methods"
>verbs<img 
    src="../images/external.png" alt=" (external link)" 
    title="This link to an external website does not constitute an endorsement."></a>,
notably PUT (to push data to the server) and DELETE
(to DELETE data from the server). Yes, PUT and DELETE are the proper way to 
insert data into, and delete data from, a dataset via HTTP(S). 
GET is supported by every piece of software that can work with HTTP(S).
GET is really easy to work with. 
Everyone already knows how to work with GET and many know how to use POST  
(which can be used in essentially the same way as GET),
so we made EDDTableFromHttpGet work with GET and POST.
Very few people (even few computer programmers) have ever worked with PUT and DELETE.
PUT and DELETE are generally only supported by computer languages,
so using them requires a skillful program. So PUT and DELETE are usually a much more
cumbersome approach given the way the tools have evolved. 
<br>&nbsp;

<li><a class="selfLink" id="HttpGetNotes" href="#HttpGetNotes" rel="bookmark"
>Notes</a>
<ul>
<li>No dataVariable may have dataType=char. Use dataType=String instead.
  If you really need dataType=char, email Chris.John at noaa.gov .
  <br>&nbsp;
</ul>

<li><a class="selfLink" id="HttpGetThanks" href="#HttpGetThanks" rel="bookmark"
>Thanks to CHORDS for the basic idea.</a>
<br>The basic idea for EDDTableFromHttpGet (i.e., using an HTTP GET request
to add data to a dataset) is from UCAR's (NCAR's?)
<a rel="help" href="https://github.com/earthcubeprojects-chords"
>Cloud-Hosted Real-time Data Services (CHORDS)<img 
    src="../images/external.png" alt=" (external link)" 
    title="This link to an external website does not constitute an endorsement."></a> project.
The format for the parameters in the request (repeated <i>name=value</i>, separated by &amp;'s)
is the same standard format that is used by HTML forms on web pages.
It is a simple and brilliant idea  
and even more so because it meshes so perfectly with 
ERDDAP's existing system for dealing with tabular data.
The idea is obvious in hindsight, but I (Bob) didn't think of it.
EDDTableFromHttpGet uses that basic idea, 
combined with our ideas of how to implement it, 
to make a system in ERDDAP™ for uploading data.
Other than the basic idea of using GET to push data into the system, 
the EDDTableFromHttpGet implementation is entirely different and entirely independent of 
CHORDS and has different features (e.g., log files, chunking of data, 
different security system, CRUD support, reproducible data). 
Our exposure to CHORDS was just a webinar. 
We did not look at their code or read about their project because we immediately 
knew we wanted to implement the system a different way. 
But we are grateful to them for the basic idea.
The full reference to CHORDS is
<br>Daniels, M. D., Kerkez, B., Chandrasekar, V., Graves, S., Stamps, D. S., 
Martin, C., Dye, M., Gooch, R., Bartos, M., Jones, J., Keiser, K. (2014). 
Cloud-Hosted Real-time Data Services for the Geosciences (CHORDS) software. 
UCAR/NCAR -- Earth Observing Laboratory. 
<a rel="help" href="https://doi.org/10.5065/d6v1236q"
                   >https://doi.org/10.5065/d6v1236q<img 
    src="../images/external.png" alt=" (external link)" 
    title="This link to an external website does not constitute an endorsement."></a>

<br>&nbsp;

</ul>


<p><a class="selfLink" id="EDDTableFromHyraxFiles" href="#EDDTableFromHyraxFiles" rel="bookmark"><strong>EDDTableFromHyraxFiles</strong></a> (deprecated)
  aggregates data files with several variables, each with
  one or more shared dimensions (for example, time, altitude (or depth), latitude, longitude), and served by a 
  <a rel="help" href="https://www.opendap.org/software/hyrax-data-server">Hyrax OPeNDAP server<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
<ul>
<li>This dataset type is <strong>DEPRECATED</strong>. 
  The newer and more general solution is to use the 
   <a rel="help" href="#cacheFromUrl">cacheFromUrl option for EDDTableFromFiles</a> (or a variant),
   which makes a local copy of the remote files and serves the data from the local files.
   The &lt;cacheFromUrl&gt; option can be used with any type of tabular data file.
   <strong><br>If you can't make that work for some reason, email Chris.John at noaa.gov .
   <br>If there are no complaints before 2020, this dataset type may be removed.</strong>
<li>We strongly recommend using the
  <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml program</a> 
  to make a rough draft of the datasets.xml chunk for this dataset.
  You can then edit that to fine tune it.
<li>In most cases, each file has multiple values for the leftmost (first) dimension, for example, time.
<li>The files often (but don't have to) have a single value for the other dimensions 
  (for example, altitude (or depth), latitude, longitude).
<li>The files may have character variables with an additional dimension (for example, nCharacters).
<li>Hyrax servers can be identified by the "/dods-bin/nph-dods/" or "/opendap/" in the URL.
  <!-- For example, ??? -->
<li>This class screen-scrapes the Hyrax web pages with the lists of files in each directory.
  Because of this, it is very specific to the current format of Hyrax web pages.
  We will try to adjust ERDDAP™ quickly if/when future versions of Hyrax change how the files are listed.
<li>The <kbd>&lt;fileDir&gt;</kbd> setting is ignored. Since this class downloads
  and makes a local copy of each remote data file, ERDDAP™ forces the fileDir
  to be <kbd><i>bigParentDirectory</i>/copy/<i>datasetID</i>/</kbd>.
<li>For <kbd>&lt;sourceUrl&gt;</kbd>, use the URL of the base directory of the dataset in the Hyrax server, for example,
  <br>  <kbd>&lt;sourceUrl&gt;<wbr>http://edac-dap.northerngulfinstitute.org/dods-bin/nph-dods/WCOS/nmsp/wcos/<wbr>&lt;/sourceUrl&gt;</kbd>
  <br>(but put it on one line) (sorry, that server is no longer available).
  <br>The sourceUrl web page usually has "OPeNDAP Server Index of [directoryName]" at the top.
<li>Since this class always downloads and makes a local copy of each remote
  data file, you should never wrap this dataset in 
  <a rel="help" href="#EDDTableCopy">EDDTableCopy</a>.
<li>See this class' superclass, <a rel="help" href="#EDDTableFromFiles">EDDTableFromFiles</a>, for information
on how this class works and how to use it.
<li>See the 1D, 2D, 3D, and 4D examples for 
  <a rel="help" href="#EDDTableFromNcFiles">EDDTableFromNcFiles</a>.
  <br>&nbsp;
</ul>


<p><a class="selfLink" id="EDDTableFromInvalidCRAFiles" 
  href="#EDDTableFromInvalidCRAFiles" rel="bookmark"><strong>EDDTableFromInvalidCRAFiles</strong></a> 
        aggregates data from NetCDF (v3 or v4) .nc files which use 
        a specific, invalid, variant of the CF DSG Contiguous Ragged Array (CRA) files.
        Although ERDDAP™ supports this file type, it is an invalid file type 
        that no one should start using. Groups that currently use this file type are 
        strongly encouraged to use ERDDAP™ to generate valid CF DSG CRA files
        and stop using these files.

  <p>Details: These files have multiple row_size variables, each with a sample_dimension attribute. 
   The files are non-CF-standard files because the multiple sample (obs) dimensions 
   are to be decoded and related to each other with this additional rule and promise 
   that is not part of the CF DSG specification:
   "you can associate a given e.g., temperature value (temp_obs dimension) 
   with a given depth value (z_obs dimension, the dimension with the most values), because: 
   the temperature row_size (for a given cast) will be either 0 or equal to the
   corresponding depth row_size (for that cast) (that's the rule).
   So, if the temperature row_size isn't 0, then the n temperature values for that cast relate 
   directly to the n depth values for that cast (that's the promise)."
 
 <p>Another problem with these files: the Principal_Investigator row_size variable doesn't have
  a sample_dimension attribute and doesn't follow the above rule.
 
 <p>Sample files for this dataset type can be found at 
  https://data.nodc.noaa.gov/thredds/catalog/ncei/wod/ [2020-10-21 This server is no longer reliably available]. 

  <p>See this class' superclass, <a rel="help" href="#EDDTableFromFiles">EDDTableFromFiles</a>, 
  for information on how this class works and how to use it.

  <p>We strongly recommend using the
  <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml program</a> 
  to make a rough draft of the datasets.xml chunk for this dataset.
  You can then edit that to fine tune it.

  <p>The first thing GenerateDatasetsXml does for this type of dataset 
  after you answer the questions is print the ncdump-like structure of the sample file.
  So if you enter a few goofy answers for the first loop through GenerateDatasetsXml,
  at least you'll be able to see if ERDDAP™ can read the file and see
  what dimensions and variables are in the file.
  Then you can give better answers for the second loop through GenerateDatasetsXml.
  <br>&nbsp;


<p><a class="selfLink" id="EDDTableFromJsonlCSVFiles" href="#EDDTableFromJsonlCSVFiles" rel="bookmark"><strong>EDDTableFromJsonlCSVFiles</strong></a> 
  aggregates data from 
  <a rel="help"
        href="https://jsonlines.org/examples/"
        >JSON Lines CSV files<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
  See this class' superclass, <a rel="help" href="#EDDTableFromFiles">EDDTableFromFiles</a>, 
  for information on how this class works and how to use it.

  <ul>
  <li>As jsonlines.org says, this format is "Better than CSV" 
    (and legally, as a federal employee, I can't agree or disagree with them -- how crazy is that?). 
    CSV has never been formally defined and is hampered by the historical 
    baggage related to its connection to the original spreadsheet programs. 
    JSON Lines CSV, in comparison, is fully defined and benefits from 
    its connection to the widely used JSON standard, which in turn 
    benefits from its connection to JavaScript and Java.
    Notably, there is full support for long integers 
    and for Unicode characters in strings,
    and a clear way to include other special characters (notably tabs and newlines)
    within strings. 

    <p>This format is particularly good for datasets where you need to 
    periodically append additional rows to the end of a given data file. 
    For that reason and others (see above), 
    <a rel="help" href="#EDDTableFromHttpGet">EDDTableFromHttpGet</a> 
    uses Json Lines CSV files for data storage.

  <li>The input files are assumed to be UTF-8 encoded. However, given the 
    \u<i>dddd</i> format for encoding special characters (e.g., \u20ac is the
    encoding for the Euro character), you have the option to 
    write the files so that they contain only 7-bit ASCII characters by
    using \u<i>dddd</i> to encode all characters above #127. 
    <br>&nbsp;

  <li>We strongly recommend using the
  <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml program</a> 
  to make a rough draft of the datasets.xml chunk for this dataset.
  You can then edit that to fine tune it.

  <p>The first thing GenerateDatasetsXml does for this type of dataset 
  after you answer the questions is print the ncdump-like structure of the sample file.
  So if you enter a few goofy answers for the first loop through GenerateDatasetsXml,
  at least you'll be able to see if ERDDAP™ can read the file and see
  what dimensions and variables are in the file.
  Then you can give better answers for the second loop through GenerateDatasetsXml.

  <li>WARNING: When ERDDAP™ reads JSON Lines CSV data files, if it finds an error 
  on a given line (e.g., incorrect number of items), it logs a warning message 
  ("WARNING: Bad line(s) of data" ... with a list of the bad lines on subsequent lines) 
  to the 
  <a rel="help" href="https://erddap.github.io/setup.html#log">log.txt file</a> 
  and then continues to read the rest of the data file.
  Thus, it is your responsibility to look periodically (or write a script to do so)
  for that message in the log.txt so that you can fix the problems in the data
  files. ERDDAP™ is set up this way so that users can continue to read all of
  the available valid data even though some lines of the file have flaws. 
  <br>&nbsp;

  </ul>


<p><a class="selfLink" id="EDDTableFromMultidimNcFiles" href="#EDDTableFromMultidimNcFiles" rel="bookmark"><strong>EDDTableFromMultidimNcFiles</strong></a> 
  aggregates data from NetCDF (v3 or v4) .nc 
  (or <a rel="help" href="#NcML">.ncml</a>)
  files with several variables, each with one or more shared dimensions.
  The files may have character variables with or without an additional dimension (for example, STRING14).
  See this class' superclass, <a rel="help" href="#EDDTableFromFiles">EDDTableFromFiles</a>, 
  for information on how this class works and how to use it.
<ul>
<li>If the files are multidimensional CF DSG variants, use this dataset type
  instead of  
  <a rel="help" href="#EDDTableFromNcFiles">EDDTableFromNcCFFiles</a>.
    <br>&nbsp;

<li>For new tabular datasets from .nc files, use this option before trying the older 
  <a rel="help" href="#EDDTableFromNcFiles">EDDTableFromNcFiles</a>.
  Some advantages of this class are:
  <ul>
  <li>This class can read more variables from a wider variety of file structures.
    If you specify DimensionsCSV 
    (a comma-separated list of dimension names) in GenerateDatasetsXml
    (or &lt;dimensionsCSV&gt; in the datasets.xml info for one of these datasets), 
    then ERDDAP™ will only read variables in the source files which use
    some or all of these dimensions, plus all scalar variables.
    If a dimension is in a group, you must specify its fullName,
    e.g., "<i>groupName/dimensionName</i>".  
  <li>This class can often reject files very quickly if they don't match a request's constraints.
    So reading data from large collections will often go much faster.
  <li>This class handles true char variables (non-String variables) correctly.
  <li>This class can trim String variables when the creator didn't use 
    Netcdf-java's writeStrings (which appends char #0 to mark the end of the string).
  <li>This class is better at dealing with individual files that lack certain
    variables or dimensions.
  <li>This class can remove blocks of rows with missing values as 
    specified for 
    <a href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html#_incomplete_multidimensional_array_representation"
    >CF Discrete Sampling Geometries (DSG) Incomplete Multidimensional Array files<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a>
    <br>&nbsp;
  </ul>

<li>We strongly recommend using the
  <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml program</a> 
  to make a rough draft of the datasets.xml chunk for this dataset.
  You can then edit that to fine tune it.

  <p>The first thing GenerateDatasetsXml does for this type of dataset 
  after you answer the questions is print the ncdump-like structure of the sample file.
  So if you enter a few goofy answers for the first loop through GenerateDatasetsXml,
  at least you'll be able to see if ERDDAP™ can read the file and see
  what dimensions and variables are in the file.
  Then you can give better answers for the second loop through GenerateDatasetsXml.

  <p>Group -- GenerateDatasetsXml will ask for a "Group". You can enter
  "" to have it search any/all groups, "<i>someGroup</i>" or "<i>someGroup/someSubGroup</i>"
  to have it search a specific group, or "[root]" to have it search just the
  root group. The "Group" string becomes &lt;group&gt; in the datasets.xml 
  info for the dataset (although "[root]" becomes "").

  <p>DimensionsCSV -- GenerateDatasetsXml will ask for a "DimensionsCSV" string.
  This is a comma-separated-value list of source names of a set of dimensions. 
  GenerateDatasetsXml will only read data variables in sample .nc files which use 
  some or all of those dimensions (and no other dimensions),
  plus all of the scalar variables in the file,
  and make the dataset from those data variables. 
  If a dimension is in a group, you must specify its fullName,
  e.g., "<i>groupName/dimensionName</i>".  
  <br>If you specify nothing (an empty string), GenerateDatasetsXml will look for 
  the variables with the most dimensions, on the theory that they will be
  the most interesting, but there may be times when you will want to 
  make a dataset from some other group of data variables that uses some
  other group of dimensions. 
  <br>If you just specify a dimension name that doesn't exist 
  (e.g., <kbd>NO_MATCH</kbd>), ERDDAP™ will just find all of the scalar variables. 
  <br>The "DimensionsCSV" string becomes &lt;dimensionsCSV&gt; in the datasets.xml 
    info for the dataset.

<li><a class="selfLink" id="treatDimensionsAs" href="#treatDimensionsAs" rel="bookmark">treatDimensionsAs</a> 
  <br>There is a category of invalid .nc files (because they don't follow the CF rules) 
  that have multiple dimensions (e.g., lat, lon, time) 
  when they should have used just one dimension (e.g., time),
  for example:
<pre>
dimensions:
    time = UNLIMITED ; // (1437 currently)
    depth = 10;
    lat = 1437 ;
    lon = 1437 ;
variables:
    double time(time) ;
    double lat(lat) ;
    double lon(lon) ;
    float temperature(time, depth) ;
</pre>
   EDDTableFromMultidimNcFiles has a special feature to deal with these files:
   if you add the global attribute "treatDimensionsAs" to the datasets global
   addAttributes, you can tell ERDDAP™ to treat certain dimensions
   (e.g., lat and lon) as if they were another dimension (e.g., time).
   The attribute value must be a comma separated list specifying the
   "from" dimensions and then the "to" dimension, e.g.,
   <br><kbd>&lt;att name="treatDimensionsAs"&gt;lat, lon, time&lt;/att&gt;</kbd>
   <br>Then ERDDAP™ will read the file as if it were:
<pre>
dimensions:
    time = UNLIMITED ; // (1437 currently)
    depth = 10;
variables:
    double time(time) ;
    double lat(time) ;
    double lon(time) ;
    float temperature(time, depth) ;
</pre>
  Of course, the current size of each of the dimensions in the list 
  must be the same; otherwise, ERDDAP™ will treat the file as a "Bad File".

  <p>Note that these files are invalid because they don't follow CF rules.
  So even though ERDDAP™ can read them, we strongly recommend that
  you don't create files like this because other CF-based software tools
  won't be able to read them correctly.  If you already have such files,
  we strongly recommend replacing them with valid files as soon as possible.
  

</ul>


<p><a class="selfLink" id="EDDTableFromNcFiles" href="#EDDTableFromNcFiles" rel="bookmark"><strong>EDDTableFromNcFiles</strong></a> 
  aggregates data from NetCDF (v3 or v4) .nc 
  (or <a rel="help" href="#NcML">.ncml</a>)
  files and <a rel="help" href="https://github.com/zarr-developers/zarr-python">Zarr<img src="../images/external.png"
      alt=" (external link)" title="This link to an external website does not constitute an endorsement."></a>
  files (as of version 2.25) with several variables, each with one shared
  dimension (for example, time) or more than one shared dimensions 
  (for example, time, altitude (or depth), latitude, longitude).
  The files must have the same dimension names.
  A given file may have multiple values for each of the dimensions and the values may be
  different in different source files.
  The files may have character variables with an additional dimension (for example, STRING14).
  See this class' superclass, <a rel="help" href="#EDDTableFromFiles">EDDTableFromFiles</a>, 
  for information on how this class works and how to use it.
   
  <p>Zarr files have slightly different behavior and require either the fileNameRegex or the
  pathRegex to include "zarr".

<ul>
<li>If the .nc files use one of the  
  <a href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html#discrete-sampling-geometries">CF Discrete Sampling Geometries (DSG)<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a>
  file formats, try using 
    <a rel="help" href="#EDDTableFromNcFiles">EDDTableFromNcCFFiles</a> before trying this.
    <br>&nbsp;

<li>For new tabular datasets from .nc files, try the newer 
  <a rel="help" href="#EDDTableFromMultidimNcFiles">EDDTableFromMultidimNcFiles</a> first.
    <br>&nbsp;

<li>We strongly recommend using the
  <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml program</a> 
  to make a rough draft of the datasets.xml chunk for this dataset.
  You can then edit that to fine tune it.

  <p>The first thing GenerateDatasetsXml does for this type of dataset 
  after you answer the questions is print the ncdump-like structure of the sample file.
  So if you enter a few goofy answers for the first loop through GenerateDatasetsXml,
  at least you'll be able to see if ERDDAP™ can read the file and see
  what dimensions and variables are in the file.
  Then you can give better answers for the second loop through GenerateDatasetsXml.

  <p>DimensionsCSV -- GenerateDatasetsXml will ask for a "DimensionsCSV" string.
  This is a comma-separated-value list of source names of a set of dimensions. 
  GenerateDatasetsXml will find the data variables in the .nc files which use 
  some or all of those dimensions, plus all scalar variables,
  and make the dataset from those data variables. 
  If you specify nothing (an empty string), GenerateDatasetsXml will look for 
  the variables with the most dimensions, on the theory that they will be
  the most interesting, but there may be times when you will want to 
  make a dataset from some other group of data variables that uses some
  other group of dimensions. 

<li>1D Example:
  1D files are somewhat different from 2D, 3D, 4D, ... files.
  <ul>
  <li>You might have a set of .nc data files where each file has one month's
    worth of data from one drifting buoy.
  <li>Each file will have 1 dimension, for example, time (size = [many]).
  <li>Each file will have one or more 1D variables which use that dimension, 
    for example, time, longitude, latitude, air temperature, ....
  <li>Each file may have 2D character variables, for example, with dimensions (time,nCharacters).
    <br>&nbsp;
  </ul>
<li>2D Example:
  <ul>
  <li>You might have a set of .nc data files where each file has one month's 
    worth of data from one drifting buoy.
  <li>Each file will have 2 dimensions, for example, time (size = [many]) and id (size = 1).
  <li>Each file will have 2 1D variables with the same names as the dimensions
    and using the same-name dimension, for example, time(time), id(id).
    These 1D variables should be included in the list of <kbd>&lt;dataVariable&gt;</kbd>'s 
    in the dataset's XML. 
  <li>Each file will have one or more 2D variables, for example, longitude, 
    latitude, air temperature, water temperature, ...
  <li>Each file may have 3D character variables, for example, with dimensions (time,id,nCharacters).
    <br>&nbsp;
  </ul>
<li>3D Example:
  <ul>
  <li>You might have a set of .nc data files where each file has one month's
    worth of data from one stationary buoy.
  <li>Each file will have 3 dimensions, for example, time (size = [many]), 
    lat (size = 1), and lon (size = 1).
  <li>Each file will have 3 1D variables with the same names as the dimensions 
    and using the same-name dimension, for example, time(time), lat(lat), lon(lon).
    These 1D variables should be included in the list of <kbd>&lt;dataVariable&gt;</kbd>'s 
    in the dataset's XML. 
  <li>Each file will have one or more 3D variables, for example, air temperature, 
    water temperature, ...
  <li>Each file may have 4D character variables, for example, with dimensions (time,lat,lon,nCharacters).
  <li>The file's name might have the buoy's name within the file's name.
    <br>&nbsp;
  </ul>
<li>4D Example:
  <ul>
  <li>You might have a set of .nc data files where each file has one month's 
    worth of data from one
    station. At each time point, the station takes readings at a series of depths.
  <li>Each file will have 4 dimensions, for example, time (size = [many]), 
    depth (size = [many]), lat (size = 1), and lon (size = 1).
  <li>Each file will have 4 1D variables with the same names as the dimensions 
    and using the
    same-name dimension, for example, time(time), depth(depth), lat(lat), lon(lon).
    These 1D variables should be included in the list of <kbd>&lt;dataVariable&gt;</kbd>'s 
    in the dataset's XML. 
  <li>Each file will have one or more 4D variables, for example, air temperature, 
    water temperature, ...
  <li>Each file may have 5D character variables, for example, with dimensions (time,depth,lat,lon,nCharacters).
  <li>The file's name might have the buoy's name within the file's name.
    <br>&nbsp;
  </ul>
</ul>


<p><a class="selfLink" id="EDDTableFromNcCFFiles" href="#EDDTableFromNcCFFiles" rel="bookmark"><strong>EDDTableFromNcCFFiles</strong></a> 
  aggregates data aggregates data from 
  NetCDF (v3 or v4) .nc 
  (or <a rel="help" href="#NcML">.ncml</a>)
  files which use one of the file formats specified by the 
  <a href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html#discrete-sampling-geometries"
    >CF Discrete Sampling Geometries (DSG)<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
  conventions.
  See this class' superclass, <a rel="help" href="#EDDTableFromFiles">EDDTableFromFiles</a>, 
  for information on how this class works and how to use it.

  <p>For files using one of the multidimensional CF DSG variants, use
    <a rel="help" href="#EDDTableFromMultidimNcFiles">EDDTableFromMultidimNcFiles</a> 
    instead.

  <p>The CF DSG conventions defines dozens of file formats and includes numerous
  minor variations. This class deals with all of the variations we are aware of, but
  we may have missed one (or more). So if this class can't read data from your CF DSG files,
  please email Chris.John at noaa.gov and include a sample file.
    <br>Or, you can join the <a rel="help"
        href="#ERDDAPMailingList">ERDDAP™ Google Group / Mailing List</a> 
        and post your question there.

  <p>We strongly recommend using the
  <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml program</a> 
  to make a rough draft of the datasets.xml chunk for this dataset.
  You can then edit that to fine tune it.
  <br>&nbsp;


<p><a class="selfLink" id="EDDTableFromNccsvFiles" href="#EDDTableFromNccsvFiles" rel="bookmark"><strong>EDDTableFromNccsvFiles</strong></a> 
  aggregates data from 
  <a rel="help" href="https://erddap.github.io/NCCSV.html">NCCSV</a>
  ASCII .csv files.
  See this class' superclass, <a rel="help" href="#EDDTableFromFiles">EDDTableFromFiles</a>, 
  for information on how this class works and how to use it.

  <ul>
  <li>We strongly recommend using the
  <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml program</a> 
  to make a rough draft of the datasets.xml chunk for this dataset.
  You can then edit that to fine tune it.

  <p>The first thing GenerateDatasetsXml does for this type of dataset 
  after you answer the questions is print the ncdump-like structure of the sample file.
  So if you enter a few goofy answers for the first loop through GenerateDatasetsXml,
  at least you'll be able to see if ERDDAP™ can read the file and see
  what dimensions and variables are in the file.
  Then you can give better answers for the second loop through GenerateDatasetsXml.

  <li>WARNING: When ERDDAP™ reads NCCSV data files, if it finds an error 
  on a given line (e.g., incorrect number of items), it logs a warning message 
  ("WARNING: Bad line(s) of data" ... with a list of the bad lines on subsequent lines) 
  to the 
  <a rel="help" href="https://erddap.github.io/setup.html#log">log.txt file</a> 
  and then continues to read the rest of the data file.
  Thus, it is your responsibility to look periodically (or write a script to do so)
  for that message in the log.txt so that you can fix the problems in the data
  files. ERDDAP™ is set up this way so that users can continue to read all of
  the available valid data even though some lines of the file have flaws. 
  <br>&nbsp;

  </ul>


<p><a class="selfLink" id="EDDTableFromNOS" href="#EDDTableFromNOS" rel="bookmark"><strong>EDDTableFromNOS</strong></a> (DEPRECATED) handles data from a NOAA 
<a rel="help" href="https://opendap.co-ops.nos.noaa.gov/axis/">NOS<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> source,
  which uses
  <a rel="help" href="https://www.w3schools.com/xml/xml_soap.asp">SOAP+XML</a>  
  for requests and responses.  It is very specific to NOAA NOS's XML.
  See the sample EDDTableFromNOS dataset in datasets2.xml.
  <br>&nbsp;


<p><a class="selfLink" id="EDDTableFromOBIS" href="#EDDTableFromOBIS" rel="bookmark"><strong>EDDTableFromOBIS</strong></a> handles data from an
Ocean Biogeographic Information System (OBIS) server (was http://www.iobis.org ).

It is possible that there are no more active servers which use this now out-of-date type of OBIS server system.
<ul>
<li>OBIS servers expect an XML request and return an XML response.
<li>Because all OBIS servers serve the same variables the same way
  (was http://iobis.org/tech/provider/questions),
  you don't have to specify much to set up an OBIS dataset in ERDDAP.
<li>You MUST include a "creator_email" attribute in the global addAttributes,
  since that information is used within the license.
  A suitable email address can be found by reading the XML response from the sourceURL.
<li>You may or may not be able to get the global attribute 
  <a rel="help" href="#subsetVariables"><kbd>&lt;subsetVariables&gt;</kbd></a> to work with
  a given OBIS server. If you try, just try one variable (for example, ScientificName or Genus).
<li><a class="selfLink" id="EDDTableFromOBISSkeletonXML" href="#EDDTableFromOBISSkeletonXML" rel="bookmark">The skeleton XML for an EDDTableFromOBIS dataset is:</a>
<pre>
&lt;dataset type="EDDTableFromOBIS" <a rel="help" href="#datasetID">datasetID</a>="..." <a rel="help" href="#active">active</a>="..." &gt;
  <a rel="help" href="#sourceUrl">&lt;sourceUrl&gt;</a>...&lt;/sourceUrl&gt;
  &lt;sourceCode&gt;...&lt;/sourceCode&gt;
    &lt;!-- If you read the XML response from the sourceUrl, the 
    source code (for example, GHMP) is the value from one of the 
    &lt;resource&gt;&lt;code&gt; tags. --&gt;
  <a rel="help" href="#accessibleTo">&lt;accessibleTo&gt;</a>...&lt;/accessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#graphsAccessibleTo">&lt;graphsAccessibleTo&gt;</a>auto|public&lt;/graphsAccessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#reloadEveryNMinutes">&lt;reloadEveryNMinutes&gt;</a>...&lt;/reloadEveryNMinutes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultDataQuery">&lt;defaultDataQuery&gt;</a>...&lt;/defaultDataQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultGraphQuery">&lt;defaultGraphQuery&gt;</a>...&lt;/defaultGraphQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#addVariablesWhere">&lt;addVariablesWhere&gt;</a>...&lt;/addVariablesWhere&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#fgdcFile">&lt;fgdcFile&gt;</a>...&lt;/fgdcFile&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#iso19115File">&lt;iso19115File&gt;</a>...&lt;/iso19115File&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#onChange">&lt;onChange&gt;</a>...&lt;/onChange&gt; &lt;!-- 0 or more --&gt;
  &lt;-- All ...SourceMinimum and Maximum tags are OPTIONAL --&gt;
  &lt;longitudeSourceMinimum&gt;...&lt;/longitudeSourceMinimum&gt; 
  &lt;longitudeSourceMaximum&gt;...&lt;/longitudeSourceMaximum&gt; 
  &lt;latitudeSourceMinimum&gt;...&lt;/latitudeSourceMinimum&gt; 
  &lt;latitudeSourceMaximum&gt;...&lt;/latitudeSourceMaximum&gt; 
  &lt;altitudeSourceMinimum&gt;...&lt;/altitudeSourceMinimum&gt; 
  &lt;altitudeSourceMaximum&gt;...&lt;/altitudeSourceMaximum&gt; 
  &lt;-- For timeSource... tags, use yyyy-MM-dd'T'HH:mm:ssZ format. --&gt;
  &lt;timeSourceMinimum&gt;...&lt;/timeSourceMinimum&gt; 
  &lt;timeSourceMaximum&gt;...&lt;/timeSourceMaximum&gt; 
  <a rel="help" href="#sourceNeedsExpandedFP_EQ">&lt;sourceNeedsExpandedFP_EQ&gt;</a>true(default)|false&lt;/sourceNeedsExpandedFP_EQ&gt;
    &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#globalAttributes">&lt;addAttributes&gt;</a>...&lt;/addAttributes&gt; &lt;!-- 0 or 1.  This MUST include 
    "creator_email" --&gt;
&lt;/dataset&gt;
</pre>
&nbsp;
</ul>


<p><a class="selfLink" id="EDDTableFromParquetFiles" href="#EDDTableFromParquetFiles" rel="bookmark"><strong>EDDTableFromParquetFiles</strong></a> 
  handles data from 
  <a rel="help"
      href="https://parquet.apache.org/"
      >Parquet<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a>.
  See this class' superclass, <a rel="help" href="#EDDTableFromFiles">EDDTableFromFiles</a>, 
  for information on how this class works and how to use it.

  <ul>
  <li>Parquet is designed to compress very efficiently, so it may give you smaller 
    file sizes than other formats.

  <li>We strongly recommend using the
  <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml program</a> 
  to make a rough draft of the datasets.xml chunk for this dataset.
  You can then edit that to fine tune it.

  <li>WARNING: When ERDDAP™ reads Parquet data files, if it finds an error 
  on a given line (e.g., incorrect number of items), it logs a warning message 
  ("WARNING: Bad line(s) of data" ... with a list of the bad lines on subsequent lines) 
  to the 
  <a rel="help" href="https://erddap.github.io/setup.html#log">log.txt file</a> 
  and then continues to read the rest of the data file.
  Thus, it is your responsibility to look periodically (or write a script to do so)
  for that message in the log.txt so that you can fix the problems in the data
  files. ERDDAP™ is set up this way so that users can continue to read all of
  the available valid data even though some lines of the file have flaws. 
  <br>&nbsp;

  </ul>


<p><a class="selfLink" id="EDDTableFromSOS" href="#EDDTableFromSOS" rel="bookmark"><strong>EDDTableFromSOS</strong></a> handles data from a
Sensor Observation Service
(SWE/<a rel="help" href="https://www.ogc.org/standards/sos">SOS<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>) server.
<ul>
<li>This dataset type aggregates data from a group of stations which are all served by one SOS server. 
<li>The stations all serve the same set of variables (although the source for each station doesn't
  have to serve all variables).
<li>SOS servers expect an XML request and return an XML response.
<li>We strongly recommend using the
  <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml program</a> 
  to make a rough draft of the datasets.xml chunk for this dataset.
  You can then edit that to fine tune it.
  It is not easy to generate the dataset XML for SOS datasets by hand.
  To find the needed information, you must visit 
  sourceUrl+"?service=SOS&amp;request=GetCapabilities" in a browser; 
  look at the XML; make a GetObservation request by hand; 
  and look at the XML response to the request.
<li>With the occasional addition of new types of SOS servers
  and changes to the old servers, it is getting harder for ERDDAP™ to 
  automatically detect the server type from the server's responses. 
  The use of &lt;sosServerType&gt; (with a value of 
  IOOS_NDBC, IOOS_NOS, OOSTethys, or WHOI) is now STRONGLY RECOMMENDED. 
  If you have problems with any datasets of this type,
  try re-running GenerateDatasetsXml for the SOS server.
  GenerateDatasetsXml will let you try out the different &lt;sosServerType&gt; options
  until you find the right one for a given server.
<li>SOS overview:
    <ul>
    <li>SWE (Sensor Web Enablement) and SOS (Sensor Observation Service) are
      <a rel="help" href="https://www.ogc.org/standards">OpenGIS&reg; standards<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
      That website has the standards documents.
    <li>The OGC Web Services Common Specification ver 1.1.0 (OGC 06-121r3) covers construction of
      GET and POST queries (see section 7.2.3 and section 9).
    <li>If you send a getCapabilities xml request to an SOS server
      (sourceUrl + "?service=SOS&amp;request=GetCapabilities"), you get an xml result
      with a list of stations and the observedProperties that they have data for.
    <li>An observedProperty is a formal URI reference to a property. For example,
     urn:ogc:phenomenon:longitude:wgs84 or https://mmisw.org/ont/cf/parameter/sea_water_temperature
    <li>An observedProperty isn't a variable.
    <li>More than one variable may have the same observedProperty (for example, insideTemp
      and outsideTemp might both have observedProperty 
      https://mmisw.org/ont/cf/parameter/air_temperature).
    <li>If you send a getObservation xml request to an SOS server, you get an xml result with
      descriptions of field names in the response, field units, and the data.
      The field names will include longitude, latitude, depth(perhaps), and time.
    <li>Each dataVariable for an EDDTableFromSOS must include an "observedProperty" attribute,
      which identifies the observedProperty that must be requested from the server to
      get that variable. Often, several dataVariables will list the same composite
      observedProperty.
    <li>The dataType for each dataVariable may not be specified by the server.
      If so, you must look at the XML data responses from the server and assign appropriate 
      <a rel="help" href="#dataType"><kbd>&lt;dataType&gt;</kbd>s</a> in the ERDDAP™ dataset dataVariable
      definitions.
    <li>(At the time of writing this) some SOS servers respond to getObservation requests for
      more than one observedProperty by just returning results for the first of the
      observedProperties. (No error message!)
      See the constructor parameter <kbd>requestObservedPropertiesSeparately</kbd>.
    </ul>
<li>EDDTableFromSOS automatically adds 
  <br><kbd>&lt;att name="<a rel="help" href="#subsetVariables">subsetVariables</a>"&gt;station_id, 
    longitude, latitude&lt;/att&gt;</kbd>
  <br>to the dataset's global attributes when the dataset is created.
<li>SOS servers usually express <a rel="help" href="#units">units</a> with the 
      <a rel="help" href="https://unitsofmeasure.org/ucum.html">UCUM<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> system.
  Most ERDDAP™ servers express units with the 
    <a rel="help" href="https://www.unidata.ucar.edu/software/udunits/">UDUNITS<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> system.
  If you need to convert between the two systems, you can use
  <a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/convert/units.html">ERDDAP's web service to convert UCUM units to/from UDUNITS</a>.
<li><a class="selfLink" id="EDDTableFromSOSSkeletonXML" href="#EDDTableFromSOSSkeletonXML" rel="bookmark">The skeleton XML for an EDDTableFromSOS dataset is:</a>
<pre>
&lt;dataset type="EDDTableFromSOS" <a rel="help" href="#datasetID">datasetID</a>="..." <a rel="help" href="#active">active</a>="..." &gt;
  <a rel="help" href="#sourceUrl">&lt;sourceUrl&gt;</a>...&lt;/sourceUrl&gt;
  <a rel="help" href="#accessibleTo">&lt;accessibleTo&gt;</a>...&lt;/accessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#graphsAccessibleTo">&lt;graphsAccessibleTo&gt;</a>auto|public&lt;/graphsAccessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#reloadEveryNMinutes">&lt;reloadEveryNMinutes&gt;</a>...&lt;/reloadEveryNMinutes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultDataQuery">&lt;defaultDataQuery&gt;</a>...&lt;/defaultDataQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultGraphQuery">&lt;defaultGraphQuery&gt;</a>...&lt;/defaultGraphQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#addVariablesWhere">&lt;addVariablesWhere&gt;</a>...&lt;/addVariablesWhere&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#fgdcFile">&lt;fgdcFile&gt;</a>...&lt;/fgdcFile&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#iso19115File">&lt;iso19115File&gt;</a>...&lt;/iso19115File&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#onChange">&lt;onChange&gt;</a>...&lt;/onChange&gt; &lt;!-- 0 or more --&gt;
  &lt;sosServerType&gt;...&lt;/sosServerType&gt; &lt;!-- 0 or 1, but STRONGLY
    RECOMMENDED. This lets you specify the type of SOS server 
    (so ERDDAP™ doesn't have to figure it out).
    Valid values are: IOOS_NDBC, IOOS_NOS, OOSTethys, and WHOI. --&gt;
  &lt;responseFormat&gt;...&lt;/responseFormat&gt; &lt;!-- 0 or 1. Use this only if
    you need to override the default responseFormat for the 
    specified sosServerType.  --&gt;
  &lt;stationIdSourceName&gt;...&lt;/stationIdSourceName&gt; &lt;!-- 0 or 1. 
    Default="station_id". --&gt;
  &lt;longitudeSourceName&gt;...&lt;/longitudeSourceName&gt;
  &lt;latitudeSourceName&gt;...&lt;/latitudeSourceName&gt;
  &lt;altitudeSourceName&gt;...&lt;/altitudeSourceName&gt;
  &lt;altitudeSourceMinimum&gt;...&lt;/altitudeSourceMinimum&gt; &lt;!-- 0 or 1 --&gt;
  &lt;altitudeSourceMaximum&gt;...&lt;/altitudeSourceMaximum&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#altitudeMetersPerSourceUnit">&lt;altitudeMetersPerSourceUnit&gt;</a>...&lt;/altitudeMetersPerSourceUnit&gt; 
  &lt;timeSourceName&gt;...&lt;/timeSourceName&gt;
  &lt;timeSourceFormat&gt;...&lt;/timeSourceFormat&gt;
    &lt;!-- timeSourceFormat MUST be either
    * For numeric data: a <a rel="help"
        href="https://www.unidata.ucar.edu/software/udunits/">UDUnits<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."
        /></a>-compatible string (with the format 
      "<i>units</i> since <i>baseTime</i>") describing how to interpret
      source time values (for example, 
      "seconds since 1970-01-01T00:00:00Z"), where the
      base time is an ISO 8601:2004(E) formatted date time
      string (yyyy-MM-dd'T'HH:mm:ssZ).
    * For String date time data: specify 
      <a rel="help" href="#stringTimeUnits">units suitable for string times</a>
      describing how to interpret string times  (for example, the 
      ISO8601TZ_FORMAT "yyyy-MM-dd'T'HH:mm:ssZ"). --&gt;
  &lt;observationOfferingIdRegex&gt;...&lt;/observationOfferingIdRegex&gt;
    &lt;!-- Only observationOfferings with IDs (usually the station names) 
    which match this <a rel="help" 
    href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/regex/Pattern.html"
    >regular expression<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> (<a rel="help" href="https://www.vogella.com/tutorials/JavaRegularExpressions/article.html">tutorial<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>) will be included 
    in the dataset (".+" will catch all station names). --&gt;
  &lt;requestObservedPropertiesSeparately&gt;true|false(default)
    &lt;/requestObservedPropertiesSeparately&gt;
  <a rel="help" href="#sourceNeedsExpandedFP_EQ">&lt;sourceNeedsExpandedFP_EQ&gt;</a>true(default)|false&lt;/sourceNeedsExpandedFP_EQ&gt;
  <a rel="help" href="#globalAttributes">&lt;addAttributes&gt;</a>...&lt;/addAttributes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#dataVariable">&lt;dataVariable&gt;</a>...&lt;/dataVariable&gt; &lt;!-- 1 or more. 
    * Each dataVariable MUST include the <a rel="help" href="#dataType">dataType</a> tag.
    * Each dataVariable MUST include the observedProperty attribute. 
    * For IOOS SOS servers, *every* variable returned in the text/csv
      response MUST be included in this ERDDAP™ dataset definition. --&gt;
&lt;/dataset&gt;
</pre>
&nbsp;
</ul>


<p><a class="selfLink" id="EDDTableFromThreddsFiles" href="#EDDTableFromThreddsFiles" rel="bookmark"
><strong>EDDTableFromThreddsFiles</strong></a> (deprecated) aggregates data files 
  with several variables, each with one or more shared dimensions 
  (for example, time, altitude (or depth), latitude, longitude), and served by a 
  <a rel="help" 
  href="https://www.unidata.ucar.edu/software/tds/"
  >THREDDS OPeNDAP server<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
<ul>
<li>This dataset type is <strong>DEPRECATED</strong>. 
  The newer and more general solution is to use the
   <a rel="help" href="#cacheFromUrl">cacheFromUrl option for EDDTableFromFiles</a> (or a variant),
   which makes a local copy of the remote files and serves the data from the local files.
   The &lt;cacheFromUrl&gt; option can be used with any type of tabular data file from
   any web-based source that publishes a directory-like list of files.
   <strong><br>If you can't make that work for some reason, email Chris.John at noaa.gov .
   <br>If there are no complaints before 2020, this dataset type may be removed.</strong>

<li>We strongly recommend using the
  <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml program</a> 
  to make a rough draft of the datasets.xml chunk for this dataset.
  You can then edit that to fine tune it.
<li>In most cases, each file has multiple values for the leftmost (first) dimension, for example, time.
<li>The files often (but don't have to) have a single value for the other dimensions 
  (for example, altitude (or depth), latitude, longitude).
<li>The files may have character variables with an additional dimension (for example, nCharacters).
<li>THREDDS servers can be identified by the "/thredds/" in the URLs.
  For example, 
  <br>https://www.ncei.noaa.gov/thredds/catalog/uv/6h_strs_agg/catalog.html
<li>THREDDS servers have catalogs in various places. This class REQUIRES 
  that the URL include "/thredds/catalog/". You can usually find this 
  variable by starting in a browser in the root catalog, and then 
  clicking through to the desired subcatalog.
<li>This class reads the catalog.xml files served by THREDDS with the lists of 
  <kbd>&lt;catalogRefs&gt;</kbd>
  (references to additional catalog.xml sub-files) and <kbd>&lt;dataset&gt;</kbd>s (data files).
<li>The <kbd>&lt;fileDir&gt;</kbd> setting is ignored. Since this class downloads
  and makes a local copy of each remote data file, ERDDAP™ forces the fileDir
  to be <kbd><i>bigParentDirectory</i>/copy/<i>datasetID</i>/</kbd>.
<li>For <kbd>&lt;sourceUrl&gt;</kbd>, use the URL of the catalog.xml file for the dataset in the 
  THREDDS server,
  for example: 
  for this URL which may be used in a web browser,
  <br>https://data.nodc.noaa.gov/thredds/catalog/nmsp/wcos/catalog.html [2020-10-21 This server is no longer reliably available.],
  <br>use <kbd>&lt;sourceUrl&gt;<wbr>https://data.nodc.noaa.gov/thredds/catalog/nmsp/wcos/catalog.xml<wbr>&lt;/sourceUrl&gt;</kbd>
  <br>(but put it on one line).
<li>Since this class always downloads and makes a local copy of each remote
  data file, you should never wrap this dataset in 
  <a rel="help" href="#EDDTableCopy">EDDTableCopy</a>.
<li>This dataset type supports an OPTIONAL, rarely-used, special tag, 
  <kbd>&lt;specialMode&gt;<i>mode</i>&lt;/specialMode&gt;</kbd>
  which can be used to specify that special, hard-coded rules should be
  used to determine which files should be downloaded from the server.
  Currently, the only valid <kbd><i>mode</i></kbd> is <kbd>SAMOS</kbd> which is used with datasets
  from https://tds.coaps.fsu.edu/thredds/catalog/samos to download only the files with 
  the last version number.  
<li>See this class' superclass, <a rel="help" href="#EDDTableFromFiles">EDDTableFromFiles</a>, 
  for information on how this class works and how to use it.
<li>See the 1D, 2D, 3D, and 4D examples for 
  <a rel="help" href="#EDDTableFromNcFiles">EDDTableFromNcFiles</a>.
  <br>&nbsp;
</ul>


<p><a class="selfLink" id="EDDTableFromWFSFiles" href="#EDDTableFromWFSFiles" rel="bookmark"><strong>EDDTableFromWFSFiles</strong></a> 
   (DEPRECATED) makes a local copy of all of the data from an ArcGIS MapServer WFS server
  so the data can then be re-served quickly to ERDDAP™ users.
<ul>
<li>You need to specify a specially formatted <kbd>sourceUrl</kbd> global attribute to tell
  ERDDAP™ how to request feature information from the server.  Please use this example as a template:
  <br><kbd>&lt;att name="sourceUrl"&gt;
  <br>http://<i>someUrl/dir1/dir2</i>/MapServer/WFSServer?
  <br>request=GetFeature&amp;amp;service=WFS&amp;amp;typename=aasg:BoreholeTemperature
  <br>&amp;amp;format=&amp;quot;text/xml;%20subType=gml/3.1.1/profiles/gmlsf/1.0.0/0<wbr>&quot;&lt;/att&gt;</kbd>
  <br>(but put it all on one line)
<li>You need to add a special global attribute to tell ERDDAP™ how to identify the
  names of the chunks of data that should be downloaded.  This will probably work for all 
  EDDTableFromWFSFiles datasets: 
  <br><kbd>&lt;att name="rowElementXPath"&gt;/wfs:FeatureCollection/gml:featureMember&lt;/att&gt;</kbd>
<li>Since this class always downloads and makes a local copy of each remote
  data file, you should never wrap this dataset in 
  <a rel="help" href="#EDDTableCopy">EDDTableCopy</a>.
<li>See this class' superclass, <a rel="help" href="#EDDTableFromFiles">EDDTableFromFiles</a>, 
  for additional information on how this class works and how to use it.
  <br>&nbsp;
</ul>


<p><a class="selfLink" id="EDDTableAggregateRows" href="#EDDTableAggregateRows" rel="bookmark"><strong>EDDTableAggregateRows</strong></a>
can make an EDDTable dataset from a group of "child" EDDTable datasets.  
<ul>
<li>Here are some uses for EDDTableAggregateRows:
  <ul>
  <li>You could make an EDDTableAggregateRows dataset from two different kinds 
    of files or data sources, for example, a dataset with data 
    up to the end of last month stored in .ncCF files 
    and a dataset with data for the current month stored in a relational database.
  <li>You could make an EDDTableAggregateRows dataset to deal with a change in 
    source files (for example, the time format changed, or 
    a variable name changed, or dataType/scale_factor/add_offset changed). 
    In this case, one child would get data from files made before the change
    and the other child would get data from files made after the change.
    This use of EDDTableAggregateRows is an alternative to using 
    <a rel="help" href="#NcML">NcML</a> or <a rel="help" href="#NCO">NCO</a>.
    Unless there is a distinguishing feature in the filenames (so you can use
    &lt;fileNameRegex&gt; to determine which file belongs to which child dataset),
    you probably need to store the files for the two child datasets in different directories.
  <li>You could make an EDDTableAggregateRows dataset which has 
    a shared subset of variables of one or more similar but different datasets,
    for example, a dataset which makes a Profile dataset 
    from the combination of a Profile dataset, a TimeSeriesProfile dataset, 
    and a TrajectoryProfile dataset 
    (which have some different variables and some variables in common --
    in which case you'll have to make special variants for the child datasets,
    with just the in-common variables).
  <li>You could have several standalone datasets, each with the same type of
    data but from a different station. You could leave those datasets intact,
    but also create an EDDTableAggregateRows dataset which has 
    data from all of the stations -- each of the child datasets could
    be a simple       
    <a rel="help" href="#EDDTableFromErddap">EDDTableFromErddap</a>, 
    which points to one of the existing station datasets. 
    If you do this, give each of the EDDTableFromErddap datasets
    a different datasetID than the original standalone datasets,
    e.g., by appending "Child" to the original datasetID.
  </ul>
<li>Each of the child <kbd>&lt;dataset&gt;</kbd>'s specified must be a complete
  dataset, as if it were a stand-alone dataset. 
  Each must have the same 
  <a rel="help" href="#dataVariable"><kbd>dataVariables</kbd></a>, 
  in the same order, with the same 
  <a rel="help" href="#destinationName"><kbd>destinationNames</kbd></a>, 
  <a rel="help" href="#dataType"><kbd>dataTypes</kbd></a>, 
  <a rel="help" href="#missing_value"><kbd>missing_values</kbd></a>, 
  <a rel="help" href="#FillValue"><kbd>_FillValues</kbd></a>, 
  and 
  <a rel="help" href="#units"><kbd>units</kbd></a>. 
  The metadata for each variable for the EDDTableAggregateRows dataset 
  comes from variables in the first child dataset,
  but EDDTableAggregateRows will update the 
  <a rel="help" href="#actual_range"><kbd>actual_range</kbd></a> metadata
  to be the actual range for all of the children.
<li>Recommendation: Get each of the child datasets working as stand-alone
  datasets. Then try to make the EDDTableAggregateRows dataset
  by cutting and pasting the datasets.xml chunk for each into the new
  EDDTableAggregateRows dataset.
<li>Dataset Default Sort Order -- The order of the child datasets determines the 
  overall default sort order of the results. 
  Of course, users can request a different sort order 
  for a given set of results by appending 
  <kbd>&amp;orderBy("<i>comma-separated list of variables</i>")</kbd>
  to the end of their query.
<li>The "source" 
  <a rel="help" href="#globalAttributes"><kbd>globalAttributes</kbd></a> 
  for the EDDTableAggregateRows is the combined globalAttributes from the first child dataset.
  The EDDTableAggregateRows can have a global &lt;addAttributes&gt; to provide
  additional global attributes or override the source global attributes.
<li><a class="selfLink" id="EDDTableAggregateRowsSkeletonXML" href="#EDDTableAggregateRowsSkeletonXML" rel="bookmark">The skeleton XML for an EDDTableAggregateRows dataset is:</a>
<pre>
&lt;dataset type="EDDTableAggregateRows" <a rel="help" href="#datasetID">datasetID</a>="..." <a rel="help" href="#active">active</a>="..." &gt;
  <a rel="help" href="#accessibleTo">&lt;accessibleTo&gt;</a>...&lt;/accessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#graphsAccessibleTo">&lt;graphsAccessibleTo&gt;</a>auto|public&lt;/graphsAccessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#accessibleViaFiles">&lt;accessibleViaFiles&gt;</a>true|false(default)&lt;/accessibleViaFiles&gt; 
    &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#reloadEveryNMinutes">&lt;reloadEveryNMinutes&gt;</a>...&lt;/reloadEveryNMinutes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#updateEveryNMillis">&lt;updateEveryNMillis&gt;</a>...&lt;/updateEveryNMillis&gt; &lt;!-- 0 or 1. --&gt;
  <a rel="help" href="#defaultDataQuery">&lt;defaultDataQuery&gt;</a>...&lt;/defaultDataQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultGraphQuery">&lt;defaultGraphQuery&gt;</a>...&lt;/defaultGraphQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#addVariablesWhere">&lt;addVariablesWhere&gt;</a>...&lt;/addVariablesWhere&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#fgdcFile">&lt;fgdcFile&gt;</a>...&lt;/fgdcFile&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#iso19115File">&lt;iso19115File&gt;</a>...&lt;/iso19115File&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#onChange">&lt;onChange&gt;</a>...&lt;/onChange&gt; &lt;!-- 0 or more --&gt;
  &lt;dataset&gt;...&lt;/dataset&gt; &lt;!-- 1 or more --&gt;
&lt;/dataset&gt;
</pre>
&nbsp;
</ul>


<p><a class="selfLink" id="EDDTableCopy" href="#EDDTableCopy" rel="bookmark"><strong>EDDTableCopy</strong></a>
    can make a local 
    copy of many types of EDDTable datasets and then re-serve the data quickly from the 
    local copy.
<ul>
<li>EDDTableCopy (and for grid data, <a rel="help" href="#EDDGridCopy">EDDGridCopy</a>)
    is a very easy to use and a very effective
  <strong>solution to some of the biggest problems with serving data from remote data sources:</strong>
  <ul>
  <li>Accessing data from a remote data source can be slow.
    <ul>
    <li>They may be slow because they are inherently slow (for example, an inefficient type of server), 
    <li>because they are overwhelmed by too many requests,
    <li>or because your server or the remote server is bandwidth limited.
    </ul>
  <li>The remote dataset is sometimes unavailable (again, for a variety of reasons).
  <li>Relying on one source for the data doesn't scale well (for example, when many users and many 
    ERDDAPs utilize it).
    <br>&nbsp;
  </ul>


<li>How It Works -- EDDTableCopy solves these problems by automatically making and maintaining 
  a local copy of the data and serving data from the local copy.
  ERDDAP™ can serve data from the local copy very, very quickly.
  And making and using a local copy relieves the burden on the remote server.
  And the local copy is a backup of the original, which is useful in case something happens
  to the original.

  <p>There is nothing new about making a local copy of a dataset. What is new here is that this 
  class makes it *easy* to create and *maintain* a local copy of data from a *variety* of types 
  of remote data sources and *add metadata* while copying the data.

<li><a class="selfLink" id="EDDTableCopyVsCacheFromUrl" href="#EDDTableCopyVsCacheFromUrl" rel="bookmark"
  >EDDTableCopy vs &lt;cacheFromUrl&gt;</a>
  <br>&lt;cacheFromUrl&gt; is an alternative to EDDTableCopy. They work differently.
  <ul>
  <li>EDDTableCopy works by requesting chunks of data from a remote service and storing 
    those chunks in local files. 
    Thus, EDDTableCopy is useful in some cases where the data is accessible via a remote service.
  <li><a rel="help" href="#cacheFromUrl">&lt;cacheFromUrl&gt;</a>
    downloads the existing files listed on a remote website.
    &lt;cacheFromUrl&gt; is easier to use and more reliable since it can easily
    tell when there is a new remote data file or when a remote data file has changed
    and thus needs to be downloaded. 
  </ul>
  If there are situations where EDDTableCopy or &lt;cacheFromUrl&gt; could be used,
  use &lt;cacheFromUrl&gt; because it is easier and more reliable.
    <br>&nbsp;

<li><kbd>&lt;extractDestinationNames&gt;</kbd> -- EDDTableCopy makes the local copy of the 
  data by requesting
  chunks of data from the remote dataset.
  EDDTableCopy determines which chunks to request by requesting the &amp;distinct() values
  for the <kbd>&lt;extractDestinationNames&gt;</kbd> (specified in the datasets.xml, see below), 
  which are the space-separated destination names of variables in the remote dataset.
  For example, 
  <br><kbd>&lt;extractDestinationNames&gt;drifter profile&lt;/extractDestinationNames&gt;</kbd> 
  <br>might
  yield distinct values combinations of drifter=tig17,profile=1017, drifter=tig17,profile=1095,
  ... drifter=une12,profile=1223, drifter=une12,profile=1251, .... 

  <p>In situations where one column (for example, profile) may be all that is required to uniquely
  identify a group of rows of data, if there are a very large number of, for example, profiles,
  it may be useful to also specify an additional  extractDestinationName (for example, drifter)
  which serves to subdivide the profiles.
  That leads to fewer data files in a given directory, which may lead to faster access.


<li>Local Files -- Each chunk of data is stored in a separate NetCDF file in a subdirectory of
  <i>bigParentDirectory</i>/copy/<i>datasetID</i>/ (as specified in 
    <a rel="help" href="https://erddap.github.io/setup.html#setup.xml">setup.xml</a>).
  There is one subdirectory level for all but the last extractDestinationName. 
  For example, data for tig17+1017, would be stored in 
    <br><i>bigParentDirectory</i>/copy/sampleDataset/tig17/1017.nc .
  <br>For example, data for une12+1251, would be stored in 
    <br><i>bigParentDirectory</i>/copy/sampleDataset/une12/1251.nc .
  <br>Directory and filenames created from data values are modified to make them file-name-safe 
  (for example, spaces are replaced by "x20") -- this doesn't affect the actual data.
  <br>&nbsp;

<li>New Data -- Each time EDDTableCopy is reloaded, it checks the remote dataset
  to see what distinct chunks are available.
  If the file for a chunk of data doesn't already exist, a request to get the chunk is
  added to a queue.
  ERDDAP's taskThread processes all the queued requests for chunks of data, one-by-one.
  You can see statistics for the taskThread's activity on the
     <a rel="help" href="https://erddap.github.io/setup.html#statusPage">Status Page</a> and in the 
     <a rel="help" href="https://erddap.github.io/setup.html#dailyReport">Daily Report</a>.
  (Yes, ERDDAP™ could assign multiple tasks to this process, but that would use up 
  lots of the remote data source's bandwidth, memory, and CPU time, and
  lots of the local ERDDAP's bandwidth, memory, and CPU time, neither of which is a good idea.)

  <p>NOTE: The very first time an EDDTableCopy is loaded, (if all goes well) 
  lots of requests for chunks of data will be added to the taskThread's queue,
  but no local data files will have been created.
  So the constructor will fail but taskThread will continue to work and create local files.
  If all goes well, the taskThread will make some local data files and the next attempt to reload
  the dataset (in ~15 minutes) will succeed, but initially with a very limited amount of data.

  <p>NOTE: After the local dataset has some data and appears in your ERDDAP,
  if the remote dataset is temporarily or permanently not accessible,
  the local dataset will still work.

  <p>WARNING: If the remote dataset is large and/or the remote server is slow (that's the problem,
  isn't it?!), it will take a long time to make a complete local copy.
  In some cases, the time needed will be unacceptable.
  For example, transmitting 1 TB of data over a T1 line (0.15 GB/s) takes at least 60 days,
  under optimal conditions.
  Plus, it uses lots of bandwidth, memory, and CPU time on the remote and local computers.
  The solution is to mail a hard drive to the administrator of the remote data set so that
  s/he can make a copy of the dataset and mail the hard drive back to you.
  Use that data as a starting point and EDDTableCopy will add data to it.
  (That is how Amazon's EC2 Cloud Service used to handle the problem, 
  even though their system has lots of bandwidth.)

  <p>WARNING: If a given combination of values disappears from a remote dataset,
  EDDTableCopy does NOT delete the local copied file. If you want to, you can delete it yourself.

<li><a class="selfLink" id="tableCopy_checkSourceData" href="#tableCopy_checkSourceData" rel="bookmark"><kbd>&lt;checkSourceData&gt;</kbd></a> -- 
The datasets.xml for this dataset can have an optional tag
<br><kbd>&lt;checkSourceData&gt;true&lt;/checkSourceData&gt;</kbd>
<br>The default value is true. If/when you set it to false, the dataset won't ever
check the source dataset to see if there is additional data available.
    <br>&nbsp;

<li>Recommended Use -
  <ol>
  <li>Create the <kbd>&lt;dataset&gt;</kbd> entry (the native type, not EDDTableCopy) 
    for the remote data source. 
    <strong>Get it working correctly, including all of the desired metadata.</strong> 
  <li>If it is too slow, add XML code to wrap it in an EDDTableCopy dataset.
    <ul>
    <li>Use a different datasetID (perhaps by changing the datasetID of the old datasetID slightly).
    <li>Copy the <kbd>&lt;accessibleTo&gt;, &lt;reloadEveryNMinutes&gt;</kbd> and 
      <kbd>&lt;onChange&gt;</kbd> from the
      remote EDDTable's XML to the EDDTableCopy's XML.
      (Their values for EDDTableCopy matter; their values for the inner dataset become irrelevant.)
    <li>Create the <kbd>&lt;extractDestinationNames&gt;</kbd> tag (see above).
    <li><kbd>&lt;orderExtractBy&gt;</kbd> is an OPTIONAL space separated list of destination 
      variable names
      in the remote dataset.
      When each chunk of data is downloaded from the remote server, the chunk will be sorted by
      these variables (by the first variable, then by the second variable if the first variable
      is tied, ...).
      In some cases, ERDDAP™ will be able to extract data faster from the local data files
      if the first variable in the list is a numeric variable ("time" counts as a numeric variable).
      But choose these variables in a way that is appropriate for the dataset.
    </ul>
  <li>ERDDAP™ will make and maintain a local copy of the data.
    <br>&nbsp;
  </ol>

<li>WARNING: EDDTableCopy assumes that the data values for each chunk don't ever change. 
  If/when they do, you need to manually delete the chunk files in 
     <i>bigParentDirectory</i>/copy/<i>datasetID</i>/
  which changed and <a rel="help" href="https://erddap.github.io/setup.html#flag">flag</a> 
     the dataset to be reloaded so that the deleted
  chunks will be replaced.
  If you have an email subscription to the dataset, you will get two emails:
  one when the dataset first reloads and starts to copy the data, 
  and another when the dataset loads again (automatically) and detects the new local data files.
  <br>&nbsp;

<li>Change Metadata -- If you need to change any addAttributes or change the order of the
  variables associated with the source dataset:
  <ol>
  <li>Change the addAttributes for the source dataset in datasets.xml, as needed. 
  <li>Delete one of the copied files.
  <li>Set a <a rel="help" href="https://erddap.github.io/setup.html#flag">flag</a> 
    to reload the dataset immediately.
    If you do use a flag and you have an email subscription to the dataset, you will get two emails:
    one when the dataset first reloads and starts to copy the data, 
    and another when the dataset loads again (automatically) and detects the new local data files.
  <li>The deleted file will be regenerated with the new metadata. 
    If the source dataset is ever unavailable, the EDDTableCopy dataset will get metadata 
    from the regenerated file, since it is the youngest file.
    <br>&nbsp;
  </ol>

<li><a rel="help" href="#EDDGridCopy">EDDGridCopy</a> is very similar to EDDTableCopy, 
   but works with gridded datasets.
    <br>&nbsp;

<li><a class="selfLink" id="EDDTableCopySkeletonXML" href="#EDDTableCopySkeletonXML" rel="bookmark"
>The skeleton XML for an EDDTableCopy dataset is:</a>
<pre>
&lt;dataset type="EDDTableCopy" <a rel="help" href="#datasetID">datasetID</a>="..." <a rel="help" href="#active">active</a>="..." &gt;
  <a rel="help" href="#accessibleTo">&lt;accessibleTo&gt;</a>...&lt;/accessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#graphsAccessibleTo">&lt;graphsAccessibleTo&gt;</a>auto|public&lt;/graphsAccessibleTo&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#accessibleViaFiles">&lt;accessibleViaFiles&gt;</a>true|false(default)&lt;/accessibleViaFiles&gt; 
    &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#reloadEveryNMinutes">&lt;reloadEveryNMinutes&gt;</a>...&lt;/reloadEveryNMinutes&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultDataQuery">&lt;defaultDataQuery&gt;</a>...&lt;/defaultDataQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#defaultGraphQuery">&lt;defaultGraphQuery&gt;</a>...&lt;/defaultGraphQuery&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#addVariablesWhere">&lt;addVariablesWhere&gt;</a>...&lt;/addVariablesWhere&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#fgdcFile">&lt;fgdcFile&gt;</a>...&lt;/fgdcFile&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#iso19115File">&lt;iso19115File&gt;</a>...&lt;/iso19115File&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#onChange">&lt;onChange&gt;</a>...&lt;/onChange&gt; &lt;!-- 0 or more --&gt;
  &lt;extractDestinationNames&gt;...&lt;/extractDestinationNames&gt;  &lt;!-- 1 --&gt;
  &lt;orderExtractBy&gt;...&lt;/orderExtractBy&gt; &lt;!-- 0 or 1 --&gt;
  <a rel="help" href="#fileTableInMemory">&lt;fileTableInMemory&gt;</a>...&lt;/fileTableInMemory&gt; &lt;!-- 0 or 1 (true or false 
    (the default)) --&gt;
  <a rel="help" href="#tableCopy_checkSourceData">&lt;checkSourceData&gt;</a>...&lt;/checkSourceData&gt; &lt;!-- 0 or 1 --&gt;
  &lt;dataset&gt;...&lt;/dataset&gt; &lt;!-- 1 --&gt;
&lt;/dataset&gt;
</pre>
&nbsp;
</ul>


<hr>
<h2><a class="selfLink" id="details" href="#details" rel="bookmark">Details</a></h2>


Here are detailed descriptions of common tags and attributes.
<ul>
<li><a class="selfLink" id="angularDegreeUnits" href="#angularDegreeUnits" rel="bookmark"
  ><kbd><strong>&lt;angularDegreeUnits&gt;</strong></kbd></a>
  is a rarely used OPTIONAL tag within an <kbd>&lt;erddapDatasets&gt;</kbd> tag in datasets.xml which contains a
  comma-separated list of units strings that ERDDAP™ should treat as angular degrees units.
  If a variable has one of these units, tabledap's orderByMean filter will 
  calculate the mean in a special way, then report the mean as a value from -180 to 180.
  See ERDDAP's EDStatic.java source code file for the current default list.
  Any changes to this tag's value will take effect the next time ERDDAP™ reads datasets.xml,
  including in response to a dataset
  <a rel="help" 
    href="https://erddap.github.io/setup.html#flag">flag</a>. 
  <br>&nbsp;

<li><a class="selfLink" id="angularDegreeTrueUnits" href="#angularDegreeTrueUnits" rel="bookmark"
  ><kbd><strong>&lt;angularDegreeTrueUnits&gt;</strong></kbd></a>
  is a rarely used OPTIONAL tag within an <kbd>&lt;erddapDatasets&gt;</kbd> tag in datasets.xml which contains a
  comma-separated list of units strings that ERDDAP™ should treat as angular degrees true units.
  If a variable has one of these units, tabledap's orderByMean filter will 
  calculate the mean in a special way, then report the mean as a value from 0 to 360.
  See ERDDAP's EDStatic.java source file for the current default list.
  Any changes to this tag's value will take effect the next time ERDDAP™ reads datasets.xml,
  including in response to a dataset
  <a rel="help" 
    href="https://erddap.github.io/setup.html#flag">flag</a>. 
  <br>&nbsp;

<li><a class="selfLink" id="commonStandardNames" href="#commonStandardNames" rel="bookmark"
  ><kbd><strong>&lt;commonStandardNames&gt;</strong></kbd></a>
  is a rarely used OPTIONAL tag within an <kbd>&lt;erddapDatasets&gt;</kbd> tag in datasets.xml to specify
  a comma-separated list of common 
  <a rel="help" href="https://cfconventions.org/Data/cf-standard-names/current/build/cf-standard-name-table.html">CF standard names<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a>.
  E.g.,
  <br><kbd>&lt;commonStandardNames&gt;air_pressure, ..., wind_to_direction&lt;/commonStandardNames&gt;</kbd>
  <br>This list is used in DataProviderForm3.html as a convenience to users.
  <br>If you want to provide this information in datasets.xml, start by copying the current 
    default list in &lt;DEFAULT_commonStandardNames&gt; in ERDDAP's
    <br>[tomcat]/webapps/erddap/WEB-INF/classes/gov/noaa/pfel/erddap/util/messages.xml file.
  <br>&nbsp;

<li><a class="selfLink" id="cacheMinutes" href="#cacheMinutes" rel="bookmark"
  ><kbd><strong>&lt;cacheMinutes&gt;</strong></kbd></a>
  is a rarely used OPTIONAL tag within an <kbd>&lt;erddapDatasets&gt;</kbd> tag in datasets.xml to specify
  the age (in minutes) at which files in the cache should be deleted (default=60). E.g.,
  <br><kbd>&lt;cacheMinutes&gt;60&lt;/cacheMinutes&gt;</kbd>
  <br>In general, only image files (because the same images are often 
  requested repeatedly) and .nc files (because they must be fully created before
  sending to the user) are cached.
  Although it might seem like a given request should always return the same 
  response, that isn't true. 
  For example, a tabledap request which includes time&gt;<i>someTime</i> will change 
  when new data arrives for the dataset. 
  And a griddap request which includes [last] for the time dimension will change 
  when new data arrives for the dataset.
  Any changes to this tag's value will take effect the next time ERDDAP™ reads datasets.xml,
  including in response to a dataset
  <a rel="help" 
    href="https://erddap.github.io/setup.html#flag">flag</a>. 
  Before ERDDAP™ v2.00, this was specified in setup.xml, which is still allowed
  but discouraged.
  <br>&nbsp;


<li><a class="selfLink" id="convertInterpolateRequestCSVExample" 
                     href="#convertInterpolateRequestCSVExample" 
rel="bookmark"><kbd><strong>&lt;convertInterpolateRequestCSVExample&gt;</strong></kbd></a>
  is an OPTIONAL tag within an <kbd>&lt;erddapDatasets&gt;</kbd> tag in datasets.xml 
  [starting with ERDDAP™ v2.10] which contains an
  example which will be shown on the Interpolate converter's web page.
  The default value is: jplMURSST41/analysed_sst/Bilinear/4 .

<li><a class="selfLink" id="convertInterpolateDatasetIDVariableList" 
                     href="#convertInterpolateDatasetIDVariableList" 
rel="bookmark"><kbd><strong>&lt;convertInterpolateDatasetIDVariableList&gt;</strong></kbd></a>
  is an OPTIONAL tag within an <kbd>&lt;erddapDatasets&gt;</kbd> tag in datasets.xml
  [starting with ERDDAP™ v2.10] which contains a
  CSV list of datasetID/variableName examples which will be used as suggestions 
  by the Interpolate converter's web page. The default value is: jplMURSST41/analysed_sst .

<li><a class="selfLink" id="convertToPublicSourceUrl" href="#convertToPublicSourceUrl" 
rel="bookmark"><kbd><strong>&lt;convertToPublicSourceUrl&gt;</strong></kbd></a>
  is an OPTIONAL tag within an <kbd>&lt;erddapDatasets&gt;</kbd> tag in datasets.xml which contains a
    "<kbd>from</kbd>" and a "<kbd>to</kbd>" attribute which specifies how to convert 
      a matching local sourceUrl (usually an IP number) into a public sourceUrl (a domain name).
    "<kbd>from</kbd>" must have the form "<kbd>[something]//[something]/</kbd>".
      There can be 0 or more of these tags.
    For more information see <a rel="help" href="#sourceUrl"><kbd>&lt;sourceUrl&gt;</kbd></a>.
    For example,
    <br><kbd>&lt;convertToPublicSourceUrl from="https://192.168.31.18/" to="https://oceanwatch.pfeg.noaa.gov/" /&gt;</kbd>
    <br>will cause a matching local sourceUrl (such as
    https://192.168.31.18/thredds/dodsC/satellite/BA/ssta/5day)
    <br>into a public sourceUrl (https://oceanwatch.pfeg.noaa.gov/thredds/dodsC/satellite/BA/ssta/5day).
  <br>Any changes to this tag's value will take effect the next time ERDDAP™ reads datasets.xml,
  including in response to a dataset
  <a rel="help" 
    href="https://erddap.github.io/setup.html#flag">flag</a>. 

  <p>But, for security reasons and reasons related to the subscription system, 
    <strong>DON'T USE THIS TAG!</strong>
  <br>Instead, always use the public domain name in the <kbd>&lt;sourceUrl&gt;</kbd> tag
  and use the 
  <a rel="help" href="https://linux.die.net/man/5/hosts">/etc/hosts table</a>
  on your server to convert local domain names 
  to IP numbers without using a DNS server. 
  You can test if a domain name is properly converted into an IP number by using:
  <br><kbd>ping <i>some.domain.name</i></kbd>
  <br>&nbsp;

<li><a class="selfLink" id="#dataImagePngBase64" href="#dataImagePngBase64" rel="bookmark"
  ><strong><kbd>data:image/png;base64,</kbd></strong></a> --
  When a user requests an .htmlTable response from ERDDAP™, 
  if the data in a String cell contains <kbd>data:image/png;base64,</kbd> followed by a base64 encoded .png image,
  ERDDAP™ will display an icon (so the user can see the image if they hover over it)
  and buttons to save the text or the image to the clipboard.    
  This feature was added in ERDDAP™ v2.19 by Marco Alba.

<li><a class="selfLink" id="drawLandMask" href="#drawLandMask" rel="bookmark"
  ><strong>drawLandMask</strong></a>
  specifies the default setting which controls when and how the 
  landmask should be drawn when ERDDAP™ draws a map.
  It can be specified in three different places in datasets.xml (listed
  from lowest to highest priority):
  <ol>
  <li>If drawLandMask is specified within <kbd>&lt;erddapDatasets&gt;</kbd> 
    (not connected with any specific dataset), then it specifies the 
    default value of drawLandMask for all variables in all datasets.
    For example, 
    <br><kbd>&lt;drawLandMask&gt;under&lt;/drawLandMask&gt;</kbd>
    <br>Any changes to this tag's value will take effect the next time ERDDAP
      reads datasets.xml.
    <br>If this tag isn't present, the underlying default value is <kbd>under</kbd>.
    <br>&nbsp;
  <li>If drawLandMask is specified as a global attribute of a given dataset, 
    then it specifies the default value of drawLandMask for all variables
    in that dataset, overriding any lower priority setting. 
    For example, 
    <br><kbd>&lt;att name="drawLandMask"&gt;under&lt;/att&gt;</kbd>
    <br>Any changes to this tag's value will take effect the next time ERDDAP™ 
    reloads that dataset. 
    <br>&nbsp;
  <li>If drawLandMask is specified as a variable's attribute in a given dataset,
    then it specifies the default value of drawLandMask for that variable
    in that dataset, overriding any lower priority setting. 
    For example, 
    <br><kbd>&lt;att name="drawLandMask"&gt;under&lt;/att&gt;</kbd>
    <br>Any changes to this tag's value will take effect the next time ERDDAP™ 
    reloads that dataset. 
  </ol>
  <p>A user can override the default (wherever it is specified) 
  by selecting a value for "Draw land mask" 
  from a dropdown list on the dataset's Make A Graph web page, or by including
  <kbd>&amp;.land=<i>value</i></kbd> in the URL that requests a map from ERDDAP.

  <p>In all situations, there are 4 possible values for the attribute: 
  <ul>
  <li>"under" draws the landmask before it draws data on the map.
    <br>For gridded datasets, land appears as a constant light gray color.
    <br>For tabular datasets, "under" shows topography data over land and oceans.
  <li>"over" -- 
    For gridded datasets, "over" draws the landmask after it draws data on maps so that it will mask any data over land.
    For tabular datasets, "over" shows bathymetry of the ocean and a constant light gray where there is land, both drawn under the data.
  <li>"outline" just draws the outline of the landmask, political boundaries, lakes and rivers.
  <li>"off" doesn't draw anything.
    <br>&nbsp;
  </ul>

<li><a class="selfLink" id="emailDiagnosticsToErdData" href="#emailDiagnosticsToErdData" rel="bookmark"
  ><kbd><strong>&lt;emailDiagnosticsToErdData&gt;</strong></kbd></a>
  is a rarely used OPTIONAL tag within an <kbd>&lt;erddapDatasets&gt;</kbd> tag in datasets.xml. 
  The tag's value can be <kbd>true</kbd> (the default) or <kbd>false</kbd>.
  If <kbd>true</kbd>, ERDDAP™ will email the stack trace to Chris.John at noaa.gov (the ERDDAP™ development team).
  This should be safe and secure since no confidential information (e.g., the requestUrl) is included in the email.
  This should make it possible to catch any obscure, totally unexpected bugs that lead to NullPointerExceptions.
  Otherwise, the user sees the exceptions, but the ERDDAP™ development team doesn't 
  (so we don't know there is a problem that needs to be fixed).
  <br>&nbsp;

<li><a class="selfLink" id="graphBackgroundColor" href="#graphBackgroundColor" rel="bookmark"
  ><kbd><strong>&lt;graphBackgroundColor&gt;</strong></kbd></a>
  is a rarely used OPTIONAL tag within an <kbd>&lt;erddapDatasets&gt;</kbd> tag in datasets.xml to specify
  default background color on graphs.
  This affects almost all graphs. There are a few situations not affected.
  The color is specified as an 8 digit hexadecimal value in the form <kbd>0xAARRGGBB</kbd>,
  where AA, RR, GG, and BB are the opacity, red, green and blue components, 
  respectively. "0x" is case sensitive, but the hexadecimal digits are not
  case sensitive. For example, a fully opaque (ff) greenish-blue color with 
  red=22, green=88, blue=ee would be <kbd>0xff2288ee</kbd>. 
  Opaque white is <kbd>0xffffffff</kbd>. 
  The default is opaque light blue (<kbd>0xffccccff</kbd>), which has the
  advantage of being different from white, which is an important color in many 
  palettes used to draw data. 
  For example,
  <br><kbd>&lt;graphBackgroundColor&gt;0xffffffff&lt;/graphBackgroundColor&gt;</kbd>
  <br>Any changes to this tag's value will take effect the next time ERDDAP™ reads datasets.xml,
  including in response to a dataset
  <a rel="help" 
    href="https://erddap.github.io/setup.html#flag">flag</a>. 
  <br>&nbsp;

<li><a class="selfLink" id="ipAddressMaxRequests" href="#ipAddressMaxRequests" rel="bookmark"
  ><kbd><strong>&lt;ipAddressMaxRequests&gt;</strong></kbd></a>
  is a rarely used optional tag (first supported with ERDDAP™ v2.12) 
  within an <kbd>&lt;erddapDatasets&gt;</kbd> tag in datasets.xml
  that is part of a system to limit the ability of 
  overly aggressive legitimate users and malicious users to make a 
  large number of simultaneous requests which would degrade system performance for other users.
  ipAddressMaxRequests specifies the maximum number of simultaneous requests
        that will be accepted from any specific IP address. Additional 
        requests will receive an HTTP 429 error: Too Many Requests.
        The small, static files in erddap/download/ and erddap/images/
        are NOT exempt from this count.
        The default is 15. The maximum allowed is 1000, which is crazy high -- don't do it!
        ERDDAP™ won't accept a number less than 6 because
        many legitimate users (notably web browsers and WMS clients) make
        up to 6 requests at a time. The ERDDAP™ Daily Report and the
        similar information written to the log.txt file with each Major Dataset Reload,
        will now include a tally of the requests by these IP addresses
        under the title "Requester's IP Address (Too Many Requests)".
  <br>Any changes to this tag's value will take effect the next time ERDDAP™ reads datasets.xml,
  including in response to a dataset
  <a rel="help" 
    href="https://erddap.github.io/setup.html#flag">flag</a>. 

        <p>The "Major LoadDatasets Time Series" section of status.html includes
        a "tooMany" column which lists the number of requests which exceeded
        a user's ipAddressMaxRequests setting and thus saw a "Too Many Requests" error. 
        This lets you easily see
        when there are active overly aggressive legitimate users and malicious users
        so you can (optionally) look in the log.txt file and decide if you want to 
        blacklist those users.

   <p>There's nothing specifically wrong with setting this to a higher number. It's up to you.
     But doing so allows/encourages people to set up systems that use a large number of
     threads to work on projects and then gives them no feedback that what they are
     doing isn't getting them any benefit.

<li><a class="selfLink" id="ipAddressMaxRequestsActive" href="#ipAddressMaxRequestsActive" rel="bookmark"
  ><kbd><strong>&lt;ipAddressMaxRequestsActive&gt;</strong></kbd></a>
  is a rarely used optional tag (first supported with ERDDAP™ v2.12) 
  within an <kbd>&lt;erddapDatasets&gt;</kbd> tag in datasets.xml
  that is part of a system to limit the ability of 
  overly aggressive legitimate users and malicious users to make a 
  large number of simultaneous requests which would degrade system performance for other users.
  ipAddressMaxRequestsActive specifies the maximum number of simultaneous requests
       that will be actively processed from any specific IP address. 
       Additional requests will sit in a queue until the previous requests
       have been processed.
       The small, static files in erddap/download/ and erddap/images/
       ARE exempt from this count and the related throttling.
       The default is 2. The maximum allowed is 100, which is crazy high -- don't do it!
       You can set this to 1 to be strict, especially if you have problems
       with overly aggressive or malicious users.
       Users will still quickly get all the data they request (up to ipAddressMaxRequests),
       but they won't be able to hog system resources.
       We don't recommend setting this to a larger number because it allows 
       overly aggressive legitimate users and malicious users to dominate 
       ERDDAP's processing capacity.
  <br>Any changes to this tag's value will take effect the next time ERDDAP™ reads datasets.xml,
  including in response to a dataset
  <a rel="help" 
    href="https://erddap.github.io/setup.html#flag">flag</a>. 
  <br>&nbsp;

<li><a class="selfLink" id="ipAddressUnlimited" href="#ipAddressUnlimited" rel="bookmark"
  ><kbd><strong>&lt;ipAddressUnlimited&gt;</strong></kbd></a>
  is a rarely used optional tag (first supported with ERDDAP™ v2.12) 
  within an <kbd>&lt;erddapDatasets&gt;</kbd> tag in datasets.xml
  that is part of a system to limit the ability of 
  overly aggressive legitimate users and malicious users to make a 
  large number of simultaneous requests which would degrade system performance for other users.
  ipAddressUnlimited is a comma-separated list of IP addresses that
        you want to allow unlimited access to your ERDDAP. 
        Look in your log.txt file to see which format your server is using for the IP addresses.
        On some servers, the IP addresses will be in the format #.#.#.# (where # is an integer
        from 0 to 255); whereas on others it will be in the format #:#:#:#:#:#:#:# .
        Requesters on this list are not subject to either the ipAddressMaxRequests
        or the ipAddressMaxRequestsActive settings.   This might be 
        a secondary ERDDAP™ or for certain users or servers in your system.
        ERDDAP™ always adds "(unknownIPAddress)", which ERDDAP™ uses when 
        the requester's IP address can't be determined, e.g., for other 
        processes running on the same server.
  <br>Any changes to this tag's value will take effect the next time ERDDAP™ reads datasets.xml,
  including in response to a dataset
  <a rel="help" 
    href="https://erddap.github.io/setup.html#flag">flag</a>. 

  <p>If for some reason all of a user's requests get the error message
  "Timeout waiting for your other requests to process.", then 
  you can solve the problem by adding the user's IP address to the 
  ipAddressUnlimited list, applying that change, then removing it from that list.

<li><a class="selfLink" id="loadDatasetsMinMinutes" href="#loadDatasetsMinMinutes" rel="bookmark"
  ><kbd><strong>&lt;loadDatasetsMinMinutes&gt;</strong></kbd></a>
  is a rarely used OPTIONAL tag within an <kbd>&lt;erddapDatasets&gt;</kbd> tag in datasets.xml to specify
  the minimum time (in minutes) between major loadDatasets (when ERDDAP™ reprocesses
  datasets.xml, including checking each dataset to see if it needs to be reloaded
  according to its reloadEveryNMinutes setting, default=15). E.g.,
  <br><kbd>&lt;loadDatasetsMinMinutes&gt;15&lt;/loadDatasetsMinMinutes&gt;</kbd>
  <br>If a given run of loadDatasets takes less than this time,
  the loader just repeatedly looks at the flag directory and/or sleeps 
  until the remaining time has passed.
  The default is 15 minutes, which should be fine for almost everyone.
  The only disadvantage to setting this to a smaller number is that 
  it will increase the frequency that ERDDAP™ retries datasets that have
  errors that prevent them from being loaded (e.g., a remote server is down). 
  If there are lots of such datasets and they are retested frequently,
  the data source might consider it pestering/aggressive behaviour.
  Any changes to this tag's value will take effect the next time ERDDAP™ reads datasets.xml,
  including in response to a dataset
  <a rel="help" 
    href="https://erddap.github.io/setup.html#flag">flag</a>. 
  Before ERDDAP™ v2.00, this was specified in setup.xml, which is still allowed
  but discouraged.
  <br>&nbsp;

<li><a class="selfLink" id="loadDatasetsMaxMinutes" href="#loadDatasetsMaxMinutes" rel="bookmark"
  ><kbd><strong>&lt;loadDatasetsMaxMinutes&gt;</strong></kbd></a>
  is an OPTIONAL tag within an <kbd>&lt;erddapDatasets&gt;</kbd> tag in datasets.xml to specify
  the maximum time (in minutes) a major loadDatasets effort is allowed to take
  (before the loadDatasets thread treated as "stalled" and is interrupted) (default=60). E.g.,
  <br><kbd>&lt;loadDatasetsMaxMinutes&gt;60&lt;/loadDatasetsMaxMinutes&gt;</kbd>
  <br>In general, this should be set to at least twice as long as you reasonably
  think that reloading all of the datasets (cumulatively) should take
  (since computers and networks sometimes are slower than expected)
  This should always be much longer than loadDatasetsMinMinutes.
  The default is 60 minutes.  Some people will set this to longer.
  Any changes to this tag's value will take effect the next time ERDDAP™ reads datasets.xml,
  including in response to a dataset
  <a rel="help" 
    href="https://erddap.github.io/setup.html#flag">flag</a>. 
   Before ERDDAP™ v2.00, this was specified in setup.xml, which is still allowed
   but discouraged.
  <br>&nbsp;

<li><a class="selfLink" id="logLevel" href="#logLevel" rel="bookmark"
  ><kbd><strong>&lt;logLevel&gt;</strong></kbd></a>
  is an OPTIONAL tag within an <kbd>&lt;erddapDatasets&gt;</kbd> tag in datasets.xml to specify
  how many diagnostic messages are sent to the log.txt file.  
  It can be set to "warning" (the fewest messages), "info" (the default), or "all" (the most messages). 
  E.g.,
  <br><kbd>&lt;logLevel&gt;info&lt;/logLevel&gt;</kbd>
  <br>Any changes to this tag's value will take effect the next time ERDDAP™ reads datasets.xml,
  including in response to a dataset
  <a rel="help" 
    href="https://erddap.github.io/setup.html#flag">flag</a>.
  Before ERDDAP™ v2.00, this was specified in setup.xml, which is still allowed
  but discouraged.
  <br>&nbsp;

<li><a class="selfLink" id="partialRequestMaxBytes" href="#partialRequestMaxBytes" rel="bookmark"
  ><kbd><strong>&lt;partialRequestMaxBytes&gt;</strong></kbd></a> and 
  <a class="selfLink" id="partialRequestMaxCells" href="#partialRequestMaxCells" rel="bookmark"
  ><kbd><strong>&lt;partialRequestMaxCells&gt;</strong></kbd></a>
  are rarely used OPTIONAL tags within an <kbd>&lt;erddapDatasets&gt;</kbd> tag in datasets.xml.
  When possible (and it isn't always possible), ERDDAP™ breaks large data requests
  into chunks to conserve memory.

  <p>With 32 bit Java, in a simplistic sense, the maximum number of simultaneous
  <i>large</i> requests is roughly 3/4 of the memory available
  (the -Xmx value passed to Tomcat) divided by the chunk size
  (e.g., 1200 MB / 100 MB => 12 requests).
  Other things require memory, so the actual number of requests will be less.
  In practice, chunking isn't always possible.
  So one huge or a few very large simultaneous non-chunkable requests
  could cause problems on 32 bit Java.

  <p>With 64 bit Java, the -Xmx value can be much larger. So memory is much less
  likely to be a constraint.

  <p>You can override the default chunk size by defining these tags in datasets.xml
  (with different values than shown here):
  <br>For grids:  <kbd>&lt;partialRequestMaxBytes&gt;100000000&lt;/partialRequestMaxBytes&gt;</kbd>
  <br>For tables: <kbd>&lt;partialRequestMaxCells&gt;1000000&lt;/partialRequestMaxCells&gt;</kbd>

  <p>partialRequestMaxBytes is the preferred maximum number of bytes for a partial
  grid data request (a chunk of the total request). default=100000000 (10^8).
  Larger sizes aren't necessarily better (and don't go over 500 MB because that is
  THREDDS's default limit for DAP responses). But larger sizes may require fewer accesses of
  tons of files (think of ERD's satellite data with each time point
  in a separate file - it's better to get more data from each file
  in each partial request).

  <p>partialRequestMaxCells is the preferred maximum number of cells
  (nRows * nColumns in the data table) for a partial TABLE data request
  (a chunk of the total request).  Default = 100000.
  Larger sizes aren't necessarily better. They result in a longer wait for the
  initial batch of data from the source.

  <p>Any changes to this tag's value will take effect the next time ERDDAP™ reads datasets.xml,
  including in response to a dataset
  <a rel="help" 
    href="https://erddap.github.io/setup.html#flag">flag</a>. 
  Before ERDDAP™ v2.00, these were specified in setup.xml, which is still allowed
  but discouraged.
  <br>&nbsp;


<li><a class="selfLink" id="requestBlacklist" href="#requestBlacklist" rel="bookmark"><kbd><strong>&lt;requestBlacklist&gt;</strong></kbd></a>
    <a class="selfLink" id="frequentCrashes" href="#frequentCrashes" rel="bookmark">is an OPTIONAL tag</a>
    within an <kbd>&lt;erddapDatasets&gt;</kbd> tag in datasets.xml which contains a
    comma-separated list of numeric IP addresses which will be blacklisted.
    Any changes to this tag's value will take effect the next time ERDDAP™ reads datasets.xml,
  including in response to a dataset
  <a rel="help" 
    href="https://erddap.github.io/setup.html#flag">flag</a>. 
  <ul>
  <li>This can be used to fend off a 
    <a rel="help" href="https://en.wikipedia.org/wiki/Denial_of_service">Denial of Service attack<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>, 
    an overly zealous 
    <a rel="help" href="https://en.wikipedia.org/wiki/Internet_bot">web robot<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>,
    or any other type of troublesome user.
  <li><a class="selfLink" id="troublesomeUser" href="#troublesomeUser" rel="bookmark">Troublesome User</a> -- 
    If ERDDAP™ slows to a crawl or freezes/stops, the cause is often a troublesome user
    who is running more than one script at once and/or making a large number of 
    very large, extremely inefficient, or invalid requests, or simultaneous requests. Look in 
    <a rel="help" href="https://erddap.github.io/setup.html#log">log.txt</a> 
    to see if this is the case and to find the numeric IP address of the troublesome user.
    If this is the problem, you should probably blacklist that user. 

    <p>When ERDDAP™ gets a request from a blacklisted IP address, it will return 
    <kbd>HTTP Error 403: Forbidden</kbd>.
    The accompanying text error message encourages the user to email you, the ERDDAP
    administrator, to work out the problems.
    If they take the time to read the error message (many apparently don't) and contact you,
    you can then work with them to get them to run just one script at a time, 
    make more efficient requests, fix the problems in their script
    (for example, requesting data from a remote dataset that can't respond before timing out),
    or whatever else was the source of trouble.
    
    <p>Users are often simply unaware that their requests are troublesome.
    They are often unaware of bugs, gross inefficiencies, or other problems with their scripts.
    They often think that because your ERDDAP™ offers data for free, that 
      they can ask for as much data as they want, e.g., by running multiple 
      scripts or by using multiple threads simultaneously.  
    <ul>
    <li>You can explain to them that each ERDDAP™, now matter how large and powerful,
      has finite resources (CPU time, hard drive I/O, network bandwidth, etc.) 
      and it isn't fair if one user requests data in a 
      way that crowds out other users or overburdens ERDDAP.  
    <li>Once a user knows how to make 2 simultaneous requests, they often see no
      reason not to make 5, 10 or 20 simultaneous requests, since the additional 
      requests cost them nothing.  It's like asymmetric warfare: here, the offensive
      weapons have a tremendous advantage (zero cost) over the defensive weapons
      (a finite installation with real costs).
    <li>Point out to them that there are diminishing returns to making more and more
      simultaneous requests; the additional requests just further block out 
      other user's requests; they don't yield a huge improvement for them.
    <li>Remind them that there are other users (both casual users and other users running scripts), 
      so it isn't fair of them to hog all of ERDDAP's resources.
    <li>Point out that the tech giants have induced users to expect infinite resources from web services.
      While there are ways to set up
      <a rel="help" href="https://erddap.github.io/setup.html#grids"
      >grids/clusters/federations of ERDDAPs</a>
      to make an ERDDAP™ system with more resources,
      most ERDDAP™ administrators don't have the money or the manpower to set up
      such systems, and such a system will still be finite.
      At ERD for example, there's one person (me) writing ERDDAP™, administering two ERDDAPs
      (with help from my boss),
      and managing several data sources, all with an annual hardware budget of $0
      (we rely on occasional grants to pay for hardware). This isn't
      Google, Facebook, Amazon, etc with 100's of engineers, and millions
      of dollars of revenue to recycle into ever larger systems. 
      And we can't just move our ERDDAP™ to, for example, Amazon AWS, 
      because the data storage costs are large and the data egress charges
      are large and variable, while our budget for external services is a fixed $0.
    <li>My request to users is: for non-time-sensitive requests 
      (which is by far the most common case), 
      their system should just make one request at a time. 
      If the requests are time sensitive 
      (e.g., multiple .pngs on a web page, multiple tiles for a WMS client, etc.), 
      then perhaps 4 simultaneous requests should be the max
      (and just for a very short time).
    <li>If you explain the situation to the user, most users will understand and 
      be willing to make the necessary changes so that you
      can remove their IP address from the blacklist.
      <br>&nbsp;
    </ul>

  <li>To blacklist a user, add their numeric IP address to the comma-separated 
      list of IP addresses in 
      <kbd>&lt;requestBlacklist&gt;</kbd> in your datasets.xml file.
      To find the troublesome user's IP address, look in the ERDDAP™ 
      <i>bigParentDirectory</i>/logs/log.txt file (<i>bigParentDirectory</i> is specified in 
        <a rel="help" href="https://erddap.github.io/setup.html#setup.xml">setup.xml</a>)
      to see if this is the case and to find that user's IP address.
      The IP address for every request is listed on the lines starting with 
      "{{{{#" and is 4 numbers separated by periods, for example, 123.45.67.8 .
      Searching for "ERROR" will help you find problems such as invalid requests.
  <li>You can also replace the last number in an IP address with * (for example, 202.109.200.*) to block a range of IP addresses, 0-255.
  <li>You can also replace the last 2 numbers in an IP address with *.* (for example, 121.204.*.*) to block a wider range of IP addresses, 0-255.0-255.
  <li>For example, 
    <br><kbd>&lt;requestBlacklist&gt;98.76.54.321, 202.109.200.*, 121.204.*.*&lt;/requestBlacklist&gt;</kbd>
  <li>You don't need to restart ERDDAP™ for the changes to <kbd>&lt;requestBlacklist&gt;</kbd> to take effect.
    The changes will be detected the next time ERDDAP™ checks if any datasets need to be reloaded.
    Or, you can speed up the process by visiting a 
    <a rel="help" href="https://erddap.github.io/setup.html#setDatasetFlag">setDatasetFlag URL</a> 
    for any dataset.
  <li>Your ERDDAP™ daily report includes a list/tally of the most active allowed and blocked requesters. 
  <li>If you want to figure out what domain/institution is related to a numeric IP address, 
     you can use a free, reverse DNS web service like 
    <a rel="help" href="https://network-tools.com/">https://network-tools.com/<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
  <li>There may be times when it makes sense to block certain users at a higher level,
    for example, malicious users.
    For example, you can block their access to everything on your server, not just ERDDAP.
    On Linux, one such method is to use <a rel="help"
    href="https://www.linode.com/docs/guides/control-network-traffic-with-iptables/"
    >iptables<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
    For example, you can add a rule that will block everything coming from 198.51.100.0
    with the command
    <br><kbd>iptables -I INPUT -s 198.51.100.0 -j DROP</kbd>
    <br>&nbsp;
  </ul>

<li><a class="selfLink" id="slowDownTroubleMillis" href="#slowDownTroubleMillis" rel="bookmark"
    ><kbd><strong>&lt;slowDownTroubleMillis&gt;</strong></kbd></a>
    is a rarely used OPTIONAL tag within an <kbd>&lt;erddapDatasets&gt;</kbd> tag in datasets.xml which contains 
    an integer specifying the number of milliseconds (default=1000) to pause when 
    responding to all failed requests, e.g., unknown dataset, request too large, user on the blacklist. E.g., 
    <br><kbd>&lt;slowDownTroubleMillis&gt;2000&lt;/slowDownTroubleMillis&gt;</kbd>

    <p>If a script is making one request immediately after another, 
    then it might rapidly make one bad request after another.
    With this setting, you can slow down a failing script so ERDDAP™ isn't flooded with
    bad requests.
    If a human makes a bad request, they won't even notice this delay.
    Recommendations:
    <ul>
    <li>If the trouble is a Distributed Denial Of Service (DDOS) attack from 100+
      attackers, set this to a smaller number (100?).
      Slowing them all down for too long leads to too many active threads.
    <li>If the trouble is from 1-10 sources, set this to 1000 ms (the default),
      but a larger number (like 10000) is also reasonable.
      That slows them down so they waste fewer network resources.
      Also, 1000 ms or so won't annoy human users who make a bad request.
    </ul>
    Any changes to this tag's value will take effect the next time ERDDAP™ reads datasets.xml,
  including in response to a dataset
  <a rel="help" 
    href="https://erddap.github.io/setup.html#flag">flag</a>. 
  <br>&nbsp;

<li><a class="selfLink" id="subscriptionEmailBlacklist" href="#subscriptionEmailBlacklist" rel="bookmark"><kbd><strong>&lt;subscriptionEmailBlacklist&gt;</strong></kbd></a>
  is a rarely used OPTIONAL tag within an <kbd>&lt;erddapDatasets&gt;</kbd> tag in datasets.xml which contains
  a comma-separated list of email addresses which are immediately blacklisted from the 
  <a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/subscriptions">subscription system</a>, for example
  <br><kbd>&lt;subscriptionEmailBlacklist&gt;bob@badguy.com, john@badguy.com&lt;/subscriptionEmailBlacklist&gt;</kbd> 
  <br>This is a case-insensitive system.
  If an email address is added to this list, if that email address has subscriptions, 
  the subscriptions will be cancelled.
  If an email address on the list tries to subscribe, the request will be refused.
  Any changes to this tag's value will take effect the next time ERDDAP™ reads datasets.xml,
  including in response to a dataset
  <a rel="help" 
    href="https://erddap.github.io/setup.html#flag">flag</a>. 
  <br>&nbsp;


<li><a class="selfLink" id="standardText" href="#standardText" rel="bookmark"
  ><strong>Standard Text</strong></a> --
  There are several OPTIONAL tags (most are rarely used) 
  within an <kbd>&lt;erddapDatasets&gt;</kbd> tag in datasets.xml
  to specify text that appears in various places in ERDDAP.
  If you want to change the default text, copy the existing value from
  the tag of the same name in 
  <br><i>tomcat</i>/webapps/erddap/WEB-INF/classes/gov/noaa/pfel/erddap/util.messages.xml
  into datasets.xml, then modify the content.
  The advantage of having these in datasets.xml is that you can specify new
  values at any time, even when ERDDAP™ is running.
  Any changes to these tags' values will take effect the next time ERDDAP™ reads datasets.xml,
  including in response to a dataset
  <a rel="help" 
    href="https://erddap.github.io/setup.html#flag">flag</a>. 
  The tag names describe their purpose, but see the default
  content in messages.xml for a deeper understanding.
    <ul>
    <li>&lt;standardLicense&gt;
    <li>&lt;standardContact&gt;
    <li>&lt;standardDataLicenses&gt;
    <li>&lt;standardDisclaimerOfEndorsement&gt;
    <li>&lt;standardDisclaimerOfExternalLinks&gt;
    <li>&lt;standardGeneralDisclaimer&gt;
    <li>&lt;standardPrivacyPolicy&gt;

    <li>&lt;startHeadHtml5&gt; 
    <li>&lt;startBodyHtml5&gt; is a good tag to change in order to customize
      the appearance of the top of every web page in your ERDDAP. 
      Notably, you can use this to easily add a temporary message on the ERDDAP™ 
      home page (e.g., "Check out the new JPL MUR SST v4.1 dataset ..." or
      "This ERDDAP™ will be offline for maintenance 2019-05-08T17:00:00 PDT
      through 2019-05-08T20:00:00 PDT.").

      One quirk of putting this tag in datasets.xml is: when you restart ERDDAP,
      the very first request to ERDDAP™ will return the default startBodyHtml5 HTML,
      but every subsequent request will use the startBodyHtml5 HTML specified
      in datasets.xml.
    <li>&lt;theShortDescriptionHtml&gt; is a good tag to change in order to customize
      the description of your ERDDAP. Note that you
      can easily change this to add a temporary message on the home page 
      (e.g., "This ERDDAP™ will be offline for maintenance 2019-05-08T17:00:00 PDT
      through 2019-05-08T20:00:00 PDT.").
    <li>&lt;endBodyHtml5&gt;
    </ul>
      <br>Before ERDDAP™ v2.00, these were specified in setup.xml, which is still allowed
       but discouraged.
      <br>&nbsp;

<li><a class="selfLink" id="unusualActivity" href="#unusualActivity" rel="bookmark"
  ><kbd><strong>&lt;unusualActivity&gt;</strong></kbd></a>
  is a rarely used OPTIONAL tag within an <kbd>&lt;erddapDatasets&gt;</kbd> tag in datasets.xml to specify
  the maximum number of requests between two runs of LoadDatasets that is considered 
  normal (default=10000). If that number is exceeded, an email is sent to 
  emailEverythingTo (as specified in setup.xml). E.g.,
  <br><kbd>&lt;unusualActivity&gt;10000&lt;/unusualActivity&gt;</kbd>
  <br>Any changes to this tag's value will take effect the next time ERDDAP™ reads datasets.xml,
  including in response to a dataset
  <a rel="help" 
    href="https://erddap.github.io/setup.html#flag">flag</a>. 
  Before ERDDAP™ v2.00, this was specified in setup.xml, which is still allowed
  but discouraged.
  <br>&nbsp;

<li><a class="selfLink" id="updateMaxEvents" href="#updateMaxEvents" rel="bookmark"
  ><kbd><strong>&lt;updateMaxEvents&gt;</strong></kbd></a>
  is a rarely used OPTIONAL tag within an <kbd>&lt;erddapDatasets&gt;</kbd> tag in datasets.xml to specify
  the maximum number of file change events (default=10) that will be handled by the 
  <a rel="help" href="#updateEveryNMillis"><kbd>&lt;updateEveryNMillis&gt;</kbd></a> system
  before switching to reloading the dataset instead. For example,
  <br><kbd>&lt;updateMaxEvents&gt;</a>10&lt;/updateMaxEvents&gt;</kbd>
  <br>The updateEveryNMillis system is intended to run very quickly right before a user's
  request is processed. If there are a lot of file change events, then presumably
  it can't run quickly, so it instead calls for the dataset to be reloaded. 
  If your ERDDAP™ deals with datasets that must be kept up-to-date even when there are
  changes to a large number of data files, you can set this to a larger number (100?).
  <br>&nbsp;

<li><a class="selfLink" id="user" href="#user" rel="bookmark"><kbd><strong>&lt;user&gt;</strong></kbd></a>
  is an OPTIONAL tag within an <kbd>&lt;erddapDatasets&gt;</kbd> tag in datasets.xml that identifies a user's
  username, password (if authentication=custom), and roles (a comma-separated list).
  The use of username and password varies slightly based on the value of 
  <a rel="help" href="https://erddap.github.io/setup.html#authentication"
  ><kbd>&lt;authentication&gt;</kbd></a> in your ERDDAP's setup.xml file.
  <ul>
  <li>This is part of ERDDAP's 
    <a rel="help" href="https://erddap.github.io/setup.html#security">security system</a>
    for restricting access to some datasets to some users.
  <li>Make a separate <kbd>&lt;user&gt;</kbd> tag for each user.
    Optionally, if authentication=oauth2, you can set up two <kbd>&lt;user&gt;</kbd> tags for each user:
    one for when the user logs in via Google, one for when the user logs in via Orcid,
    presumably with the same <kbd>roles</kbd>.
  <li>If there is no <kbd>&lt;user&gt;</kbd> tag for a client, s/he will only be 
    able to access public datasets,
    i.e., datasets which don't have an <a rel="help" href="#accessibleTo"><kbd>&lt;accessibleTo&gt;</kbd></a>
    tag.
  <li><kbd>username</kbd>
    <br>For authentication=custom, the username is usually a combination of letters, 
      digits, underscores, and periods.
    <br>For authentication=email, the username is the user's email address.
      It may be any email address.
    <br>For authentication=google, 
      the username is the user's full Google email address.
      This includes Google-managed accounts like @noaa.gov accounts.    
    <br>For authentication=orcid, 
      the username is the user's Orcid account number (with dashes).    
    <br>For authentication=oauth2, 
      the username is the user's full Google email address or 
      the user's Orcid account number (with dashes).   
  <li><kbd>password</kbd>
    <br>For authentication=email, google, orcid, or oauth2, don't specify a password attribute.
    <br>For authentication=custom, you must specify a password attribute for each user.
    <ul>
    <li>The passwords that users enter are case sensitive 
      and must have 8 or more characters so they are harder to crack.
      Nowadays, even 8 characters can be cracked quickly and inexpensively
      by brute force using a cluster of computers on AWS.
      ERDDAP™ only enforces the 8-character minimum when the user tries
      to log in (not when the &lt;user&gt; tag is being processed, because
      that code only sees the hash digest of the password, not the plaintext password).
    <li>setup.xml's <kbd>&lt;passwordEncoding&gt;</kbd> determines how passwords are stored in the
      <kbd>&lt;user&gt;</kbd> tags in datasets.xml.  In order of increasing security, the options are: 
      <ul>
      <li><a rel="help" href="https://en.wikipedia.org/wiki/MD5">MD5<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
      (Don't use this!) -- for the password attribute, specify the MD5 hash digest
        of the user's password.
      <li>UEPMD5 (Don't use this!) -- for the password attribute, specify the 
        MD5 hash digest of <i>username</i>:ERDDAP:<i>password</i> .
        The username and "ERDDAP" are used to 
          <a rel="help" href="https://en.wikipedia.org/wiki/Salt_(cryptography)">salt<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        the hash value, making it more difficult to decode. 
      <li><a rel="help" href="https://en.wikipedia.org/wiki/SHA-2">SHA256<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
        (not recommended) -- for the password attribute, specify the SHA-256 hash digest
        of the user's password.
      <li>UEPSHA256 (default, recommended passwordEncoding. 
        But much better: use the google, orchid, or oauth2 authentication options.)
        -- for the password attribute, 
        specify the SHA-256 hash digest of <i>username</i>:ERDDAP:<i>password</i> .
        The username and "ERDDAP" are used to salt
        the hash value, making it more difficult to decode. 
      </ul>
    <li>On Windows, you can generate MD5 password digest values by downloading an MD5 program 
      (such as <a rel="help" href="https://www.fourmilab.ch/md5/">MD5<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>)
        and using (for example): 
      <br> <kbd>md5 -djsmith:ERDDAP:<i>actualPassword</i></kbd>
    <li>On Linux/Unix, you can generate MD5 digest values by using the built-in md5sum program (for example):
      <br>  <kbd>echo -n "jsmith:ERDDAP:<i>actualPassword</i>" | md5sum</kbd>
    <li>Stored plaintext passwords are case sensitive. 
       The stored forms of MD5 and UEPMD5 passwords are not case sensitive.
    <li>For example (using UEPMD5), if username="jsmith" and password="myPassword", the
      <kbd>&lt;user&gt;</kbd> tag is:
      <br><kbd>&lt;user username="jsmith" 
      <br>password="57AB7ACCEB545E0BEB46C4C75CEC3C30" 
      <br>roles="JASmith, JASmithGroup" /&gt;</kbd>
      <br>where the stored password was generated with
      <br><kbd>md5 -djsmith:ERDDAP:myPassword</kbd>
    <li><kbd>roles</kbd> is a comma-separated list of roles for which the user is authorized.
      Any <kbd>&lt;dataset&gt;</kbd> may have an 
      <a rel="help" href="#accessibleTo"><kbd>&lt;accessibleTo&gt;</kbd></a> tag which lists
      the roles which are allowed to access that dataset.
      For a given user and a given dataset, if one of the roles in the user's list of roles 
      matches one of the roles in the dataset's list of <kbd>&lt;accessibleTo&gt;</kbd> roles,
      then the user is authorized to access that dataset.
      <p>Every user who logs in is automatically given the role <kbd>[anyoneLoggedIn]</kbd>,
      whether there is a <kbd>&lt;user&gt;</kbd> tag for them in datasets.xml or not.
      So if a given dataset has
      <br><kbd>&lt;accessibleTo&gt;[anyoneLoggedIn]&lt;/accessibleTo&gt;</kbd>
      <br>then any user that is logged in will be authorized to access that dataset,
      even if there is no <kbd>&lt;user&gt;</kbd> tag for them in datasets.xml.
    </ul>
  <li>Any changes to this tag's value will take effect the next time ERDDAP™ reads datasets.xml,
  including in response to a dataset
  <a rel="help" 
    href="https://erddap.github.io/setup.html#flag">flag</a>. 
    <br>&nbsp;
  </ul> 

<li><a class="selfLink" id="pathRegex" href="#pathRegex" rel="bookmark"><kbd><strong>&lt;pathRegex&gt;</strong></kbd></a>
  lets you specify a regular expression which limits which paths 
  (which subdirectories) will be included in the dataset. The default is .*, 
  which matches all paths.
  This is a rarely used, rarely needed, OPTIONAL tag for 
  EDDGridFromFiles datasets, EDDTableFromFiles datasets, and a few other dataset types.
  However, when you need it, you really need it.

  <p>To make this work, you need to be really good with regular expressions. 
   See this <a rel="help"
          href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/regex/Pattern.html"
          >regex documentation<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a> 
            and 
          <a rel="help" href="https://www.vogella.com/tutorials/JavaRegularExpressions/article.html"
          >regex tutorial<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a>.
  In particular, you need to know about capture groups (something inside parentheses),
  and the "or" symbol "|".
  <br>Together, these let you specify any number of options, e.g., <kbd>(option1|option2|option3)</kbd> .
  <br>Also, any of the options can be nothing, e.g., <kbd>(|option2|option3)</kbd> .
  <br>Also, you need to know that capture groups can be nested, i.e., any option in a capture
  group can contain another capture group, e.g., <kbd>(|option2(|option2b|option2c)|option3)</kbd> 
  which says that option2 can be followed by nothing, or option2b, or option2c.
  <br>For pathRegexes, each option will be one folder name followed by a /, e.g., bar/ .

  <p>The tricky part of the pathRegex is: 
  When ERDDAP™ recursively descends the directory tree, 
  the pathRegex must accept all the paths it 
  encounters on its way to the directories with data. 
  Regex's with nested capture groups are a good way to deal with this. 
  
  <p>An Example:
  <br>Suppose we have the following directory structure:
  <br>/foo/bar/D0001/a/*.nc
  <br>/foo/bar/D0001/b/*.nc
  <br>/foo/bar/D0002/a/*.nc
  <br>/foo/bar/D0002/b/*.nc
  <br>...
  <br>/foo/bar/E0001/a/*.nc
  <br>...
  <br>and the specified fileDirectory is /foo/bar/, 
    and we just want the .nc files in the D[0-9]{4}/a/ subdirectories.
  <br>The solution is to set pathRegex to <kbd>/foo/bar/(|D[0-9]{4}/(|a/))</kbd>
  <br>That says:
  <br>The path must start with /foo/bar/
  <br>&nbsp;&nbsp;That may be followed by nothing or D[0-9]{4}/
  <br>&nbsp;&nbsp;&nbsp;&nbsp;That may be followed by nothing or a/

  <p>Yes, pathRegex's can be incredibly difficult to formulate.
  If you get stuck, ask a computer programmer (the closest thing in the real 
  world to a wizard spouting incantations?) or send an email to 
  Chris.John at noaa.gov.

<li><a class="selfLink" id="dataset" href="#dataset" rel="bookmark"><kbd><strong>&lt;dataset&gt;</strong></kbd></a>
  is an OPTIONAL (but always used) tag within an <kbd>&lt;erddapDatasets&gt;</kbd> tag in datasets.xml that 
  (if you include all of the information between 
  <kbd>&lt;dataset&gt;</kbd> and <kbd>&lt;/dataset&gt;</kbd>) 
  completely describes one dataset. 
  For example, 
  <br><kbd>&lt;dataset type="EDDGridFromDap" datasetID="erdPHssta8day" active="true"&gt; ... &lt;/dataset&gt;</kbd>
  <br>There MAY be any number of dataset tags in your datasets.xml file.
  <br>Three attributes MAY appear within a <kbd>&lt;dataset&gt;</kbd> tag:
  <br>&nbsp;

<ul>
<li><kbd><strong>type="<i>aType</i>"</strong></kbd> is a REQUIRED attribute within a 
  <kbd>&lt;dataset&gt;</kbd> tag in datasets.xml which identifies the dataset type (for example,
  whether it is an EDDGrid/gridded or EDDTable/tabular dataset) and the source of the data 
  (for example, a database, files, or a remote OPeNDAP server).
  See the <a rel="help" href="#datasetTypes"><strong>List of Dataset Types</strong></a>.
  <br>&nbsp;
<li><a class="selfLink" id="datasetID" href="#datasetID" rel="bookmark"><kbd><strong>datasetID="<i>aDatasetID</i>"</strong></kbd></a> 
is a REQUIRED attribute within a 
  <kbd>&lt;dataset&gt;</kbd> tag which assigns a short
  (usually &lt;15 characters), unique, identifying name to a dataset.
    <ul>
    <li>The datasetIDs MUST be a letter (A-Z, a-z) followed by any number of 
      A-Z, a-z, 0-9, and _ (but best if &lt;32 characters total).  
      <!-- Technically, valid characters are A-Z, a-z, 0-9, _, and -. -->
    <li>DatasetIDs are case sensitive, but DON'T create two datasetIDs that
      only differ in upper/lowercase letters.  It will cause problems
      on Windows computers (yours and/or a user's computer).
    <li>Best practices: We recommend using <a rel="help" href="https://en.wikipedia.org/wiki/CamelCase">camelCase<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
    <li>Best practices: We recommend that the first part be an acronym or abbreviation of the
      source institution's name and the second part be an acronym or abbreviation of the 
      dataset's name.
      When possible, we create a name which reflects the source's name for the dataset.
      For example, we used <kbd>datasetID="erdPHssta8day"</kbd> for a dataset from the 
      NOAA NMFS SWFSC Environmental Research Division (ERD) which is designated by the source
      to be satellite/PH/ssta/8day.
    <li>If you change a dataset's name, the old dataset (with the old name) will still
      be live in ERDDAP. This is an "orphan" dataset, because the specification for it
      in datasets.xml is now gone. This must be dealt with:
      <ol>
      <li>For ERDDAP™ v2.19 and later, you don't need to do anything. 
         ERDDAP™ will automatically remove these orphan datasets.
      <li>For ERDDAP™ v2.18 and earlier, you need to do something 
            to remove the orphan datasets: 
            Make a active="false" dataset, e.g.,
            <br><kbd>&lt;dataset type="EDDTableFromNcFiles" datasetID="<i>theOldName</i>" active="false" /&gt;</kbd>
            <br>After the next major loadDatasets, You can remove that tag after the old dataset is inactive.
            <br>&nbsp;
          </ul>
      </ol>
    </ul>

<li><a class="selfLink" id="active" href="#active" rel="bookmark"><kbd><strong>active="<i>boolean</i>"</strong></kbd></a> 
  is an OPTIONAL attribute within a <kbd>&lt;dataset&gt;</kbd> tag in datasets.xml
  which indicates if a dataset is active (eligible for use in ERDDAP) or not.
  <ul>
  <li>Valid values are <kbd>true</kbd> (the default) and <kbd>false</kbd>.
  <li>Since the default is <kbd>true</kbd>, you don't need to use this attribute 
    until you want to temporarily or permanently remove this dataset from ERDDAP.
  <li>If you just remove an active="true" dataset from datasets.xml, the dataset
    will still be active in ERDDAP™ but will never be updated. 
    Such a dataset will be an "orphan" and will be listed as such on the status.html
    web page right below the list of datasets that failed to load.
  <li>If you set active="false", ERDDAP™ will deactivate the dataset the next time
    it tries to update the dataset. 
    When you do this, ERDDAP™ doesn't throw out any information it may have
    stored about the dataset and certainly doesn't do anything to the actual data.
  <li>In order to remove a dataset from ERDDAP™, see
      <a rel="help" href="https://erddap.github.io/setup.html#removingDatasets"
      >Force Dataset Removal</a>.
    <br>&nbsp;
  </ul>
</ul>


<p><strong>Several tags can appear between the <kbd>&lt;dataset&gt;</kbd> and 
<kbd>&lt;/dataset&gt;</kbd> tags.</strong>
<br>There is some variation in which tags are allowed by which types of datasets.
See the documentation for a specific 
<a rel="help" href="#datasetTypes">type&nbsp;of&nbsp;dataset</a>
for details.

<ul>
<li><a class="selfLink" id="accessibleTo" href="#accessibleTo" rel="bookmark"><kbd><strong>&lt;accessibleTo&gt;</strong></kbd></a>
    is an OPTIONAL tag within a <kbd>&lt;dataset&gt;</kbd> 
    tag that specifies a comma-separated list of 
  <a rel="help" href="#user">roles</a> which are allowed to have access to this dataset.
  For example,
      <br><kbd>&lt;accessibleTo&gt;RASmith, NEJones&lt;/accessibleTo&gt;</kbd>
      <br>
  <ul>
  <li>This is part of ERDDAP's 
    <a rel="help" href="https://erddap.github.io/setup.html#security">security system</a>
    for restricting access to some datasets to some users.
  <li>If this tag is not present, all users (even if they haven't logged in) will have access to
    this dataset.
  <li>If this tag is present, this dataset will only be visible and accessible 
    to logged-in users who have one of the specified roles.
    This dataset won't be visible to users who aren't logged in.
  <li>Every user who logs in is automatically given the role <kbd>[anyoneLoggedIn]</kbd>,
    whether there is a <kbd>&lt;user&gt;</kbd> tag for them in datasets.xml or not.
    So if a given dataset has
    <br><kbd>&lt;accessibleTo&gt;[anyoneLoggedIn]&lt;/accessibleTo&gt;</kbd>
    <br>then any user that is logged in will be authorized to access that dataset,
    even if there is no <kbd>&lt;user&gt;</kbd> tag for them in datasets.xml.
    <br>&nbsp;
  </ul> 

<li><a class="selfLink" id="graphsAccessibleTo" href="#graphsAccessibleTo" rel="bookmark"><kbd><strong>&lt;graphsAccessibleTo&gt;</strong></kbd></a>
    is an OPTIONAL tag within a <kbd>&lt;dataset&gt;</kbd> tag in datasets.xml which
    determines whether graphics and metadata for the dataset are available to the public.
    It offers a way to partially override the dataset's
    <a rel="help" href="#accessibleTo"><kbd>&lt;accessibleTo&gt;</kbd></a> setting.
    The allowed values are:
    <ul>
    <li><kbd>auto</kbd> -- This value (or the absence of a 
      &lt;graphsAccessibleTo&gt; tag for the dataset) makes access
      to graphs and metadata from the dataset mimic the dataset's
      <kbd>&lt;accessibleTo&gt;</kbd> setting.
      <br>So if the dataset is private, its graphs and metadata will be private.
      <br>And if the dataset is public, its graphs and metadata will be public.
    <li><kbd>public</kbd> -- This setting makes the dataset's graphs and metadata
      accessible to anyone, even users who aren't logged in,
      even if the dataset is otherwise private because it has an 
      <kbd>&lt;accessibleTo&gt;</kbd> tag.
      <br>&nbsp;
    </ul>

<li><a class="selfLink" id="accessibleViaFiles" href="#accessibleViaFiles" rel="bookmark"><kbd><strong>&lt;accessibleViaFiles&gt;</strong></kbd></a>
    is an OPTIONAL tag within a <kbd>&lt;dataset&gt;</kbd> tag in datasets.xml for 
    <a rel="help" href="#EDDGridAggregateExistingDimension">EDDGridAggregateExistingDimension</a>,
    <a rel="help" href="#EDDGridCopy">EDDGridCopy</a>, 
    <a rel="help" href="#EDDGridFromEDDTable">EDDGridFromEDDTable</a>,
    <a rel="help" href="#EDDGridFromErddap">EDDGridFromErddap</a>,
    <a rel="help" href="#EDDGridFromEtopo">EDDGridFromEtopo</a>,
    <a rel="help" href="#EDDGridFromFiles">EDDGridFromFiles</a> (including all subclasses),
    <a rel="help" href="#EDDGridSideBySide">EDDGridSideBySide</a>,
    <a rel="help" href="#EDDTableCopy">EDDTableCopy</a> 
    <a rel="help" href="#EDDTableFromErddap">EDDTableFromErddap</a>,
    <a rel="help" href="#EDDTableFromEDDGrid">EDDTableFromEDDGrid</a>, and
    <a rel="help" href="#EDDTableFromFiles">EDDTableFromFiles</a> (including all subclasses) datasets.

    It can have a value of <kbd>true</kbd> or <kbd>false</kbd>.
    For example,
    <br><kbd>&lt;accessibleViaFiles&gt;true&lt;/accessibleViaFiles&gt;</kbd>
    <br>If the value is <kbd>true</kbd>, ERDDAP™ will make it so that users can 
    browse and download the dataset's source data files via ERDDAP's 
<a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/files/">"files" system</a>. See the "files" system's 
<a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/files/documentation.html">documentation</a>
for more information. 

    <p>The default value of <kbd>&lt;accessibleViaFiles&gt;</kbd> 
    comes from <kbd>&lt;defaultAccessibleViaFiles&gt;</kbd>
    in <a rel="help" href="https://erddap.github.io/setup.html#setup.xml">setup.xml</a>.
    It has a default value of <kbd>false</kbd>, but we recommend that you 
    add that tag to your setup.xml with a value of <kbd>true</kbd>. 

    <p>Recommendation --
    We recommend making all relevant datasets accessible via the files system
    by setting <kbd>&lt;defaultAccessibleViaFiles&gt;</kbd> to <kbd>true</kbd> in setup.xml
    because there is a group of users for whom this is the preferred way to get the data.
    Among other reasons, the "files" system makes it easy for users to see which files
    are available and when they last changed, thus making it easy for a user
    to maintain their own copy of the entire dataset.
    If you generally don't want to make datasets accessible via the files system, set
    <kbd>&lt;defaultAccessibleViaFiles&gt;</kbd> to <kbd>false</kbd>.
    In either case, just use <kbd>&lt;accessibleViaFiles&gt;</kbd> for the 
    few datasets which are exceptions to the general policy set by 
    <kbd>&lt;defaultAccessibleViaFiles&gt;</kbd>
    (for example, when the dataset uses <a rel="help" href="#NcML">.ncml</a> files,
    which aren't really useful to users).

  <br>&nbsp;

<li><a class="selfLink" id="accessibleViaWMS" href="#accessibleViaWMS" rel="bookmark"><kbd><strong>&lt;accessibleViaWMS&gt;</strong></kbd></a>
    is an OPTIONAL tag within a <kbd>&lt;dataset&gt;</kbd> tag in datasets.xml for all
    <a rel="help" href="#EDDGrid">EDDGrid</a> subclasses.
    It can have a value of <kbd>true</kbd> (the default) or <kbd>false</kbd>.
    For example,
    <br><kbd>&lt;accessibleViaWMS&gt;true&lt;/accessibleViaWMS&gt;</kbd>
    <br>If the value is <kbd>false</kbd>, ERDDAP's WMS server won't be available for this dataset.
      This is commonly used for datasets that have some longitude values greater than 180
      (which technically is invalid for WMS services),
      and for which you are also offering a variant of the dataset with
      longitude values entirely in the range -180 to 180
      via <a rel="help" href="#EDDGridLonPM180">EDDGridLonPM180</a>.
    <br>If the value is <kbd>true</kbd>, ERDDAP™ will try to make the dataset
      available via ERDDAP's WMS server. But if the dataset is completely unsuitable
      for WMS (e.g., there is no longitude or latitude data), then the dataset
      won't be available via ERDDAP's WMS server, regardless of this setting. 
  <br>&nbsp;

<li><a class="selfLink" id="addVariablesWhere" href="#addVariablesWhere" rel="bookmark">&lt;addVariablesWhere&gt;</a> 
is an OPTIONAL tag within the <kbd>&lt;dataset&gt;</kbd> tag for all EDDTable datasets.

<p>Requests to any EDDTable dataset can include 
<kbd>&amp;addVariablesWhere("<i>attributeName</i>","<i>attributeValue</i>")</kbd>,
which tells ERDDAP™ to add all of the variables in the dataset where 
<kbd><i>attributeName=attributeValue</i></kbd> to the list of requested variables.
For example, if a user adds
<kbd>&amp;addVariablesWhere("ioos_category","Wind")</kbd> to a query, ERDDAP
will add all variables in the dataset that have an <kbd>ioos_category=Wind</kbd> attribute
to the list of requested variables (for example, windSpeed, windDirection, windGustSpeed).
<i>attributeName</i> and <i>attributeValue</i> are case-sensitive.

<p>In datasets.xml, if the chunk of dataset.xml for a dataset has
<br><kbd>&lt;addVariablesWhere&gt;<i>attributeNamesCSV</i>&lt;addVariablesWhere&gt;</kbd>
<br>for example,
<br><kbd>&lt;addVariablesWhere&gt;ioos_category,units&lt;addVariablesWhere&gt;</kbd>
<br>the Data Access Form (.html web page) for the dataset will include a widget 
(for each attributeName in the comma-separated list) right below
the list of variables which lets users specify an attribute value.
If the user selects an attribute value for one or more of the attribute names,
they will be added to the request via <kbd>&addVariablesWhere("<i>attributeName</i>","<i>attributeValue</i>")</kbd>.
Thus, this tag in datasets.xml lets you specify the list of attribute names which will
appear on the Data Access Form for that dataset and makes it easy for users
to add <kbd>&addVariablesWhere</kbd> functions to the request.
The <i>attributeNamesCSV</i> list is case-sensitive.


<li><a class="selfLink" id="altitudeMetersPerSourceUnit" href="#altitudeMetersPerSourceUnit" rel="bookmark"><kbd><strong>&lt;altitudeMetersPerSourceUnit&gt;</strong></kbd></a>
  is an OPTIONAL tag within the <kbd>&lt;dataset&gt;</kbd> tag in datasets.xxml 
  for EDDTableFromSOS datasets (only!) that specifies a 
  number which is multiplied by the source altitude or depth values 
  to convert them into altitude values (in meters above sea level).   For example,
  <br><kbd>&lt;altitudeMetersPerSourceUnit&gt;-1&lt;/altitudeMetersPerSourceUnit&gt;</kbd>
  <br>This tag MUST be used if the dataset's vertical axis values aren't meters, positive=up.
  Otherwise, it is OPTIONAL, since the default value is 1.
  For example, 
  <ul>
  <li>If the source is already measured in meters above sea level, use 1
    (or don't use this tag, since 1 is the default value).
  <li>If the source is measured in meters below sea level, use -1.
    <br><kbd>&lt;altitudeMetersPerSourceUnit&gt;-1&lt;/altitudeMetersPerSourceUnit&gt;</kbd>
  <li>If the source is measured in km above sea level, use 0.001.
    <br>&nbsp;
  </ul>

<li><a class="selfLink" id="defaultDataQuery" href="#defaultDataQuery" rel="bookmark"><kbd><strong>&lt;defaultDataQuery&gt;</strong></kbd></a> is an OPTIONAL tag
  within a <kbd>&lt;dataset&gt;</kbd> tag in datasets.xml that tells ERDDAP™ to use the specified
  query (the part of the URL after the "?") 
  if the .html fileType (the Data Access Form) is requested with no query.
  <ul>
  <li>You will probably rarely need to use this.
  <li>You need to XML-encode (not percent-encode) 
    the default queries since they are in an XML document.
    For example, &amp; becomes &amp;amp; , &lt; becomes &amp;lt; , &gt; becomes &amp;gt; .
  <li>Please check your work.  It is easy to make a mistake and not get what you want.
    ERDDAP™ will try to clean up your errors -- but don't rely on that, since *how* it is cleaned up may change.
  <li>For griddap datasets, a common use of this is to specify a different default 
    depth or altitude dimension value (for example, [0] instead of [last]).
    <br>In any case, you should always list all of the variables, 
      always use the same dimension values for all variables,
      and almost always use [0], [last], or [0:last] for the dimension values.
    <br>For example:        <br><kbd>&lt;defaultDataQuery&gt;u[last][0][0:last][0:last],<wbr>v[last][0][0:last][0:last]<wbr>&lt;/defaultDataQuery&gt;</kbd>
  <li>For tabledap datasets, if you don't specify any constraint, the request will return the entire dataset, 
       which may be impractically large, depending on the dataset.
     If you don't want to specify any constraints, rather than have an empty <kbd>&lt;defaultDataQuery&gt;</kbd>
     (which is the same as not specifying a defaultDataQuery),
     you need to explicitly list all of the variables you want to include in the defaultDataQuery.
  <li>For tabledap datasets, the most common use of this is to specify a different default 
    time range 
    (relative to <kbd>max(time)</kbd>, for example, <kbd>&amp;time&gt;=max(time)-1day</kbd>,
    or relative to <kbd>now</kbd>, for example, <kbd>&amp;time&gt;=now-1day</kbd> ).
    <br>Remember that requesting no data variables is the same as specifying all data variables, 
      so usually you can just specify the new time constraint.
    <br>For example:        
    <br><kbd>&lt;defaultDataQuery&gt;&amp;amp;time<wbr>&amp;gt;=max(time)-1day<wbr>&lt;/defaultDataQuery&gt;</kbd>
    <br>or 
    <br><kbd>&lt;defaultDataQuery&gt;&amp;amp;time<wbr>&amp;gt;=now-1day<wbr>&lt;/defaultDataQuery&gt;</kbd>
    <br>&nbsp;
  </ul>

<li><a class="selfLink" id="defaultGraphQuery" href="#defaultGraphQuery" rel="bookmark"><kbd><strong>&lt;defaultGraphQuery&gt;</strong></kbd></a> is an OPTIONAL tag
  within a <kbd>&lt;dataset&gt;</kbd> tag in datasets.xml that tells ERDDAP™ to use the specified
  query (the part of the URL after the "?") 
  if the .graph fileType (the Make A Graph Form) is requested with no query.
  <ul>
  <li>You will probably rarely need to use this.
  <li>You need to XML-encode (not percent-encode) 
    the default queries since they are in an XML document.
    For example, &amp; becomes &amp;amp; , &lt; becomes &amp;lt; , &gt; becomes &amp;gt; .
  <li>Please check your work.  It is easy to make a mistake and not get what you want.
    ERDDAP™ will try to clean up your errors -- but don't rely on that, since *how* it is cleaned up may change.
  <li>For griddap datasets, the most common use of this is to specify a different default 
    depth or altitude dimension value (for example, [0] instead of [last]) and/or to specify that a specific
    variable be graphed.
    <br>In any case, you will almost always use [0], [last], or [0:last] for the dimension values.
    <br>For example:                
    <br><kbd>&lt;defaultGraphQuery&gt;temp[last][0][0:last][0:last]&amp;amp;.draw=surface<wbr>&amp;amp;.vars=longitude|latitude|temp<wbr>&lt;/defaultGraphQuery&gt;</kbd>
    <br>(but put it all on one line)
  <li>For tabledap datasets, if you don't specify any constraint, the request will graph the entire dataset, 
       which may take a long time, depending on the dataset.
  <li>For tabledap datasets, the most common use of this is to specify a different default 
    time range 
    (relative to <kbd>max(time)</kbd>, for example, <kbd>&amp;time&gt;=max(time)-1day</kbd>,
    or relative to <kbd>now</kbd>, for example, <kbd>&amp;time&gt;=now-1day</kbd> ).
    <br>Remember that requesting no data variables is the same as specifying all data variables, 
      so usually you can just specify the new time constraint.
    <br>For example:        
    <br><kbd>&lt;defaultGraphQuery&gt;&amp;amp;time<wbr>&amp;gt;=max(time)-1day<wbr>&lt;/defaultGraphQuery&gt;</kbd>
    <br>or 
    <br><kbd>&lt;defaultGraphQuery&gt;&amp;amp;time<wbr>&amp;gt;=now-1day<wbr>&lt;/defaultGraphQuery&gt;</kbd>
    <br>&nbsp;
  </ul>


<li><a class="selfLink" id="dimensionValuesInMemory" href="#dimensionValuesInMemory" rel="bookmark"
><kbd><strong>&lt;dimensionValuesInMemory&gt;</strong></kbd></a> 
  (<kbd>true</kbd> (the default) or <kbd>false</kbd>)
  is an OPTIONAL and rarely used tag within the <kbd>&lt;dataset&gt;</kbd> tag for any EDDGrid dataset 
  that tells ERDDAP™ where to keep 
  the source values of the dimensions (also known as the axisVariables): 
  <ul>
  <li><kbd>true</kbd> = in memory (which is faster but uses more memory) 
  <li><kbd>false</kbd> = on disk (which is slower but uses no memory)
  </ul>
  For example,
  <br><kbd>&lt;dimensionValuesInMemory&gt;false&lt;/dimensionValuesInMemory&gt;</kbd>
  <br>You should only use this with the non-default value of <kbd>false</kbd> if
  your ERDDAP™ has a lot of datasets with very large dimensions 
  (e.g., millions of values, e.g., in EDDGridFromAudioFiles datasets) 
  and ERDDAP's In Use memory usage is always too high.
  See the 
  <kbd>Memory: currently using</kbd> line at [yourDomain]/erddap/status.html 
  to monitor ERDDAP™ memory usage. 
  <br>&nbsp;

<li><a class="selfLink" id="fileTableInMemory" href="#fileTableInMemory" rel="bookmark"><kbd><strong>&lt;fileTableInMemory&gt;</strong></kbd></a> 
  (<kbd>true</kbd> or <kbd>false</kbd> (the default))
  is an OPTIONAL tag within the <kbd>&lt;dataset&gt;</kbd> tag for any EDDGridFromFiles and EDDTableFromFiles
  dataset that tells ERDDAP™ where to keep the fileTable 
  (which has information about each source data file): 
  <ul>
  <li><kbd>true</kbd> = in memory (which is faster but uses more memory) 
  <li><kbd>false</kbd> = on disk (which is slower but uses no memory)
  </ul>
  For example,
  <br><kbd>&lt;fileTableInMemory&gt;true&lt;/fileTableInMemory&gt;</kbd>
  <br>If you set this to <kbd>true</kbd> for any dataset, keep an eye on the 
  <kbd>Memory: currently using</kbd> line at [yourDomain]/erddap/status.html 
  to ensure that ERDDAP™ still has plenty of free memory. 
  <br>&nbsp;

<li><a class="selfLink" id="fgdcFile" href="#fgdcFile" rel="bookmark"><kbd><strong>&lt;fgdcFile&gt;</strong></kbd></a> is an OPTIONAL tag
  within a <kbd>&lt;dataset&gt;</kbd> tag in datasets.xml that tells ERDDAP™ to use a pre-made FGDC file
  instead of having ERDDAP™ try to generate the file.  Usage:
  <br><kbd>&lt;fgdcFile&gt;<i>fullFileName</i>&lt;/fgdcFile&gt;</kbd>
  <br><i>fullFileName</i> can refer to a local file 
    (somewhere on the server's file system)
    or the URL of a remote file.
  <br>If <kbd><i>fullFileName</i></kbd>="" or the file isn't found, 
    the dataset will have no FGDC metadata.
    So this is also useful if you want to suppress the FGDC metadata
    for a specific dataset.
  <br>Or, you can put <kbd>&lt;fgdcActive&gt;false&lt;/fgdcActive&gt;</kbd>
    in setup.xml to tell ERDDAP™ not to offer FGDC metadata for any dataset.
  <br>&nbsp;

<li><a class="selfLink" id="iso19115File" href="#iso19115File" rel="bookmark"
  ><kbd><strong>&lt;iso19115File&gt;</strong></kbd></a> is an OPTIONAL tag
  within a <kbd>&lt;dataset&gt;</kbd> tag in datasets.xml that tells ERDDAP™ to use a pre-made ISO 19115 file
  instead of having ERDDAP™ try to generate the file.  Usage:
  <br><kbd>&lt;iso19115File&gt;<i>fullFileName</i>&lt;/iso19115File&gt;</kbd>
  <br><i>fullFileName</i> can refer to a local file 
    (somewhere on the server's file system)
    or the URL of a remote file.
  <br>If <kbd><i>fullFileName</i></kbd>="" or the file isn't found, 
    the dataset will have no ISO 19115 metadata.
    So this is also useful if you want to suppress the ISO 19115 metadata
    for a specific dataset.
  <br>Or, you can put <kbd>&lt;iso19115Active&gt;false&lt;/iso19115Active&gt;</kbd>
    in setup.xml to tell ERDDAP™ not to offer ISO 19115 metadata for any dataset.
  <br>&nbsp;

<li><a class="selfLink" id="matchAxisNDigits" href="#matchAxisNDigits" rel="bookmark"
  ><kbd><strong>&lt;matchAxisNDigits&gt;</strong></kbd></a>
  is an OPTIONAL tag within an EDDGrid <kbd>&lt;dataset&gt;</kbd> tag 
  for EDDGrid datasets that are aggregations, e.g., aggregations of files.
  Each time the dataset is reloaded, ERDDAP™ checks that the axis values of 
  each component of the aggregation are the same.
  The precision of the testing is determined by the 
  <a rel="help" href="#matchAxisNDigits"><kbd>matchAxisNDigits</kbd></a>,
  which specifies the total number of digits which must match
  when testing double precision axis values, 0 - 18 (the default).
  When testing float axis values, the test is done with 
  <kbd>matchAxisNDigits</kbd>/2 digits.
  A value of 18 or above tells EDDGrid to do an exact test.
  A value of 0 tells EDDGrid not to do any testing, which is not recommended,
  except as described below.

  <p>Although EDDGrid allows the components of the aggregation to have
  slightly different axis values, only one set of axis values is shown
  to the user. The set is from the same component that provides the
  dataset's source metadata. For example, for EDDGridFromFiles datasets, that 
  is specified by the &lt;metadataFrom&gt; setting (default=<kbd>last</kbd>).

  <p>Use of <kbd>matchAxisNDigits</kbd>=0 is strongly discouraged in most cases, 
  because it turns off all checking. Even minimal checking is useful 
  because it ensures that the components are suitable for aggregating.
  We all assume that all the components are suitable, but it isn't always so.
  This is thus an important sanity test.
  Even values of <kbd>matchAxisNDigits</kbd>1, 2, 3 or 4 are discouraged 
  because the different axis values often indicate that the components
  were created (binned?) a different way and are thus not suitable for
  aggregation.

  <p>There is one case where using <kbd>matchAxisNDigits</kbd>=0 is 
  useful and recommended: with aggregations of remote files, e.g., 
  data in S3 buckets. In this case, if the dataset uses
  <kbd>cacheFromUrl</kbd>, <kbd>cacheSizeGB</kbd>, 
  <kbd>matchAxisNDigits</kbd>=0, and the EDDGridFromFiles system for
  <a rel="help" href="#EDDGridFromFiles_AggregationViaFileNames"
  >Aggregation via File Names</a>,
  then EDDGrid doesn't have to read all of the remote files to do the aggregation.
  This allows datasets made from data in S3 buckets to load very quickly
  (as opposed to absurdly slowly if EDDGrid has to download and read all of the files).


<li><a class="selfLink" id="nThreads" href="#nThreads" rel="bookmark"
><kbd><strong>&lt;nThreads&gt;</strong></kbd></a> -- 
<a class="selfLink" id="nGridThreads"  href="#nGridThreads"  rel="bookmark">Starting</a>
<a class="selfLink" id="nTableThreads" href="#nTableThreads" rel="bookmark">with</a> 
ERDDAP™ version 2.00, 
when any subclass of EDDTableFromFiles or EDDGrid reads data from its source, 
it can read one chunk of data (e.g., one source file) at a time (in one thread) (that's the default) 
or more than one chunk of data (e.g., 2+ source files) at a time (in 2 or more threads)
while processing each request. 
<br>&nbsp;

    <ul>
    <li>Rule of Thumb:
        <br>For most datasets on most systems, use nThreads=1, the default.
        If you have a powerful computer (lots of CPU cores, lots of memory), 
        then consider setting nThreads to 2, 3, 4, or higher 
        (but never more than the number of CPU cores in the computer)
        for datasets that might benefit:
        <ul>
        <li>Most EDDTableFromFiles datasets will benefit.
        <li>Datasets where something causes a lag before a chunk of data can actually 
          be processed will benefit, for example:
            <ul>
            <li>Datasets with 
                <a rel="help" href="#ExternallyCompressedFiles">externally-compressed (e.g., .gz)</a>     
                binary (e.g., .nc) files, because ERDDAP™ has to decompress the whole file
                before it can start to read the file.
            <li>Datasets that use 
                <a href="#cacheFromUrl" rel="bookmark">cacheSizeGB</a>,
                because ERDDAP™ often has to download the file before it can read it.
            <li>Datasets with data files stored on a high-bandwidth parallel file system,
                because it can deliver more data, faster, when requested. Examples of
                parallel file systems include 
                <a rel="help" href="https://en.wikipedia.org/wiki/Non-RAID_drive_architectures">JBOD<img 
          src="../images/external.png" alt=" (external link)" 
          title="This link to an external website does not constitute an endorsement."></a>,

             <a rel="help" href="http://www.pnfs.com/">pNFS<img 
          src="../images/external.png" alt=" (external link)" 
          title="This link to an external website does not constitute an endorsement."></a>, 

             <a rel="help" href="https://en.wikipedia.org/wiki/Gluster">GlusterFS<img 
          src="../images/external.png" alt=" (external link)" 
          title="This link to an external website does not constitute an endorsement."></a>,

              Amazon S3, and Google Cloud Storage.
            <br>&nbsp;      
            </ul>
      </ul>
      Warning: When using nThreads&gt;1, keep an eye on ERDDAP's memory use, thread use, and overall responsiveness
      (see <a rel="help" href="https://erddap.github.io/setup.html#statusPage"
      >ERDDAP's status page</a>). See comments about these issues below.
      <br>&nbsp;      

    <li><a class="selfLink" id="nThreadsWhere" href="#nThreadsWhere" rel="bookmark">For a given dataset,</a>
      this <kbd>nThreads</kbd> setting can come from different places:
        <ul>
        <li>If the datasets.xml chunk for a dataset has an <kbd>&lt;nThreads&gt;</kbd> tag
          (within the <kbd>&lt;dataset&gt;</kbd> tag, not as a global attribute)
          with a value &gt;= 1,
          that value of nThreads is used. So, you can specify a different number for each 
          dataset.
        <li>Otherwise, if datasets.xml 
          has an <kbd>&lt;nTableThreads&gt;</kbd> tag (for EDDTableFromFiles datasets)
          or  an <kbd>&lt;nGridThreads&gt;</kbd> tag (for EDDGrid datasets)
          with a value &gt;= 1,
          outside of a <kbd>&lt;dataset&gt;</kbd> tag, that value of nThreads is used.
        <li>Otherwise, 1 thread is used, which is a safe choice since it uses the
          smallest amount of memory.
          <br>&nbsp;      
        </ul>
     For the <a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/index.html"
      >original ERDDAP™ installation</a>, we use 
      <br><kbd>&lt;nTableThreads&gt;6&lt;/nTableThreads&gt;</kbd> (It's a powerful server.)
      Difficult requests now take 30% of the previous time.       
      <br>&nbsp;      

    <li><a class="selfLink" id="nThreadsMonitor" href="#nThreadsMonitor" rel="bookmark">Monitor Resource Usage</a>
      <br>When you are experimenting with different nThreads settings (and perhaps
      making difficult sample requests to your ERDDAP),
      you can monitor your computer's resource usage:
      <ul>
      <li>On Macs, use <kbd>Finder : Applications : Utilities : Activity Monitor</kbd>
      <li>On Linux, use <kbd>top</kbd>
      <li>On Windows 10, use <kbd><i>Ctrl + Shift + Esc</i> to open Task Manager</kbd> 
      <br>&nbsp;      
      </ul>

    <li><a class="selfLink" id="nThreadsResponsiveness" href="#nThreadsResponsiveness" rel="bookmark"
      >WARNING: Decreased Responsiveness</a>
      <br>In isolation, ERDDAP™ will fulfill a request to a dataset with a higher 
      nThreads setting faster than if nThreads=1.  But while that request is being processed,
      other requests from other users will be somewhat crowded out and get a slower response.
      Also, when ERDDAP™ responds to a given request, other computing resources 
      (e.g., disk drive access, network bandwidth) may be limiting, especially 
      with higher nThreads settings. 
      Thus with higher nThreads settings, the overall system responsiveness 
      will be worse when there are multiple requests being processed -- 
      this can be very annoying to users!
      Because of this: never set nThreads to more than the number of CPU cores in the computer.
      nThreads=1 is the fairest setting since each request (among several simultaneous requests)
      will get an equal share of computing resources.
      But the more powerful the computer, the less this will be a problem.
      <br>&nbsp;      

    <li><a class="selfLink" id="nThreadsMemoryUse" href="#nThreadsMemoryUse" rel="bookmark"
      >WARNING: Higher Memory Use for EDDGrid Datasets</a>
      <br>Memory use while processing requests is directly proportional to the nThreads setting.
      A reasonably safe rule of thumb is: 
      you need to set 
      <a rel="help" href="https://erddap.github.io/setup.html#memory"
      >ERDDAP's memory settings</a> to at least 2GB + (2GB * nThreads).
      Some requests to some datasets will need more memory than that.
      For example, setting nThreads=3 for any EDDGrid dataset
      means that the -Xmx setting should be at least -Xmx8000M.
      If that memory setting is greater than 3/4 the physical memory of the computer,
      decrease the nThreads setting so that you can decrease the memory setting.

      <p>The memory use of threads processing requests to EDDTable datasets
      is almost always lower because the files are usually much smaller. 
      However, if a given EDDTable dataset has huge (e.g., &gt;=1 GB) data files,
      then the comments above will apply to those datasets as well.

      <p>Whatever the nThreads setting, keep a close eye on the memory usage statistics on your 
      <a rel="help" href="https://erddap.github.io/setup.html#statusPage"
      >ERDDAP's status page</a>. You shouldn't ever come close to maxing out the memory usage
      in ERDDAP; otherwise there will be serious errors and failures.

    <li><a class="selfLink" id="nThreads1" href="#nThreads1" rel="bookmark"
      >Temporarily Set to 1</a>
      <br>If current memory usage is even slightly high, 
        ERDDAP™ will set nThreads for this request to 1.
        Thus, ERDDAP™ conserves memory when memory is scarce.
      <br>&nbsp;

    <li>Diminishing Returns
      <br>There are diminishing returns to increasing the nThreads setting: 
      2 threads will be way better than 1 (if we ignore dynamic overclocking). 
      But 3 will be only a chunk better than 2. 
      And 4 will be only marginally better than 3.  

      <p>In one test of a difficult query to a large EDDTable dataset, 
      the response time using 1, 2, 3, 4, 5, 6 threads was 
      38, 36, 20, 18, 13, 11 seconds.  (We now use nTableThreads=6 on that server.)

      <p>nThreads=2:
      Although, there is often a significant benefit to specifying nThreads=2 
      instead of nThreads=1, it often won't make much difference in the clock
      time needed to respond to a given user's request.
      The reason is: with nThreads=1, most modern CPU's will often
      <a rel="help" href="https://en.wikipedia.org/wiki/Intel_Turbo_Boost">dynamically overclock<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a>
      (turbo boost) to temporarily increase the clock speed of the CPU. 
      Thus with nThreads=1, the one core will often be working at a higher clock speed than
      each of the two cores if you used nThreads=2.
      Regardless, we still think it is better to use nThreads=2 rather than nThreads=1,
      since that setting will yield better results in a wider variety of situations.
      And of course, if your computer has sufficient CPU cores, 
      an even higher nThreads setting should yield better results.
      
      <p>As discussed above, very high nThreads settings may lead to faster responses 
      to some requests,
      but the risk of overall decreased ERDDAP™ responsiveness and high memory
      use (as noted above) while those requests are being processed 
      means it generally isn't a good idea.

    <li><a class="selfLink" id="nThreadsCPUCores" href="#nThreadsCPUCores" rel="bookmark">CPU Cores</a>
      <br>You shouldn't ever set nThreads to a number larger than the number
      of CPU cores in the computer's CPU. 
      Essentially all modern CPUs have multiple cores (e.g., 2, 4, or 8).
      Some computers even have multiple CPUs (e.g., 2 CPUs * 4 cores/CPU = 8 CPU cores).
      To find out how many CPUs and cores a computer has:
      <ul>
      <li>On Macs, use <kbd><i>Option key</i> : Apple Menu : System Information</kbd>
      <li>On Linux, use <kbd>cat /proc/cpuinfo</kbd>
      <li>On Windows 10, use <kbd><i>Ctrl + Shift + Esc</i> to open Task Manager : Performance</kbd> 
        (<kbd>Logical processors</kbd> shows the total number of CPU cores)
      </ul>

      <p>Yes, most processors these days say that they support 2 threads per core (via
      <a rel="help" href="https://en.wikipedia.org/wiki/Hyper-threading">hyper-threading<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>), 
      but the 2 threads share computing resources, so you won't see twice the
      throughput on a CPU under heavy load. 

      For example, a computer with one CPU with 4 cores may claim to support up to 8 threads,
      but you should never exceed nThreads=4 in that ERDDAP.
      Remember that:
      <ul>
      <li>The nThreads setting in ERDDAP™ is per request. ERDDAP™ often handles multiple requests simultaneously.
      <li>ERDDAP™ does things other than process requests, e.g., reload datasets.
      <li>When ERDDAP™ responds to a given request, other computing resources 
        (e.g., disk drive access, network bandwidth) may be limiting.
        The higher you set nThreads, the more likely that these other resources
        will be maxed out and will slow down ERDDAP's general responsiveness.
      <li>The operating system does things other than run ERDDAP.
      </ul>
      So it is best not to set the nThreads setting to more than the number
      of cores in the computer's CPU.
      <br>&nbsp;      

    <li>Your Mileage May Vary (YMMV)
    <br>The results of different nThreads settings will vary greatly for different
    requests to different datasets on different systems.
    If you really want to know the effect of different nThreads settings, 
    run realistic tests. 
    <br>&nbsp;

    <li>Pfft! Why is nThreads per request?
    <br>I can hear some of you thinking "Why is nThreads per request? If I were coding this,
    I would use one permanent worker thread pool and a messaging queue for better performance."
    The problem with using one worker thread pool and a messaging queue is that 
    one difficult request would flood the queue with numerous slow tasks. 
    That would effectively block ERDDAP™ from even starting work on tasks
    related to other requests until the initial request was (essentially) finished.
    Thus, even simple subsequent requests would respond super slowly. 
    ERDDAP's use of nThreads per request leads to a much fairer use of computing resources.
    <br>&nbsp;

    <li><a class="selfLink" id="nThreadsUnfortunately" href="#nThreadsUnfortunately" rel="bookmark"
    >nThreads vs. Multiple Worker Computers</a>
    <br>Unfortunately, ERDDAP's nThreads system will never be as effective
    as true parallelizing via multiple worker computers, 
    with each working on a chunk of data,
    in the way that Hadoop or Apache Spark are usually used. When the task is
    truly parallelized/distributed to multiple computers, each computer can use all of its
    resources on its part of the task. With ERDDAP's nThreads system, each of the
    threads is competing for the same computer's bandwidth, disk drives, memory, etc.
    Unfortunately, most of us don't have the resources or funds to set up or even rent
    (on Amazon Web Services (AWS) or Google Cloud Platform (GCP)) massive grids of computers. 
    Also, unlike a relational database which is allowed to return the result rows in any order,
    ERDDAP™ makes a promise to return the result rows in a consistent order. This constraint
    makes ERDDAP's nThreads implementation less efficient.
    But ERDDAP's nThreads is useful in many cases.
    <p>However, there are ways to make ERDDAP™ scale to handle a huge number of requests
    quickly by setting up a 
    <a href="https://erddap.github.io/grids.html"
      >grid/cluster/federation of ERDDAPs</a>.
    <br>&nbsp;

    </ul>

<li><a class="selfLink" id="palettes" href="#palettes" rel="bookmark"
      ><kbd><strong>&lt;palettes&gt;</strong></kbd></a> -- 
      Starting with ERDDAP™ version 2.12, datasets.xml can include a &lt;palettes&gt; tag 
      (within <kbd>&lt;erddapDatasets&gt;</kbd>) which overrides 
      the &lt;palettes&gt; tag value from messages.xml (or reverts to the messages.xml
      value if the tag in datasets.xml is empty). 
      This lets you change the list of available palettes while ERDDAP™ is running. 
      It also lets you make a change and have it persist when you install a new version of ERDDAP.
      <br>WARNING: The palettes listed in datasets.xml must be a superset of 
      the palettes listed in messages.xml; otherwise ERDDAP™ will throw an exception
      and stop processing datasets.xml. 
      This ensures that all ERDDAP™ installations at least support the same core palettes.
      <br>WARNING: ERDDAP™ checks that the palettes files specified in messages.xml
        actually exist, but it doesn't check the palette files listed in datasets.xml. 
        It's your responsibility to ensure the files are present.

      <p>Also starting with ERDDAP™ version 2.12, if you make a cptfiles subdirectory
      in the ERDDAP™ content directory, ERDDAP™ will copy all the *.cpt files in that directory
      into the [tomcat]/webapps/erddap/WEB-INF/cptfiles directory each time ERDDAP™ starts up.
      Thus, if you put custom cpt files in that directory, those files will be 
      used by ERDDAP™, with no extra effort on your part, even when you install
      a new version of ERDDAP.

      <p>WARNING: If you add custom palettes to your ERDDAP™ and you have EDDGridFromErddap 
      and/or EDDTableFromErddap datasets in your ERDDAP™, then users will see
      your custom palette options on the ERDDAP™ Make A Graph web pages, but
      if the user tries to use them, they'll get a graph with the default (usually Rainbow)
      palette. This is because the image is made by the remote ERDDAP™ which 
      doesn't have the custom palette. The only solutions now are to 
      email the remote ERDDAP™ administrator to add your custom palettes to his/her ERDDAP
      or email Chris.John at noaa.gov to ask that the palettes be added to the standard
      ERDDAP™ distribution.


<li><a class="selfLink" id="onChange" href="#onChange" rel="bookmark"><kbd><strong>&lt;onChange&gt;</strong></kbd></a>
  is an OPTIONAL tag within a <kbd>&lt;dataset&gt;</kbd> tag in datasets.xml that specifies an action which 
  will be done when this dataset is created (when ERDDAP™ is restarted) and whenever 
  this dataset changes in any way. 
  <ul>
  <li>Currently, for EDDGrid subclasses, any change to metadata or to an axis variable 
    (for example, a new time point for near-real-time data) is considered a change,
    but a reloading of the dataset is not considered a change (by itself).
  <li>Currently, for EDDTable subclasses, any reloading of the dataset is considered a change.
  <li>Currently, only two types of actions are allowed:
    <ul>
    <li>http:// or https:// -- If the action starts with "http://" or "https://",
      ERDDAP™ will send an HTTP GET request to the
      specified URL. The response will be ignored.
      For example, the URL might tell some other web service to do something.
      <ul>
      <li><a class="selfLink" id="PercentEncoded" href="#PercentEncoded" rel="bookmark">If the URL has a query part (after the "?"),</a> it MUST be already 
        <a rel="help" href="https://en.wikipedia.org/wiki/Percent-encoding">percent encoded<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
        You need to encode special characters in the constraints 
        (other than the initial '&amp;' and the main '='
        in constraints) into the form %HH, where HH is the 2 digit hexadecimal value of the character.
        Usually, you just need to convert a few of the punctuation characters: % into %25, 
        &amp; into %26, " into %22, &lt; into %3C, = into %3D, &gt; into %3E, + into %2B,
        | into %7C, [ into %5B, ] into %5D, space into %20, 
        and convert all characters above #127 into their UTF-8 form and then percent encode
        each byte of the UTF-8 form into the %HH format (ask a programmer for help).
        <br>For example, <kbd>&amp;stationID&gt;="41004"</kbd>
        <br>becomes &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<kbd>&amp;stationID%3E=%2241004%22</kbd>
        <br>Percent encoding is generally required when you access ERDDAP
          via software other than a browser. Browsers usually handle percent encoding for you.
        <br>In some situations, you need to percent encode all characters other than
        A-Za-z0-9_-!.~'()*, but still don't encode the initial '&amp;' or the main '=' in constraints.
        <br>Programming languages have tools to do this (for example, see Java's
          <a rel="help" href="https://docs.oracle.com/javase/8/docs/api/java/net/URLEncoder.html">java.net.URLEncoder<img 
          src="../images/external.png" alt=" (external link)" 
          title="This link to an external website does not constitute an endorsement."></a>
        and JavaScript's
          <a rel="help" href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent">encodeURIComponent()<img 
          src="../images/external.png" alt=" (external link)" 
          title="This link to an external website does not constitute an endorsement."></a>) 
          and there are
        <br><a rel="help" href="https://www.url-encode-decode.com/">websites that percent encode/decode for you<img 
          src="../images/external.png" alt=" (external link)" 
          title="This link to an external website does not constitute an endorsement."></a>.
        <li>Since datasets.xml is an XML file, you MUST also &amp;-encode ALL '&amp;', '&lt;', and '&gt;' in the
          URL as '&amp;amp;', '&amp;lt;', and '&amp;gt;' after percent encoding.          
      <li>Example: For a URL that you might type into a browser as:
        <br><kbd>https://www.company.com/webService?department=R%26D&amp;param2=value2</kbd> 
        <br>You should specify an <kbd>&lt;onChange&gt;</kbd> tag via (on one line)
        <br><kbd>&lt;onChange&gt;https://www.company.com/webService?department=R%26D&amp;amp;param2=value2&lt;/onChange&gt;</kbd> 
      </ul>
    <li>mailto: -- If the action starts with "mailto:", ERDDAP™ will send an email to the subsequent
      email address indicating that the dataset has been updated/changed.
      <br>For example: <kbd>&lt;onChange&gt;mailto:john.smith@company.com&lt;/onChange&gt;</kbd>
    </ul>

    If you have a good reason for ERDDAP™ to support some other type of action, send us an email
    describing what you want.
  <li>This tag is OPTIONAL. There can be as many of these tags as you want.
    Use one of these tags for each action to be performed.
  <li>This is analogous to ERDDAP's email/URL subscription system, but these actions aren't stored
    persistently (i.e., they are only stored in an EDD object).
  <li>To remove a subscription, just remove the <kbd>&lt;onChange&gt;</kbd> tag.
    The change will be noted the next time the dataset is reloaded.
    <br>&nbsp;
  </ul>

<li><a class="selfLink" id="reloadEveryNMinutes" href="#reloadEveryNMinutes" rel="bookmark"><kbd><strong>&lt;reloadEveryNMinutes&gt;</strong></kbd></a>
  is an OPTIONAL tag within a <kbd>&lt;dataset&gt;</kbd> tag in datasets.xml of almost all dataset types
  that specifies how often the dataset should be reloaded.   For example,
  <br><kbd>&lt;reloadEveryNMinutes&gt;60&lt;/reloadEveryNMinutes&gt;</kbd>
  <ul>
  <li>Generally, datasets that change frequently (for example, get new data files) 
    should be reloaded frequently, for example, every 60 minutes.
  <li>Datasets that change infrequently should be reloaded infrequently, for example,
    every 1440 minutes (daily) or 10080 minutes (weekly).
  <li>This tag is OPTIONAL, but recommended. The default is 10080.
  <li>An example is: <kbd>&lt;reloadEveryNMinutes&gt;1440&lt;/reloadEveryNMinutes&gt;</kbd>
  <li>When a dataset is reloaded, all files in the 
      <i>bigParentDirectory</i>/cache/<i>datasetID</i>
    directory are deleted.
  <li>No matter what this is set to, a dataset won't be loaded more frequently than
    <kbd>&lt;loadDatasetsMinMinutes&gt;</kbd> (default = 15), as specified in 
     <a rel="help" href="https://erddap.github.io/setup.html#setup.xml">setup.xml</a>.
   So if you want datasets to be reloaded very frequently, you need to set both
     reloadEveryNMinutes and loadDatasetsMinMinutes to small values.
  <li>Don't set reloadEveryNMinutes to the same value as loadDatasetsMinMinutes,
    because the elapsed time is likely to be (for example) 14:58 or 15:02,
    so the dataset will only be reloaded in about half of the major reloads. 
    Instead, use a smaller (for example, 10) or larger (for example, 20) reloadEveryNMinutes value.
  <li>Regardless of reloadEveryNMinutes, you can manually tell ERDDAP™ to reload a specific dataset
    as soon as possible via a
     <a rel="help" href="https://erddap.github.io/setup.html#flag">flag file</a>.
  <li><a class="selfLink" id="ProactiveVsReactive" href="#ProactiveVsReactive" rel="bookmark">Proactive versus Reactive</a> -- 
    ERDDAP's reload system is proactive --
    datasets are reloaded soon after their reloadEveryNMinutes time is up 
    (i.e., they become "stale", but never very stale),
    whether the dataset is getting requests from users or not. So ERDDAP™ datasets
    are always up-to-date and ready for use. 
    This is in contrast to THREDDS' reactive approach:
    a user's request is what tells THREDDS to check if a dataset is stale (it may be very stale).
    If it is stale, THREDDS makes the user wait (often for a few minutes) while
    the dataset is reloaded.
  <li>For Curious Programmers -- In ERDDAP™, 
    the reloading of all datasets is handled by two single purpose threads.
    One thread initiates a minor reload if it finds a flag file 
    or a major reload (which checks all datasets to see if they need to be reloaded).
    The other thread does the actual reload of the datasets one at a time.
    These threads work in the background ensuring that all datasets are kept up-to-date.
    The thread which actually does the reloads prepares a new version of a dataset
    then swaps it into place (essentially replacing the old version atomically). 
    So it is very possible that the following sequence of events occurs (it's a good thing):
    <ol>
    <li>ERDDAP™ starts reloading a dataset (making a new version) in the background.
    <li>User 'A' makes a request to the dataset. ERDDAP™ 
      uses the current version of the dataset to create the response.
      (That is good. There was no delay for the user, and the current version 
      of the dataset should never be very stale.)
    <li>ERDDAP™ finishes creating the new reloaded version of the dataset 
      and swaps that new version into production. 
      All subsequent new requests are handled by the new version of the dataset.
      For consistency, user A's request is still being filled by the 
      original version.
    <li>User 'B' makes a request to the dataset and ERDDAP™ 
      uses the new version of the dataset to create the response.
    <li>Eventually user A's and user B's requests are completed (perhaps A's finishes first,
      perhaps B's finishes first).
    </ol>

    <p>I can hear someone saying, "Just two thredds! Ha! 
    That's lame! He should set that up so
    that reloading of datasets uses as many threads as are needed, so it all gets done 
    faster and with little or no lag." Yes and no. The problem is that 
    loading more than one dataset at a time creates several hard new problems. 
    They all need to be solved or dealt with. 
    The current system works well and has manageable problems (for example, potential 
    for lag before a flag is noticed).
    (If you need help managing them, email <kbd>erd dot data at noaa dot gov</kbd> .)
    The related 
    <a rel="help" href="#updateEveryNMillis">updateEveryNMillis</a>.
    system works within response threads, so it can and does lead to multiple
    datasets being updated (not the full reload) simultaneously.
  </ul>

<li><a class="selfLink" id="updateEveryNMillis" href="#updateEveryNMillis" rel="bookmark"><kbd><strong>&lt;updateEveryNMillis&gt;</strong></kbd></a>
  is an OPTIONAL tag within a <kbd>&lt;dataset&gt;</kbd> tag in datasets.xml of some dataset types
  that helps ERDDAP™ work with datasets that change very frequently 
  (as often as roughly every second). 
  Unlike ERDDAP's regular, proactive, 
  <a rel="help" href="#reloadEveryNMinutes"><kbd>&lt;reloadEveryNMinutes&gt;</kbd></a>
  system for completely reloading each dataset, 
  this OPTIONAL additional system is reactive (triggered by a user request)
  and quicker because it is incremental 
  (just updating the information that needs to be updated).
  For example, if a request to an EDDGridFromDap dataset occurs more than
  the specified number of milliseconds since the last update,
  ERDDAP™ will see if there are any new values for the leftmost (first, usually "time") dimension
  and, if so, just download those new values before handling the user's request.
  This system is very good at keeping a rapidly changing dataset up-to-date 
  with minimal demands on the data source,
  but at the cost of slightly slowing down the processing of some user requests.
  <ul>  
  <li>To use this system, add (for example):
    <br><kbd>&lt;updateEveryNMillis&gt;1000&lt;/updateEveryNMillis&gt;</kbd>
    <br>right after the <kbd>&lt;reloadEveryNMinutes&gt;</kbd> tag for the dataset in datasets.xml.
    The number of milliseconds that you specify can be as small as 1
    (to ensure that the dataset is always up-to-date).
    A value of 0 (the default) or a negative number turns off the system. 
  <li>Due to their incremental nature, updates should finish very quickly, 
    so users should never have to wait a long time. 
  <li>If a second data request arrives before the previous update has finished,
    the second request won't trigger another update.
  <li>Throughout the documentation, we will try to use the word "reload" for regular, 
    full dataset reloads, and "update" for these new incremental, partial updates.
  <li>For testing purposes, some diagnostics are printed to 
    log.txt if 
    <a rel="help" href="#logLevel"><kbd>&lt;logLevel&gt;</kbd></a> in datasets.xml
    is set to "all".
  <li>If you use incremental updates and especially if the
    leftmost (first), for example, time, axis is large, you may want to set 
    <kbd>&lt;reloadEveryNMinutes&gt;</kbd> to a larger number (1440?),
    so that updates do most of the work to keep the dataset up-to-date,
    and full reloads are done infrequently. 
  <li>Note: this new update system updates metadata (for example, time actual_range,
    time_coverage_end, ...)
    but doesn't trigger onChange (email or touch URL) 
    or change the RSS feed (perhaps it should...).    
  <li><a class="selfLink" id="updateFiles" href="#updateFiles" rel="bookmark">For all datasets that use subclasses of</a> 
    <a rel="help" href="#EDDGridFromFiles">EDDGridFromFiles</a> and
    <a rel="help" href="#EDDTableFromFiles">EDDTableFromFiles</a>:
    <ul>
    <li><strong>WARNING:</strong> 
      when you add a new data file to a dataset by copying it into the
      directory that ERDDAP™ looks at, there is a danger that ERDDAP™ will
      notice the partially written file; try to read it, but fail because the
      file is incomplete; declare the file to be a "bad" file and 
      remove it (temporarily) from the dataset. 
      <br>To avoid this, we <strong>STRONGLY RECOMMEND</strong> that you copy a new file
      into the directory with a temporary name (for example, 20150226.ncTmp) 
      that doesn't match the datasets fileNameRegex (*\.nc), 
      then rename the file to the correct name (for example, 20150226.nc).
      If you use this approach, ERDDAP™ will ignore the temporary file
      and only notice the correctly named file when it is complete 
      and ready to be used.
    <li>If you modify existing datafiles in place (for example, to add a new data point),
      <kbd>&lt;updateEveryNMillis&gt;</kbd> will work well if the changes appear
      atomically (in an instant) and the file is always a valid file.
      For example, the netcdf-java library allows for additions to the
      unlimited dimension of a "classic" .nc v3 file to be made atomically.
      <br><kbd>&lt;updateEveryNMillis&gt;</kbd> will work badly if the file is invalid
      while the changes are being made.
    <li><kbd>&lt;updateEveryNMillis&gt;</kbd> will work well for datasets where 
      one or a few files change in a short amount of time.
    <li><kbd>&lt;updateEveryNMillis&gt;</kbd> will work poorly for datasets where 
      a large number of files change in a short amount of time
      (unless the changes appear atomically).
      For these datasets, it is better to not use <kbd>&lt;updateEveryNMillis&gt;</kbd> 
      and to set a 
      <a rel="help" 
      href="https://erddap.github.io/setup.html#setDatasetFlag">flag</a> to tell ERDDAP™ to reload the dataset.
    <li><kbd>&lt;updateEveryNMillis&gt;</kbd> does not update the information associated with the 
      <a rel="help" href="#subsetVariables">&lt;subsetVariables&gt;</a>.
      Normally, this is not a problem, because the subsetVariables have information
      about things that don't change very often (for example, the list of station names,
      latitudes, and longitudes).
      If the subsetVariables data changes (for example, when a new station is added
      to the dataset), then contact the
      <a rel="help" 
        href="https://erddap.github.io/setup.html#setDatasetFlag">flag URL</a> 
      for the dataset to tell ERDDAP™ to reload the dataset.
      Otherwise, ERDDAP™ won't notice the new subsetVariable information until
      the next time the dataset is reloaded (&lt;reloadEveryNMinutes&gt;).
    <li>Our generic recommendation is to use:
      <br><kbd>&lt;reloadEveryNMinutes&gt;1440&lt;/reloadEveryNMinutes&gt;
      <br>&lt;updateEveryNMillis&gt;10000&lt;/updateEveryNMillis&gt;</kbd>
    <li>TROUBLE? 
      On Linux computers, 
      if you are using <kbd>&lt;updateEveryNMillis&gt;</kbd>
      with EDDGridFromFiles or EDDTableFromFiles
      classes, you may see a problem where a dataset fails to load (occasionally or
      consistently) with the error message:
      "IOException: User limit of inotify instances reached or too many open files".
      The cause may be a bug in Java which causes inotify instances to be not garbage collected. 
      This problem is avoided in ERDDAP™ v1.66 and higher. 
      So the best solution is to switch the latest version of ERDDAP.
      <br>If that doesn't solve the problem (that is, if you have a really large
      number of datasets using <kbd>&lt;updateEveryNMillis&gt;</kbd>),
      you can fix this problem by calling:<kbd>
      <br>sudo sysctl fs.inotify.max_user_watches=65536
      <br>sudo sysctl fs.inotify.max_user_instances=1024
      <br>sudo sysctl -p</kbd>
      <br>Or, use higher numbers if the problem persists.
      The default for watches is 8192. The default for instances is 128.
    </ul>
  <li>You can put <kbd>&lt;updateMaxEvents&gt;10&lt;/updateMaxEvents&gt;</kbd>
    in datasets.xml (in with the other settings near the top) to change the maximum
    number of file changes (default=10) that will be processed by the updateEveryNMillis system.
    A larger number may be useful for dataset where it is very important that they
    be kept always up-to-date.
    See the <a rel="help" href="#updateMaxEvents">updateMaxEvents documentation</a>.
  <li>For Curious Programmers -- these incremental updates, unlike ERDDAP's full
    <a rel="help" href="#reloadEveryNMinutes"><kbd>reloadEveryNMinutes</kbd></a>
    system, occur within user request threads.
    So, any number of datasets can be updating simultaneously.
    There is code (and a lock) to ensure that only one thread is working
    on an update for any given dataset at any given moment. 
    Allowing multiple simultaneous updates was easy;
    allowing multiple simultaneous full reloads would be harder.
    <br>&nbsp;
  </ul>

<li><a class="selfLink" id="sourceCanConstrainStringEQNE" href="#sourceCanConstrainStringEQNE" rel="bookmark"><kbd><strong>&lt;sourceCanConstrainStringEQNE&gt;</strong></kbd></a>
  is an OPTIONAL tag within an EDDTable <kbd>&lt;dataset&gt;</kbd> tag in datasets.xml that 
  specifies if the source can constrain String variables with the = and != operators.
  <ul>
  <li>For EDDTableFromDapSequence, this applies to the outer sequence String variables only.
    It is assumed that the source can't handle any constraints on inner sequence variables.
  <li>This tag is OPTIONAL. Valid values are <kbd>true</kbd> (the default) and <kbd>false</kbd>.
  <li>For EDDTableFromDapSequence OPeNDAP DRDS servers, this should be set to true (the default).
  <li>For EDDTableFromDapSequence Dapper servers, this should be set to false.
  <li>An example is:
    <br><kbd>&lt;sourceCanConstrainStringEQNE&gt;true&lt;/sourceCanConstrainStringEQNE&gt;</kbd>
    <br>&nbsp;
  </ul>

<li><a class="selfLink" id="sourceCanConstrainStringGTLT" href="#sourceCanConstrainStringGTLT" rel="bookmark"><kbd><strong>&lt;sourceCanConstrainStringGTLT&gt;</strong></kbd></a>
  is an OPTIONAL tag within an EDDTable <kbd>&lt;dataset&gt;</kbd> tag that 
  specifies if the source can constrain String variables with the &lt;, &lt;=, &gt;, and &gt;= operators.
  <ul>
  <li>For EDDTableFromDapSequence, this applies to the outer sequence String variables only.
     It is assumed that the source can't handle any constraints on inner sequence variables.
  <li>Valid values are <kbd>true</kbd> (the default) and <kbd>false</kbd>.
  <li>This tag is OPTIONAL. The default is <kbd>true</kbd>.
  <li>For EDDTableFromDapSequence OPeNDAP DRDS servers, this should be set to true (the default).
  <li>For EDDTableFromDapSequence Dapper servers, this should be set to false.
  <li>An example is: 
    <br><kbd>&lt;sourceCanConstrainStringGTLT&gt;true&lt;/sourceCanConstrainStringGTLT&gt;</kbd>
    <br>&nbsp;
  </ul>

<li><a class="selfLink" id="sourceCanConstrainStringRegex" href="#sourceCanConstrainStringRegex" rel="bookmark"><kbd><strong>&lt;sourceCanConstrainStringRegex&gt;</strong></kbd></a>
  is an OPTIONAL tag within an EDDTable <kbd>&lt;dataset&gt;</kbd> tag that 
  specifies if the source can constrain String variables by regular expressions, and if so, 
  what the operator is.
  <ul>
  <li>Valid values are "=~" (the DAP standard), "~=" (mistakenly supported by many DAP servers), 
    or "" (indicating that the source doesn't support regular expressions).
  <li>This tag is OPTIONAL. The default is "".
  <li>For EDDTableFromDapSequence OPeNDAP DRDS servers, this should be set to "" (the default).
  <li>For EDDTableFromDapSequence Dapper servers, this should be set to "" (the default).
  <li>An example is:
    <br><kbd>&lt;sourceCanConstrainStringRegex&gt;=~&lt;/sourceCanConstrainStringRegex&gt;</kbd>
    <br>&nbsp;
  </ul>

<li><a class="selfLink" id="sourceCanDoDistinct" href="#sourceCanDoDistinct" rel="bookmark"><kbd><strong>&lt;sourceCanDoDistinct&gt;</strong></kbd></a>
  is an OPTIONAL tag within an EDDTableFromDatabase <kbd>&lt;dataset&gt;</kbd> tag that 
  specifies if the source database should handle &amp;distinct() constraints
  in user queries.
  <ul>
  <li>This tag is OPTIONAL. Valid values are <kbd>no</kbd> (ERDDAP™ handles distinct; the default),
    <kbd>partial</kbd> (the source handles distinct and ERDDAP™ handles it again),
    and <kbd>yes</kbd> (the source handles distinct).
  <li>If you are using <kbd>no</kbd> and ERDDAP™ is running out of memory when
    handling distinct, use <kbd>yes</kbd>.
  <li>If you are using <kbd>yes</kbd> and the source database handles distinct too slowly,
    use <kbd>no</kbd>.
  <li><kbd>partial</kbd> gives you the worst of both: 
    it is slow because the database handling of distinct is slow and
    it may run out of memory in ERDDAP.
  <li>Databases interpret <kbd>DISTINCT</kbd> as a request for just
    unique rows of results, whereas ERDDAP™ interprets it as a request for
    a sorted list of unique rows of results.
    If you set this to <kbd>partial</kbd> or <kbd>yes</kbd>, ERDDAP™ automatically
    also tells the database to sort the results. 
  <li>One small difference in the results: 
    <br>With <kbd>no|partial</kbd>, ERDDAP™ will sort "" at the start of the results (before non-"" strings).
    <br>With <kbd>yes</kbd>, the database may (Postgres will) sort "" at the end of the results 
      (after non-"" strings).
    <br>I will guess that this will also affect the sorting of short words
      versus longer words that start with the short word.
      For example, ERDDAP™ will sort "Simon" before "Simons".
  <li>An example is:
    <br><kbd>&lt;sourceCanDoDistinct&gt;yes&lt;/sourceCanDoDistinct&gt;</kbd>
    <br>&nbsp;
  </ul>

<li><a class="selfLink" id="sourceCanOrderBy" href="#sourceCanOrderBy" rel="bookmark"><kbd><strong>&lt;sourceCanOrderBy&gt;</strong></kbd></a>
  is an OPTIONAL tag within an EDDTableFromDatabase <kbd>&lt;dataset&gt;</kbd> tag that 
  specifies if the source database should handle &amp;orderBy(...) constraints
  in user queries.
  <ul>
  <li>This tag is OPTIONAL. Valid values are <kbd>no</kbd> (ERDDAP™ handles orderBy(...); the default),
    <kbd>partial</kbd> (the source handles orderBy and ERDDAP™ handles it again),
    and <kbd>yes</kbd> (the source handles orderBy(...)).
  <li>If you are using <kbd>no</kbd> and ERDDAP™ is running out of memory when
    handling orderBy(...), use <kbd>yes</kbd>.
  <li>If you are using <kbd>yes</kbd> and the source database handles orderBy(...) too slowly,
    use <kbd>no</kbd>.
  <li><kbd>partial</kbd> gives you the worst of both: 
    it is slow because the database handling of orderBy(...) is slow and
    it may run out of memory in ERDDAP.
  <li>One small difference in the results: 
    <br>With <kbd>no|partial</kbd>, ERDDAP™ will sort "" at the start of the results (before non-"" strings).
    <br>With <kbd>yes</kbd>, the database may (Postgres will) sort "" at the end of the results 
      (after non-"" strings).
    <br>This may also affect the sorting of short words
      versus longer words that start with the short word.
      For example, ERDDAP™ will sort "Simon" before "Simons", but I'm not sure 
      about how a database will sort them.
  <li>An example is:
    <br><kbd>&lt;sourceCanOrderBy&gt;yes&lt;/sourceCanOrderBy&gt;</kbd>
    <br>&nbsp;
  </ul>

<li><a class="selfLink" id="sourceNeedsExpandedFP_EQ" href="#sourceNeedsExpandedFP_EQ" rel="bookmark"><kbd><strong>&lt;sourceNeedsExpandedFP_EQ&gt;</strong></kbd></a>
  is an OPTIONAL tag within an EDDTable <kbd>&lt;dataset&gt;</kbd> tag that 
  specifies (true (the default) or false) if the source needs help with queries with 
  <kbd>&lt;numericVariable&gt;=&lt;floatingPointValue&gt;</kbd> (and !=, &gt;=, &lt;=).  
  For example,
  <br><kbd>&lt;sourceNeedsExpandedFP_EQ&gt;false&lt;/sourceNeedsExpandedFP_EQ&gt;</kbd>
  <ul>
  <li>For some data sources, numeric queries involving =, !=, &lt;=, or &gt;= may not work as desired
    with floating point numbers.
    For example, a search for <kbd>longitude=220.2</kbd> may fail if the value is stored as 220.20000000000001.
  <li>This problem arises because floating point numbers are
    <a rel="help" href="https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/">not represented exactly within computers<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
  <li>If sourceNeedsExpandedFP_EQ is set to true (the default), 
    ERDDAP™ modifies the queries sent to the data source to avoid this problem.
    It is always safe and fine to leave this set to true.
    <br>&nbsp;
  </ul>

<li><a class="selfLink" id="sourceUrl" href="#sourceUrl" rel="bookmark"><kbd><strong>&lt;sourceUrl&gt;</strong></kbd></a> 
  is a common tag within a dataset's global <kbd>&lt;addAttributes&gt;</kbd> 
  tag that specifies the URL that is the source of the data.
  <ul>
  <li>An example is: 
  <br><kbd>&lt;sourceUrl&gt;https://oceanwatch.pfeg.noaa.gov/thredds/dodsC/<wbr>satellite/<wbr>VH/chla/1day<wbr>&lt;/sourceUrl&gt;</kbd>
  <br>(but put it all on one line)
  <li>In ERDDAP™, all datasets will have a "sourceUrl" in the combined global attributes
    which are shown to the users.
  <li>For most dataset types, this tag is REQUIRED.  
    See the dataset type's description to find out if this is REQUIRED or not.
  <li>For some datasets, the separate &lt;sourceUrl&gt; tag is not allowed.
    Instead, you must provide a "sourceUrl" 
    <a rel="help" href="#globalAttributes">global attribute</a>, usually in 
    the global <kbd>&gt;addAttributes&lt;</kbd>.
    If there is no actual source URL (for example, if the data is stored in local files),
    this attribute often just has a placeholder value,
    for example, <kbd>&lt;att name="name"&gt;(local files)&lt;/att&gt;</kbd> .
  <li>For most datasets, this is the base of the URL that is used to request data.
     For example, for DAP servers, this is the URL to which .dods, .das, .dds, 
     or .html could be added.
  <li>Since datasets.xml is an XML file, you MUST also encode '&amp;', '&lt;', and '&gt;'
    in the URL as '&amp;amp;', '&amp;lt;', and '&amp;gt;'.
  <li>For most dataset types, ERDDAP™ adds the original sourceUrl (the "localSourceUrl" 
    in the source code)
    to the <a rel="help" href="#globalAttributes">global attributes</a> (where it becomes the "publicSourceUrl" in the source code).
    When the data source is local files, ERDDAP™ adds sourceUrl="(local files)" to the global attributes
    as a security precaution.
    When the data source is a database, ERDDAP™ adds sourceUrl="(source database)" to the global 
    attributes as a security precaution.
    If some of your datasets use non-public sourceUrl's (usually because their computer is
    in your DMZ or on a local LAN) you can use
      <a rel="help" href="#convertToPublicSourceUrl"><kbd>&lt;convertToPublicSourceUrl&gt;</kbd></a>
      tags to specify 
    how to convert the local sourceUrls to public sourceUrls.
  <li><a class="selfLink" id="handshakeAlert" href="#handshakeAlert" rel="bookmark"
    >A sourceUrl may begin with</a> 
    http://, https://, ftp://, and perhaps other prefixes.
    https connections read and check the source's digital certificate to ensure
    that the source is who they say they are. 
    In rare cases, this check may fail with the error 
    "javax.net.ssl.SSLProtocolException: handshake alert: unrecognized_name".
    This is probably due to the domain name on the certificate not matching the 
    domain name that you are using. 
    You can and should read the details 
    of the sourceUrl's certificate in your web browser,
    notably, the list of "DNS Name"s in the "Subject Alternative Name" section.

    <p>In some cases, the sourceUrl you are using may be an alias of the  
    domain name on the certificate. For example,  
    <br>https://podaac-opendap.jpl.nasa.gov/opendap/allData/ccmp/L3.5a/monthly/flk/
    will throw this error, but 
    <br>https://opendap.jpl.nasa.gov/opendap/allData/ccmp/L3.5a/monthly/flk/
    , which uses the domain name on the certificate, won't. 
    The solution in these cases is therefore to find and use the domain name 
    on the certificate.
    If you can't find it on the certificate, contact the data provider.
 
    <p>In other cases, the domain name on the certificate may be for a 
    group of names. If this occurs or the problem is otherwise unsolvable, 
    please email Chris.John at noaa.gov to report the problem.

    <br>&nbsp;
  </ul>


<li><a class="selfLink" id="addAttributes" href="#addAttributes" rel="bookmark"><kbd><strong>&lt;addAttributes&gt;</strong></kbd></a> 
    is an OPTIONAL tag for each dataset and for each variable which lets ERDDAP
  administrators control the metadata attributes associated with a dataset and its variables.  
  <ul>
  <li>ERDDAP™ combines the attributes from the dataset's source ("sourceAttributes")
    and the "addAttributes" which you define in datasets.xml (which have priority) 
    to make the "combinedAttributes", which are what ERDDAP™ users see. 
    Thus, you can use addAttributes to redefine the values of sourceAttributes, add new
    attributes, or remove attributes.
  <li>The <kbd>&lt;addAttributes&gt;</kbd> tag encloses 0 or more 
    <kbd><strong>&lt;att&gt;</strong></kbd> subtags, which are used
    to specify individual attributes. 
  <li>Each attribute consists of a name and a value (which has a specific data type, 
    for example, double). 
  <li>There can be only one attribute with a given name.
    If there are more, the last one has priority.
  <li>The value can be a single value or a space-separated list of values.

  <li>Syntax 
    <ul>
    <li>The order of the <kbd>&lt;att&gt;</kbd> 
      subtags within <kbd>addAttributes</kbd> is not important.
    <li>The <kbd>&lt;att&gt;</kbd> subtag format is 
      <br><kbd>&lt;att name="<i>name</i>" [type="<i>type</i>"] &gt;<i>value</i>&lt;/att&gt;</kbd>
    <li>The destination name of all attributes MUST start with a letter (A-Z, a-z)
      and MUST contain only the characters A-Z, a-z, 0-9, or '_'.
    <li>If an <kbd>&lt;att&gt;</kbd> subtag has no value or a value of <kbd>null</kbd>, 
      that attribute will be removed from the combined attributes.
      <br>For example, <kbd>&lt;att name="rows" /&gt;</kbd> will remove <kbd>rows</kbd> from the combined attributes.
      <br>For example, <kbd>&lt;att name="coordinates"&gt;null&lt;/att&gt;</kbd> will remove <kbd>coordinates</kbd>
      from the combined  attributes.
    <li><a class="selfLink" id="attributeType" href="#attributeType" rel="bookmark">The OPTIONAL <kbd>type</kbd> value for <kbd>&lt;att&gt;</kbd> 
      subtags</a> indicates the data type for the values.
      The default type is <kbd>String</kbd>. An example of a String attribute is:
      <br><kbd>&lt;att name="creator_name"&gt;NASA/GSFC OBPG&lt;/att&gt;</kbd>
      <ul>
      <li>Valid types for single values are
        <kbd>byte (8-bit  integer), short (16-bit signed integer), 
        int (32-bit signed integer), long (64-bit signed integer), 
        float (32-bit floating point), double (64-bit floating point),
        char</kbd>, 
        and <kbd>String</kbd>. For example,
        <br><kbd>&lt;att name="scale_factor" type="float"&gt;0.1&lt;/att&gt;</kbd>

        <p>See these notes about the <a rel="help" href="#charData">char data type</a>.
        <br>See these notes about the <a rel="help" href="#longData">long data type</a>.

      <li>Valid types for space-separated lists of values (or single values) are 
        <kbd>byteList, shortList, 
        unsignedShortList, charList, intList, longList, floatList, doubleList</kbd>.  
        For example, 
        <br><kbd>&lt;att name="actual_range" type="doubleList"&gt;10.34 23.91&lt;/att&gt;</kbd>
        <br>An unsignedShortList lets you specify a list of unsigned shorts, but
          they will be converted into a list of the corresponding Unicode characters (e.g.,
          "65 67 69" will be converted into "A C E".
        <br>If you specify a charList, encode any special characters (e.g., space, 
          double quotes, backslash, &lt;#32, or &gt;#127) as you would 
          encode them in the data section of an NCCSV file
          (e.g., " ", "\"" or """", "\\", "\n", "\u20ac").
        <br>There is no <kbd>stringList</kbd>. Store the String values as a multi-line String. 
        For example,
        <br><kbd>&lt;att name="history"&gt;2011-08-05T08:55:02Z ATAM - made CF-1.6 compliant. 
        <br>2012-04-08T08:34:58Z ATAM - Changed &#039;height&#039; from double to float.&lt;/att&gt;</kbd>
        
      <!-- li>Boolean values are case-insensitive. 
         For true, use "TRUE", "true", "T", or "t".
         For false, use "FALSE", "false", "F", or "f". -->
         <br>&nbsp;
      </ul> 
    </ul>
  </ul>

  <li><a class="selfLink" id="globalAttributes" href="#globalAttributes" rel="bookmark"><strong>Global Attributes / Global <kbd>&lt;addAttributes&gt;</kbd></strong></a> -- 
     <br><kbd>&lt;addAttributes&gt;</kbd> is an OPTIONAL tag within the 
       <kbd>&lt;dataset&gt;</kbd> tag which
       is used to
     change attributes that apply to the entire dataset.
     <ul>
     <li><strong>Use the global <kbd>&lt;addAttributes&gt;</kbd> 
       to change the dataset's global attributes.</strong>
       ERDDAP™ combines the global attributes from the dataset's source 
       (<kbd><strong>sourceAttributes</strong></kbd>)
       and the global <kbd><strong>addAttributes</strong></kbd> 
       which you define in datasets.xml (which have priority) 
       to make the global <kbd><strong>combinedAttributes</strong></kbd>, which are what ERDDAP™ users see. 
       Thus, you can use addAttributes to redefine the values of sourceAttributes, add new
       attributes, or remove attributes.
     <li>See the <a rel="help" href="#addAttributes"><kbd><strong>&lt;addAttributes&gt;</strong></kbd> information</a>
       which applies to 
       global and variable <kbd><strong>&lt;addAttributes&gt;</strong></kbd>.

     <li><a rel="help"
     href="https://www.fgdc.gov/standards/projects/FGDC-standards-projects/metadata/base-metadata/index_html"
     >FGDC<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> and
       <a rel="help" href="https://en.wikipedia.org/wiki/Geospatial_metadata">ISO&nbsp;19115-2/19139<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
       Metadata
       -- Normally, ERDDAP™ will automatically generate
       ISO 19115-2/19139 and FGDC (FGDC-STD-001-1998) XML metadata files for each 
       dataset using information
       from the dataset's metadata. So, <strong>good dataset metadata leads to good ERDDAP-generated 
       ISO 19115 and FGDC metadata. Please consider putting lots of time and effort into
       improving your datasets' metadata (which is a good thing to do anyway).</strong>
       Most of the dataset metadata attributes which are used to generate the ISO 19115 
       and FGDC metadata are from the 
       <a rel="help" href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3">ACDD metadata standard<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        and are so noted below. 

     <li>Many global attributes are special in that ERDDAP™ looks for them and 
       uses them in various ways.
       For example, a link to the infoUrl is included on web pages with lists of datasets, and
       other places, so that users can find out more about the dataset.
     <li>When a user selects a subset of data, globalAttributes related to the variable's 
       longitude, latitude, altitude (or depth), and time ranges (for example, Southernmost_Northing,
       Northernmost_Northing, time_coverage_start, time_coverage_end) are automatically
       generated or updated.
     <li>A simple sample global <kbd>&lt;addAttributes&gt;</kbd> is:<pre>
&lt;addAttributes&gt; 
  &lt;att name="Conventions"&gt;COARDS, CF-1.6, ACDD-1.3&lt;/att&gt;
  &lt;att name="infoUrl"&gt;<wbr>https://coastwatch.pfeg.noaa.gov/infog/PH_ssta_las.html<wbr>&lt;/att&gt;
  &lt;att name="institution"&gt;NOAA CoastWatch, West Coast Node&lt;/att&gt;
  &lt;att name="title"&gt;SST, Pathfinder Ver 5.0, Day and Night, Global&lt;/att&gt;
  &lt;att name="cwhdf_version" /&gt;
&lt;/addAttributes&gt;  </pre>
         The empty <kbd>cwhdf_version</kbd> attribute causes the source 
         <kbd>cwhdf_version</kbd> attribute (if any)
         to be removed from the final, combined list of attributes.
     <li>Supplying this information helps ERDDAP™ do a better job 
       and helps users understand the datasets.
       <br>Good metadata makes a dataset usable. 
       <br>Insufficient metadata makes a dataset useless.
       <br>Please take the time to do a good job with metadata attributes.
     </ul>

    <p><strong>Comments about global attributes that are special in ERDDAP:</strong>
      <ul>

      <li><a class="selfLink" id="acknowledgement" href="#acknowledgement" rel="bookmark"><strong>acknowledgement</strong></a> and 
          <a class="selfLink" id="acknowledgment"  href="#acknowledgment"  rel="bookmark"><strong>acknowledgment</strong></a>
          (from the
        <a rel="help" href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3">ACDD<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        metadata standard) 
        is a RECOMMENDED way to acknowledge the group or groups that provided support 
        (notably, financial)
        for the project that created this data.  For example,
        <br><kbd>&lt;att name="acknowledgment"&gt;AVISO&lt;/att&gt;</kbd>
        <p>Note that ACDD 1.0 and 1.1 used the spelling "acknowledgment" (which is the
        usual spelling in the U.S.), but ACDD 1.3 changed this to "acknowledgement"
        (which is the usual spelling in the U.K.). My understanding is that the change
        was essentially an accident and that they certainly didn't recognize 
        the ramifications of the change. What a mess! Now there are millions of 
        data files around the world that have "acknowledgment" and millions that have
        "acknowledgement". This highlights the folly of "simple" changes to a standard,
        and emphasizes the need for stability in standards.
        Because ACDD 1.3 (which is the version of ACDD that ERDDAP™ supports) 
        says "acknowledgement", 
        that is what ERDDAP™ (notably GenerateDatasetsXml) encourages. 
        <br>&nbsp;

      <li><a class="selfLink" id="cdm_altitude_proxy" href="#cdm_altitude_proxy" rel="bookmark"><strong>cdm_altitude_proxy</strong></a> is just for EDDTable datasets 
        that don't have an altitude or depth variable but do have 
        a variable that is a proxy for altitude or depth (for example, pressure, sigma, bottleNumber),
        you may use this attribute to identify that variable.  For example,
        <br><kbd>&lt;att name="cdm_altitude_proxy"&gt;pressure&lt;/att&gt;</kbd>
        <br>If the <a rel="help" href="#cdm_data_type">cdm_data_type</a> is Profile or TrajectoryProfile 
          and there is no altitude or depth variable,
        cdm_altitude_proxy MUST be defined.
        If cdm_altitude_proxy is defined, ERDDAP™ will add the following metadata to the variable:
        <kbd>_CoordinateAxisType=Height</kbd> and <kbd>axis=Z</kbd>.
        <br>&nbsp;

      <li><a class="selfLink" id="cdm_data_type" href="#cdm_data_type" rel="bookmark"><strong>cdm_data_type</strong></a> 
        (from the
        <a rel="help" href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3"
        >ACDD<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        metadata standard) 
        is a global attribute that indicates the 
          Unidata <a rel="help" href="https://www.unidata.ucar.edu/software/netcdf-java/v4.6/CDM/index.html"
          >Common Data Model<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
        data type for the dataset.    For example,
        <br><kbd>&lt;att name="cdm_data_type"&gt;Point&lt;/att&gt;</kbd>
        <br>The CDM is still evolving and may change again. ERDDAP™ complies with the 
        related and more detailed 
        <a rel="help" href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html#discrete-sampling-geometries">Discrete&nbsp;Sampling&nbsp;Geometries (DSG)<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> chapter of the 
        <a rel="help" href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html">CF 1.6<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
        metadata conventions
        (previously called the CF Point Observation Conventions).
        <ul>
        <li>Either the dataset's global <a rel="help" href="#globalAttributes">sourceAttributes</a> 
          or its global <kbd>&lt;addAttributes&gt;</kbd> MUST 
         include the cdm_data_type attribute.
         A few dataset types (like EDDTableFromObis) will set this automatically.
        <li>For EDDGrid datasets, the cdm_data_type options are Grid (the default and by far
          the most common type for EDDGrid datasets), MovingGrid, Other, Point, Profile, 
          RadialSweep, TimeSeries, TimeSeriesProfile, Swath, Trajectory, and TrajectoryProfile.
          Currently, EDDGrid does not require that any related metadata be specified,
          nor does it check that the data matches the cdm_data_type. That will probably 
          change in the near future.
        <li>EDDTable uses cdm_data_type in a rigorous way, following CF's DSG specification
          rather than CDM, which for some reason hasn't been updated to be consistent
          with DSG. If a dataset's metadata doesn't comply with
          the ERDDAP's cdm_data_type's requirements (see below), the dataset will fail to load and will generate an
          <a rel="help" href="#errorMessages">error message</a>. (That's a good thing, 
          in the sense that the error message will tell
          you what is wrong so that you can fix it.)
          And if the dataset's data doesn't match the dataset's metadata setup
          (e.g., if there is more than one latitude value for a given station in a timeseries dataset),
          some requests for data will return incorrect data in the response.
          So make sure you get all of this right.

          <p>For all of these datasets, in the <kbd>Conventions</kbd> and
            <kbd>Metadata_Conventions</kbd> global attributes, please refer to 
            CF-1.6 (not CF-1.0, 1.1, 1.2, 1.3, 1.4, or 1.5), 
            since CF-1.6 is the first version to include the changes related to 
            Discrete Sampling Geometry (DSG) conventions.

          <p><a class="selfLink" id="ERDDAPandDSG" href="#ERDDAPandDSG" rel="bookmark"
            >ERDDAP™ has a not-simple relationship to CF DSG.</a>
            <ul>
            <li>ERDDAP™ can make a valid DSG dataset out of a source dataset 
              that is already a valid DSG file(s), or out of a source dataset 
              that isn't set up for DSG but can be made so via changes to metadata
              (some of which are ERDDAP-specific in order to provide a more
              general approach to specifying the DSG setup). 
            <li>ERDDAP™ does a lot of validity tests when it loads a dataset. 
              If the dataset that has a cdm_data_type (or featureType) attribute
              successfully loads in ERDDAP™, then ERDDAP™ is saying the dataset
              meets the DSG requirements (otherwise, ERDDAP™ will throw an 
              exception explaining the first problem that it found). 
              <br>WARNING: A successfully-loaded dataset appears to meet the 
              DSG requirements (it has the right combination of attributes),
              but still may be incorrectly set up, leading to incorrect results
              in .ncCF and .ncCFMA response files. 
              (Software is smart in some ways and clueless in others.)
            <li>When you look at the dataset's metadata in ERDDAP™, 
              the DSG dataset appears to be in ERDDAP's internal format (a giant, database-like table).
              It isn't in one of the DSG formats (e.g., the dimensions and metadata aren't right),
              but the information needed to treat the dataset as a DSG dataset is in the metadata 
              (for example, <kbd>cdm_data_type=TimeSeries</kbd> and 
              <kbd>cdm_timeseries_variables=<i>aCsvListOfStationRelatedVarables</i></kbd>
              in the global metadata and <kbd>cf_role=timeseries_id</kbd> for some variable). 
            <li>If a user requests a subset of the dataset in a .ncCF 
              (an .nc file in DSG's Contiguous Ragged Array file format)
              or .ncCFMA file (a .nc file in DSG's Multidimensional Array file format), 
              that file will be a valid CF DSG file.
              <br>WARNING: However, if the dataset was set up incorrectly (so that the promises
              made by the metadata aren't true), then the response file will be 
              technically valid but will be incorrect in some way.
              <br>&nbsp;
            </ul>

          <li><a class="selfLink" id="EDDTableCdmTypes" href="#EDDTableCdmTypes" rel="bookmark">For EDDTable datasets,</a>
          the cdm_data_type options (and related requirements in ERDDAP) are 

          <ul>
          <li><a class="selfLink" id="cdmPoint" href="#cdmPoint" rel="bookmark">Point</a> -- 
          is for a set of measurements taken at unrelated times and locations.
            <ul>
            <li>As with all cdm_data_types other than Other, Point datasets MUST have longitude,
              latitude, and time variables. 
              <br>&nbsp;
            </ul>

          <li><a class="selfLink" id="cdmProfile" href="#cdmProfile" rel="bookmark">Profile</a> -- 
          is a set of measurements all taken at one time, at one latitude longitude location,
          but at more than one depth (or altitude). 
          The dataset may be a collection of these Profiles, 
          for example, 7 profiles from different locations. 
          This cdm_data_type doesn't imply any logical connection between any of the profiles.
            <ul>
            <li>One of the variables (for example, profile_number) MUST have the variable attribute cf_role=profile_id
              to identify the variable that uniquely identifies the profiles.
              <br><kbd>&lt;att name="cf_role"&gt;profile_id&lt;/att&gt;</kbd>
              <br>If no other variable is suitable, consider using the time variable.
            <li>The dataset MUST include the globalAttribute 
              <a class="selfLink" id="cdm_profile_variables" href="#cdm_profile_variables" rel="bookmark">cdm_profile_variables</a>, 
              where the
              value is a comma-separated list of the variables which have the information
              about each profile.  For a given profile, the values of these variables
              MUST be constant. For example,
              <pre>&lt;att name="cdm_profile_variables"&gt;profile_number,time,latitude,longitude&lt;/att&gt;</pre>
              The list MUST include the cf_role=profile_id variable 
              and all other variables with information about the profile, 
              and time, latitude and longitude.
              <br>The list will never include altitude, depth, or any observation variables.
              <br>&nbsp;
            </ul>
          [Opinion: cdm_data_type=Profile should rarely be used. In practice, a given dataset is
          usually actually either a TimeSeriesProfile (profiles at a fixed position)
          or a TrajectoryProfile (profiles along a trajectory), and so should be 
          properly identified as such.]
          <br>&nbsp;

          <li><a class="selfLink" id="cdmTimeSeries" href="#cdmTimeSeries" rel="bookmark">TimeSeries</a> -- 
          is a sequence of measurements (e.g., sea water temperature) taken at
          one, fixed, latitude, longitude, depth (or altitude) location. 
          (Think of it as "station".) The dataset may be a collection of these TimeSeries,
          for example, a  sequence from each of 3 different locations.
            <ul>
            <li>One of the variables (for example, station_id) MUST have the variable attribute cf_role=timeseries_id
              to identify the variable that uniquely identifies the stations.
              <br><kbd>&lt;att name="cf_role"&gt;timeseries_id&lt;/att&gt;</kbd>
            <li>The dataset MUST include the globalAttribute 
              <a class="selfLink" id="cdm_timeseries_variables" href="#cdm_timeseries_variables" rel="bookmark">cdm_timeseries_variables</a>, 
              where the
              value is a comma-separated list of the variables which have the information
              about each station.                
              For a given station, the values of these variables
              MUST be constant. 
              For example,
              <pre>&lt;att name="cdm_timeseries_variables"&gt;station_id,station_type,latitude,longitude&lt;/att&gt;</pre>
              The list MUST include the cf_role=timeseries_id variable 
              and all other variables with information about the station, 
              which almost always includes latitude and longitude (and altitude or depth, if present).
              <br>The list will never include time or any observation variables.
            <li>For some moored buoys, a dataset may have two sets of latitude and longitude variables:
              <ol>
              <li>One pair of latitude and longitude values that are constant
                (i.e., the fixed location of the mooring).
                In ERDDAP™, give these variables the destinationNames of latitude and longitude,
                and include these variables in the list of cdm_timeseries_variables.
              <li>Precise latitude and longitude values associated with each observation.
                In ERDDAP™, give these variables different destinationNames (e.g., preciseLat and preciseLon)
                and don't include these variables in the list of cdm_timeseries_variables.
                <br>&nbsp;
              </ol>
              The reasoning for this is: from a theoretical perspective, for a DSG TimeSeries dataset, 
              the latitude and longitude (and altitude or depth, if present) 
              location of the station MUST be constant.
            </ul>

          <li><a class="selfLink" id="cdmTimeSeriesProfile" href="#cdmTimeSeriesProfile" rel="bookmark">TimeSeriesProfile</a> --
             is for a sequence of profiles taken at one, fixed, latitude longitude location. 
             Each profile is a set of measurements taken at multiple altitudes or depths. 
             The dataset may be a collection of these TimeSeriesProfiles, 
             for example, a sequence of profiles taken at each of 12 different locations.
            <ul>
            <li>One of the variables (for example, station_id) MUST have the 
              variable attribute cf_role=timeseries_id
              to identify the variable that uniquely identifies the stations.
              <br><kbd>&lt;att name="cf_role"&gt;timeseries_id&lt;/att&gt;</kbd>
            <li>One of the variables (for example, profile_number) MUST have the 
              variable attribute cf_role=profile_id
              to identify the variable that uniquely identifies the profiles.
              <br><kbd>&lt;att name="cf_role"&gt;profile_id&lt;/att&gt;</kbd>
              <br>(A given profile_id only has to be unique for a given timeseries_id.)
              If no other variable is suitable, consider using the time variable.
            <li>The dataset MUST include the globalAttribute cdm_timeseries_variables,
              where the
              value is a comma-separated list of the variables which have the information
              about each station.  For a given station, the values of these variables
              MUST be constant. 
              For example,
              <pre>&lt;att name="cdm_timeseries_variables"&gt;station_id,station_type,latitude,longitude&lt;/att&gt;</pre>
              The list MUST include the cf_role=timeseries_id variable 
              and all other variables with information about the station, 
              which almost always includes latitude and longitude.
              <br>The list will never include time, altitude, depth, or any observation variables.
            <li>The dataset MUST include the globalAttribute cdm_profile_variables, where the
              value is a comma-separated list of the variables which have the information
              about each profile.  For a given profile, the values of these variables
              MUST be constant. For example,
              <pre>&lt;att name="cdm_profile_variables"&gt;profile_number,time&lt;/att&gt;</pre>
              The list MUST include the cf_role=profile_id variable 
              and all other variables with information about the profile, 
              which almost always includes time.
              <br>The list will never include latitude, longitude, altitude, depth, or any observation variables.
              <br>&nbsp;
          </ul>

          <li><a class="selfLink" id="cdmTrajectory" href="#cdmTrajectory" rel="bookmark">Trajectory</a> -- 
             is a sequence of measurements taken along a trajectory (a path through space and time)
             (e.g., sea_water_temperature taken by a ship as it moves through the water). 
             The dataset may be a collection of these Trajectories, 
             for example, a sequence from each of 4 different ships.
            <ul>
            <li>One of the variables (for example, ship_id) MUST have the attribute cf_role=trajectory_id
              to identify the variable that uniquely identifies the trajectories.
              <br><kbd>&lt;att name="cf_role"&gt;trajectory_id&lt;/att&gt;</kbd>
            <li>The dataset MUST include the globalAttribute 
              <a class="selfLink" id="cdm_trajectory_variables" href="#cdm_trajectory_variables" rel="bookmark">cdm_trajectory_variables</a>, 
              where the
              value is a comma-separated list of the variables which have the information
              about each trajectory.  For a given trajectory, 
              the values of these variables 
              MUST be constant. For example,
              <pre>&lt;att name="cdm_trajectory_variables"&gt;ship_id,ship_type,ship_owner&lt;/att&gt;</pre>
              The list MUST include the cf_role=trajectory_id variable
              and all other variables with information about the trajectory.              
              <br>The list will never include time, latitude, longitude, 
              or any observation variables.
              <br>&nbsp;
            </ul>

          <li><a class="selfLink" id="cdmTrajectoryProfile" href="#cdmTrajectoryProfile" rel="bookmark">TrajectoryProfile</a> -- 
            is a sequence of profiles taken along a trajectory. 
            The dataset may be a collection of these TrajectoryProfiles, 
            for example, a sequence of profiles taken by 14 different ships.           
            <ul>
            <li>One of the variables (for example, ship_id) MUST have the variable attribute cf_role=trajectory_id
              to identify the variable that uniquely identifies the trajectories.
              <br><kbd>&lt;att name="cf_role"&gt;trajectory_id&lt;/att&gt;</kbd>
            <li>One of the variables (for example, profile_number) MUST have the variable attribute cf_role=profile_id
              to identify the variable that uniquely identifies the profiles.
              <br><kbd>&lt;att name="cf_role"&gt;profile_id&lt;/att&gt;</kbd>
              <br>(A given profile_id only has to be unique for a given trajectory_id.)
              If no other variable is suitable, consider using the time variable.
            <li>The dataset MUST include the globalAttribute cdm_trajectory_variables, where the
              value is a comma-separated list of the variables which have the information
              about each trajectory.  For a given trajectory, the values of these variables
              MUST be constant. For example,
              <pre>&lt;att name="cdm_trajectory_variables"&gt;ship_id,ship_type,ship_owner&lt;/att&gt;</pre>
              The list MUST include the cf_role=trajectory_id variable
              and all other variables with information about the trajectory.
              <br>The list will never include profile-related variables, time, 
              latitude, longitude, or any observation variables.
            <li>The dataset MUST include the globalAttribute cdm_profile_variables, where the
              value is a comma-separated list of the variables which have the information
              about each profile.  For a given profile, the values of these variables
              MUST be constant. For example,
              <pre>&lt;att name="cdm_profile_variables"&gt;profile_number,time,latitude,longitude&lt;/att&gt;</pre>
              The list MUST include the cf_role=profile_id variable 
              and all other variables with information about the profile, 
              which almost always includes time, latitude and longitude.
              <br>The list will never include altitude, depth, or any observation variables.
              <br>&nbsp;
            </ul>

          <li><a class="selfLink" id="cdmOther" href="#cdmOther" rel="bookmark">Other</a> -- 
            has no requirements. Use it if the dataset doesn't 
            fit one of the other options, notably, if the dataset doesn't include 
            latitude, longitude and time variables.
            <br>&nbsp;
          </ul>

          <a class="selfLink" id="cdmRelatedNotes" href="#cdmRelatedNotes" rel="bookmark">Related notes:</a>
          <ul>
          <li>All EDDTable datasets with a cdm_data_type other than "Other" MUST have longitude,
            latitude, and time variables. 
          <li>Datasets with profiles MUST have an altitude variable, a depth variable, or an 
            <a rel="help" href="#cdm_altitude_proxy">cdm_altitude_proxy</a> variable.
          <li>If you can't make a dataset comply with all of the requirements for the ideal
            cdm_data_type, use "Point" (which has few requirements) or "Other" 
            (which has no requirements) instead.
          <li>This information is used by ERDDAP™ in various ways, for example, but mostly for making
            .ncCF files (.nc files which comply with the 
            Contiguous Ragged Array Representations associated with the dataset's cdm_data_type)
            and .ncCFMA files (.nc files which comply with the 
            Multidimensional Array Representations associated with the dataset's cdm_data_type)
            as defined in  
            <a rel="help" href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html#discrete-sampling-geometries"
            >Discrete&nbsp;Sampling&nbsp;Geometries&nbsp;(DSG)<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
          chapter of the 
            <a rel="help" href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html"
            >CF<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
               metadata conventions,
            which were previously named "CF Point Observation Conventions".
          <li>Hint: For these datasets, the correct setting for  
            <a rel="help" href="#subsetVariables">subsetVariables</a>
            is usually the combination of all the variables listed in the cdm_..._variables attributes.  
            For example, for TimeSeriesProfile, use the 
            cdm_timeseries_variables plus the cdm_profile_variables.
            <br>&nbsp;
          </ul>
        </ul>

      <li><a class="selfLink" id="contributor_name" href="#contributor_name" rel="bookmark"><strong>contributor_name</strong></a> 
        (from the
        <a rel="help" href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3"
        >ACDD<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        metadata standard) 
        is the RECOMMENDED way to identify a person, organization, or project 
        which contributed to this dataset (for example, the original creator of the data,
        before it was reprocessed by the creator of this dataset).   For example,
        <br><kbd>&lt;att name="contributor_name"&gt;NOAA OceanWatch - Central Pacific&lt;/att&gt;</kbd>
        <br>If "contributor" doesn't really apply to a dataset, omit this attribute.
        Compared to <a rel="help" href="#creator_name">creator_name</a>, this is sometimes more
        focused on the funding source.
        <br>&nbsp;

      <li><a class="selfLink" id="contributor_role" href="#contributor_role" rel="bookmark"><strong>contributor_role</strong></a> 
        (from the
        <a rel="help" href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3"
        >ACDD<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        metadata standard) 
        is the RECOMMENDED way to identify the role of 
        <a rel="help" href="#creator_name">contributor_name</a>.  For example,
        <br><kbd>&lt;att name="contributor_role"&gt;Source of Level 2b data&lt;/att&gt;</kbd>
        <br>If "contributor" doesn't really apply to a dataset, omit this attribute.
        <br>&nbsp;

      <li><a class="selfLink" id="Conventions" href="#Conventions" rel="bookmark"><strong>Conventions</strong></a> 
        (from the
        <a rel="help" href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html"
        >CF<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
        metadata standard) 
        is STRONGLY RECOMMENDED. (It may be REQUIRED in the future.) 
        The value is a comma-separated list of metadata standards that this dataset follows.
        For example:
        <br>&lt;att name="Conventions"&gt;COARDS, CF-1.6, ACDD-1.3&lt;/att&gt;
        <br>The common metadata conventions used in ERDDAP™ are:
        <ul>
        <li><a rel="help" 
        href="https://ferret.pmel.noaa.gov/noaa_coop/coop_cdf_profile.html"
        >COARDS Conventions<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        is the precursor to CF. 

        <li><a rel="help" href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html"
        >Climate and Forecast (CF) Conventions<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        is the source of many of the recommended and required attributes in ERDDAP. 
        The current version of CF is identified as "CF-1.6".

        <li>The NetCDF Attribute Convention for Dataset Discovery (ACDD)
        is the source of many of the recommended and required attributes in ERDDAP. 
        The original 1.0 version of ACDD (a brilliant piece of work by Ethan Davis), was identified as
        <a rel="help" href="https://wiki.esipfed.org/ArchivalCopyOfVersion1"
        >Unidata Dataset Discovery v1.0<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
        The current (starting in 2015) 1.3 version of ACDD is identified as 
        <a rel="help" href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3"
        >ACDD-1.3<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
        If your datasets have been using <kbd>Unidata Dataset Discovery v1.0</kbd>,
        we encourage you to 
        <a rel="help" href="#switchToACDD13">switch your datasets to use ACDD-1.3</a>.

        </ul>        
        If your dataset follows some additional metadata standard, please add the name
        to the CSV list in the Conventions attribute.
        <br>&nbsp;

      <li><a class="selfLink" id="coverage_content_type" href="#coverage_content_type" rel="bookmark"><strong>coverage_content_type</strong></a> 
        (from the
        <a rel="help" href="https://en.wikipedia.org/wiki/Geospatial_metadata">ISO 19115<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        metadata standard) 
        is the RECOMMENDED way to identify the type of gridded data (in EDDGrid datasets).   
        For example,
        <br><kbd>&lt;att name="coverage_content_type"&gt;modelResult&lt;/att&gt;</kbd>
        <br>The only allowed values are auxiliaryInformation, image, modelResult, 
        physicalMeasurement (the default when ISO 19115 metadata is generated), 
        qualityInformation, referenceInformation, and thematicClassification.
        (Don't use this tag for EDDTable datasets.)
        <br>&nbsp;

      <li><a class="selfLink" id="creator_name" href="#creator_name" rel="bookmark"><strong>creator_name</strong></a> 
        (from the
        <a rel="help" href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3"
        >ACDD<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        metadata standard) 
        is the RECOMMENDED way to identify the person, organization, or project 
        (if not a specific person or organization), 
        most responsible for the creation (or most recent reprocessing) of this data.
        For example,
        <br><kbd>&lt;att name="creator_name"&gt;NOAA NMFS SWFSC ERD&lt;/att&gt;</kbd>
        <br>If the data was extensively reprocessed (for example, 
        satellite data from level 2 to level 3 or 4),
        then usually the reprocessor is listed as the creator and
        the original creator is listed via <a rel="help" href="#contributor_name">contributor_name</a>.
        Compared to <a rel="help" href="#project">project</a>, this is more flexible,
        since it may identify a person, an organization, or a project.
        <br>&nbsp;

      <li><a class="selfLink" id="creator_email" href="#creator_email" rel="bookmark"><strong>creator_email</strong></a> 
        (from the
        <a rel="help" href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3"
        >ACDD<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        metadata standard) 
        is the RECOMMENDED way to identify an email address 
        (correctly formatted) that provides a way to contact the creator.   For example,
        <br><kbd>&lt;att name="creator_email"&gt;erd.data@noaa.gov&lt;/att&gt;</kbd>
        <br>&nbsp;       

      <li><a class="selfLink" id="creator_url" href="#creator_url" rel="bookmark"><strong>creator_url</strong></a> 
        (from the
        <a rel="help" href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3"
        >ACDD<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        metadata standard) 
        is the RECOMMENDED way to identify a URL for organization that created the dataset, 
        or a URL with 
        the creator's information about this dataset (but that is more the purpose of
        <a rel="help" href="#infoUrl">infoUrl</a>).   For example,
        <br><kbd>&lt;att name="creator_url"&gt;https://www.pfeg.noaa.gov&lt;/att&gt;</kbd>
        <br>&nbsp;

        
      <li><a class="selfLink" id="date_created" href="#date_created" rel="bookmark"><strong>date_created</strong></a> 
        (from the
        <a rel="help" href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3"
        >ACDD<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        metadata standard) 
        is the RECOMMENDED way to identify the date on which the data was first created 
        (for example, processed into this form), 
        in ISO 8601 format. For example,
        <br><kbd>&lt;att name="date_created"&gt;2010-01-30&lt;/att&gt;</kbd>
        <br>If data is periodically added to the dataset, this is the first date
        that the original data was made available.           
        <br>&nbsp;

      <li><a class="selfLink" id="date_modified" href="#date_modified" rel="bookmark"><strong>date_modified</strong></a> 
        (from the
        <a rel="help" href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3"
        >ACDD<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        metadata standard) 
        is the RECOMMENDED way to identify the date on which the data was last modified 
        (for example, when an error was fixed or when the latest data was added), 
        in ISO 8601 format. For example,
        <br><kbd>&lt;att name="date_modified"&gt;2012-03-15&lt;/att&gt;</kbd>
        <br>&nbsp;       

      <li><a class="selfLink" id="date_issued" href="#date_issued" rel="bookmark"><strong>date_issued</strong></a> (from the
        <a rel="help" href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3"
        >ACDD<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        metadata standard) 
        is the RECOMMENDED way to identify the date on which the data was first 
        made available to others, 
        in ISO 8601 format, for example, <kbd>2012-03-15</kbd>.         
        For example,
        <br><kbd>&lt;att name="date_issued"&gt;2010-07-30&lt;/att&gt;</kbd>
        <br>
        For example, the dataset may have a 
        <a rel="help" href="#date_created">date_created</a> of
        <kbd>2010-01-30</kbd>, but was only made publicly available <kbd>2010-07-30</kbd>.
        date_issued is less commonly used than date_created and date_modified. 
        If date_issued is omitted, it is assumed to be the same as the date_created.        
        <br>&nbsp;

      <li><a class="selfLink" id="globalDrawLandMask" href="#globalDrawLandMask" rel="bookmark"><strong>drawLandMask</strong></a> -- 
        This is an OPTIONAL global attribute used by ERDDAP™ (and no metadata 
        standards) which specifies the default value for the "Draw Land Mask" 
        option on the dataset's Make A Graph form (<i>datasetID</i>.graph) 
        and for the <kbd>&amp;.land</kbd> parameter in a URL requesting 
        a map of the data. For example,
        <br><kbd>&lt;att name="drawLandMask"&gt;over&lt;/att&gt;</kbd>
        <br>See the <a rel="help" href="#drawLandMask">drawLandMask overview</a>.
        <br>&nbsp;

      <li><a class="selfLink" id="featureType" href="#featureType" rel="bookmark"><strong>featureType</strong></a> 
        (from the
        <a rel="help" href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html"
        >CF<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
        metadata standard) 
        is IGNORED and/or REPLACED.  If the dataset's <a rel="help" href="#cdm_data_type"><kbd>cdm_data_type</kbd></a>
        is appropriate, ERDDAP™ will automatically use it to create a
        <kbd>featureType</kbd> attribute.  So there is no need for you to add it.

        <p>However, if you are using 
        <a rel="help" href="#EDDTableFromNcCFFiles">EDDTableFromNcCFFiles</a>
        to create a dataset from files that follow the 
        <a href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html#discrete-sampling-geometries"
        >CF Discrete Sampling Geometries (DSG) standard<img 
      src="../images/external.png" alt=" (external link)" 
      title="This link to an external website does not constitute an endorsement."></a>,
        the files themselves must have <kbd>featureType</kbd> correctly defined,
        so that ERDDAP™ can read the files correctly.
        That is part of the CF DSG requirements for that type of file.
        <br>&nbsp;

      <li><a class="selfLink" id="history" href="#history" rel="bookmark"><strong>history</strong></a> 
        (from the
        <a rel="help" href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html"
        >CF<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        and
        <a rel="help" href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3"
        >ACDD<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        metadata standards) 
        is a RECOMMENDED multi-line String global attribute with a line for
        every processing step that the data has undergone.  For example,
        <br><kbd>&lt;att name="history"&gt;2011-08-05T08:55:02Z CMOR: Rewrote data 
          to comply with CF standards. 
        <br>2012-04-08T08:34:58Z CMOR: Converted &#039;height&#039; type 
          from &#039;d&#039; to &#039;f&#039;.&lt;/att&gt;</kbd>
        <ul>
         <li>Ideally, each line has an ISO 8601:2004(E) formatted date+timeZ 
           (for example, <kbd>2011-08-05T08:55:02Z</kbd>) 
           followed by a description of the processing step.
         <li>ERDDAP™ creates this if it doesn't already exist.
         <li>If it already exists, ERDDAP™ will append new information to the 
           existing information.
         <li><kbd>history</kbd> is important because it allows clients to backtrack
           to the original source of the data.
          <br>&nbsp;
         </ul>

      <li><a class="selfLink" id="infoUrl" href="#infoUrl" rel="bookmark"><strong>infoUrl</strong></a> 
        is a REQUIRED global attribute with the URL of a web page with more
        information about this dataset (usually at the source institution's website).  
        For example,
        <br><kbd>&lt;att name="infoUrl"&gt;http://www.globec.org/&lt;/att&gt;</kbd>
        <ul>
        <li>Either the dataset's global 
          <a rel="help" href="#globalAttributes">sourceAttributes</a> or its 
          global <kbd>&lt;addAttributes&gt;</kbd> MUST 
          include this attribute.
        <li><kbd>infoUrl</kbd> is important because it allows clients to find out 
          more about the data from the original source.
        <li>ERDDAP™ displays a link to the infoUrl on the dataset's 
          Data Access Form (<i>datasetID</i>.html), 
          Make A Graph web page (<i>datasetID</i>.graph), and other web pages.
        <li>If the URL has a query part (after the "?"), it MUST be already 
          <a rel="help" href="https://en.wikipedia.org/wiki/Percent-encoding"
          >percent encoded<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
        You need to encode special characters in the constraints 
        (other than the initial '&amp;' and the main '=',
        if any) into the form %HH, where HH is the 2 digit hexadecimal value of the character.
        Usually, you just need to convert a few of the punctuation characters: % into %25, 
        &amp; into %26, " into %22, &lt; into %3C, = into %3D, &gt; into %3E, + into %2B, 
        | into %7C, [ into %5B, ] into %5D, space into %20,
        and convert all characters above #127 into their UTF-8 form and then percent encode
        each byte of the UTF-8 form into the %HH format (ask a programmer for help).
        <br>For example, <kbd>&amp;stationID&gt;="41004"</kbd>
        <br>becomes &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<kbd>&amp;stationID%3E=%2241004%22</kbd>
        <br>Percent encoding is generally required when you access ERDDAP
          via software other than a browser. Browsers usually handle percent encoding for you.
        <br>In some situations, you need to percent encode all characters other than
        A-Za-z0-9_-!.~'()*, but still don't encode the initial '&amp;' or the main '='.
        <br>Programming languages have tools to do this (for example, see Java's
          <a rel="help" href="https://docs.oracle.com/javase/8/docs/api/java/net/URLEncoder.html"
          >java.net.URLEncoder<img 
          src="../images/external.png" alt=" (external link)" 
          title="This link to an external website does not constitute an endorsement."></a>
        <br>and JavaScript's
          <a rel="help" 
          href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent"
          >encodeURIComponent()<img 
          src="../images/external.png" alt=" (external link)" 
          title="This link to an external website does not constitute an endorsement."></a>) 
          and there are
        <br><a rel="help" href="https://www.url-encode-decode.com/"
          >websites that percent encode/decode for you<img 
          src="../images/external.png" alt=" (external link)" 
          title="This link to an external website does not constitute an endorsement."></a>.
        <li>Since datasets.xml is an XML file, you MUST also &amp;-encode ALL 
          '&amp;', '&lt;', and '&gt;' in the
          URL as '&amp;amp;', '&amp;lt;', and '&amp;gt;' after percent encoding.          
        <li><kbd>infoUrl</kbd> is unique to ERDDAP. It is not from any metadata standard.
          <br>&nbsp;
        </ul>

      <li><a class="selfLink" id="institution" href="#institution" rel="bookmark"><strong>institution</strong></a> 
        (from the
        <a rel="help" href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html"
        >CF<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
        and
        <a rel="help" href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3"
        >ACDD<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        metadata standards) 
        is a REQUIRED global attribute with the short version of the
        name of the institution which is the source of this data (usually an acronym,
        usually &lt;20 characters).  For example,
        <br><kbd>&lt;att name="institution"&gt;NASA GSFC&lt;/att&gt;</kbd>        
        <ul>
        <li>Either the dataset's global 
          <a rel="help" href="#globalAttributes">sourceAttributes</a> or its 
           global <kbd>&lt;addAttributes&gt;</kbd> MUST 
           include this attribute.
        <li>ERDDAP™ displays the institution whenever it displays a list of datasets.
           If an institution's name here is longer than 20 characters, 
           only the first 20 characters
           will be visible in the list of datasets (but the whole institution
           can be seen by putting the mouse cursor over the adjacent "?" icon).
        <li>If you add <kbd>institution</kbd> to the list of <kbd>&lt;categoryAttributes&gt;</kbd> 
          in ERDDAP's 
           <a rel="help" href="https://erddap.github.io/setup.html#setup.xml"
           >setup.xml<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
          file, users can easily find datasets from the same institution via ERDDAP's 
          "Search for Datasets by Category" on the home page.
          <br>&nbsp;
      </ul>

      <li><a class="selfLink" id="keywords" href="#keywords" rel="bookmark"><strong>keywords</strong></a> 
        (from the
        <a rel="help" href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3"
        >ACDD<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        metadata standard) 
        is a RECOMMENDED comma-separated list of words 
        and short phrases (for example,
        <a rel="help" href="https://wiki.earthdata.nasa.gov/display/CMR/GCMD+Keyword+Access"
        >GCMD Science Keywords<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>) 
        that describe the dataset in a general way, and not assuming any other knowledge of the
        dataset (for example, for oceanographic data, include <kbd>ocean</kbd>).   
        For example,
        <br><kbd>&lt;att name="keywords"&gt;ano, circulation, coastwatch, currents, derived, 
        Earth Science &amp;gt; Oceans &amp;gt; Ocean Circulation &amp;gt; Ocean Currents, eastward, eastward_sea_water_velocity, experimental, hf radio, meridional, noaa, northward, 
        northward_sea_water_velocity, nuevo, ocean, oceans, radio, radio-derived, 
        scan, sea, seawater, velocity, water, zonal&lt;/att&gt;</kbd>
        <br>Since datasets.xml is an XML document, the characters 
        &amp;, &lt;, and &gt; in an attribute like keywords (e.g., the &gt; characters
        in GCMD science keywords) must be encoded as
        <kbd>&amp;amp;</kbd>, <kbd>&amp;lt;</kbd>, and <kbd>&amp;gt;</kbd>, respectively.
        <br>When a dataset is loaded in ERDDAP,
        <ul>
        <li>"Earth Science &gt; " is added to the start of any GCMD keyword that lacks it. 
        <li>GCMD keywords are converted to Title Case (i.e., the first letters are capitalized).
        <li>The keywords are rearranged into sorted order 
          and any newline characters are removed.  
        </ul>
        <br>&nbsp;
       

      <li><a class="selfLink" id="keywords_vocabulary" href="#keywords_vocabulary" rel="bookmark"><strong>keywords_vocabulary</strong></a> 
        (from the
        <a rel="help" href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3"
        >ACDD<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        metadata standard) 
        is a RECOMMENDED attribute: 
        if you are following a guideline for the words/phrases in your 
        <kbd>keywords</kbd> attribute
        (for example, GCMD Science Keywords), put the name of that guideline here.  
        For example,
        <br><kbd>&lt;att name="keywords_vocabulary"&gt;GCMD Science Keywords&lt;/att&gt;</kbd>
        <br>&nbsp;
        

      <li><a class="selfLink" id="license" href="#license" rel="bookmark"><strong>license</strong></a> 
        (from the
        <a rel="help" href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3"
        >ACDD<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        metadata standard) 
        is a STRONGLY RECOMMENDED global attribute with the license and/or usage
        restrictions.   For example,
        <br><kbd>&lt;att name="license"&gt;[standard]&lt;/att&gt;</kbd>
        <ul>
        <li>If "<kbd>[standard]</kbd>" occurs in the attribute value, it will be 
          replaced by the
          standard ERDDAP™ license from the <kbd>&lt;standardLicense&gt;</kbd> tag in ERDDAP's
          <br>[tomcat]/webapps/erddap/WEB-INF/classes/gov/noaa/pfel/erddap/util/messages.xml file.
          <br>&nbsp;
        </ul>

      <li><a class="selfLink" id="Metadata_Conventions" href="#Metadata_Conventions" rel="bookmark"><strong>Metadata_Conventions</strong></a> 
        is from the outdated 
        <a rel="help" 
        href="https://wiki.esipfed.org/ArchivalCopyOfVersion1"
        >ACDD 1.0<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
        (which was identified in Metadata_Conventions as
        "Unidata Dataset Discovery v1.0")
        metadata standard. The attribute value was a comma-separated list
        of metadata conventions used by this dataset.  
        <br>If a dataset uses ACDD 1.0, this attribute is STRONGLY RECOMMENDED, 
          for example,
        <br><kbd>&lt;att name="Metadata_Conventions"&gt;COARDS, CF-1.6, 
          Unidata Dataset Discovery v1.0&lt;/att&gt;</kbd>
        <br>But ERDDAP™ now recommends ACDD-1.3. 
        If you have 
        <a rel="help" href="#switchToACDD13">switched your datasets to use ACDD-1.3</a>, 
        use of <kbd>Metadata_Conventions</kbd> is STRONGLY DISCOURAGED: just use 
        <a rel="help" href="#Conventions"><kbd>&lt;Conventions&gt;</kbd></a> instead.
        <br>&nbsp;

      <li><a class="selfLink" id="processing_level" href="#processing_level" rel="bookmark"><strong>processing_level</strong></a> 
        (from the
        <a rel="help" href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3"
        >ACDD<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        metadata standard) 
        is a RECOMMENDED textual description of the processing 
        (for example, <a rel="help" href="https://en.wikipedia.org/wiki/Remote_sensing#Data_processing_levels">NASA
        satellite data processing levels<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>, 
        for example, <kbd>Level 3</kbd>)
        or quality control level (for example, <kbd>Science Quality</kbd>) of the data.
        For example,
        <br><kbd>&lt;att name="processing_level"&gt;3&lt;/att&gt;</kbd>
        <br>&nbsp;
        

      <li><a class="selfLink" id="project" href="#project" rel="bookmark"><strong>project</strong></a> 
        (from the
        <a rel="help" href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3"
        >ACDD<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        metadata standard) 
        is an OPTIONAL attribute to identify the project that the dataset is part of. 
        For example,
        <br><kbd>&lt;att name="project"&gt;GTSPP&lt;/att&gt;</kbd>
        <br>If the dataset isn't part of a project, don't use this attribute.
        Compared to <a rel="help" href="#creator_name">creator_name</a>, this is 
        focused on the project (not a person or an organization, which may
        be involved in multiple projects).
        <br>&nbsp;

      <li><a class="selfLink" id="publisher_name" href="#publisher_name" rel="bookmark"><strong>publisher_name</strong></a> 
        (from the
        <a rel="help" href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3"
        >ACDD<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        metadata standard) 
        is the RECOMMENDED way to identify the person, organization, or project 
        which is publishing this dataset.  For example,
        <br><kbd>&lt;att name="publisher_name"&gt;JPL&lt;/att&gt;</kbd>
        <br>For example, you are the publisher if another person or group <a rel="help" href="#creator_name">created</a>
        the dataset and you are just re-serving it via ERDDAP.
        If "publisher" doesn't really apply to a dataset, omit this attribute.
        Compared to <a rel="help" href="#creator_name">creator_name</a>, the publisher
        probably didn't significantly modify or reprocess the data; the publisher
        is just making the data available in a new venue.
        <br>&nbsp;

      <li><a class="selfLink" id="publisher_email" href="#publisher_email" rel="bookmark"><strong>publisher_email</strong></a> 
        (from the
        <a rel="help" href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3"
        >ACDD<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        metadata standard) 
        is the RECOMMENDED way to identify an email address 
        (correctly formatted, for example, john_smith@great.org) 
        that provides a way to contact the publisher.  For example,
        <br><kbd>&lt;att name="publisher_email"&gt;john_smith@great.org&lt;/att&gt;</kbd>
        <br>If "publisher" doesn't really apply to a dataset, omit this attribute.
        <br>&nbsp;

      <li><a class="selfLink" id="publisher_url" href="#publisher_url" rel="bookmark"><strong>publisher_url</strong></a> 
        (from the
        <a rel="help" href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3"
        >ACDD<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        metadata standard) 
        is the RECOMMENDED way to identify a URL for the organization that
        published the dataset, 
        or a URL with the publisher's information about this dataset 
        (but that is more the purpose of
        <a rel="help" href="#infoUrl">infoUrl</a>).  For example,
        <br><kbd>&lt;att name="publisher_url"&gt;https://podaac.jpl.nasa.gov&lt;/att&gt;</kbd>
        <br>If "publisher" doesn't really apply to a dataset, omit this attribute.
        <br>&nbsp;

      <li><a class="selfLink" id="real_time" href="#real_time" rel="bookmark"><strong>real_time</strong></a> 
        is a global String attribute (not from any standard) indicating if this is a real time dataset.
        For example,
        <br><kbd>&lt;att name="real_time"&gt;true&lt;/att&gt;</kbd>
        <br>If this is <kbd>false</kbd> (the default), ERDDAP™ will cache responses to requests 
          for file types where the entire file must be created
          before ERDDAP™ can begin to send the response to the user and reuse them for 
          up to about 15 minutes (e.g., .nc, .png).
        <br>If this is set to <kbd>true</kbd>, ERDDAP™ will never cache the response files 
          and will always return newly created files.
        <br>&nbsp;
        
      <li><a class="selfLink" id="sourceUrlAttribute" href="#sourceUrlAttribute" rel="bookmark"><strong>sourceUrl</strong></a> 
        is a global attribute with the URL of the source
        of the data.  For example,
        <br><kbd>&lt;att name="sourceUrl"&gt;https://opendap.co-ops.nos.noaa.gov/<wbr>ioos-dif-sos/SOS<wbr>&lt;/att&gt;</kbd>  
        <br>(but put it all on one line)
        <ul>
        <li>ERDDAP™ usually creates this global attribute automatically.
          Two exceptions are EDDTableFromHyraxFiles and EDDTableFromThreddsFiles.
        <li>If the source is local files and the files were created by your organization, use
          <br><kbd>&lt;att name="sourceUrl"&gt;(local files)&lt;/att&gt;</kbd>  
        <li>If the source is local database and the data was created by your organization, use
          <br><kbd>&lt;att name="sourceUrl"&gt;(local database)&lt;/att&gt;</kbd>  
        <li><kbd>sourceUrl</kbd> is important because it allows clients to backtrack 
          to the original source of the data.
        <li><kbd>sourceUrl</kbd> is unique to ERDDAP. It is not from any metadata standard.
          <br>&nbsp;
        </ul>

      <li><a class="selfLink" id="standard_name_vocabulary" href="#standard_name_vocabulary" rel="bookmark"><strong>standard_name_vocabulary</strong></a> 
        (from the
        <a rel="help" href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3"
        >ACDD<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        metadata standard) 
        is a RECOMMENDED attribute to identify 
        the name of the controlled vocabulary 
        from which variable 
        <a rel="help" href="#standard_name">standard_name</a>s 
        are taken.
        For example,
        <br><kbd>&lt;att name="standard_name_vocabulary"&gt;CF Standard Name Table v77&lt;/att&gt;</kbd>
        <br>for version 77 of the 
        <a rel="help" href="https://cfconventions.org/Data/cf-standard-names/current/build/cf-standard-name-table.html">CF standard name table<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
        <br>&nbsp;

      <li><a class="selfLink" id="subsetVariables" href="#subsetVariables" rel="bookmark"><strong>subsetVariables</strong></a> (for EDDTable datasets only) 
        is a RECOMMENDED global attribute that lets you specify
        a comma-separated list of 
        <a rel="help" href="#dataVariable"><kbd>&lt;dataVariable&gt;</kbd></a>
        <a rel="help" href="#destinationName">destinationName</a>s
        to identify variables which have a limited number of
        values (stated another way: variables for which each of the values has 
        many duplicates). For example,
        <br><kbd>&lt;att name="subsetVariables"&gt;station_id, longitude, latitude&lt;/att&gt;</kbd>
        <br>If this attribute is present, the dataset will have a 
          <i>datasetID</i>.subset web page (and a link to it
          on every dataset list) which lets users quickly and easily select 
          various subsets of the data.
        <ul>
        <li>Each time a dataset is loaded, ERDDAP
          loads and stores on disk a table with all of the distinct() combinations
          of the subsetVariable's variable's values.
          ERDDAP™ can read that subsetVariables table and process it very quickly (especially
          compared to reading lots of data files or getting data from 
          a database or other external service).
        <li>That allows ERDDAP™ to do 3 things:
          <ol>
          <li>It allows ERDDAP™ to put a list of possible values in a dropdown list on the
            Data Access Form, Make A Graph web page, and .subset webpages.
          <li>It allows ERDDAP™ to offer a .subset webpage for that dataset. 
            That page is interesting because it makes it easy to find valid combinations
            of the values of those variables, which for some datasets and some variables 
            is very, very hard (almost impossible).
            Then, all user requests for distinct() subsetVariable data will be very fast.
          <li>If there is a user request that only refers to a subset of those variables, 
            ERDDAP™ can quickly read the subsetVariables table, and respond to the request. 
            That can save a ton of time and effort for ERDDAP.
          </ol>
        <li>The order of the destinationNames you specify determines the sort order on the 
          <i>datasetID</i>.subset web page,
          so you will usually specify the most important variables first, 
          then the least important.
          For example, for datasets with time series data for several stations, 
          you might use, for example,
          <br><kbd>&lt;att name="subsetVariables"&gt;station_id, longitude,
            latitude&lt;/att&gt;</kbd>
          <br>so that the values are sorted by station_id.
        <li>Obviously, it is your choice which variables to include in the 
          subsetVariables list, but the suggested usage is: 
          <p>In general, include variables for which 
            you want ERDDAP™ to display a drop-down list of options on the dataset's 
            Data Access Form (.html) and Make-A-Graph (.graph) web pages.
          <p>In general, do include variables with information about 
            the dataset's features (the stations, profiles, and/or trajectories,
            notably from 
            <a rel="help" href="#cdm_timeseries_variables">cdm_timeseries_variables</a>,
            <a rel="help" href="#cdm_profile_variables">cdm_profile_variables</a>,
            <a rel="help" href="#cdm_trajectory_variables">cdm_trajectory_variables</a>).
            There are only a few different values for these variables 
            so they work well with drop-down lists.
          <p>Don't ever include any data variables associated with individual observations 
            (e.g., time, temperature, salinity, current speed)
            in the subsetVariables list.  
            There are too many different values for these variables,
            so a drop-down list would be slow to load and be hard to work with 
            (or not work).
        <li>If the number of distinct combinations of these variables is greater 
          than about 1,000,000,
          you should consider restricting the subsetVariables that you specify to reduce the 
          number of distinct combinations to below 1,000,000; otherwise, 
          the <i>datasetID</i>.subset web pages may be generated slowly.
          In extreme cases, the dataset may not load in ERDDAP™ because 
          generating the list of distinct combinations uses too much memory.
          If so, you MUST remove some variables from the subsetVariables list.
        <li>If the number of distinct values of any one subset variable is greater
          than about 20,000,
          you should consider not including that variable in the list of subsetVariables;
          otherwise, it takes a long time to transmit the 
          <i>datasetID</i>.subset, <i>datasetID</i>.graph, and <i>datasetID</i>.html web pages.
          Also, on a Mac, it is very hard to make selections from a drop down 
          list with more than 500 items because of the lack of a scroll bar.
          A compromise is: remove variables from the list when users are not
          likely to select values from a drop down list.
        <li>You should test each dataset to see if the subsetVariables setting is okay.
          If the source data server is slow and it takes too long (or fails) to 
          download the data,
          either reduce the number of variables specified or remove the 
          subsetVariables global attribute.
        <li>SubsetVariables is very useful. So if your dataset is suitable, 
          please create a subsetVariables attribute.
        <li>EDDTableFromSOS automatically adds 
          <br><kbd>&lt;att name="subsetVariables"&gt;station_id, longitude, latitude&lt;/att&gt;</kbd>
          <br>when the dataset is created.
        <li>Possible warning: if a user using the <i>datasetID</i>.subset web
          page selects a value which
          has a carriageReturn or newline character, <i>datasetID</i>.subset will fail.
          ERDDAP™ can't work around this issue because of some HTML details. In any case, it is
          almost always a good idea to remove the carriageReturn and newline characters
          from the data.
          To help you fix the problem, if the EDDTable.subsetVariablesDataTable 
          method in ERDDAP
          detects data values that will cause trouble, it will email a warning 
          with a list of 
          offending values to the emailEverythingTo email addresses specified 
          in setup.xml.
          That way, you know what needs to be fixed.
        <li><a class="selfLink" id="PregeneratedSubsetTables" href="#PregeneratedSubsetTables" rel="bookmark">Pre-generated subset tables.</a> 
          Normally, when ERDDAP™ loads a dataset, it requests
          the distinct() subset variables data table from the data source, 
          just via a normal data request.
          In some cases, this data is not available from the data source or 
          retrieving from the
          data source may be hard on the data source server.
          If so, you can supply a table with the information in a .json or .csv file
          with the name <i>tomcat</i>/content/erddap/subset/<i>datasetID</i>.json (or .csv).
          If present, ERDDAP™ will read it once when the dataset is loaded and use it as the 
          source of the subset data.
          <ul>
          <li>If there is an error while reading it, the dataset will fail to load.
          <li>It MUST have exact same column names (for example, same case) as 
            <kbd>&lt;subsetVariables&gt;</kbd>, 
            but the columns MAY be in any order.
          <li>It MAY have extra columns (they'll be removed and newly redundant
            rows will be removed).
          <li><a rel="help" href="#timeUnits">Time and timestamp</a> columns
            should have ISO 8601:2004(E) formatted date+timeZ strings
           (for example, <kbd>1985-01-31T15:31:00Z</kbd>).
          <li>Missing values should be missing values (not fake numbers like -99).
          <li>.json files may be a little harder to create but deal with Unicode
            characters well.
            .json files are easy to create if you create them with ERDDAP.
          <li>.csv files are easy to work with, but suitable for ISO 8859-1 characters only.
            .csv files MUST have column names on the first row and data on subsequent rows.
          </ul>
       <li>For huge datasets or when &lt;subsetVariables&gt; is misconfigured,
        the table of combinations of values can be large enough to cause Too Much Data or OutOfMemory errors.
        The solution is to remove variables from the list of &lt;subsetVariables&gt;
        for which there are a large number of values, or remove variables 
        as needed until the size of that table is reasonable.
        Regardless of the error, the parts of ERDDAP™ that use the subsetVariables system don't work well
        (e.g., web pages load very slowly) when there are too many rows (e.g., more than a million)
        in that table.
       <li>subsetVariables has nothing to do with specifying which variables users 
         can use in constraints, i.e., how users can request subsets of the dataset.
         ERDDAP™ always allows constraints to refer to any of the variables.
        <br>&nbsp;

        </ul>

      <li><a class="selfLink" id="summary" href="#summary" rel="bookmark"><strong>summary</strong></a> 
        (from the
        <a rel="help"
        href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html"
        >CF<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
        and 
        <a rel="help" href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3"
        >ACDD<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        metadata standards) 
        is a REQUIRED global attribute with a long description of the 
        dataset (usually &lt;500 characters).    For example,
        <br><kbd>&lt;att name="summary"&gt;VIIRSN Level-3 Standard Mapped Image, 
        Global, 4km, Chlorophyll a, Daily.  The Visible and Infrared Imager/Radiometer 
        Suite (VIIRS) is a multi-disciplinary instrument that flies on the National 
        Polar-orbiting Operational Environmental Satellite System (NPOESS) series 
        of spacecraft, including the NPOESS Preparatory Project (NPP).&lt;/att&gt;</kbd>
        <ul>
        <li>Either the dataset's global 
          <a rel="help" href="#globalAttributes">sourceAttributes</a> or its 
          global <kbd>&lt;addAttributes&gt;</kbd> MUST 
          include this attribute.
        <li><kbd>summary</kbd> is very important because it allows clients to read
          a description of the dataset 
          that has more information than the title and thus quickly understand 
          what the dataset is.
        <li>Advice: please write the summary so it would work to describe the 
          dataset to some random
          person you meet on the street or to a colleague.  Remember to include the 
          <a rel="help" href="https://en.wikipedia.org/wiki/Five_Ws">Five W's and one H<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>:
          Who created the dataset?  What information was collected? 
          When was the data collected?
          Where was it collected? Why was it collected?  How was it collected?
        <li>ERDDAP™ displays the summary on the dataset's 
          Data Access Form (<i>datasetID</i>.html), 
          Make A Graph web page (<i>datasetID</i>.graph),
          and other web pages.  ERDDAP™ uses the summary when creating FGDC and
          ISO 19115 documents.
          <br>&nbsp;
        </ul>

      <li><a class="selfLink" id="testOutOfDate" href="#testOutOfDate" 
      rel="bookmark"><strong>testOutOfDate</strong></a>
        (an optional ERDDAP-specific global metadata attribute, not from any standard)
        specifies, in a simplistic way, when the data for a near-real-time dataset is considered out-of-date,
        specified as <kbd>now-<i>nUnits</i></kbd>, for example, <kbd>now-2days</kbd> 
        for data that usually appears 24-48 hours after the time value.
        For forecast data, use <kbd>now<strong>+</strong><i>nUnits</i></kbd>, 
        for example, <kbd>now+6days</kbd> for forecast data that is, at most, 8 days
        in the future.
        (See the 
        <a rel="help"
        href="https://coastwatch.pfeg.noaa.gov/erddap/tabledap/documentation.html#now"
        ><kbd>now-<i>nUnits</i></kbd> syntax description</a>.)
        If the maximum time value for the dataset is 
        more recent than the specified time, the dataset is considered up-to-date.
        If the maximum time value is older than the specified time,
        the dataset is considered up-to-date.        
        For out-of-date datasets, there is presumably a problem with the data source,
        so ERDDAP™ is unable to access data from more recent time points.

        <p>The testOutOfDate value is displayed as a column in the
        <a rel="help" href="#EDDTableFromAllDatasets">allDatasets dataset</a> in your ERDDAP.
        It is also used to calculate the outOfDate index,
        which is another column in the allDatasets dataset.
        <br>If the index is &lt;1, the dataset is considered up-to-date.
        <br>If the index is &lt;=1, the dataset is considered out-of-date.
        <br>If the index is &lt;=2, the dataset is considered very out-of-date.

        <p>The testOutOfDate value is also used by ERDDAP™ to generate the 
        https://<i>yourDomain</i>/erddap/outOfDateDatasets.html web page
        (<a rel="help"
        href="https://coastwatch.pfeg.noaa.gov/erddap/outOfDateDatasets.html"
        >example</a>)
        which shows the datasets which have &lt;testOutOfDate&gt; tags,
        with the datasets ranked by how out-of-date they are.
        If you change the file type (from .html to .csv, .jsonlCSV, .nc, .tsv, ...), 
        you can get that information in different file formats.

        <p>When possible, <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml</a>
        adds a testOutOfDate attribute to the global addAttributes of a dataset.
        This value is a suggestion based on the information available to
        GenerateDatasetsXml. If the value isn't appropriate, change it.

        <p>"Out-of-date" here is very different from 
        <a rel="help" href="#reloadEveryNMinutes"><kbd>&lt;reloadEveryNMinutes&gt;</kbd></a>,
        which deals with how up-to-date ERDDAP's knowledge of the dataset is.
        The &lt;testOutOfDate&gt; system assumes that ERDDAP's knowledge of the 
        dataset is up-to-date. The question &lt;testOutOfDate&gt; deals with is: 
        does there appear to be something wrong with the source of the data,
        causing more recent data to be not accessible by ERDDAP?

      <li><a class="selfLink" id="title" href="#title" rel="bookmark"><strong>title</strong></a> 
        (from the
        <a rel="help" href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html"
        >CF<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
        and
        <a rel="help" href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3"
        >ACDD<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        metadata standards) 
        is a REQUIRED global attribute with the short description of the dataset
        (usually &lt;=95 characters).   For example,
        <br><kbd>&lt;att name="title"&gt;VIIRSN Level-3 Mapped, Global, 4km, Chlorophyll a, Daily&lt;/att&gt;</kbd>
        <ul>
        <li>Either the dataset's global 
          <a rel="help" href="#globalAttributes">sourceAttributes</a> or its 
          global <kbd>&lt;addAttributes&gt;</kbd> MUST 
          include this attribute.
        <li><kbd>title</kbd> is important because every list of datasets presented by ERDDAP
          (other than search
          results) lists the datasets in alphabetical order, by title. 
          So if you want to specify the order of datasets, or have some datasets 
          grouped together, 
          you have to create titles with that in mind.
          Many lists of datasets (for example, in response to a category search), 
          show a subset of the full list and in a different order.
          So the title for each dataset should stand on its own.
        <li>If the title contains the word "DEPRECATED" (all capital letters),
          then the dataset will get a lower ranking in searches.
          <br>&nbsp;
        </ul>
      </ul> <!-- end of global attributes -->

  <li><a class="selfLink" id="axisVariable" href="#axisVariable" rel="bookmark"><kbd><strong>&lt;axisVariable&gt;</strong></kbd></a> is used to
    describe a dimension (also called "axis").
    <br>For EDDGrid datasets, one or more axisVariable tags is REQUIRED,
       and all <a rel="help" href="#dataVariable">dataVariables</a> always 
       share/use all axis variables.
       (<a rel="help" href="#dataStructures">Why?</a> 
        <a rel="help" href="#differentDimensions">What if they don't?</a>)
    <br>There MUST be an axis variable for each dimension of the data variables.
    <br>Axis variables MUST be specified in the order that the data variables use them.
    <br>(EDDTable datasets can NOT use <kbd>&lt;axisVariable&gt;</kbd> tags.)
    <br>A fleshed out example is:
    <pre>
&lt;axisVariable&gt;
  &lt;<a rel="help" href="#sourceName">sourceName</a>&gt;MT&lt;/sourceName&gt; 
  &lt;<a rel="help" href="#destinationName">destinationName</a>&gt;time&lt;/destinationName&gt;
  &lt;addAttributes&gt;
    &lt;att name="<a rel="help" href="#units">units</a>"&gt;days since 1902-01-01T12:00:00Z&lt;/att&gt;
  &lt;/addAttributes&gt;
&lt;/axisVariable&gt; </pre>

      
    <kbd>&lt;axisVariable&gt;</kbd> supports the following subtags:
      <ul>
      <li><kbd>&lt;<a rel="help" href="#sourceName">sourceName</a>&gt;</kbd> -- 
       the data source's name for the variable. 
      This is the name that ERDDAP™ will use when requesting data from the data source. 
      This is the name that ERDDAP™ will look for when data is returned from the data source.
      This is case sensitive.
      This is REQUIRED.
      <li><kbd>&lt;<a rel="help" href="#destinationName">destinationName</a>&gt;</kbd> 
      is the name for the variable that will be shown to and used by ERDDAP™ users. 
        <ul>
        <li>This is OPTIONAL. If absent, the sourceName is used.
        <li>This is useful because it allows you to change a cryptic or odd sourceName.
        <li>destinationName is case sensitive.
        <li>destinationNames MUST start with a letter (A-Z, a-z) and MUST be 
            followed by 0 or more
            characters (A-Z, a-z, 0-9, and _).  ('-' was allowed before ERDDAP™ version 1.10.)
          This restriction allows axis variable names to be the same in ERDDAP™, 
          in the response files, and in all the software where those files will be 
          used, including programming languages (like Python, Matlab, and JavaScript)
          where there are similar restrictions on variable names.
        <li>In EDDGrid datasets, the 
          <a rel="help" href="#LLAT">longitude, latitude, altitude, depth, and time</a> 
          axis variables are special.
          <br>&nbsp;
        </ul>

      <li><a rel="help" href="#variableAttributes"><kbd>&lt;addAttributes&gt;</kbd></a>  
        defines an OPTIONAL set of attributes 
        (<i>name</i> = <i>value</i>)
        which are added to the source's attributes for a variable, to make the combined
        attributes for a variable. 
        <br>If the variable's <a rel="help" href="#variableAttributes">sourceAttributes</a> or 
        <kbd>&lt;addAttributes&gt;</kbd> include 
        <a rel="help" href="#scale_factor"><kbd>scale_factor</kbd>&nbsp;and/or&nbsp;<kbd>add_offset</kbd></a>      
        attributes, their values will be used to unpack the data from the source 
        before distribution to the client 
        <br>(resultValue = sourceValue * scale_factor + add_offset) .
        The unpacked variable will be of the same data type (for example, float) as the 
        <kbd>scale_factor</kbd> and <kbd>add_offset</kbd> values.
        <br>&nbsp;
      </ul>

  <li><a class="selfLink" id="dataVariable" href="#dataVariable" rel="bookmark"><kbd><strong>&lt;dataVariable&gt;</strong></kbd></a> is a REQUIRED 
    (for almost all datasets) tag within the <kbd>&lt;dataset&gt;</kbd> tag which
    is used to describe a data variable. There MUST be 1 or more instances of this tag. 
    A fleshed out example is: 
      <pre>
&lt;dataVariable&gt;
  &lt;<a rel="help" href="#sourceName">sourceName</a>&gt;waterTemperature&lt;/sourceName&gt;
  &lt;<a rel="help" href="#destinationName">destinationName</a>&gt;sea_water_temperature&lt;/destinationName&gt;
  <a rel="help" href="#dataType">&lt;dataType&gt;</a>float&lt;/dataType&gt;
  &lt;addAttributes&gt;
    &lt;att name="<a rel="help" href="#ioos_category">ioos_category</a>"&gt;Temperature&lt;/att&gt;
    &lt;att name="<a rel="help" href="#long_name">long_name</a>"&gt;Sea Water Temperature&lt;/att&gt;
    &lt;att name="<a rel="help" href="#standard_name">standard_name</a>"&gt;sea_water_temperature&lt;/att&gt;
    &lt;att name="<a rel="help" href="#units">units</a>"&gt;degree_C&lt;/att&gt;
  &lt;/addAttributes&gt;
&lt;/dataVariable&gt;  </pre>

    <kbd>&lt;dataVariable&gt;</kbd> supports the following subtags: 

      <ul>
      <li><a class="selfLink" id="sourceName" href="#sourceName" rel="bookmark"><kbd>&lt;sourceName&gt;</kbd></a> -- the data source's
        name for the variable. 
        This is the name that ERDDAP™ will use when requesting data from the data source. 
        This is the name that ERDDAP™ will look for when data is returned from the data source.
        This is case sensitive. 
        This is REQUIRED.

        <p><a class="selfLink" id="Groups" href="#Groups" rel="bookmark"><strong>Groups</strong> --</a> 
        CF added support for groups with CF v1.8. 
        Starting in ~2020, NetCDF tools support putting variables into groups in a .nc file. 
        In practice, this just means that the variables have a long name 
        which identifies the group(s) and the variable name, for example, group1a/group2c/varName ).
        ERDDAP™ supports groups by converting the "/"'s in the variable's &lt;sourceName&gt;
        into "_"'s in the variable's &lt;destinationName&gt;, for example, 
        group1a_group2c_varName . 
        (When you see that, you should realize that groups are not much more than a syntax convention.) 
        When the variables are listed in ERDDAP™, all the variables in a group will appear together,
        mimicking the underlying group. 
        [If ERDDAP™, notably GenerateDatasetsXml, doesn't perform as well as it could
        with source files that have groups, please email a sample file
        to Chris.John at noaa.gov .]

        <p>EDDTableFromFiles datasets can use some specially-encoded, pseudo sourceNames
        to define new data variables, e.g., to promote a global attribute to be a data variable.
        See <a rel="help" href="#EDDTableFromFiles_PseudoSourceNames">this documentation</a>.        

        <p><a class="selfLink" id="hdfStructureSourceNames" href="#hdfStructureSourceNames" rel="bookmark"><strong>HDF Structures</strong> --</a> 
        Starting with ERDDAP™ v2.12, EDDGridFromNcFiles and EDDGridFromNcFilesUnpacked
        can read data from "structures" in .nc4 and .hdf4 files.
        To identify a variable that is from a structure, the &lt;sourceName&gt; must use the format: 
        <kbd><i>fullStructureName</i>|<i>memberName</i></kbd>, for example group1/myStruct|myMember .

        <p><a class="selfLink" id="fixedValue" href="#fixedValue" rel="bookmark"><strong>Fixed Value SourceNames</strong> --</a> 
        <br>In an EDDTable dataset, if you want to create a variable (with a single, fixed value)
        that isn't in the source dataset, use:
        <br><kbd>&lt;sourceName&gt;=<i>fixedValue</i>&lt;/sourceName&gt;</kbd>
        <br>The initial equals sign tells ERDDAP™ that a fixedValue will follow.
        <ul>
          <li>For numeric variables, the fixed value must be a single finite value or <kbd>NaN</kbd> 
            (case insensitive, e.g., <kbd>=NaN</kbd> ).
          <li>For String variables, the fixed value must be single, 
            <a rel="help" href="https://www.json.org/json-en.html" >JSON-style string<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a>
            (with special characters escaped with \ characters), e.g., <kbd>="My \"Special\" String"</kbd> .
          <li>For a timestamp variable, specify the fixed value as a number in "seconds since 1970-01-01T00:00:00Z" and use 
            <br>units=<kbd>seconds since 1970-01-01T00:00:00Z</kbd> .  
        </ul>
        <br>The other tags for the <kbd>&lt;dataVariable&gt;</kbd> work as if this
          were a regular variable.
        <br>For example, to create a variable called <kbd>altitude</kbd> with a 
          fixed value of 0.0 (float), use:
        <br><kbd>&lt;sourceName&gt;=0&lt;/sourceName&gt;</kbd>
        <br><kbd>&lt;<a rel="help" href="#destinationName">destinationName</a>&gt;altitude&lt;/destinationName&gt;</kbd>
        <br><a rel="help" href="#dataType"><kbd>&lt;dataType&gt;float&lt;/dataType&gt;</kbd></a>
        <br>For unusual situations, you can even specify an <kbd>actual_range</kbd> addAttribute, which will
          override the expected values of destinationMin and destinationMax 
          (which would otherwise equal the fixedValue).
        <br>&nbsp;

        <p><a class="selfLink" id="scriptSourceNames" href="#scriptSourceNames" rel="bookmark"><strong>Script SourceNames / Derived Variables</strong> --</a> 
        <br>Starting with ERDDAP™ v2.10, in an 
        <a rel="help" href="#EDDTableFromFiles">EDDTableFromFiles</a>, 
        <a rel="help" href="#EDDTableFromDatabase">EDDTableFromDatabase</a>, or 
        <a rel="help" href="#EDDTableFromFileNames">EDDTableFromFileNames</a> 
        dataset, the &lt;sourceName&gt; can be 
        <br>an expression (an equation that evaluates to
        a single value), using the format
        <br><kbd>&lt;sourceName&gt;=<i>expression</i>&lt;/sourceName&gt;</kbd>
        <br>or a script (a series of statements that returns a single value), using the format
        <br><kbd>&lt;sourceName&gt;=<i>script</i>&lt;/sourceName&gt;</kbd>
        <br>ERDDAP™ relies on the
          <a rel="help" href="https://www.apache.org/">Apache project's<img 
              src="../images/external.png" alt=" (external link)" 
              title="This link to an external website does not constitute an endorsement."></a>
          <a rel="help" href="https://commons.apache.org/proper/commons-jexl/">Java Expression Language (JEXL)<img 
              src="../images/external.png" alt=" (external link)" 
              title="This link to an external website does not constitute an endorsement."></a>
          (license: <a rel="bookmark" href="https://www.apache.org/licenses/LICENSE-2.0">Apache<img 
              src="../images/external.png" alt=" (external link)" 
              title="This link to an external website does not constitute an endorsement."></a>)
          to evaluate the expressions and run the scripts.
        <br>The calculation for a given new variable is done within one row of the results, repeatedly for all rows.
        <br>The expressions and scripts use a Java- and JavaScript-like syntax and can use any of the 
        <br><a rel="bookmark" href="https://commons.apache.org/proper/commons-jexl/reference/syntax.html"
              >operators and methods which are built into JEXL<img 
              src="../images/external.png" alt=" (external link)" 
              title="This link to an external website does not constitute an endorsement."></a>.          
        <br>The scripts can also use methods (functions) from these classes:
          <ul>
          <li><a rel="help" href="ScriptCalendar2.html">Calendar2</a>, 
            which is a wrapper for some of the static, time- and calendar-related methods in 
            com.cohort.util.Calendar2 
            (<a rel="help" href="setup.html#licenseCoHortSoftware">license</a>).
            For example, 
            <br><kbd>Calendar2.parseToEpochSeconds(<i>sourceTime, dateTimeFormat</i>)</kbd>
            will parse the sourceTime string via the dateTimeFormat string and 
            return a "seconds since 1970-01-01T00:00:00Z" (epochSeconds) double value.
          <li><a rel="help" href="ScriptMath.html">Math</a>, 
            which is a wrapper for almost all of the static, math-related methods in 
            <a rel="help" href="https://docs.oracle.com/javase/8/docs/api/java/lang/Math.html"
              >java.lang.Math<img 
              src="../images/external.png" alt=" (external link)" 
              title="This link to an external website does not constitute an endorsement."></a>.
            For example, 
            <kbd>Math.atan2(<i>y, x</i>)</kbd>
            takes in rectangular coordinates (y, x) and returns polar coordinates 
            (an array of doubles with [r, theta]).
          <li><a rel="help" href="ScriptMath2.html">Math2</a>, 
            which is a wrapper for almost all of the static, math-related methods in 
            com.cohort.util.Math2 
            (<a rel="help" href="setup.html#licenseCoHortSoftware">license</a>).
            For example,
            <br><kbd>Math2.roundTo(<i>d, nPlaces</i>)</kbd>
            will round d to the specified number of digits to the right of the 
            decimal point.
          <li>String, which gives you access to all of the static, String-related methods in 
            <a rel="help" href="https://docs.oracle.com/javase/8/docs/api/java/lang/String"
              >java.lang.String<img 
              src="../images/external.png" alt=" (external link)" 
              title="This link to an external website does not constitute an endorsement."></a>.
            String objects in ERDDAP™ expressions and scripts can use any of their
            associated Java methods, as described in the java.lang.String documentation.
            For example,
            <kbd>String.valueOf(d)</kbd>
            will convert the double value d into a String
            (although you can also use <kbd>""+d</kbd>).
          <li><a rel="help" href="ScriptString2.html">String2</a>,
            which is a wrapper for most of the static, String- and array-related methods in
            com.cohort.util.String2 
            (<a rel="help" href="setup.html#licenseCoHortSoftware">license</a>).
            For example,
            <kbd>String2.zeroPad(<i>number, nDigits</i>)</kbd>
            will add 0's to the left of the number String so that the 
            total number of digits is nDigits (e.g., <kbd>String2.zeroPad("6", 2)</kbd>
            will return "06").
          <li><a rel="help" href="ScriptRow.html">row</a>,
            which has non-static methods for accessing the data from the 
            various columns in the current row of the source data table.
            For example, <kbd>row.columnString("year")</kbd> reads the value from the "year" column as a String, 
            whereas, <kbd>row.columnInt("year")</kbd> reads the value from the "year" column as an integer.  
          </ul>

        <p>For security reasons, expressions and scripts can't use
        other classes other than those 6. 
        ERDDAP™ enforces this limitation by creating a default
        blacklist (which blacklists all classes) and then a whitelist
        (which specifically allows the 6 classes described above).
        If you need other methods and/or other classes to do your work,
        please email your requests to Chris.John at noaa.gov . 

        <p>Efficiency 
        <br>For EDDTableFromFiles datasets, there is only a very, very minimal
        (probably not noticeable) slowdown for requests for data from
        these variables. For EDDTableFromDatabase, there is huge speed penalty
        for requests that include constraints on these variables (e.g., 
        (<kbd>&amp;longitude0360&gt;30&amp;longitude0360&lt;40</kbd>) because
        the constraints can't be passed through to the database,
        so the database has to return much much more data to ERDDAP™ (which is very time consuming) 
        so that ERDDAP™ can create the new variable and apply the constraint.
        To avoid the worst case (where there are no constraints
        being passed to the database), ERDDAP™ throws an error message
        so that the database doesn't have to return the entire contents of the 
        table. (If you want to bypass this, add a constraint to a non-script column 
        which will always
        be true, e.g., <kbd>&amp;time&lt;3000-01-01</kbd>.) For this reason,
        with EDDTableFromDatabase, it is probably always better to create a
        derived column in the database rather
        than use sourceName=script in ERDDAP.

        <p>Overview Of How An Expression (Or Script) Is Used:
        <br>In response to a user's request for tabular data,
        ERDDAP™ gets data from a series of source files. Each source file
        will generate a table of raw (straight from the source) data.
        ERDDAP™ will then go through the table of raw data, row by row, 
        and evaluate the expression or script once for every row, in order
        to create a new column which has that expression or script as a sourceName.

        <p>GenerateDatasetsXml
        <br>Note that GenerateDatasetsXml is completely unaware when there is a 
        need to create a variable with <kbd>&lt;sourceName&gt;=<i>expression</i>&lt;/sourceName&gt;</kbd>.
        You have to create the variable in datasets.xml by hand.

        <p>Expression Examples:
        <br>Here are some complete examples of data variables which use an expression
        to create a new column of data. We expect that these examples (and variants of them) 
        will cover about 95% of the usage of all expression-derived sourceNames.

        <ul>
        <li><a class="selfLink" id="scriptSourceNameTime" href="#scriptSourceNameTime" rel="bookmark"
          >Combining separate "date" and "time" columns into a unified time column:</a>
          <pre>
&lt;dataVariable&gt;
    &lt;sourceName&gt;=Calendar2.parseToEpochSeconds(row.columnString("date") + "T" + 
        row.columnString("time") + "Z", "yyyy-MM-dd'T'HH:mm:ss'Z'")&lt;/sourceName&gt; 
    &lt;destinationName&gt;time&lt;/destinationName&gt;
    &lt;dataType&gt;double&lt;/dataType&gt;
    &lt;addAttributes&gt;
        &lt;att name="units"&gt;seconds since 1970-01-01&lt;/att&gt;
    &lt;/addAttributes&gt;
&lt;/dataVariable&gt;</pre>
          That sourceName expression makes a new "time" column 
          by concatenating the String values from the "date" (yyyy-MM-dd) and "time" (HH:mm:ss) 
          columns on each row of the source file, 
          and by converting that string into a "seconds since 1970-01-01" (epochSeconds) double value.

          <p>Or course, you'll have to customize the time format string to deal
          with the specific format in each dataset's source date and time columns, see the 
          <br><a rel="help" href="#stringTimeUnits">time units documentation</a>. 

          <p>Technically, you don't have to use Calendar2.parseToEpochSeconds()
          to convert the combined date+time into epochSeconds. You could just 
          pass the date+time String to ERDDAP™ and specify the format (e.g., 
          <br>yyyy-MM-dd'T'HH:mm:ss'Z')
          via the units attribute. But there are significant advantages
          to converting to epochSeconds -- notably, EDDTableFromFiles can then 
          easily keep track of the range of time values in each file and so
          quickly decide whether to look in a given file when responding to a 
          request which has time constraints.

          <p>A related problem is the need to create a unified date+time column
          from a source with separate year, month, date, hour, minute, second.
          The solution is very similar, but you will often need to zero-pad many
          of the fields, so that, for example, month (1 - 12) and date (1 - 31) 
          always have 2 digits. Here's an example with year, month, date: 
          <pre>&lt;sourceName&gt;=Calendar2.parseToEpochSeconds(row.columnString("year") + 
  String2.zeroPad(row.columnString("month"), 2) + 
  String2.zeroPad(row.columnString("date"), 2), "yyyyMMdd")&lt;/sourceName&gt;</pre>

          <p>A related problem is the need to create
          a unified latitude or longitude column by combining the data 
          in the source table's separate degrees, minutes, and seconds columns,
          each stored as integers.
          For example,
          <pre>&lt;sourceName&gt;=row.columnInt("deg") + row.columnInt("min")/60.0 + 
    row.columnInt("sec")/3660.0&lt;/sourceName&gt;</pre>

        <li><a class="selfLink" id="scriptSourceNameLongitude" href="#scriptSourceNameLongitude" rel="bookmark"
          >Converting a column named "lon" with longitude values</a> from 0 - 360&deg; into 
          a column named "longitude" with values from -180 - 180&deg; 
          <pre>
&lt;dataVariable&gt;
    &lt;sourceName&gt;=Math2.anglePM180(row.columnDouble("lon"))&lt;/sourceName&gt; 
    &lt;destinationName&gt;longitude&lt;/destinationName&gt;
    &lt;dataType&gt;double&lt;/dataType&gt;
    &lt;addAttributes&gt;
        &lt;att name="units"&gt;degrees_east&lt;/att&gt;
    &lt;/addAttributes&gt;
&lt;/dataVariable&gt;</pre>
          That sourceName expression makes a new "longitude" column 
          by converting the double value from the "lon" column on each row of the source file
          (presumably with 0 - 360 values), 
          and by converting that into a -180 to 180 double value.
          
          <p>If you instead want to convert source longitude values of -180 - 180&deg; 
            into 0 - 360&deg;, use 
          <pre>&lt;sourceName&gt;=Math2.angle0360(row.columnDouble("lon"))&lt;/sourceName&gt;</pre>

          <p>Naming the Two Longitude Variables:
          <br>If the dataset will have 2 longitude variables, we recommend using 
            destinationName=longitude for the -180 - 180&deg; variable and 
            destinationName=longitude0360 (and longName=\"Longitude 0-360&deg;") for the 0 - 360&deg; variable.            
            This is important because users sometimes use Advanced Search to search for 
            data within a specific longitude range. That search will work better
            if longitude consistently has -180 - 180&deg; values for all datasets. 
            Also, the dataset's geospatial_lon_min, geospatial_lon_max,
            Westernmost_Easting and Easternmost_Eastings global attributes will 
            then be set in a consistent way (with longitude values -180 to 180&deg;);

        <li><a class="selfLink" id="scriptSourceNameDegrees" href="#scriptSourceNameDegrees" rel="bookmark"
          >Converting a column named "tempF" with temperature values in degree_F into  
          a column named "tempC" with temperatures in degree_C:</a>
          <pre>
&lt;dataVariable&gt;
    &lt;sourceName&gt;=(row.columnFloat("tempF")-32)*5/9&lt;/sourceName&gt; 
    &lt;destinationName&gt;tempC&lt;/destinationName&gt;
    &lt;dataType&gt;float&lt;/dataType&gt;
    &lt;addAttributes&gt;
        &lt;att name="units"&gt;degrees_C&lt;/att&gt;
    &lt;/addAttributes&gt;
&lt;/dataVariable&gt;</pre>
          That sourceName expression makes a new "tempC" column 
          by converting the float degree_F value from the "tempF" column on each row of the source file 
          into a float degree_C value.
          
          <p>Note that your dataset can have both the original tempF variable
            and the new tempC variable by having another variable with 
            <br><kbd>&lt;sourceName&gt;tempF&lt;/sourceName&gt;</kbd>

        <li><a class="selfLink" id="scriptSourceNameVectors" href="#scriptSourceNameVectors" rel="bookmark"
          >Converting wind "speed" and "direction" columns into two columns with the u,v components</a>
          <ul>
          <li>To make a u variable, use
            <pre>&lt;sourceName&gt;=row.columnFloat("speed") * Math.cos(row.columnFloat("direction"))&lt;/sourceName&gt;</pre>
          <li>To make a v variable, use
            <pre>&lt;sourceName&gt;=row.columnFloat("speed") * Math.sin(row.columnFloat("direction"))&lt;/sourceName&gt;</pre>
          </ul>
          Or, given u,v:
          <ul>
          <li>To make a speed variable, use
            <pre>&lt;sourceName&gt;=Math.atan2(row.columnDouble("v"), row.columnDouble("u"))[0]&lt;/sourceName&gt;</pre>
          <li>To make a direction variable, use
            <pre>&lt;sourceName&gt;=Math.toDegrees(Math.atan2(row.columnDouble("v"), row.columnDouble("u"))[1])&lt;/sourceName&gt;</pre>
          </ul>
        </ul>

        <p><a class="selfLink" id="scriptSourceNameScript" href="#scriptSourceNameScript" rel="bookmark"
          >Script Example:</a>
        <br>Here is an example of using a script, not just an expression,
        as a sourceName. We expect that scripts, as opposed to expressions,
        won't be needed often.
        In this case the goal is to return a non-NaN missing value (-99) 
        for temperature values outside a specific range.
        Note that the script is the part after the "=".
        <pre>
&lt;dataVariable&gt;
    &lt;sourceName&gt;=var tc=row.columnFloat("tempC"); return tc&amp;gt;35 || tc&amp;lt;-5? -99.0f : tc*9/5+32;&lt;/sourceName&gt; 
    &lt;destinationName&gt;tempF&lt;/destinationName&gt;
    &lt;dataType&gt;float&lt;/dataType&gt;
    &lt;addAttributes&gt;
        &lt;att name="units"&gt;degrees_F&lt;/att&gt;
    &lt;/addAttributes&gt;
&lt;/dataVariable&gt;</pre>               

        <p>Hard Flag
        <br>If you change the expression or script defined in a sourceName, 
        you must set a 
        <a rel="help" href="https://erddap.github.io/setup.html#hardFlag">hard flag</a>
        for the dataset so the ERDDAP™ deletes all of the cached information for the dataset
        and re-reads every data file (using the new expression or script) the next time it loads the dataset.
        Alternatively, you can use <a rel="help" href="#DasDds">DasDds</a>
        which does the equivalent of setting a hard flag.
        
        <p><a class="selfLink" id="scriptSourceNamePercentEncode" href="#scriptSourceNamePercentEncode" rel="bookmark"
          >Percent Encode</a>
          <br>This is only rarely relevant: 
          Because the expressions and scripts are written in datasets.xml, which is an XML document,
          you must percent encode any <kbd>&lt;</kbd>, <kbd>&gt;</kbd>, and <kbd>&amp;</kbd> 
          characters in the expressions and scripts
          as <kbd>&amp;lt;</kbd>, <kbd>&amp;gt;</kbd>, and <kbd>&amp;amp;</kbd> .

        <p><a class="selfLink" id="scriptSourceNameProblems" href="#scriptSourceNameProblems" rel="bookmark"
          >Common Problems</a>
        <br>A common problem is that you create a variable with sourceName=<i>expression</i>
        but the resulting column of data just has missing values. 
        Alternatively, some rows of the new column have missing values and you think they shouldn't.
        The underlying problem is that something is wrong with the expression and ERDDAP
        is converting that error into a missing value. 
        To solve the problem, 
        <ul>
        <li>Look at the expression to see what the problem might be. 
        <li>Look in 
          <a rel="help" href="https://erddap.github.io/setup.html#log">log.txt</a>, 
          which will show the first error message generated 
          during the creation of each new column.
        </ul>
        Common causes are: 
        <ul>
        <li>You used the wrong case. Expressions and scripts are case sensitive.
        <li>You omitted the name of the class. For example, you must use <kbd>Math.abs()</kbd>,
          not just <kbd>abs()</kbd>.
        <li>You didn't do type conversions. For example, if a parameter value's data type is String
          and you have a double value,
          you need to convert a double into a String via <kbd>""+d</kbd>.
        <li>The column name in the expression doesn't exactly match the column
          name in the file (or the name might be different in some files).
        <li>There is a syntax error in the expression (e.g., a missing or extra ')').
        </ul>
        
        <p>If you get stuck or need help,
        <br>please send an email with the details to <kbd>erd dot data at noaa dot gov</kbd>.
        <br>Or, you can join the <a rel="help"
        href="#ERDDAPMailingList">ERDDAP™ Google Group / Mailing List</a> 
        and post your question there.


      <li><a class="selfLink" id="destinationName" href="#destinationName" rel="bookmark"><kbd>&lt;destinationName&gt;</kbd></a> -- 
        the name for the variable that will be shown to 
        and used by ERDDAP™ users. 
        <ul>
        <li>This is OPTIONAL. If absent, the <a rel="help" href="#sourceName">sourceName</a> 
          is used.
        <li>This is useful because it allows you to change a cryptic or odd sourceName.
        <li>destinationName is case sensitive.
        <li>destinationNames MUST start with a letter (A-Z, a-z) and MUST be followed by 0 or
          more characters (A-Z, a-z, 0-9, and _).  
          ('-' was allowed before ERDDAP™ version 1.10.)
          This restriction allows data variable names to be the same in ERDDAP™, 
          in the response files, and in all the software where those files will be 
          used, including programming languages (like Python, Matlab, and JavaScript)
          where there are similar restrictions on variable names.
        <li>In EDDTable datasets, 
          <a rel="help" href="#LLAT">longitude, latitude, altitude (or depth), and time</a> 
          data  variables are special.
        <br>&nbsp;
        </ul>

      <li><a class="selfLink" id="dataType" href="#dataType" rel="bookmark"><kbd>&lt;dataType&gt;</kbd></a> -- 
        specifies the data type coming from the source. 
        (In some cases, for example, when reading data from ASCII files, 
        it specifies how the data coming from the source should be stored.)
        <ul>
        <li>This is REQUIRED by some dataset types and IGNORED by others. 
          Dataset types that require this for their dataVariables are: 
          EDDGridFromXxxFiles, EDDTableFromXxxFiles,
          EDDTableFromMWFS, EDDTableFromNOS, EDDTableFromSOS.
          Other dataset types ignore this tag because they get the information from the source.
          <br>&nbsp;

        <li>Valid values are any of the standard 
          <a rel="bookmark" href="#dataTypes">ERDDAP™ data types</a>
          plus <kbd>boolean</kbd> (see below).
          The dataType names are case-sensitive.
          <br>&nbsp;

        <li><a class="selfLink" id="booleanData" href="#booleanData" rel="bookmark">"boolean"</a> is a special case. 
          <ul>
          <li>Internally, ERDDAP™ doesn't support a boolean type because booleans 
            can't store missing values and most file types don't support booleans.
            Also, DAP doesn't support booleans, so there would be no standard way 
            to query boolean variables.
          <li>Specifying "boolean" for the dataType in datasets.xml will cause 
            boolean values to
            be stored and represented as bytes: 0=false, 1=true, 127=missing_value.
          <li>Users can specify constraints by using the numeric values 
            (for example, "isAlive=1").
          <li>ERDDAP™ administrators sometimes need to use the "boolean" dataType in datasets.xml
            to tell ERDDAP™ how to interact with the data source (e.g., to read
            boolean values from a relational database and convert them to 0, 1, or 127).
            <br>&nbsp;
          </ul>

        <li>If you want to change a data variable from the dataType 
          in the source files 
          (for example, short) into some other dataType in the dataset (for example, int), 
          don't use &lt;dataType&gt; to specify what you want. 
          (It works for some types of datasets, but not others.) 
          Instead:
          <ul>
          <li>Use &lt;dataType&gt; to specify what is in the files (for example, short).
          <li>In the &lt;addAttributes&gt; for the variable, add a 
            <a rel="help" href="#scale_factor">scale_factor</a> attribute
            with the new dataType (for example, int) and a value of 1, for example,
            <br><kbd>&lt;att name="scale_factor" type="int"&gt;1&lt;/att&gt;</kbd>
            <br>&nbsp;
          </ul>
        </ul>

      <li><a rel="help" href="#variableAttributes"><kbd>&lt;addAttributes&gt;</kbd></a> -- 
        defines a set of attributes 
        (<i>name</i> = <i>value</i>) which are
        added to the source's attributes for a variable, to make the combined attributes
        for a variable. This is OPTIONAL.
        <br>If the variable's 
          <a rel="help" href="#variableAttributes">sourceAttributes</a> or 
          <kbd>&lt;addAttributes&gt;</kbd> include 
        <a rel="help" href="#scale_factor"><kbd>scale_factor</kbd>&nbsp;and/or&nbsp;<kbd>add_offset</kbd></a>
        attributes, their values will be used to unpack the data from
        the source before distribution to the client.
        The unpacked variable will be of the same data type (for example, float) as the 
        <kbd>scale_factor</kbd> and <kbd>add_offset</kbd> values.
        <br>&nbsp;
      </ul>

  <li><a class="selfLink" id="variableAttributes" href="#variableAttributes" rel="bookmark"><strong>Variable Attributes / Variable <kbd>&lt;addAttributes&gt;</kbd></strong></a> -- 
    <kbd>&lt;addAttributes&gt;</kbd> is an OPTIONAL tag within an 
      <kbd>&lt;axisVariable&gt;</kbd> or 
      <kbd>&lt;dataVariable&gt;</kbd> 
    tag which is used to change the variable's attributes.
    <ul>
    <li><strong>Use a variable's <kbd>&lt;addAttributes&gt;</kbd>
      to change the variable's attributes.</strong>
      ERDDAP™ combines a variable's attributes from the dataset's source 
      (<kbd><strong>sourceAttributes</strong></kbd>)
      and the variable's <kbd><strong>addAttributes</strong></kbd> which you define in datasets.xml 
      (which have priority) 
      to make the variable's "<strong>combinedAttributes</strong>", which are what ERDDAP™ users see. 
      Thus, you can use addAttributes to redefine the values of sourceAttributes, add new
      attributes, or remove attributes.
     <li>See the <a rel="help" href="#addAttributes"><kbd><strong>&lt;addAttributes&gt;</strong></kbd> information</a>
       which applies to 
       global and variable <kbd><strong>&lt;addAttributes&gt;</strong></kbd>.
    <li>ERDDAP™ looks for and uses many of these attributes in various ways.
      For example, the colorBar values are required to make a variable available via WMS,
      so that maps can be made with consistent colorBars.
    <li><a rel="help" href="#LLAT">The longitude, latitude, altitude (or depth), and time variables</a> 
      get lots of appropriate metadata automatically 
      (for example, <a rel="help" href="#units">units</a>).
    <li>A sample <kbd>&lt;addAttributes&gt;</kbd> for a data variable is: <pre>
&lt;addAttributes&gt; 
  &lt;att name="actual_range" type="doubleList"&gt;10.34 23.91&lt;/att&gt;
  &lt;att name="colorBarMinimum" type="double"&gt;0&lt;/att&gt;
  &lt;att name="colorBarMaximum" type="double"&gt;32&lt;/att&gt;
  &lt;att name="<a rel="help" href="#ioos_category">ioos_category</a>"&gt;Temperature&lt;/att&gt;
  &lt;att name="<a rel="help" href="#long_name">long_name</a>"&gt;Sea Surface Temperature&lt;/att&gt;
  &lt;att name="numberOfObservations" /&gt; 
  &lt;att name="<a rel="help" href="#units">units</a>"&gt;degree_C&lt;/att&gt;
&lt;/addAttributes&gt;</pre>
       The empty <kbd>numberOfObservations</kbd> attribute causes the source 
       <kbd>numberOfObservations</kbd> attribute (if any)
       to be removed from the final, combined list of attributes.

    <li>Supplying this information helps ERDDAP™ do a better job and helps users 
      understand the datasets.
      <br>Good metadata makes a dataset usable. 
      <br>Insufficient metadata makes a dataset useless.
      <br>Please take the time to do a good job with metadata attributes. 
    </ul>

    <p><strong>Comments about variable attributes that are special in ERDDAP:</strong>
      <ul>
      <li><a class="selfLink" id="actual_range" href="#actual_range" rel="bookmark"><strong>actual_range</strong></a> 
        is a RECOMMENDED variable attribute.  For example,
        <br><kbd>&lt;att name="actual_range" 
          <a rel="help" href="#attributeType">type="floatList"</a>&gt;0.17 23.58&lt;/att&gt;</kbd>
        <ul>
        <li>This attribute is from the 
          <a rel="help" href="https://ferret.pmel.noaa.gov/noaa_coop/coop_cdf_profile.html"
          >CDC COARDS<img 
          src="../images/external.png" alt=" (external link)" 
          title="This link to an external website does not constitute an endorsement."></a>
          and
          <a rel="help"
          href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html"
          >CF 1.7+<img 
          src="../images/external.png" alt=" (external link)" 
          title="This link to an external website does not constitute an endorsement."></a>
          metadata standards.
        <li>If present, it MUST be an array of two values of the same data type 
          as the destination data type of the variable,
          specifying the actual (not the theoretical or the allowed) minimum and maximum 
          values of the data for that variable.
        <li>If the data is packed with 
          <a rel="help" href="#scale_factor"><kbd>scale_factor</kbd>&nbsp;and/or&nbsp;<kbd>add_offset</kbd></a>,
          actual_range must have unpacked values and be of the same data type
          as the unpacked values.
        <li>For some data sources (for example, all EDDTableFrom...Files datasets),
          ERDDAP™ determines the 
          actual_range of each variable and sets the actual_range attribute. 
          With other data sources (for example, relational databases, Cassandra,
          DAPPER, Hyrax), it might be troublesome or burdensome for the source
          to calculate the range, so ERDDAP™ doesn't request it. 
          In this case, it is best if you can set actual_range (especially for the 
          longitude, latitude, altitude, depth, and time variables)
          by adding an actual_range attribute to each variable's 
          <a rel="help" href="#addAttributes"><kbd>&lt;addAttributes&gt;</kbd></a>
          for this dataset in datasets.xml, for example,
          <br><kbd>&lt;att name="actual_range" 
            <a rel="help" href="#attributeType">type="doubleList"</a>&gt;-180 180&lt;/att&gt;</kbd>
        <li>For numeric <a rel="help" href="#timeUnits">time and timestamp variables</a>,
          the values specified should be the 
          relevant source (not destination) numeric values. For example, 
          if the source time values 
          are stored as "days since 1985-01-01", then the actual_range should be specified 
          in "days since 1985-01-01".
          And if you want to refer to NOW as the second value for near-real-time data that
          is periodically updated, 
          you should use <kbd>NaN</kbd> . For example, to specify a data range of
          1985-01-17 until NOW, use
          <br><kbd>&lt;att name="actual_range" 
            <a rel="help" href="#attributeType">type="doubleList"</a>&gt;16 NaN&lt;/att&gt;</kbd>
        <li>If actual_range is known (either by ERDDAP™ calculating it or by you 
          adding it via
          <kbd>&lt;addAttributes&gt;</kbd>), ERDDAP™ will display it
          to the user on the Data Access Form (<i>datasetID</i>.html)
          and Make A Graph web pages (<i>datasetID</i>.graph) for that dataset
          and use it when generating the FGDC and ISO 19115 metadata.
          Also, the last 7 days of time's actual_range are used as the default time subset.         
        <li>If actual_range is known, users can
          use the 
          <a rel="help" 
          href="https://coastwatch.pfeg.noaa.gov/erddap/tabledap/documentation.html#min"
          >min() and max() functions</a> in requests,
          which is often very useful.
        <li>For all EDDTable... datasets, if actual_range is known 
          (either by you specifying it or
          by ERDDAP™ calculating it), ERDDAP™ will be able to quickly reject any requests for 
          data outside that range.  For example, if the dataset's lowest time value
          corresponds to 
          1985-01-17, then a request for all data from 1985-01-01 through 1985-01-16 will be 
          immediately rejected with the error message 
          "Your query produced no matching results."
          This makes actual_range a very important piece of metadata, as it can 
          save ERDDAP™ a lot of effort and save the user a lot of time.
          And this highlights that the actual_range values must not be narrower 
          than the data's actual range; otherwise, ERDDAP™ may erroneously say
          "There is no matching data" when in fact there is relevant data.
        <li>When a user selects a subset of data and requests a file type 
          that includes metadata (for example, .nc), ERDDAP™ modifies actual_range
          in the response file to reflect the subset's range.
        <li>See also <a rel="help" href="#data_min">data_min and data_max</a>,
          which are an alternative way to specify the actual_range.
          However, these are deprecated now that actual_range is defined by CF 1.7+.
          <br>&nbsp;
        </ul>

      <!-- <li><a class="selfLink" id="charset" href="#charset" rel="bookmark"><strong>charset</strong></a>
        <ul>
        <li>This attribute may only be used with char variables 
          (see also <a rel="help" href="#_Encoding">_Encoding</a> which is 
          only used with String variables).
        <li>This attribute is strongly recommended.
        <li>This attribute is not from any standard, although <kbd>charset</kbd>
          is part of the HTML <kbd>content</kbd> setting.
        <li>Internally in ERDDAP™, char variables are 2-byte characters that use the          
          <a rel="help"
          href="https://en.wikipedia.org/wiki/UTF-16"
          >Unicode UCS-2 character set<img 
          src="../images/external.png" alt=" (external link)" 
          title="This link to an external website does not constitute an endorsement."></a>.
        <li>Many file types only support 1-byte characters and thus need this
          attribute to identify an associated
          <br><a rel="help"
          href="https://en.wikipedia.org/wiki/Code_page"
          >charset (AKA code page)<img 
          src="../images/external.png" alt=" (external link)" 
          title="This link to an external website does not constitute an endorsement."></a>
          which defines how to map the 256 possible values to a set of 
          256 characters drawn from the UCS-2 character set.
        <li>Values for <kbd>charset</kbd> are case-insensitive.
        <li>In theory, ERDDAP™ could support charset identifiers from           
          <a rel="help"
          href="https://www.iana.org/assignments/character-sets/character-sets.xhtml"
          >this IANA list<img 
          src="../images/external.png" alt=" (external link)" 
          title="This link to an external website does not constitute an endorsement."></a>,
          but in practice, ERDDAP™ currently just supports <kbd>ISO-8859-1</kbd> 
          (note that it has dashes, not underscores), which has the advantage
          that it is identical to the first 256 characters of Unicode.
        <li>The default value is <kbd>ISO-8859-1</kbd> .
        <li><kbd>UTF-8</kbd> is not a valid option for <kbd>charset</kbd>, 
          since UTF-8 requires between 
          1 and 4 bytes per character and char variables are composed of individual 
          1 or 2-byte characters.
        <li>This is an ongoing troublesome issue because many source files use 
          charsets that are different from ISO-8859-1, 
          but don't identify the charset.
          For example, many source data files have some metadata copied and pasted from
          Microsoft Word on Windows and thus have fancy hyphens and apostrophes
          from a Windows-specific charset instead of ASCII hyphens and apostrophes.
          These characters then show up as odd characters or '?' in ERDDAP.
          <br>&nbsp;
        </ul>
        -->
      <li><a class="selfLink" id="colorBar" href="#colorBar" rel="bookmark"><strong>Color Bar Attributes</strong></a> -- 
        There are several OPTIONAL variable attributes which specify the suggested
        default attributes for a color bar 
        (used to convert data values into colors on images)
        for this variable.
        <ul>
        <li>If present, this information is used as default information by 
          griddap and tabledap
          whenever you request an image that uses a color bar.
        <li>For example, when latitude-longitude gridded data is plotted as a 
          coverage on a map,
          the color bar specifies how the data values are converted to colors.
        <li>Having these values allows ERDDAP™ to create images which use a 
          consistent color bar
          across different requests, even when the time or other dimension values vary.
        <li>These attribute names were created for use in ERDDAP. 
          They are not from a metadata standard.
        <!-- These WMS requirements are also in setupDatasetXml. -->
        <li><a class="selfLink" id="WMS" href="#WMS" rel="bookmark">WMS</a> -- The main requirements for a variable to be 
          accessible via ERDDAP's WMS
          server are: 
          <ul>
          <li>The dataset must be an EDDGrid... dataset.
          <li>The data variable MUST be a gridded variable.
          <li>The data variable MUST have longitude and latitude axis variables.
            (Other axis variables are OPTIONAL.)
          <li>There MUST be some longitude values between -180 and 180.
          <li>The colorBarMinimum and colorBarMaximum attributes MUST be specified.
             (Other color bar attributes are OPTIONAL.)
          </ul>
        <li>The attributes related to the color bar are:
          <ul>
          <li><kbd><strong>colorBarMinimum</strong></kbd> specifies the minimum value on the
            colorBar.  For example,
            <br><kbd>&lt;att name="colorBarMinimum" 
              <a rel="help" href="#attributeType">type="double"</a>&gt;-5&lt;/att&gt;</kbd>
            <ul>
            <li>If the data is packed with 
              <a rel="help" href="#scale_factor"><kbd>scale_factor</kbd>&nbsp;and/or&nbsp;<kbd>add_offset</kbd></a>,
              specify the colorBarMinimum as an unpacked value.
            <li>Data values lower than colorBarMinimum are represented by the same color as
              colorBarMinimum values.
            <li>The attribute should be of 
              <a rel="help" href="#attributeType">type="double"</a>, 
              regardless of the data variable's type.
            <li>The value is usually a nice round number. 
            <li>Best practices: We recommend a value slightly higher than the 
              minimum data value.
            <li>There is no default value.
            </ul>
          <li><kbd><strong>colorBarMaximum</strong></kbd> specifies the maximum value on the
            colorBar.  For example,
            <br><kbd>&lt;att name="colorBarMaximum" 
              <a rel="help" href="#attributeType">type="double"</a>&gt;5&lt;/att&gt;</kbd>
            <br>
            <ul>
            <li>If the data is packed with 
              <a rel="help" href="#scale_factor"><kbd>scale_factor</kbd>&nbsp;and/or&nbsp;<kbd>add_offset</kbd></a>,
              specify the colorBarMinimum as an unpacked value.
            <li>Data values higher than colorBarMaximum are represented by the same color as
              colorBarMaximum values.
            <li>The attribute should be of 
              <a rel="help" href="#attributeType">type="double"</a>, 
              regardless of the data variable's type.
            <li>The value is usually a nice round number.
            <li>Best practices: We recommend a value slightly lower than the
              maximum data value.
            <li>There is no default value.
            </ul>
          <li><kbd><strong>colorBarPalette</strong></kbd> specifies the palette for the colorBar.
            For example,
            <br><kbd>&lt;att name="colorBarPalette"&gt;WhiteRedBlack&lt;/att&gt;</kbd>
            <ul>
            <li>All ERDDAP™ installations support these standard palettes:
                <!--the standard palettes are listed in 'palettes' in Bob's messages.xml, 
                    EDDTable.java, EDDGrid.java, and setupDatasetsXml.html -->
              <kbd>BlackBlueWhite, 
              BlackRedWhite, BlackWhite, BlueWhiteRed, LightRainbow, Ocean, 
              OceanDepth, Rainbow,
              RedWhiteBlue, ReverseRainbow, Topography, TopographyDepth [added in v1.74], 
              WhiteBlack, WhiteBlueBlack,</kbd>
              and <kbd>WhiteRedBlack</kbd>.
            <li>If you have installed
              <a rel="help" 
              href="https://erddap.github.io/setup.html#palettes"
              >additional palettes</a>, you can refer to one of them.
            <li>If this attribute isn't present, the default is <kbd>BlueWhiteRed</kbd> if
              <kbd>-1*colorBarMinimum = colorBarMaximum</kbd>; otherwise the default is 
              <kbd>Rainbow</kbd>.
            </ul>
          <li><kbd><strong>colorBarScale</strong></kbd> specifies the scale for the colorBar.  
            For example,
            <br><kbd>&lt;att name="colorBarScale"&gt;Log&lt;/att&gt;</kbd>
            <ul>
            <li>Valid values are <kbd>Linear</kbd> and <kbd>Log</kbd>.
            <li>If the value is <kbd>Log</kbd>, colorBarMinimum must be greater than 0.
            <li>If this attribute isn't present, the default is <kbd>Linear</kbd>.
            </ul>
          <li><kbd><strong>colorBarContinuous</strong></kbd> specifies whether the colorBar 
            has a continuous palette
              of colors, or whether the colorBar has a few discrete colors.  For example,
            <br><kbd>&lt;att name="colorBarContinuous"&gt;false&lt;/att&gt;</kbd>
            <ul>
            <li>Valid values are the strings <kbd>true</kbd> and <kbd>false</kbd>.
            <li>If this attribute isn't present, the default is <kbd>true</kbd>.
            </ul>
          <li><kbd><strong>colorBarNSections</strong></kbd> specifies the default number
            of sections on the colorBar.  For example,
            <br><kbd>&lt;att name="colorBarNSections" type="int"&gt;6&lt;/att&gt;</kbd>
            <ul>
            <li>Valid values are positive integers.
            <li>If this attribute isn't present, the default is <kbd>-1</kbd>,
              which tells ERDDAP™ to pick the number of sections based on the
              range of the colorBar.
              <br>&nbsp;
            </ul>
          </ul>
        </ul>

      <li><a class="selfLink" id="data_min" href="#data_min" rel="bookmark"><strong>data_min</strong> and <strong>data_max</strong></a> -- 
        These are deprecated variable attributes defined in the World Ocean Circulation Experiment (WOCE) metadata description.  For example,
        <br><kbd>&lt;att name="data_min" 
          <a rel="help" href="#attributeType">type="float"</a>&gt;0.17&lt;/att&gt;</kbd>
        <br><kbd>&lt;att name="data_max" 
          <a rel="help" href="#attributeType">type="float"</a>&gt;23.58&lt;/att&gt;</kbd>
        <ul>
        <li>We recommend that you use 
          <a rel="help" href="#actual_range"><kbd>actual_range</kbd></a>,
          instead of data_min and data_max,
          because actual_range is now defined by the CF specification.
        <li>If present, they must be of the same data type as the destination data type of the variable, 
          and specify the actual (not the theoretical or the allowed) minimum and maximum 
          values of the data for that variable.
        <li>If the data is packed with 
          <a rel="help" href="#scale_factor"><kbd>scale_factor</kbd>&nbsp;and/or&nbsp;<kbd>add_offset</kbd></a>,
          data_min and data_max must be unpacked values using the unpacked data type.
          <br>&nbsp;
        </ul>

      <li><a class="selfLink" id="variableDrawLandMask" href="#variableDrawLandMask" rel="bookmark"><strong>drawLandMask</strong></a> -- 
        This is an OPTIONAL variable attribute used by ERDDAP™ (and no metadata standards) 
        which specifies the default value for the "Draw Land Mask" option on the dataset's
        Make A Graph form (<i>datasetID</i>.graph)
        and for the <kbd>&amp;.land</kbd> parameter in a URL requesting a 
        map of the data.  For example,
        <br><kbd>&lt;att name="drawLandMask"&gt;under&lt;/att&gt;</kbd>
        <br>See the <a rel="help" href="#drawLandMask">drawLandMask overview</a>.

      <li><a class="selfLink" id="Encoding" href="#Encoding" rel="bookmark"><strong>_Encoding</strong></a>
        <ul>
        <li>This attribute may only be used with String variables 
          <!--(see also <a rel="help" href="#charset">charset</a> which is 
          only used with char variables)-->.
        <li>This attribute is strongly recommended.
        <li>This attribute is from the 
          <a rel="help"
          href="https://docs.unidata.ucar.edu/netcdf-java/current/userguide/index.html"
          >NetCDF User's Guide (NUG)<img 
          src="../images/external.png" alt=" (external link)" 
          title="This link to an external website does not constitute an endorsement."></a>.
        <li>Internally in ERDDAP™, Strings are a sequence of 2-byte characters 
          that use the          
          <a rel="help"
          href="https://en.wikipedia.org/wiki/UTF-16"
          >Unicode UCS-2 character set<img 
          src="../images/external.png" alt=" (external link)" 
          title="This link to an external website does not constitute an endorsement."></a>.
        <li>Many file types only support 1-byte characters in Strings and thus need this
          attribute to identify an associated
          <br><a rel="help"
          href="https://en.wikipedia.org/wiki/Code_page"
          >charset (AKA code page)<img 
          src="../images/external.png" alt=" (external link)" 
          title="This link to an external website does not constitute an endorsement."></a>
          which defines how to map the 256 possible values to a set of 
          256 characters drawn from the UCS-2 character set
          and/or the encoding system, e.g., 
          <a rel="help"
          href="https://en.wikipedia.org/wiki/UTF-8"
          >UTF-8<img 
          src="../images/external.png" alt=" (external link)" 
          title="This link to an external website does not constitute an endorsement."></a>
          (which requires between 1 and 4 bytes per character).
        <li>Values for <kbd>_Encoding</kbd> are case-insensitive.
        <li>In theory, ERDDAP™ could support _Encoding identifiers from           
          <a rel="help"
          href="https://www.iana.org/assignments/character-sets/character-sets.xhtml"
          >this IANA list<img 
          src="../images/external.png" alt=" (external link)" 
          title="This link to an external website does not constitute an endorsement."></a>,
          but in practice, ERDDAP™ currently just supports 
          <ul>
          <li><kbd>ISO-8859-1</kbd> 
            (note that it has dashes, not underscores), which has the advantage
            that it is identical to the first 256 characters of Unicode, and
          <li><kbd>UTF-8</kbd>.
          </ul>
        <li>When reading source files, the default value is <kbd>ISO-8859-1</kbd>,
          except for netcdf-4 files, where the default is <kbd>UTF-8</kbd>.
        <li>This is an ongoing troublesome issue because many source files use 
          charsets or encodings that are different from ISO-8859-1, 
          but don't identify the charset or encoding.
          For example, many source data files have some metadata copied and pasted from
          Microsoft Word on Windows and thus have fancy hyphens and apostrophes
          from a Windows-specific charset instead of ASCII hyphens and apostrophes.
          These characters then show up as odd characters or '?' in ERDDAP.
          <br>&nbsp;
        </ul>

      <li><strong><a class="selfLink" id="fileAccessBaseUrl" href="#fileAccessBaseUrl" rel="bookmark">fileAccessBaseUrl</a> and <a class="selfLink" id="fileAccessSuffix" href="#fileAccessSuffix" rel="bookmark">fileAccessSuffix</a></strong> 
        are very rarely used attributes that are not from any standard.        
        If an EDDTable column has filenames of web accessible files (e.g., 
        image, video, or audio files), you can add 
        <br><kbd>&lt;att name="fileAccessBaseUrl"&gt;<i>someBaseURL</i>&lt;/a&gt;</kbd>
        <br>to specify the base URL (ending with / ) needed
        to make the filenames into complete URLs. 
        In unusual cases, such as when a column has references to .png files but 
        the values lack ".png", you can add 
        <br><kbd>&lt;att name="fileAccessSuffix"&gt;<i>someSuffix</i>&lt;/a&gt;</kbd>
        <br>(for example, <kbd>&lt;att name="fileAccessSuffix"&gt;.png&lt;/a&gt;</kbd>)
        <br>to specify a suffix to be added to make the filenames into complete URLs.        
        Then for .htmlTable responses, ERDDAP™ will 
        show the filename as a link to the full URL (the baseUrl plus the filename 
        plus the suffix). 

        <p>If you want ERDDAP™ to serve the related files, 
        make a separate 
        <a rel="help" href="#EDDTableFromFileNames">EDDTableFromFileNames</a>
        dataset for those files
        (it may be a private dataset).
 
      <li><a class="selfLink" id="fileAccessArchiveUrl" href="#fileAccessArchiveUrl" rel="bookmark"><strong>fileAccessArchiveUrl</strong></a> 
        is a very rarely used attribute that is not from any standard.        
        If an EDDTable column has filenames of web accessible files (e.g., 
        image, video, or audio files) which are accessible via an archive (e.g., .zip file)
        accessible via a URL, use 
        <kbd>&lt;att name="fileAccessArchiveUrl"&gt;<i>theURL</i>&lt;/att&gt; 
        </kbd>to specify the URL for the archive. 

        <p>If you want ERDDAP™ to serve the archive file, 
        make a separate 
        <a rel="help" href="#EDDTableFromFileNames">EDDTableFromFileNames</a>
        dataset for that file
        (it may be a private dataset).
 
      <li><a class="selfLink" id="ioos_category" href="#ioos_category" rel="bookmark"><strong>ioos_category</strong></a> -- 
        This is a REQUIRED variable attribute if <kbd>&lt;variablesMustHaveIoosCategory&gt;</kbd>
        is set to true (the default) in
          <a rel="help" href="https://erddap.github.io/setup.html#setup.xml"
          >setup.xml</a>;
          otherwise, it is OPTIONAL.  
        <br>For example, <kbd>&lt;att name="ioos_category"&gt;Salinity&lt;/att&gt;</kbd>
        <br>The categories are from 
          <a rel="help" href="https://ioos.noaa.gov/">NOAA's 
          Integrated Ocean Observing System (IOOS)<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
        <ul>
        <li>(As of writing this) we aren't aware of formal definitions of these names.
        <li>The core names are from Zdenka Willis' .ppt "Integrated Ocean Observing System (IOOS)
           NOAA's Approach to Building an Initial Operating Capability" and from the
           <a rel="help"
           href="https://www.iooc.us/wp-content/uploads/2010/11/US-IOOS-Blueprint-for-Full-Capability-Version-1.0.pdf"
           >US IOOS Blueprint<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
          (page 1-5).
        <li>It is likely that this list will be revised in the future.
          If you have requests, please email <kbd>Chris.John at noaa.gov</kbd>.
        <li>ERDDAP™ supports a larger list of categories than IOOS does because
          Bob Simons added additional names (mostly based on the names of scientific fields,
          for example, Biology, Ecology, Meteorology, Statistics, Taxonomy)
          for other types of data.
        <li>The current valid values in ERDDAP™ are <kbd>Bathymetry, Biology, Bottom Character, 
          CO2, Colored Dissolved Organic Matter, Contaminants,          
          Currents, Dissolved Nutrients, Dissolved O2, Ecology, Fish Abundance,
          Fish Species, Heat Flux, Hydrology, Ice Distribution, Identifier, Location,
          Meteorology, Ocean Color, Optical Properties, Other, 
          Pathogens, Phytoplankton Species, Pressure, Productivity, Quality, Salinity, Sea Level,
          Statistics, Stream Flow, Surface Waves, Taxonomy, Temperature, Time, 
          Total Suspended Matter, Unknown, Wind, 
          Zooplankton Species,</kbd> and <kbd>Zooplankton Abundance</kbd>.
        <li>There is some overlap and ambiguity between different terms -- do your best.
        <li>If you add ioos_category to the list of <kbd>&lt;categoryAttributes&gt;</kbd> 
          in ERDDAP's 
          <a rel="help" href="https://erddap.github.io/setup.html#setup.xml">setup.xml</a> 
            file, users can easily find datasets with similar data via ERDDAP's
          "Search for Datasets by Category" on the home page.
          <br><a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/categorize/ioos_category/index.html?page=1&amp;itemsPerPage=1000"
          >Try using ioos_category to search for datasets of interest.</a>
        <li>There was 
        <a rel="bookmark" 
href="https://groups.google.com/forum/#!topic/erddap/TnwbgzpSS0w"
>a discussion about ERDDAP™ and ioos_category in the ERDDAP™ Google Group.<img 
  src="../images/external.png" alt=" (external link)" 
  title="This is a link to an external website."/></a>
        </ul>

        <p>You may be tempted to set <kbd>&lt;variablesMustHaveIoosCategory&gt;</kbd> 
        to <kbd>false</kbd> so that this attribute isn't required. ("Pfft! What's it to me?")
        Some reasons to leave it set to <kbd>true</kbd> (the default) and use ioos_category are:
        <ul>
        <li>If setup.xml's <kbd>&lt;variablesMustHaveIoosCategory&gt;</kbd> is set to true, 
          <a rel="help" href="#GenerateDatasetsXml">GenerateDatasetsXml</a> 
          always creates/suggests an ioos_category attribute for each variable 
          in each new dataset. So why not just leave it in?
        <li>ERDDAP™ lets users search for datasets of interest by category.
          ioos_category is a very useful search category because the ioos_categories 
          (for example, Temperature) are quite broad.
          This makes ioos_category much better for this purpose than, for example,
          the much finer-grained CF standard_names 
          (which aren't so good for this purpose because of all
          the synonyms and slight variations, for example, 
          sea_surface_temperature versus sea_water_temperature).
          <br>(Using ioos_category for this purpose is controlled by 
          <kbd>&lt;categoryAttributes&gt;</kbd> in your setup.xml file.)
          <br><a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/categorize/ioos_category/index.html?page=1&amp;itemsPerPage=1000"
          >Try using ioos_category to search for datasets of interest.</a>
        <li>These categories are from 
          <a rel="help" href="https://ioos.noaa.gov/">NOAA's 
          Integrated Ocean Observing System (IOOS)<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>. 
          These categories
          are fundamental to IOOS's description of IOOS's mission.
          If you are in NOAA, supporting ioos_category is a good One-NOAA thing to do.
          (Watch this 
          <a rel="help" href="https://www.youtube.com/watch?v=nBnCsMYm2yQ">One NOAA video<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
          and be inspired!)
          If you are in some other U.S. or international agency, 
          or work with governmental agencies, or work with some other Ocean Observing System, 
          isn't it a good idea to cooperate with the U.S. IOOS office?                    
        <li>Sooner or later, you may want some other ERDDAP™ to link to your datasets via
          <a rel="help" href="#EDDGridFromErddap">EDDGridFromErddap</a> and
          <a rel="help" href="#EDDTableFromErddap">EDDTableFromErddap</a>.
          If the other ERDDAP™ requires ioos_category, your datasets must have
          ioos_category in order for EDDGridFromErddap and EDDTableFromErddap to work.
        <li>It is psychologically much easier to include ioos_category 
          when you create the dataset
          (it's just another thing that ERDDAP™ requires to add the dataset to ERDDAP),
          than to add it after the fact (if you decided to use it in the future).
          <br>&nbsp;
        </ul>
  
      <li><a class="selfLink" id="long_name" href="#long_name" rel="bookmark"><strong>long_name</strong></a> 
        (<a rel="help" href="https://ferret.pmel.noaa.gov/noaa_coop/coop_cdf_profile.html"
        >COARDS<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>,
        <a rel="help" href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html"
        >CF<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        and
        <a rel="help" href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3"
        >ACDD<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        metadata standards) 
        is a RECOMMENDED variable attribute in ERDDAP.  For example,
        <br><kbd>&lt;att name="long_name"&gt;Eastward Sea Water Velocity&lt;/att&gt;</kbd>
        <ul>
        <li>ERDDAP™ uses the long_name for labeling axes on graphs.
        <li>Best practices: Capitalize the words in the long_name as if it were a title
          (capitalize the first word and all non-article words).
          Don't include the units in the long_name. 
          The long name shouldn't be very long (usually &lt;20 characters), 
          but should be more
          descriptive than the <a rel="help" href="#destinationName">destinationName</a>,
          which is often very concise.
        <li>If "long_name" isn't defined in the 
          variable's <a rel="help" href="#variableAttributes">sourceAttributes</a> or 
          <kbd>&lt;addAttributes&gt;</kbd>, 
          ERDDAP™ will generate it by cleaning up the 
          <a rel="help" href="#standard_name">standard_name</a> (if present) or 
          the destinationName. 
          <br>&nbsp;
        </ul>

      <li><a class="selfLink" id="missing_value" href="#missing_value" rel="bookmark"><strong>missing_value</strong></a> and
        <a class="selfLink" id="FillValue" href="#FillValue" rel="bookmark"><strong>_FillValue</strong></a> 
        (<a rel="help" href="https://ferret.pmel.noaa.gov/noaa_coop/coop_cdf_profile.html"
        >COARDS<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
        and
        <a rel="help" href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html"
        >CF<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>)
        are variable attributes which describe a number
        (for example, -9999) which is used to represent a missing value.  For example,
        <br><kbd>&lt;att name="missing_value" 
          <a rel="help" href="#attributeType">type="double"</a>&gt;-9999&lt;/att&gt;</kbd>
        <br>For String variables, the default for both is "" (the empty string).
        <br>For numeric variables, the default for both is NaN.
        <ul>
        <li>ERDDAP™ supports both missing_value and _FillValue, since some data 
          sources assign slightly different meanings to them.
        <li>If present, they should be of the same data type as the variable.
        <li>If the data is packed with 
          <a rel="help" href="#scale_factor"><kbd>scale_factor</kbd>&nbsp;and/or&nbsp;<kbd>add_offset</kbd></a>,
          the missing_value and _FillValue values should be likewise packed.
          Similarly, for a column with String date/time values that use a local
          <a rel="help" 
          href="#time_zone"><kbd>time_zone</kbd></a>,
          the missing_value and _FillValue values should use the local time zone.
        <li>If a variable uses these special values, the missing_value and/or _FillValue
          attributes are REQUIRED.
        <li>For 
          <a rel="help" href="#timeUnits">time and timestamp variables</a>
          (whether the source is strings or numeric),
          missing_values and _FillValues appear as "" (the empty string) when
          the time is written as a String and as NaN when the time is written 
          as a double. 
          The source values for missing_value and _FillValue will not appear 
          in the variable's metadata.
        <li>For String variables, ERDDAP™ always converts any missing_values or _FillValue
          data values into "" (the empty string). 
          The source values for missing_value and _FillValue will not appear 
          in the variable's metadata.
        <li>For numeric variables:
          <br>The missing_value and _FillValue will appear in the variable's metadata.
          <br>For some output data formats, 
            ERDDAP™ will leave these special numbers intact, e.g., you will see -9999.
          <br>For other output data formats (notably text-like formats like 
            .csv and .htmlTable), 
            ERDDAP™ will replace these special numbers with NaN or "".
        <li><a class="selfLink" id="inherentMissingValues" href="#inherentMissingValues" rel="bookmark">Some</a>
           data types have inherent missing value markers
           that don't need to be explicitly identified with missing_value or
           _FillValue attributes: float and double variables have NaN (Not a Number), 
           String values use the empty string, and char values have character
           \uffff (character #65535, which is Unicode's value for Not a Character).
           Integer data types do not have inherent missing value markers.
        <li>If an integer variable has a missing value (for example, an empty 
            position in a .csv file), ERDDAP™ will interpret the value as 
            the defined missing_value or _FillValue for that variable.
            If none is defined, ERDDAP™ will interpret the value as the 
            default missing value for that data type, which is always
            the maximum value which can be held by that data type: 
            <br>127 for byte variables, 32767 for short, 2147483647 for int, 9223372036854775807 for long, 
            <br>255 for ubyte, 65535 for ushort, 4294967295 for uint, and 18446744073709551615 for ulong.
        <li><a class="selfLink" id="addFillValueAttributes" href="#addFillValueAttributes" rel="bookmark">ADD _FillValue ATTRIBUTES?</a>                  
            <br>Each time ERDDAP™ loads a dataset, it checks if the variables 
            with integer source data types have a defined missing_value or
            _FillValue attribute. 
            If a variable doesn't, then ERDDAP™ prints a message to the log file
            (starting with "Add _FillValue Attribute?")
            recommending that the ERDDAP™ administrator add a _FillValue
            attribute for this variable in datasets.xml. 
            It is very useful for every variable to have a _FillValue or missing_value
            because missing values are always possible, e.g., 
            if a given file in a dataset doesn't have a given variable,
            ERDDAP™ needs to be able to present that variable as having 
            all missing values for that variable.
            If you decide a variable should not have a _FillValue attribute, you can add
            <br><kbd>&lt;att names="_FillValue"&gt;null&lt;/att&gt;</kbd>
            instead, which will suppress the message for that datasetID+variable
            combination in the future. 
            <p>Each time ERDDAP™ starts up, it collects all of those recommendations
            into a message which is written to the log file
            (starting with "ADD _FillValue ATTRIBUTES?"),
            emailed to the ERDDAP™ administrator,
            and written to a CSV data file in the [bigParentDirectory]/logs/ directory.
            If you wish to, 
            you can use the GenerateDatasetsXml program (and the <kbd>AddFillValueAttributes</kbd> option)
            to apply all the suggestions in the CSV file to the datasets.xml file.
            For any of the datasetID/variable combinations in that file, 
            if you decide there is no need to add the attributed, you can
            change the attribute to <kbd>&lt;att names="_FillValue"&gt;null&lt;/att&gt;</kbd>
            to suppress the recommendation for that datasetID+variable
            combination in the future.
            <p>This is important! 
            <br>As Bob has often said: it would be bad (and embarrassing)
            if some of the evidence of global warming was caused by 
            unidentified missing values in the data (e.g., temperatures values 
            of 99 or 127 degree_C that should have been marked as missing values and thus
            skewed the mean and/or median statistics higher).
        <li><a class="selfLink" id="consistentMissingValues" href="#consistentMissingValues" rel="bookmark"
          >The _FillValue and missing_value values</a>
          for a given variable in different source files must be consistent;
          otherwise, ERDDAP™ will accept files with one set of values and 
          reject all of the other files as "Bad Files".
          To solve the problem,
          <ul>
          <li>If the files are gridded .nc files, you can use 
            <a rel="help" href="#EDDGridFromNcFilesUnpacked">EDDGridFromNcFilesUnpacked</a>.
          <li>If the files are tabular data files, you can use
            EDDTableFrom...Files'
            <a rel="help" href="#EDDTableFromFiles_standardizeWhat">standardizeWhat</a> to tell ERDDAP
            to standardize the source files as they are read into ERDDAP.
          <li>For harder problems, you can use
            <a rel="help" href="#NcML">NcML</a> or
            <a rel="help" href="#NCO">NCO</a>
            to solve the problem.
            <br>&nbsp;
          </ul>       
        </ul>

      <li><a class="selfLink" id="scale_factor" href="#scale_factor" rel="bookmark"><strong>scale_factor</strong></a> (default = 1) and 
        <a class="selfLink" id="add_offset" href="#add_offset" rel="bookmark"><strong>add_offset</strong></a> (default = 0)
        (<a rel="help" 
        href="https://ferret.pmel.noaa.gov/noaa_coop/coop_cdf_profile.html"
        >COARDS<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
        and
        <a rel="help"
        href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html"
        >CF<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>)
        are OPTIONAL variable attributes which describe data which is packed in a 
        simpler data type via a simple transformation.
        <ul>
        <li>If present, their data type is different from the source data type and describes the
          data type of the destination values.
          <br>For example, a data source might have stored float data values with 
            one decimal digit
            packed as short ints (int16), using scale_factor = 0.1 and add_offset = 0. 
            For example,
          <br><kbd>&lt;att name="scale_factor" 
            <a rel="help" href="#attributeType">type="float"</a>&gt;0.1&lt;/att&gt;</kbd>
          <br><kbd>&lt;att name="add_offset" 
            <a rel="help" href="#attributeType">type="float"</a>&gt;0&lt;/att&gt;</kbd>
          <br>In this example, ERDDAP™ would unpack the data and present it to the 
            user as float data values. 
        <li>If present, ERDDAP™ will extract the values from these attributes,
          remove the attributes, and automatically unpack the data for the user:
          <br><kbd>destinationValue = sourceValue * scale_factor + add_offset</kbd>
          <br>Or, stated another way:
          <br><kbd>unpackedValue = packedValue * scale_factor + add_offset</kbd>
        <li><a class="selfLink" id="consistentScaleAddOffset" href="#consistentScaleAddOffset" rel="bookmark"
          >The scale_factor and add_offset values</a>
          for a given variable in different source files must be consistent;
          otherwise, ERDDAP™ will accept files with one set of values and 
          reject all of the other files as "Bad Files".
          To solve the problem,
          <ul>
          <li>If the files are gridded .nc files, you can use 
            <a rel="help" href="#EDDGridFromNcFilesUnpacked">EDDGridFromNcFilesUnpacked</a>.
          <li>If the files are tabular data files, you can use
            EDDTableFrom...Files'
            <a rel="help" href="#EDDTableFromFiles_standardizeWhat">standardizeWhat</a> to tell ERDDAP
            to standardize the source files as they are read into ERDDAP.
          <li>For harder problems, you can use
            <a rel="help" href="#NcML">NcML</a> or
            <a rel="help" href="#NCO">NCO</a>
            to solve the problem.
            <br>&nbsp;
          </ul>       
        </ul>

      <li><a class="selfLink" id="standard_name" href="#standard_name" rel="bookmark"><strong>standard_name</strong></a> 
        (from the
        <a rel="help"
        href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html"
        >CF<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        metadata standard) 
        is a RECOMMENDED variable attribute in ERDDAP.
        CF maintains the list of allowed
          <a rel="help" href="https://cfconventions.org/Data/cf-standard-names/current/build/cf-standard-name-table.html"
          >CF standard names<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.  
        For example,
        <br><kbd>&lt;att name="standard_name"&gt;eastward_sea_water_velocity&lt;/att&gt;</kbd>
        <ul>
        <li>If you add <kbd>standard_name</kbd> to variables' attributes and add 
          <kbd>standard_name</kbd> to the list of
          <kbd>&lt;categoryAttributes&gt;</kbd> in ERDDAP's 
            <a rel="help" href="https://erddap.github.io/setup.html#setup.xml"
            >setup.xml</a> 
            file, users can easily find datasets with
          similar data via ERDDAP's "Search for Datasets by Category" on the home page.
        <li>If you specify a CF standard_name for a variable, 
          the units attribute for the variable
          doesn't have to be identical to the Canonical Units specified for the 
          standard name in the CF Standard Name table, 
          but the units MUST be convertible to the Canonical Units.
          For example, all temperature-related
          CF standard_names have "K" (Kelvin) as the Canonical Units. So a variable
          with a temperature-related standard_name
          MUST have units of K, degree_C, degree_F, or some UDUnits variant of those names,
          since they are all inter-convertible.
        <li>Best practices: Part of the power of 
            <a rel="help" href="https://en.wikipedia.org/wiki/Controlled_vocabulary"
            >controlled vocabularies<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
            comes from using only the terms
          in the list. 
          So we recommend sticking to the terms defined in the controlled vocabulary,
          and we recommend against making up a term if there isn't an appropriate 
          one in the list.
          If you need additional terms, see if the standards committee will add 
          them to the controlled  vocabulary.
        <li>standard_name values are the only CF attribute values which are
          case sensitive. They are always all lowercase. 
          Starting in ERDDAP™ v1.82, GenerateDatasets will convert uppercase 
          letters to lowercase letters. And when a dataset is loaded in ERDDAP,
          uppercase letters are silently changed to lowercase letters.
          <br>&nbsp;
        </ul>

      <li><a class="selfLink" id="time_precision" href="#time_precision" rel="bookmark"><strong>time_precision</strong></a> 
        <ul>
        <li>time_precision is an OPTIONAL attribute used by ERDDAP™ (and no metadata standards)
          for <a rel="help" href="#timeUnits">time and timestamp variables</a>,
          which may be in gridded datasets or tabular datasets, 
          and in axisVariables or dataVariables.  For example,
          <br><kbd>&lt;att name="time_precision"&gt;1970-01-01&lt;/att&gt;</kbd>
          <br>time_precision specifies the precision to be used whenever ERDDAP™ formats the time
          values from that variable as strings on web pages, including .htmlTable responses.
          In file formats where ERDDAP™ formats times as strings (for example, .csv and .json),
          ERDDAP™ only uses the time_precision-specified format if it includes 
          fractional seconds; otherwise, ERDDAP™ uses the 1970-01-01T00:00:00Z format.
        <li>Valid values are <kbd>1970-01, 1970-01-01, 1970-01-01T00Z,
          1970-01-01T00:00Z, 1970-01-01T00:00:00Z</kbd> (the default), 
          <kbd>1970-01-01T00:00:00.0Z, 1970-01-01T00:00:00.00Z, 1970-01-01T00:00:00.000Z</kbd>.
          [<kbd>1970</kbd> is not an option because it is a single number,
          so ERDDAP™ can't know if it is a formatted time string (a year) or if it is 
          some number of seconds since 1970-01-01T00:00:00Z.]
        <li>If time_precision isn't specified or the value isn't matched,
          the default value will be used.
        <li>Here, as in other parts of ERDDAP™, any fields of the formatted time that are 
          not displayed are assumed to have the minimum value.  For example,
          <kbd>1985-07, 1985-07-01, 1985-07-01T00Z, 1985-07-01T00:00Z,</kbd> 
          and <kbd>1985-07-01T00:00:00Z</kbd> are all considered equivalent,
          although with different levels of precision implied.
          This matches the 
          <a rel="help" href="https://www.iso.org/iso/date_and_time_format"
          >ISO 8601:2004 "extended" Time Format Specification<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>.
        <li><strong>WARNING:</strong> You should only use a limited time_precision if 
          <strong>all</strong> of the data values for the variable have only the minimum value 
          for all of the fields that are hidden.
          <ul>
          <li>For example, you can use a time_precision of <kbd>1970-01-01</kbd> 
            if all of the data values have hour=0, minute=0, and second=0 (for example
            <kbd>2005-03-04T00:00:00Z</kbd> and <kbd>2005-03-05T00:00:00Z</kbd>). 
          <li>For example, don't use a time_precision of <kbd>1970-01-01</kbd> 
            if there are non-0 hour, minute, or seconds values, 
            (for example <kbd>2005-03-05T12:00:00Z</kbd>)
            because the non-default hour value wouldn't be displayed.
            Otherwise, if a user asks for all data with <kbd>time=2005-03-05</kbd>,
            the request will fail unexpectedly.
            <br>&nbsp;
          </ul>
        </ul>

      <li><a class="selfLink" id="time_zone" href="#time_zone" rel="bookmark"><strong>time_zone</strong></a> 
        <ul>
        <li>time_zone is an OPTIONAL attribute used by ERDDAP™ (and no metadata standards)
          for <a rel="help" href="#timeUnits">time and timestamp variables</a>,
          which may be in gridded datasets or tabular datasets.
        <li>The default is "Zulu" (which is the modern time zone version of GMT).
        <li>Background information: "time offsets" (e.g., Pacific Standard Time,
          -08:00, GMT-8) are fixed, specific, offsets relative to Zulu (GMT).
          In contrast, "time zones" are the much more complex things that are affected by 
          Daylight Saving (e.g., "US/Pacific"),
          which have had different rules in different places at different times. 
          The time zones always have names since they can't be summarized by a simple offset value
          (see the "TZ database names" column in the table at
          <a rel="help" 
            href="https://en.wikipedia.org/wiki/List_of_tz_database_time_zones"
            >https://en.wikipedia.org/wiki/List_of_tz_database_time_zones<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a>).
          ERDDAP's time_zone attribute helps you deal with local time data from some time zone
          (e.g., 1987-03-25T17:32:05 Pacific Time). 
          If you have string or numeric time data with a (fixed) time offset, 
          you should simply adjust the data to Zulu (which is what ERDDAP™ wants) 
          by specifying a different base time in the units attribute (e.g., 
          "hours since 1970-01-01T08:00:00Z", note the T08 to specify the time offset),
          and always check the results to ensure you get the results you want.

        <li>For timestamp variables with source data from Strings, this attribute
          lets you specify a time zone which leads ERDDAP™ to convert the 
          local-time-zone source times (some in Standard time, 
          some in Daylight Saving time) into Zulu times (which are always in Standard time).
          The list of valid time zone names is probably identical to the list 
          in the TZ column at
          <a rel="help" 
            href="https://en.wikipedia.org/wiki/List_of_tz_database_time_zones"
            >https://en.wikipedia.org/wiki/List_of_tz_database_time_zones<img 
            src="../images/external.png" alt=" (external link)" 
            title="This link to an external website does not constitute an endorsement."></a>.
          Common US time zones are:
          US/Hawaii, US/Alaska, US/Pacific, US/Mountain, US/Arizona, US/Central, US/Eastern.
        <li>For timestamp variables with numeric source data, you can specify the
          "time_zone" attribute, but the value must be "Zulu" or "UTC". If you 
          need support for other time zones, please email Chris.John at noaa.gov .
          <br>&nbsp;
        </ul>

      <li><a class="selfLink" id="units" href="#units" rel="bookmark"><strong>units</strong></a> 
        (<a rel="help" href="https://ferret.pmel.noaa.gov/noaa_coop/coop_cdf_profile.html"
        >COARDS<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>,
        <a rel="help"
        href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html">CF<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
         and
        <a rel="help" href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3"
        >ACDD<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        metadata standard) 
        defines the units of the data values.   For example,
        <br><kbd>&lt;att name="units"&gt;degree_C&lt;/att&gt;</kbd>
        <ul>
        <li>"units" is REQUIRED as either a sourceAttribute or an addAttribute 
          for "time" variables
          and is STRONGLY RECOMMENDED for other variables whenever appropriate (which is
          almost always).
        <li>In general, we recommend 
          <a rel="help" href="https://www.unidata.ucar.edu/software/udunits/">UDUnits<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."
          ></a>-compatible units which is required by the 
          <a rel="help" href="https://ferret.pmel.noaa.gov/noaa_coop/coop_cdf_profile.html"
          >COARDS<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
          and
          <a rel="help" href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html"
          >CF<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
          standards.
        <li>Another common standard is <a rel="help" href="https://unitsofmeasure.org/ucum.html">UCUM<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
          -- the Unified Code for Units of Measure.
          <a rel="help" href="https://www.ogc.org/">OGC<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        services such as
            <a rel="help" href="https://www.ogc.org/standards/sos">SOS<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>,
            <a rel="help" href="https://www.ogc.org/standards/wcs">WCS<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>, 
        and
            <a rel="help" href="https://www.ogc.org/standards/wms">WMS<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
            require UCUM and often refer to UCUM 
            as UOM (Units Of Measure).
        <li>We recommend that you use one units standard for all datasets in your ERDDAP.
          You should tell ERDDAP™ which standard you are using with 
          <kbd>&lt;units_standard&gt;</kbd>,
          in your
            <a rel="help" 
            href="https://erddap.github.io/setup.html#setup.xml"
            >setup.xml</a>
            file.
        <li><a class="selfLink" id="consistentTimeUnits" href="#consistentTimeUnits" rel="bookmark"
          >The units for a given variable in different source files must be consistent.</a>
          If you have a collection of data files where 
          one subset of the files uses different units values
          than one or more other subsets of the files (for example, 
          <br>"days since 1985-01-01" versus "days since 2000-01-01", 
          <br>"degree_Celsius" versus "deg_C", or
          <br>"knots" versus "m/s")
          you need to find a way to standardize the units values,
          otherwise, ERDDAP™ will only load one subset of the files.
          Think about it: if one file has windSpeed units=knots and 
          another has windSpeed units=m/s, then the values from the two files 
          shouldn't be included in the same aggregated dataset. 
          <ul>
          <li>If the files are gridded .nc files, in many situations you can use 
            <a rel="help" href="#EDDGridFromNcFilesUnpacked">EDDGridFromNcFilesUnpacked</a>.
          <li>If the files are tabular data files, in many situations you can use
            EDDTableFrom...Files'
            <a rel="help" href="#EDDTableFromFiles_standardizeWhat">standardizeWhat</a> to tell ERDDAP
            to standardize the source files as they are read into ERDDAP.
          <li>For harder problems, you can use
            <a rel="help" href="#NcML">NcML</a> or
            <a rel="help" href="#NCO">NCO</a>
            to solve the problem.
          </ul>

        <li>The CF standard section 8.1 says that if a variable's data is packed via 
           <a rel="help" href="#scale_factor">scale_factor and/or add_offset</a>,
           "The units of a variable should be representative of the unpacked data."
        <li><a class="selfLink" id="timeUnits" href="#timeUnits" rel="bookmark"
          >For time and timestamp variables,</a> either the variable's <a rel="help" href="#variableAttributes">sourceAttributes</a>
          or <kbd>&lt;addAttributes&gt;</kbd> (which takes precedence) MUST have 
            <kbd><a rel="help" href="#units">units</a></kbd> which is either 
          <ul>
          <li>For time axis variables or time data variables with numeric data: 
            <a rel="help" href="https://www.unidata.ucar.edu/software/udunits/">UDUnits<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."
          ></a>-compatible
            string (with the format <kbd><i>units</i> since <i>baseTime</i></kbd>) 
            describing how to
            interpret source time values 
            (for example, <kbd>seconds since 1970-01-01T00:00:00Z</kbd>).

            <p><kbd><i>units</i></kbd> can be any one of:
            <br>ms, msec, msecs, millis, millisec, millisecs, millisecond, milliseconds, 
            <br>s, sec, secs, second, seconds, m, min, mins, minute, minutes, 
              h, hr, hrs, hour, hours, 
            <br>d, day, days, week, weeks, mon, mons, month, months, yr, yrs, year, or years.
            <br>Technically, ERDDAP™ does NOT follow the UDUNITS standard when
              converting "years since" and
              "months since" time values to "seconds since". The UDUNITS standard 
              defines a year as a 
              fixed, single value: 3.15569259747e7 seconds. 
              And UDUNITS defines a month as year/12.
              Unfortunately, most/all datasets that we have seen that use "years since" or 
              "months since" clearly intend the values to be calendar years or calendar months. 
              For example, 3 "months since 1970-01-01" is usually intended to mean 1970-04-01.
              So, ERDDAP™ interprets "years since" and "months since" as calendar 
              years and months,
              and does not strictly follow the UDUNITS standard.
             
             <p>The <kbd><i>baseTime</i></kbd> must be an ISO 8601:2004(E) 
                 formatted date time string 
             <span class="nowrap">(yyyy-MM-dd'T'HH:mm:ssZ,</span> for example, 
             <span class="nowrap">1970-01-01T00:00:00Z),</span> 
             or some variation of that (for example, with parts missing at the end).
             ERDDAP™ tries to work with a wide range of variations of that ideal 
             format, for example, "1970-1-1 0:0:0" is supported.
             If the time zone information is missing, it is assumed to be the
             Zulu time zone (AKA GMT).
             Even if another time offset is specified, ERDDAP™ never uses Daylight Saving Time.
             If the baseTime uses some other format, you must use
             <kbd>&lt;addAttributes&gt;</kbd> to specify a new units string which 
             use a variation of the ISO 8601:2004(E) format (e.g.,
             change <kbd>days since Jan 1, 1985</kbd> into 
             <kbd>days since 1985-01-01</kbd>.

             <p>You can test ERDDAP's ability to deal with a specific <kbd><i>units</i> since <i>baseTime</i></kbd>
             with ERDDAP's 
             <a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/convert/time.html"
             >Time Converter</a>.
             Hopefully, you can plug in a number (the first time value from the data source?) 
             and a units string, click on <kbd>Convert</kbd>, and ERDDAP™ will 
             be able to convert it into an ISO 8601:2004(E) formatted date time string.
             The converter will return an error message if the units string isn't recognizable.

          <li><a class="selfLink" id="stringTimeUnits" href="#stringTimeUnits" rel="bookmark"
            >For the units attribute for time or timestamp data variables with String data,</a>           
            you must specify a 
            <a rel="help" 
      href="https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html"
      >java.time.DateTimeFormatter<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a>
            pattern (which is mostly compatible with java.text.SimpleDateFormat) 
            which describes how to interpret the 
            string times. 
            
            <p>For the commonly used time formats that are variations of the 
            ISO 8601:2004(E) standard format (for example, 2018-01-02T00:00:00Z),
            you can specify variations of <kbd>yyyy-MM-dd'T'HH:mm:ssZ</kbd>,
            for example, use <kbd>yyyy-MM-dd</kbd> if the string time only has a date.
            For any format that starts with <kbd>yyyy-M</kbd>, ERDDAP
            uses a special parser that is very forgiving of minor variations
            in the format. The parser can handle time zones in the 
            format 'Z', "UTC", "GMT", &plusmn;XX:XX, &plusmn;XXXX, and &plusmn;XX
            formats. If parts of the date time are not specified (for example, 
            minutes and seconds), ERDDAP™ assumes the lowest value for that field 
            (e.g., if seconds aren't specified, seconds=0 is assumed).

            <p>For all other string time formats, you need to precisely
            specify a DateTimeFormatter-compatible time format string.
            Like <kbd>yyyy-MM-dd'T'HH:mm:ssZ</kbd>, these format strings are built 
            from characters which identify a specific type of information
            from the time string, e.g., <kbd>m</kbd> means minute-of-hour.
            If you repeat the format character some number of times, it 
            further refines the meaning, e.g., <kbd>m</kbd> means that the value
            may be specified by any number of digits, <kbd>mm</kbd> means that the
            value must be specified by 2 digits.
            The Java documentation for DateTimeFormatter is a crude overview
            and does not make these details clear.
            So here is a list of format character variations and their meaning
            within ERDDAP™ (which is sometimes slightly different from Java's DateTimeFormatter):
<table class="erd">
<tr><th>Characters</th><th>Examples</th><th>Meaning</th></tr>
<tr>
<td>u, y, Y</td>
<td>-4712, 0, 1, 10, 100, 2018</td>
<td>a year number, any number of digits. 
ERDDAP™ treats y (year-of-era) and Y (week-based-year, because this is often mistakenly 
used instead of y) as u, the 
<a rel="help" href="https://en.wikipedia.org/wiki/Astronomical_year_numbering"
    >astronomical year number<img 
    src="../images/external.png" alt=" (external link)" 
    title="This link to an external website does not constitute an endorsement."></a>. 
Astronomical years are positive
or negative integers that don't use the BCE (BC) or CE (AD) era designators: 
2018=2018CE, ..., 2=2CE, 1=1CE, 0=1BCE, -1=2BCE, -2=3BCE, ...</td>
</tr>
<tr>
<td>uuuu, yyyy, YYYY</td>
<td>-4712, 0000, 0001, 0010, 0100, 2018</td>
<td>a 4 digit astronomical year number (ignoring any preceding '-')</td>
</tr>
<tr>
<td>M</td>
<td>1, 01, 12</td>
<td>a month number, any number of digits (1=January)</td>
</tr>
<tr>
<td>MM</td>
<td>01, 12</td>
<td>a 2 digit (zero padded) month number</td>
</tr>
<tr>
<td>MMM</td>
<td>Jan, jan, JAN</td>
<td>a 3 letter English month name, case insensitive</td>
</tr>
<tr>
<td>MMMM</td>
<td>Jan, jan, JAN, January, january, JANUARY</td>
<td>a 3 letter or full English month name, case insensitive</td>
</tr>
<tr>
<td>d</td>
<td>1, 01, 31</td>
<td>a day-of-month number, any number of digits</td>
</tr>
<tr>
<td>dd</td>
<td>01, 31</td>
<td>a 2 digit (zero padded) day-of-month. The first 'digit' may be a space.</td>
</tr>
<tr>
<td>D</td>
<td>1, 001, 366</td>
<td>day-of-year, any number of digits, 001=Jan 1</td>
</tr>
<tr>
<td>DDD</td>
<td>001, 366</td>
<td>day-of-year, 3 digits, 001=Jan 1</td>
</tr>
<tr>
<td>EEE</td>
<td>thu, THU, Thu</td>
<td>a 3 letter day-of-week, value is ignored when parsing</td>
</tr>
<tr>
<td>EEEE</td>
<td>thu, THU, Thu, thursday, THURSDAY, Thursday</td>
<td>a 3 letter or full English day-of-week, case insensitive, value is ignored when parsing</td>
</tr>
<tr>
<td>H</td>
<td>0, 00, 23</td>
<td>H hour-of-day (0-23), any number of digits</td>
</tr>
<tr>
<td>HH</td>
<td>00, 23</td>
<td>HH hour-of-day (00-23), 2 digits. The first 'digit' may be a space.</td>
</tr>
<tr>
<td>a</td>
<td>am, AM, pm, PM</td>
<td>AM or PM, case-insensitive</td>
</tr>
<tr>
<td>h</td>
<td>12, 1, 01, 11</td>
<td>clock-hour-of-am-pm (12, 1, 2, ... 11), any number of digits</td>
</tr>
<tr>
<td>hh</td>
<td>12, 01, 11</td>
<td>clock-hour-of-am-pm (12, 1, 2, ... 11), 2 digits. The first 'digit' may be a space.</td>
</tr>
<tr>
<td>K</td>
<td>0, 1, 11</td>
<td>hour-of-am-pm (0, 1, ...11), any number of digits</td>
</tr>
<tr>
<td>KK</td>
<td>00, 01, 11</td>
<td>hour-of-am-pm, 2 digits</td>
</tr>
<tr>
<td>m</td>
<td>0, 00, 59</td>
<td>minute-of-hour, any number of digits</td>
</tr>
<tr>
<td>mm</td>
<td>00, 59</td>
<td>minute-of-hour, 2 digits</td>
</tr>
<tr>
<td>s</td>
<td>0, 00, 59</td>
<td>second-of-minute, any number of digits</td>
</tr>
<tr>
<td>ss</td>
<td>00, 59</td>
<td>second-of-minute, 2 digits</td>
</tr>
<tr>
<td>S</td>
<td>0, 000, 9, 999</td>
<td>fraction-of-second, as if following a decimal point, any number of digits</td>
</tr>
<tr>
<td>SS</td>
<td>00, 99 </td>
<td>hundredths of a second, 2 digits</td>
</tr>
<tr>
<td>SSS</td>
<td>000, 999</td>
<td>thousands of a second, 3 digits</td>
</tr>
<tr>
<td>A</td>
<td>0, 0000, 86399999</td>
<td>millisecond-of-day, any number of digits</td>
</tr>
<tr>
<td>AAAAAAAA</td>
<td>00000000, 86399999</td>
<td>millisecond-of-day, 8 digits</td>
</tr>
<tr>
<td>N</td>
<td>0, 00000000000000, 86399999999999</td>
<td>nanosecond-of-day, any number of digits. In ERDDAP™, this is truncated to nMillis.</td>
</tr>
<tr>
<td>NNNNNNNNNNNNNN</td>
<td>00000000000000, 86399999999999</td>
<td>nanosecond-of-day, 14 digits. In ERDDAP™ this is truncated to nMillis.</td>
</tr>
<tr>
<td>n</td>
<td>0, 00000000000, 59999999999</td>
<td>nanosecond-of-second, any number of digits. In ERDDAP™ this is truncated to nMillis.</td>
</tr>
<tr>
<td>nnnnnnnnnnn</td>
<td>00000000000, 59999999999</td>
<td>nanosecond-of-second, 11 digits. In ERDDAP™ this is truncated to nMillis.</td>
</tr>
<tr>
<td>XXX, ZZZ</td>
<td>Z, -08:00, +01:00</td>
<td>a time zone with the format 'Z' or &plusmn;(2 digit hour offset):(2 digit minute offset).
This treats <i>space</i> as + (non-standard).
ZZZ supporting 'Z' is non-standard but deals with a common user error.</td>
</tr>
<tr>
<td>XX, ZZ</td>
<td>Z -0800, +0100</td>
<td>a time zone with the format 'Z' or &plusmn;(2 digit hour offset):(2 digit minute offset).
This treats <i>space</i> as + (non-standard).
ZZ supporting 'Z' is non-standard but deals with a common user error.</td>
</tr>
<tr>
<td>X, Z</td>
<td>Z, -08, +01</td>
<td>a time zone with the format 'Z' or &plusmn;(2 digit hour offset):(2 digit minute offset).
This treats <i>space</i> as + (non-standard).
Z supporting 'Z' is non-standard but deals with a common user error.</td>
</tr>
<tr>
<td>xxx</td>
<td>-08:00, +01:00</td>
<td>a time zone with the format &plusmn;(2 digit hour offset):(2 digit minute offset).
This treats <i>space</i> as + (non-standard).</td>
</tr>
<tr>
<td>xx</td>
<td>-0800, +0100</td>
<td>a time zone with the format &plusmn;(2 digit hour offset)(2 digit minute offset).
This treats <i>space</i> as + (non-standard).</td>
</tr>
<tr>
<td>x</td>
<td>-08, +01</td>
<td>a time zone with the format &plusmn;(2 digit hour offset).
This treats <i>space</i> as + (non-standard).</td>
</tr>
<tr>
<td>'</td>
<td>'T', 'Z', 'GMT'</td>
<td>start and end of a series of literal characters</td>
</tr>
<tr>
<td>'' (two single quotes)</td>
<td>''</td>
<td>two single quotes denotes a literal single quote</td>
</tr>
<tr>
<td>[]</td>
<td>[ ]</td>
<td>the start ("[") and end ("]") of an optional section. 
This notation is only supported for literal characters and at the end 
of the format string.</td>
</tr>
<tr>
<td>#, {, }</td>
<td>#, {, }</td>
<td>reserved for future use</td>
</tr>
<tr>
<td>G,L,Q,e,c,V,z,O,p</td>
<td></td>
<td>These formatting characters are supported by Java's DateTimeFormatter,
but currently not supported by ERDDAP. If you need support for them,
email Chris.John at noaa.gov .</td>
</tr>
</table>

<p>Notes:
<ul>
<li>In a date time with punctuation, numeric values may have a variable number of digits
  (e.g., in the US slash date format "1/2/1985", the month and the date may be 1 or 2 digits)
  so the format must use 1-letter tokens, e.g., M/d/yyyy, which accept any number of digits
  for month and date.
<li>If the number of digits for an item is constant, e.g., 01/02/1985, then specify the 
  number of digits in the format, e.g., MM/dd/yyyy for 2-digit month, 2-digit date, and
  4 digit year.
<li>These formats are tricky to work with. A given format may work for most, but not all,
  time strings for a given variable. 
  Always check that the format you specify is working as expected in ERDDAP
  for all of a variable's time strings.
<li>When possible, GenerateDatasetXml will suggest time format strings.
<li>If you need help generating a format string, please email Chris.John at noaa.gov .  
</ul>

</ul>

          <p>The main time data variable (for tabular datasets) and the main time
          axis variable
          (for gridded datasets) are recognized by the 
          <a rel="help" href="#destinationName">destinationName</a> <kbd>time</kbd>.
          Their <kbd>units</kbd> metadata must be a UDUnits-compatible
          units string for numeric time values, e.g., "days since 1970-01-01" 
          (for tabular or gridded datasets), or 
          <a rel="help" href="#stringTimeUnits">units suitable for string times</a>,
          e.g., "M/d/yyyy" (for tabular datasets).

          <p>Different Time Units in Different Gridded .nc Files -
          If you have a collection of gridded .nc files where, for the time variable, 
          one subset of the files uses different time units 
          than one or more other subsets of the files, 
          you can use
          <a rel="help" href="#EDDGridFromNcFilesUnpacked">EDDGridFromNcFilesUnpacked</a>.
          It converts time values to "seconds since 1970-01-01T00:00:00Z" 
          at a lower level, thereby hiding the differences, 
          so that you can make one dataset from the collection of heterogeneous files.

          <p><a class="selfLink" id="timeStampVariable" href="#timeStampVariable" rel="bookmark">TimeStamp Variables</a> -- Any other variable 
          (axisVariable or dataVariable, in an EDDGrid or EDDTable dataset)
          can be a timeStamp variable. 
          Timestamp variables are variables that have time-related units and time data,
          but have a &lt;destinationName&gt; other than <kbd>time</kbd>.
          TimeStamp variables behave like the main time variable in that they
          convert the source's time format into 
          "seconds since 1970-01-01T00:00:00Z" and/or ISO 8601:2004(E) format). 
          ERDDAP™ recognizes timeStamp variables by their time-related
          "<a rel="help" href="#units">units</a>"
          metadata, which must match this regular expression "[a-zA-Z]+ +since +[0-9].+" 
          (for numeric dateTimes, for example, "seconds since 1970-01-01T00:00:00Z")
          or be a dateTime format string containing "uuuu", "yyyy" or "YYYY" (for example, "yyyy-MM-dd'T'HH:mm:ssZ").
          But please still use the destinationName "time" for the main dateTime variable.

          <p><strong>Always check your work to be sure that the time data that shows up
          in ERDDAP™ is the correct time data.</strong>
          Working with time data is always tricky and error prone. 

          <p>See <a rel="help" href="#LLAT">more information about time variables</a>.
          <br>ERDDAP™ has a utility to 
            <a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/convert/time.html"
            >Convert
            a Numeric Time to/from a String Time</a>.
          <br>See <a rel="help" href="https://coastwatch.pfeg.noaa.gov/erddap/convert/time.html#erddap">How
            ERDDAP™ Deals with Time</a>.
          <br>&nbsp;
        </ul>

      <li><a class="selfLink" id="valid_range" href="#valid_range" rel="bookmark"><strong>valid_range</strong>, or <strong>valid_min</strong> and <strong>valid_max</strong></a> -- 
        These are OPTIONAL variable attributes 
        defined in the 
        <a rel="help"
        href="https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html"
        >CF<img 
        src="../images/external.png" alt=" (external link)" 
        title="This link to an external website does not constitute an endorsement."></a> 
        metadata conventions.  For example,
        <br><kbd>&lt;att name="valid_range" 
          <a rel="help" href="#attributeType">type="floatList"</a>&gt;0.0 40.0&lt;/att&gt;</kbd>
        <br>or
        <br><kbd>&lt;att name="valid_min" 
          <a rel="help" href="#attributeType">type="float"</a>&gt;0.0&lt;/att&gt;</kbd>
        <br><kbd>&lt;att name="valid_max" 
          <a rel="help" href="#attributeType">type="float"</a>&gt;40.0&lt;/att&gt;</kbd>
        <ul>
        <li>If present, they should be of the same data type as the variable, 
          and specify the valid minimum and maximum values of the data for that variable.
          Users should consider values outside this range to be invalid.
        <li>ERDDAP™ does not apply the valid_range. Said another way: ERDDAP™ does
          not convert data values outside the valid_range to the _FillValue or 
          missing_value. ERDDAP™ just passes on this metadata and leaves the
          application up to you. 
          <br>Why? That's what this metadata is for. If the data 
          provider had wanted to, the data provider could have converted 
          the data values outside of the valid_range to be _FillValues. 
          ERDDAP™ doesn't second guess the data provider. 
          This approach is safer: if it is later shown that the valid_range 
          was too narrow or otherwise incorrect, ERDDAP™ won't have obliterated the data.
        <li>If the data is packed with 
          <a rel="help" href="#scale_factor"><kbd>scale_factor</kbd>&nbsp;and/or&nbsp;<kbd>add_offset</kbd></a>,
          valid_range, valid_min and valid_max should be the packed data type and values.
          Since ERDDAP™ applies scale_factor and add_offset when it loads the dataset, 
          ERDDAP™ will unpack the valid_range, valid_min and valid_max
          values so that the destination metadata (shown to users) will indicate
          the unpacked data type and range.
          <br>Or, if an <kbd>unpacked_valid_range</kbd> attribute is present, 
          it will be renamed <kbd>valid_range</kbd> when ERDDAP™ loads the dataset.
        </ul>

      </ul> <!-- end of variable attributes -->
      <li><a class="selfLink" id="removeMVRows" href="#removeMVRows" rel="bookmark"><kbd><strong>&lt;removeMVRows&gt;</strong></kbd></a> 
        is an OPTIONAL tag within a <dataset> tag in datasets.xml for EDDTableFromFiles (including all subclasses) datasets, though it is only used for EDDTableFromMultidimNcFiles. It can have a value of true or false. For example,
          <removeMVRows>true</removeMVRows>
          <br>
          This removes any block of rows at the end of a group where all the values are missing_value, _FillValue, or the CoHort ...Array native missing value (or char=#32 for CharArrays).
          This is for the CF DSG Multidimensional Array file type and similar files.
          If true, this does the proper test and so always loads all the max dim variables, so it may take extra time. 
          <br>
          The default value of <removeMVRows> is false.
            <br>
          Recommendation -- If possible for your dataset, we recommend setting removeMVRows to false. Setting removeMVRows to true can significantly slow down requests, though may be needed for some datasets.
          <br>&nbsp;
        </ul>
    </ul>
</ul>


<br>&nbsp;
<hr>
<h2><a class="selfLink" id="contact" href="#contact" rel="bookmark">Contact</a></h2> 
Questions, comments, suggestions?  Please send an email to 
  <kbd>erd dot data at noaa dot gov</kbd>
and include the ERDDAP™ URL directly related to your question or comment.

<p><a class="selfLink" id="ERDDAPMailingList" href="#ERDDAPMailingList" rel="bookmark">Or,
you can join the ERDDAP™ Google Group / Mailing List</a> by visiting
<a class="N" rel="help"
href="https://groups.google.com/forum/#!forum/erddap"
>https://groups.google.com/forum/#!forum/erddap<img 
  src="../images/external.png" alt=" (external link)" 
  title="This link to an external website does not constitute an endorsement."></a> 
and clicking on "Apply for membership". 
Once you are a member, you can post your question there
or search to see if the question has already been asked and answered. 

<br>&nbsp;
<hr>
<p>ERDDAP, Version 2.25
<br><a rel="bookmark" href="https://coastwatch.pfeg.noaa.gov/erddap/legal.html">Disclaimers</a> | 
    <a rel="bookmark" href="https://coastwatch.pfeg.noaa.gov/erddap/legal.html#privacyPolicy"
    >Privacy Policy</a>

</div>
</body>
</html>