Skip to content

Documentation for Granules

DataGranules is the class earthaccess uses to query CMR at the granule level.

Bases: GranuleQuery

A Granule oriented client for NASA CMR.

Api

https://cmr.earthdata.nasa.gov/search/site/docs/search/api.html

Source code in earthaccess/search.py
def __init__(self, auth: Optional[Auth] = None, *args: Any, **kwargs: Any) -> None:
    super().__init__(*args, **kwargs)

    self.session = (
        # To search, we need the new bearer tokens from NASA Earthdata
        auth.get_session(bearer_token=True)
        if auth and auth.authenticated
        else requests.session()
    )

    if auth:
        self.mode(auth.system.cmr_base_url)

    self._debug = False

bounding_box(lower_left_lon, lower_left_lat, upper_right_lon, upper_right_lat)

Filter by granules that overlap a bounding box. Must be used in combination with a collection filtering parameter such as short_name or entry_title.

Parameters:

Name Type Description Default
lower_left_lon FloatLike

lower left longitude of the box

required
lower_left_lat FloatLike

lower left latitude of the box

required
upper_right_lon FloatLike

upper right longitude of the box

required
upper_right_lat FloatLike

upper right latitude of the box

required

Returns:

Type Description
Self

self

Raises:

Type Description
ValueError

A coordinate could not be converted to a float.

Source code in earthaccess/search.py
@override
def bounding_box(
    self,
    lower_left_lon: FloatLike,
    lower_left_lat: FloatLike,
    upper_right_lon: FloatLike,
    upper_right_lat: FloatLike,
) -> Self:
    """Filter by granules that overlap a bounding box. Must be used in combination
    with a collection filtering parameter such as short_name or entry_title.

    Parameters:
        lower_left_lon: lower left longitude of the box
        lower_left_lat: lower left latitude of the box
        upper_right_lon: upper right longitude of the box
        upper_right_lat: upper right latitude of the box

    Returns:
        self

    Raises:
        ValueError: A coordinate could not be converted to a float.
    """
    return super().bounding_box(
        lower_left_lon, lower_left_lat, upper_right_lon, upper_right_lat
    )

cloud_cover(min_cover=0, max_cover=100)

Filter by the percentage of cloud cover present in the granule.

Parameters:

Name Type Description Default
min_cover Optional[FloatLike]

minimum percentage of cloud cover

0
max_cover Optional[FloatLike]

maximum percentage of cloud cover

100

Returns:

Type Description
Self

self

Raises:

Type Description
ValueError

min_cover or max_cover is not convertible to a float, or min_cover is greater than max_cover.

Source code in earthaccess/search.py
@override
def cloud_cover(
    self,
    min_cover: Optional[FloatLike] = 0,
    max_cover: Optional[FloatLike] = 100,
) -> Self:
    """Filter by the percentage of cloud cover present in the granule.

    Parameters:
        min_cover: minimum percentage of cloud cover
        max_cover: maximum percentage of cloud cover

    Returns:
        self

    Raises:
        ValueError: `min_cover` or `max_cover` is not convertible to a float,
            or `min_cover` is greater than `max_cover`.
    """
    return super().cloud_cover(min_cover, max_cover)

cloud_hosted(cloud_hosted=True)

Only match granules that are hosted in the cloud. This is valid for public collections and when using the short_name parameter. Concept-Id is unambiguous.

Tip

Cloud-hosted collections can be public or restricted. Restricted collections will not be matched using this parameter.

Parameters:

Name Type Description Default
cloud_hosted bool

If True, obtain only granules from cloud-hosted collections.

True

Returns:

Type Description
Self

self

Raises:

Type Description
TypeError

cloud_hosted is not of type bool.

Source code in earthaccess/search.py
def cloud_hosted(self, cloud_hosted: bool = True) -> Self:
    """Only match granules that are hosted in the cloud.
    This is valid for public collections and when using the short_name parameter.
    Concept-Id is unambiguous.

    ???+ Tip
        Cloud-hosted collections can be public or restricted.
        Restricted collections will not be matched using this parameter.

    Parameters:
        cloud_hosted: If `True`, obtain only granules from cloud-hosted collections.

    Returns:
        self

    Raises:
        TypeError: `cloud_hosted` is not of type `bool`.
    """
    if not isinstance(cloud_hosted, bool):
        raise TypeError("cloud_hosted must be of type bool")

    if "short_name" in self.params:
        provider = find_provider_by_shortname(
            self.params["short_name"], cloud_hosted
        )
        if provider is not None:
            self.params["provider"] = provider
    return self

daac(daac_short_name)

Only match collections for a given DAAC. Default to on-prem collections for the DAAC.

Parameters:

Name Type Description Default
daac_short_name str

a DAAC shortname, e.g. NSIDC, PODAAC, GESDISC

required

Returns:

Type Description
Self

self

Source code in earthaccess/search.py
def daac(self, daac_short_name: str) -> Self:
    """Only match collections for a given DAAC. Default to on-prem collections for
    the DAAC.

    Parameters:
        daac_short_name: a DAAC shortname, e.g. NSIDC, PODAAC, GESDISC

    Returns:
        self
    """
    if "cloud_hosted" in self.params:
        cloud_hosted = self.params["cloud_hosted"]
    else:
        cloud_hosted = False
    self.DAAC = daac_short_name
    self.params["provider"] = find_provider(daac_short_name, cloud_hosted)
    return self

data_center(data_center_name)

An alias for the daac method.

Parameters:

Name Type Description Default
data_center_name String

DAAC shortname, e.g. NSIDC, PODAAC, GESDISC

required

Returns:

Type Description
Self

self

Source code in earthaccess/search.py
def data_center(self, data_center_name: str) -> Self:
    """An alias for the `daac` method.

    Parameters:
        data_center_name (String): DAAC shortname, e.g. NSIDC, PODAAC, GESDISC

    Returns:
        self
    """
    return self.daac(data_center_name)

day_night_flag(day_night_flag)

Filter by period of the day the granule was collected during.

Parameters:

Name Type Description Default
day_night_flag str

"day", "night", or "unspecified"

required

Returns:

Type Description
Self

self

Raises:

Type Description
TypeError

day_night_flag is not of type str.

ValueError

day_night_flag is not one of "day", "night", or "unspecified".

Source code in earthaccess/search.py
@override
def day_night_flag(self, day_night_flag: str) -> Self:
    """Filter by period of the day the granule was collected during.

    Parameters:
        day_night_flag: "day", "night", or "unspecified"

    Returns:
        self

    Raises:
        TypeError: `day_night_flag` is not of type `str`.
        ValueError: `day_night_flag` is not one of `"day"`, `"night"`, or
            `"unspecified"`.
    """
    return super().day_night_flag(day_night_flag)

debug(debug=True)

If True, prints the actual query to CMR, notice that the pagination happens in the headers.

Parameters:

Name Type Description Default
debug bool

If True, print the CMR query.

True

Returns:

Type Description
Self

self

Source code in earthaccess/search.py
def debug(self, debug: bool = True) -> Self:
    """If True, prints the actual query to CMR, notice that the pagination happens
    in the headers.

    Parameters:
        debug: If `True`, print the CMR query.

    Returns:
        self
    """
    self._debug = debug
    return self

doi(doi)

Search data granules by DOI.

Tip

Not all datasets have an associated DOI, internally if a DOI is found earthaccess will grab the concept_id for the query to CMR.

Parameters:

Name Type Description Default
doi str

DOI of a dataset, e.g. 10.5067/AQR50-3Q7CS

required

Returns:

Type Description
Self

self

Raises:

Type Description
RuntimeError

The CMR query to get the collection for the DOI fails.

Source code in earthaccess/search.py
def doi(self, doi: str) -> Self:
    """Search data granules by DOI.

    ???+ Tip
        Not all datasets have an associated DOI, internally if a DOI is found
        earthaccess will grab the concept_id for the query to CMR.

    Parameters:
        doi: DOI of a dataset, e.g. 10.5067/AQR50-3Q7CS

    Returns:
        self

    Raises:
        RuntimeError: The CMR query to get the collection for the DOI fails.
    """
    # TODO consider deferring this query until the search is executed
    collection = DataCollections().doi(doi).get()

    # TODO consider raising an exception when there are multiple collections, since
    # we can't know which one the user wants, and choosing one is arbitrary.
    if len(collection) > 0:
        concept_id = collection[0].concept_id()
        self.params["concept_id"] = concept_id
    else:
        # TODO consider removing this print statement since we don't print such
        # a message in other cases where no results are found.  Seems arbitrary.
        logger.info(
            f"earthaccess couldn't find any associated collections with the DOI: {doi}"
        )

    return self

downloadable(downloadable=True)

Only match granules that are available for download. The inverse of this method is online_only.

Parameters:

Name Type Description Default
downloadable bool

If True, obtain only granules that are downloadable.

True

Returns:

Type Description
Self

self

Raises:

Type Description
TypeError

downloadable is not of type bool.

Source code in earthaccess/search.py
@override
def downloadable(self, downloadable: bool = True) -> Self:
    """Only match granules that are available for download. The inverse of this
    method is `online_only`.

    Parameters:
        downloadable: If `True`, obtain only granules that are downloadable.

    Returns:
        self

    Raises:
        TypeError: `downloadable` is not of type `bool`.
    """
    return super().downloadable(downloadable)

get(limit=2000)

Get all the collections (datasets) that match with our current parameters up to some limit, even if spanning multiple pages.

Tip

The default page size is 2000, we need to be careful with the request size because all the JSON elements will be loaded into memory. This is more of an issue with granules than collections as they can be potentially millions of them.

Parameters:

Name Type Description Default
limit int

The number of results to return.

2000

Returns:

Type Description
List[DataGranule]

Query results as a (possibly empty) list of DataGranules instances.

Raises:

Type Description
RuntimeError

The CMR query failed.

Source code in earthaccess/search.py
@override
def get(self, limit: int = 2000) -> List[DataGranule]:
    """Get all the collections (datasets) that match with our current parameters
    up to some limit, even if spanning multiple pages.

    ???+ Tip
        The default page size is 2000, we need to be careful with the request size
        because all the JSON elements will be loaded into memory. This is more of an
        issue with granules than collections as they can be potentially millions of
        them.

    Parameters:
        limit: The number of results to return.

    Returns:
        Query results as a (possibly empty) list of `DataGranules` instances.

    Raises:
        RuntimeError: The CMR query failed.
    """
    response = get_results(self.session, self, limit)
    cloud = len(response) > 0 and self._is_cloud_hosted(response[0])

    return [DataGranule(granule, cloud_hosted=cloud) for granule in response]

granule_name(granule_name)

Find granules matching either granule ur or producer granule id, queries using the readable_granule_name metadata field.

Tip

We can use wildcards on a granule name to further refine our search, e.g. MODGRNLD.*.daily.*.

Parameters:

Name Type Description Default
granule_name str

granule name (accepts wildcards)

required

Returns:

Type Description
Self

self

Raises:

Type Description
TypeError

if granule_name is not of type str

Source code in earthaccess/search.py
def granule_name(self, granule_name: str) -> Self:
    """Find granules matching either granule ur or producer granule id,
    queries using the readable_granule_name metadata field.

    ???+ Tip
        We can use wildcards on a granule name to further refine our search,
        e.g. `MODGRNLD.*.daily.*`.

    Parameters:
        granule_name: granule name (accepts wildcards)

    Returns:
        self

    Raises:
        TypeError: if `granule_name` is not of type `str`
    """
    if not isinstance(granule_name, str):
        raise TypeError("granule_name must be of type string")

    self.params["readable_granule_name"] = granule_name
    self.params["options[readable_granule_name][pattern]"] = True
    return self

hits()

Returns the number of hits the current query will return.

This is done by making a lightweight query to CMR and inspecting the returned headers.

Returns:

Type Description
int

The number of results reported by the CMR.

Raises:

Type Description
RuntimeError

The CMR query failed.

Source code in earthaccess/search.py
@override
def hits(self) -> int:
    """Returns the number of hits the current query will return.

    This is done by making a lightweight query to CMR and inspecting the returned
    headers.

    Returns:
        The number of results reported by the CMR.

    Raises:
        RuntimeError: The CMR query failed.
    """
    url = self._build_url()

    response = self.session.get(url, headers=self.headers, params={"page_size": 0})

    try:
        response.raise_for_status()
    except requests.exceptions.HTTPError as ex:
        if ex.response is not None:
            raise RuntimeError(ex.response.text) from ex
        else:
            raise RuntimeError(str(ex)) from ex

    return int(response.headers["CMR-Hits"])

instrument(instrument)

Filter by the instrument associated with the granule.

Parameters:

Name Type Description Default
instrument str

name of the instrument

required

Returns:

Type Description
Self

self

Raises:

Type Description
ValueError

instrument is not a non-empty string.

Source code in earthaccess/search.py
@override
def instrument(self, instrument: str) -> Self:
    """Filter by the instrument associated with the granule.

    Parameters:
        instrument: name of the instrument

    Returns:
        self

    Raises:
        ValueError: `instrument` is not a non-empty string.
    """
    return super().instrument(instrument)

line(coordinates)

Filter by granules that overlap a series of connected points. Must be used in combination with a collection filtering parameter such as short_name or entry_title.

Parameters:

Name Type Description Default
coordinates Sequence[PointLike]

a list of (lon, lat) tuples

required

Returns:

Type Description
Self

self

Raises:

Type Description
ValueError

coordinates is not a sequence of at least 2 coordinate pairs, or a coordinate could not be converted to a float.

Source code in earthaccess/search.py
@override
def line(self, coordinates: Sequence[PointLike]) -> Self:
    """Filter by granules that overlap a series of connected points. Must be used
    in combination with a collection filtering parameter such as short_name or
    entry_title.

    Parameters:
        coordinates: a list of (lon, lat) tuples

    Returns:
        self

    Raises:
        ValueError: `coordinates` is not a sequence of at least 2 coordinate
            pairs, or a coordinate could not be converted to a float.
    """
    return super().line(coordinates)

online_only(online_only=True)

Only match granules that are listed online and not available for download. The inverse of this method is downloadable.

Parameters:

Name Type Description Default
online_only bool

If True, obtain only granules that are online (not downloadable)

True

Returns:

Type Description
Self

self

Raises:

Type Description
TypeError

online_only is not of type bool.

Source code in earthaccess/search.py
@override
def online_only(self, online_only: bool = True) -> Self:
    """Only match granules that are listed online and not available for download.
    The inverse of this method is `downloadable`.

    Parameters:
        online_only: If `True`, obtain only granules that are online (not
            downloadable)

    Returns:
        self

    Raises:
        TypeError: `online_only` is not of type `bool`.
    """
    return super().online_only(online_only)

orbit_number(orbit1, orbit2=None)

Filter by the orbit number the granule was acquired during. Either a single orbit can be targeted or a range of orbits.

Parameter

Returns:

Type Description
Self

self

Source code in earthaccess/search.py
@override
def orbit_number(
    self,
    orbit1: FloatLike,
    orbit2: Optional[FloatLike] = None,
) -> Self:
    """Filter by the orbit number the granule was acquired during. Either a single
    orbit can be targeted or a range of orbits.

    Parameter:
        orbit1: orbit to target (lower limit of range when orbit2 is provided)
        orbit2: upper limit of range

    Returns:
        self
    """
    return super().orbit_number(orbit1, orbit2)

parameters(**kwargs)

Provide query parameters as keyword arguments. The keyword needs to match the name of the method, and the value should either be the value or a tuple of values.

Example
query = DataCollections.parameters(
    short_name="AST_L1T",
    temporal=("2015-01","2015-02"),
    point=(42.5, -101.25)
)

Returns:

Type Description
Self

self

Raises:

Type Description
ValueError

The name of a keyword argument is not the name of a method.

TypeError

The value of a keyword argument is not an argument or tuple of arguments matching the number and type(s) of the method's parameters.

Source code in earthaccess/search.py
@override
def parameters(self, **kwargs: Any) -> Self:
    """Provide query parameters as keyword arguments. The keyword needs to match the
    name of the method, and the value should either be the value or a tuple of
    values.

    ???+ Example
        ```python
        query = DataCollections.parameters(
            short_name="AST_L1T",
            temporal=("2015-01","2015-02"),
            point=(42.5, -101.25)
        )
        ```

    Returns:
        self

    Raises:
        ValueError: The name of a keyword argument is not the name of a method.
        TypeError: The value of a keyword argument is not an argument or tuple
            of arguments matching the number and type(s) of the method's parameters.
    """
    methods = {}
    for name, func in getmembers(self, predicate=ismethod):
        methods[name] = func

    for key, val in kwargs.items():
        # verify the key matches one of our methods
        if key not in methods:
            raise ValueError("Unknown key {}".format(key))

        # call the method
        if isinstance(val, tuple):
            methods[key](*val)
        else:
            methods[key](val)

    return self

platform(platform)

Filter by the satellite platform the granule came from.

Parameters:

Name Type Description Default
platform str

name of the satellite

required

Returns:

Type Description
Self

self

Raises:

Type Description
ValueError

platform is not a non-empty string.

Source code in earthaccess/search.py
@override
def platform(self, platform: str) -> Self:
    """Filter by the satellite platform the granule came from.

    Parameters:
        platform: name of the satellite

    Returns:
        self

    Raises:
        ValueError: `platform` is not a non-empty string.
    """
    return super().platform(platform)

point(lon, lat)

Filter by granules that include a geographic point.

Parameters:

Name Type Description Default
lon FloatLike

longitude of geographic point

required
lat FloatLike

latitude of geographic point

required

Returns:

Type Description
Self

self

Raises:

Type Description
ValueError

lon or lat cannot be converted to a float.

Source code in earthaccess/search.py
@override
def point(self, lon: FloatLike, lat: FloatLike) -> Self:
    """Filter by granules that include a geographic point.

    Parameters:
        lon: longitude of geographic point
        lat: latitude of geographic point

    Returns:
        self

    Raises:
        ValueError: `lon` or `lat` cannot be converted to a float.
    """
    return super().point(lon, lat)

polygon(coordinates)

Filter by granules that overlap a polygonal area. Must be used in combination with a collection filtering parameter such as short_name or entry_title.

Parameters:

Name Type Description Default
coordinates Sequence[PointLike]

list of (lon, lat) tuples

required

Returns:

Type Description
Self

self

Raises:

Type Description
ValueError

coordinates is not a sequence of at least 4 coordinate pairs, any of the coordinates cannot be converted to a float, or the first and last coordinate pairs are not equal.

Source code in earthaccess/search.py
@override
def polygon(self, coordinates: Sequence[PointLike]) -> Self:
    """Filter by granules that overlap a polygonal area. Must be used in combination
    with a collection filtering parameter such as short_name or entry_title.

    Parameters:
        coordinates: list of (lon, lat) tuples

    Returns:
        self

    Raises:
        ValueError: `coordinates` is not a sequence of at least 4 coordinate
            pairs, any of the coordinates cannot be converted to a float, or the
            first and last coordinate pairs are not equal.
    """
    return super().polygon(coordinates)

provider(provider)

Only match collections from a given provider.

A NASA datacenter or DAAC can have one or more providers. For example, PODAAC is a data center or DAAC, PODAAC is the default provider for on-prem data, and POCLOUD is the PODAAC provider for their data in the cloud.

Parameters:

Name Type Description Default
provider str

a provider code for any DAAC, e.g. POCLOUD, NSIDC_CPRD, etc.

required

Returns:

Type Description
Self

self

Source code in earthaccess/search.py
@override
def provider(self, provider: str) -> Self:
    """Only match collections from a given provider.

    A NASA datacenter or DAAC can have one or more providers.
    For example, PODAAC is a data center or DAAC,
    PODAAC is the default provider for on-prem data, and POCLOUD is
    the PODAAC provider for their data in the cloud.

    Parameters:
        provider: a provider code for any DAAC, e.g. POCLOUD, NSIDC_CPRD, etc.

    Returns:
        self
    """
    self.params["provider"] = provider
    return self

short_name(short_name)

Filter by short name (aka product or collection name).

Parameters:

Name Type Description Default
short_name str

name of a collection

required

Returns:

Type Description
Self

self

Source code in earthaccess/search.py
@override
def short_name(self, short_name: str) -> Self:
    """Filter by short name (aka product or collection name).

    Parameters:
        short_name: name of a collection

    Returns:
        self
    """
    return super().short_name(short_name)

temporal(date_from=None, date_to=None, exclude_boundary=False)

Filter by an open or closed date range. Dates can be provided as date objects or ISO 8601 strings. Multiple ranges can be provided by successive method calls.

Tip

Giving either datetime.date(YYYY, MM, DD) or "YYYY-MM-DD" as the date_to parameter includes that entire day (i.e. the time is set to 23:59:59). Using datetime.datetime(YYYY, MM, DD) is different, because datetime.datetime objects have 00:00:00 as their built-in default.

Parameters:

Name Type Description Default
date_from Optional[Union[str, date, datetime]]

start of temporal range

None
date_to Optional[Union[str, date, datetime]]

end of temporal range

None
exclude_boundary bool

whether to exclude the date_from and date_to in the matched range

False

Returns:

Type Description
Self

self

Raises:

Type Description
ValueError

date_from or date_to is a non-None value that is neither a datetime object nor a string that can be parsed as a datetime object; or date_from and date_to are both datetime objects (or parsable as such) and date_from is after date_to.

Source code in earthaccess/search.py
@override
def temporal(
    self,
    date_from: Optional[Union[str, dt.date, dt.datetime]] = None,
    date_to: Optional[Union[str, dt.date, dt.datetime]] = None,
    exclude_boundary: bool = False,
) -> Self:
    """Filter by an open or closed date range. Dates can be provided as date objects
    or ISO 8601 strings. Multiple ranges can be provided by successive method calls.

    ???+ Tip
        Giving either `datetime.date(YYYY, MM, DD)` or `"YYYY-MM-DD"` as the `date_to`
        parameter includes that entire day (i.e. the time is set to `23:59:59`).
        Using `datetime.datetime(YYYY, MM, DD)` is different, because `datetime.datetime`
        objects have `00:00:00` as their built-in default.

    Parameters:
        date_from: start of temporal range
        date_to: end of temporal range
        exclude_boundary: whether to exclude the date_from and date_to in the matched range

    Returns:
        self

    Raises:
        ValueError: `date_from` or `date_to` is a non-`None` value that is
            neither a datetime object nor a string that can be parsed as a datetime
            object; or `date_from` and `date_to` are both datetime objects (or
            parsable as such) and `date_from` is after `date_to`.
    """
    return super().temporal(date_from, date_to, exclude_boundary)

version(version)

Filter by version. Note that CMR defines this as a string. For example, MODIS version 6 products must be searched for with "006".

Parameters:

Name Type Description Default
version str

version string

required

Returns:

Type Description
Self

self

Source code in earthaccess/search.py
@override
def version(self, version: str) -> Self:
    """Filter by version. Note that CMR defines this as a string. For example,
    MODIS version 6 products must be searched for with "006".

    Parameters:
        version: version string

    Returns:
        self
    """
    return super().version(version)