earthaccess and NASA EDL¶
earthaccess allows us to access data from datasets behind NASA EDL. This library comes with handy methods to generate an access token, create an authenticated Python requests session or fsspec file accessors.
The following are simple examples of what can we do with them.
In [1]:
Copied!
import earthaccess
auth = earthaccess.login()
import earthaccess
auth = earthaccess.login()
Data in AWS¶
If the data we want to access is on AWS, we can use earthaccess to generate temporary S3 credentials for any of the DAACs
In [2]:
Copied!
s3_credentials = auth.get_s3_credentials("NSIDC")
s3_credentials
s3_credentials = auth.get_s3_credentials("NSIDC")
s3_credentials
Out[2]:
{'accessKeyId': 'ASIA2D3OGJNTL5KJD3X2', 'secretAccessKey': 'AWM9EwgjmyNNOVhj3anRvYqIBFqGsPi+hkFGbKH+', 'sessionToken': 'FwoGZXIvYXdzEKr//////////wEaDB7/MqJEMI4AuUv6zyLcAfI26EAEL052p1j5kDFQFJDOfq870KxjutvEP7HyZmN+ArZ2d+j6rOHTN5ohGkJvEtaN7NnBRJAGSCdae8k4iyQ3gEOFTSkz3W7D6DJ4naUh74d9F6MeXnINehbpkJ88slqEUpQWCDmBov7jo6Cxj4uhGR7+eN/WDo9sbpK9ngaDKxQqO5xVCZwQFbihwSo08Uv+ofCuVBeJp+BQUzEOqTXykNA2Y4rYEX4eezqMeCYxORXrP1N3ul4rHgWnUA5cZIoq06sPRKHQEn9BGjWcoGE2kKEa2opsOgWo9d8ogMnusQYyLbWCypFj3oByH/OnY10QneJ/KnzzdrBDYYhuzuMkPEZnCZoq2aJvXXsH1xsw6Q==', 'expiration': '2024-05-08 17:12:48+00:00'}
These S3 temporary credentials are valid for 1 hour and can be used by third party libraries that support S3 buckets.
HTTPS access¶
We can also access data over HTTP using presigned Python requests
sessions. The advantage of these sessions is that they work on every DAAC or data in S3 when accessed through HTTPS.
In [3]:
Copied!
nsidc_url = "https://n5eil01u.ecs.nsidc.org/DP7/ATLAS/ATL06.005/2019.02.21/ATL06_20190221121851_08410203_005_01.h5"
lpcloud_url = "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20220903T163129_2224611_012/EMIT_L2A_RFL_001_20220903T163129_2224611_012.nc"
# this is a Python requests session
session = earthaccess.get_requests_https_session()
nsidc_url = "https://n5eil01u.ecs.nsidc.org/DP7/ATLAS/ATL06.005/2019.02.21/ATL06_20190221121851_08410203_005_01.h5"
lpcloud_url = "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20220903T163129_2224611_012/EMIT_L2A_RFL_001_20220903T163129_2224611_012.nc"
# this is a Python requests session
session = earthaccess.get_requests_https_session()
In [4]:
Copied!
headers = {"Range": "bytes=0-100"}
r = session.get(lpcloud_url, headers=headers)
r.text
headers = {"Range": "bytes=0-100"}
r = session.get(lpcloud_url, headers=headers)
r.text
Out[4]:
'�HDF\r\n\x1a\n\x00\x00\x00\x00\x00\x08\x08\x00\x04\x00\x10\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00���������HUn\x00\x00\x00\x00��������\x00\x00\x00\x00\x00\x00\x00\x00`\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00OHDR\x02'
Accessing remote files as if they were local with fsspec¶
In [5]:
Copied!
fs = earthaccess.get_fsspec_https_session()
fs = earthaccess.get_fsspec_https_session()
In [6]:
Copied!
with fs.open(lpcloud_url) as f:
data = f.read(100)
data
with fs.open(lpcloud_url) as f:
data = f.read(100)
data
Out[6]:
b'\x89HDF\r\n\x1a\n\x00\x00\x00\x00\x00\x08\x08\x00\x04\x00\x10\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xff\xff\xff\xff\xff\xff\xff\xff\xd7HUn\x00\x00\x00\x00\xff\xff\xff\xff\xff\xff\xff\xff\x00\x00\x00\x00\x00\x00\x00\x00`\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00OHDR'
In [7]:
Copied!
%%time
import xarray as xr
# earthaccess can open a list of files
files = earthaccess.open([lpcloud_url])
ds = xr.open_dataset(files[0], group="sensor_band_parameters")
ds
%%time
import xarray as xr
# earthaccess can open a list of files
files = earthaccess.open([lpcloud_url])
ds = xr.open_dataset(files[0], group="sensor_band_parameters")
ds
CPU times: user 2.53 s, sys: 211 ms, total: 2.74 s Wall time: 25.5 s
Out[7]:
<xarray.Dataset> Dimensions: (bands: 285) Dimensions without coordinates: bands Data variables: wavelengths (bands) float32 ... fwhm (bands) float32 ... good_wavelengths (bands) float32 ...